Friday, September 18, 2015

Computational biologist's lab notebook

Schnell lab from University of Michigan recently published an interesting editorial in PLOS Comp. Bio. (article link).  It covers various aspect of a typical bioinformatics project especially in life science research environment.  The article is written in a way to make it appear like wet-lab protocol and rule book. My personal favorites are rule #4 and rule #5 about not only recording all scientific activity (e.g. what bioinformatics programs / code were used, data sources, data references, parameters used etc.) but also keeping the track of in a date wise fashion. This rule is basically referring to version control  in a code development setting. Give it read... interesting article.


Ten Simple Rules for a Computational Biologist’s Laboratory Notebook (Santiago Schnell)


Monday, August 24, 2015

Another RNA-Seq tool from Lior Pachter's lab: Sleuth



New RNA-Seq based expression estimation program, sleuth', was released today from Lior Pachter's group. This one  challenges currently used 'count-based'  methods for RNA-Seq data. Sleuth also accounts for biological and technical variation.
He raises a few good points that RNA-Seq based expression estimation becomes challenging  since there is a biological variation of transcription within and between cells. Also, sometimes technical variations in a repeated sequencing experiment can overshadow the 'true' expression level changes.

See more details here:
https://liorpachter.wordpress.com/2015/08/17/a-sleuth-for-rna-seq/

software download: http://pachterlab.github.io/sleuth/






Wednesday, July 22, 2015

RNA-Seq expression metrics: FPKM, RPKM and TMP explained by StatQuest


FPKM / RPKM and TPM are one of the most widely used transcript and gene expression normalization methods in RNA-Seq studies. FPKM and RPKM are similar metrics except the former applies to paired-end sequencing while the latter one is for single-end data. TMP is bit different as it is calculated using a different order of operations. StatQuest explains it well in a nutshell (please follow the video below and read on the following link):



For text summary please refer to: StatQuest page


New ovarian cancer study on large structural variants based on TCGA data is out

We have recently published a study on ovarian cancer using whole genome sequence (control + tumor), RNA-Seq and microarray gene expression chip data from the same patients. Data was acquired from TCGA. The paper titles:

 "Integrated sequence and expression analysis of ovarian cancer structural variants underscores the importance of gene fusion regulation" (BMC medical genomicsPubmed)

Here is the brief summary:
  • First we designed an integrated bioinformatics workflow (Figure S1) to process and analyze the large volumes of genomic and transcriptomic data, and to detect structural variants (SVs) breakpoint accurately at the single nucleotide resolution. 
  • Using whole genome sequencing (WGS) data we first detected various classes of SVs and further classify them as 'germline derived' and 'somatically derived'.
  • Then based on their structure and underlyong genomic regions, determined their potential to create functional gene-fusions at the RNA level. 
  • Using RNA-Seq and micraoarray data, we measured the proportion potential gene-fusion forming SVs that actually get transcribed. 
  • The observations suggests existence of regulatory mechanism(s) that suppress the expression of more established germline SVs (could be segregating in natural population) but facilitates the selected somatically derived SVs at the RNA level in ovarian tumors.
  • Our findings resonate with the observations of Bueno et al. (Pubmed) that the expression of BCR-ABL gene fusion, a very well known tumor driver in non-solid tumors, can be regulated by the genetic and epigenetic silencing of miR-203
  • Our findings are also relevant to the fact that recent studies have found several gene-fusions that are considered as cancer biomarkers in healthy individuals. Simply put, not the ocurrance of SVs at the genomic level but the regulation of the expression of SVs at RNA level contributes to their biological and clinical significance in the onset and progression of cancer.
Feel free to contact me if have any questions and need some more details about from the paper.