Investigation of factors affecting RNA-seq gene expression calls

Annu Int Conf IEEE Eng Med Biol Soc. 2014:2014:5232-5. doi: 10.1109/EMBC.2014.6944805.

Abstract

RNA-seq enables quantification of the human transcriptome. Estimation of gene expression is a fundamental issue in the analysis of RNA-seq data. However, there is an inherent ambiguity in distinguishing between genes with very low expression and experimental or transcriptional noise. We conducted an exploratory investigation of some factors that may affect gene expression calls. We observed that the distribution of reads that map to exonic, intronic, and intergenic regions are distinct. These distributions may provide useful insights into the behavior of gene expression noise. Moreover, we observed that these distributions are qualitatively similar between two sequence mapping algorithms. Finally, we examined the relationship between gene length and gene expression calls, and observed that they are correlated. This preliminary investigation is important for RNA-seq gene expression analysis because it may lead to more effective algorithms for distinguishing between true gene expression and experimental or transcriptional noise.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Intergenic / genetics
  • Exons / genetics
  • Gene Expression Profiling*
  • Gene Expression Regulation
  • Humans
  • Introns / genetics
  • Sequence Analysis, RNA / methods*
  • Transcriptome / genetics

Substances

  • DNA, Intergenic