Prediction of mRNA polyadenylation sites by support vector machine

Bioinformatics. 2006 Oct 1;22(19):2320-5. doi: 10.1093/bioinformatics/btl394. Epub 2006 Jul 26.

Abstract

mRNA polyadenylation is responsible for the 3' end formation of most mRNAs in eukaryotic cells and is linked to termination of transcription. Prediction of mRNA polyadenylation sites [poly(A) sites] can help identify genes, define gene boundaries, and elucidate regulatory mechanisms. Current methods for poly(A) site prediction achieve moderate sensitivity and specificity. Here, we present a method using support vector machine for poly(A) site prediction. Using 15 cis-regulatory elements that are over-represented in various regions surrounding poly(A) sites, this method achieves higher sensitivity and similar specificity when compared with polyadq, a common tool for poly(A) site prediction. In addition, we found that while the polyadenylation signal AAUAAA and U-rich elements are primary determinants for poly(A) site prediction, other elements contribute to both sensitivity and specificity of the prediction, indicating a combinatorial mechanism involving multiple elements when choosing poly(A) sites in human cells.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence*
  • Base Sequence
  • Chromosome Mapping / methods*
  • Molecular Sequence Data
  • Pattern Recognition, Automated / methods
  • Poly A / genetics*
  • Polyadenylation / genetics*
  • RNA, Messenger / genetics*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Analysis, RNA / methods*

Substances

  • RNA, Messenger
  • Poly A