Splicing of internal large exons is defined by novel cis-acting sequence elements

Nucleic Acids Res. 2012 Oct;40(18):9244-54. doi: 10.1093/nar/gks652. Epub 2012 Jul 11.

Abstract

Human internal exons have an average size of 147 nt, and most are <300 nt. This small size is thought to facilitate exon definition. A small number of large internal exons have been identified and shown to be alternatively spliced. We identified 1115 internal exons >1000 nt in the human genome; these were found in 5% of all protein-coding genes, and most were expressed and translated. Surprisingly, 40% of these were expressed at levels similar to the flanking exons, suggesting they were constitutively spliced. While all of the large exons had strong splice sites, the constitutively spliced large exons had a higher ratio of splicing enhancers/silencers and were more conserved across mammals than the alternatively spliced large exons. We asked if large exons contain specific sequences that promote splicing and identified 38 sequences enriched in the large exons relative to small exons. The consensus sequence is C-rich with a central invariant CA dinucleotide. Mutation of these sequences in a candidate large exon indicated that these are important for recognition of large exons by the splicing machinery. We propose that these sequences are large exon splicing enhancers (LESEs).

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Alternative Splicing*
  • Base Sequence
  • Conserved Sequence
  • Evolution, Molecular
  • Exons*
  • Gene Expression
  • Genome, Human
  • Humans
  • Introns
  • RNA Splice Sites
  • Regulatory Sequences, Ribonucleic Acid*
  • Sequence Analysis, RNA

Substances

  • RNA Splice Sites
  • Regulatory Sequences, Ribonucleic Acid