Genome-wide detection of alternative splicing in expressed sequences of human genes

Nucleic Acids Res. 2001 Jul 1;29(13):2850-9. doi: 10.1093/nar/29.13.2850.

Abstract

We have identified 6201 alternative splice relationships in human genes, through a genome-wide analysis of expressed sequence tags (ESTs). Starting with approximately 2.1 million human mRNA and EST sequences, we mapped expressed sequences onto the draft human genome sequence and only accepted splices that obeyed the standard splice site consensus. A large fraction (47%) of these were observed multiple times, indicating that they comprise a substantial fraction of the mRNA species. The vast majority of the detected alternative forms appear to be novel, and produce highly specific, biologically meaningful control of function in both known and novel human genes, e.g. specific removal of the lysosomal targeting signal from HLA-DM beta chain, replacement of the C-terminal transmembrane domain and cytoplasmic tail in an FC receptor beta chain homolog with a different transmembrane domain and cytoplasmic tail, likely modulating its signal transduction activity. Our data indicate that a large proportion of human genes, probably 42% or more, are alternatively spliced, but that this appears to be observed mainly in certain types of molecules (e.g. cell surface receptors) and systemic functions, particularly the immune system and nervous system. These results provide a comprehensive dataset for understanding the role of alternative splicing in the human genome, accessible at http://www.bioinformatics.ucla.edu/HASDB.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alternative Splicing / genetics*
  • Base Sequence
  • Computational Biology
  • Consensus Sequence / genetics
  • Conserved Sequence / genetics
  • Databases as Topic
  • Exons / genetics
  • Expressed Sequence Tags
  • Genes / genetics*
  • Genome, Human*
  • Genomics*
  • HLA-D Antigens / chemistry
  • HLA-D Antigens / genetics
  • Humans
  • Internet
  • Introns / genetics
  • Multigene Family / genetics
  • Myotonin-Protein Kinase
  • Polymorphism, Single Nucleotide / genetics
  • Protein Serine-Threonine Kinases / genetics
  • RNA Splice Sites / genetics
  • RNA, Messenger / analysis
  • RNA, Messenger / genetics
  • Reproducibility of Results
  • Sequence Alignment

Substances

  • DMPK protein, human
  • HLA-D Antigens
  • HLA-DM antigens
  • RNA Splice Sites
  • RNA, Messenger
  • Myotonin-Protein Kinase
  • Protein Serine-Threonine Kinases