Abundant novel transcriptional units and unconventional gene pairs on human chromosome 22

Genome Res. 2006 Jan;16(1):45-54. doi: 10.1101/gr.3883606. Epub 2005 Dec 12.

Abstract

Novel transcriptional units (TUs) are EST-supported transcribed features not corresponding to known genes. Unconventional gene pairs (UGPs) are pairs of genes and/or TUs sharing exon-to-exon cis-antisense overlaps or putative bidirectional promoters. Computational TU and UGP discovery followed by manual curation was performed in the entire published 34.9-Mb human chromosome 22 euchromatic sequence. Novel TUs (n = 517) were as abundant as known genes (n = 492) and typically did not have nonprimate DNA and protein homologies. One hundred seventy-one (33%) of TUs, but only 13 (3%) of genes, both lacked nonprimate conservation and localized to gaps in the human-mouse BLASTZ alignment. Novel TUs were richer in exonic primate-specific interspersed repetitive elements (P = 0.001) and were more likely to rely on splice junctions provided by them, than were known genes: 19% of spliced TUs, versus 5% of spliced genes, had a splice site within a primate-specific repeat. Hence, novel TUs and known genes may represent different portions of the transcriptome. Two hundred nine (21%) of chromosome 22 transcripts participated in 77 cis-antisense and 42 promoter-sharing UGPs. Transcripts involved simultaneously in both UGP types were more common than was expected (P = 0.01). UGPs were nonrandomly distributed along the sequence: 89 (75%) clustered in distinct regions, the sum of which equaled 4.4 Mb (<13% of the chromosome). Eighty (67%) of the UGPs possessed significant locus structure differences between primates and rodents. Since some TUs may be functional noncoding transcripts and since the cis-regulatory potential of UGPs is well recognized, TUs and UGPs specific to the primate lineage may contribute to the genomic basis for primate-specific phenotypes.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Chromosomes, Human, Pair 22 / genetics*
  • Euchromatin / genetics*
  • Humans
  • Interspersed Repetitive Sequences / genetics*
  • Mice
  • Open Reading Frames / genetics*
  • Transcription, Genetic / genetics*

Substances

  • Euchromatin