Construction of representative transcript and protein sets of human, mouse, and rat as a platform for their transcriptome and proteome analysis

Takeya Kasukawa; Shintaro Katayama; Hideya Kawaji; Harukazu Suzuki; David A Hume; Yoshihide Hayashizaki

doi:10.1016/j.ygeno.2004.08.011

Construction of representative transcript and protein sets of human, mouse, and rat as a platform for their transcriptome and proteome analysis

Genomics. 2004 Dec;84(6):913-21. doi: 10.1016/j.ygeno.2004.08.011.

Authors

Takeya Kasukawa¹, Shintaro Katayama, Hideya Kawaji, Harukazu Suzuki, David A Hume, Yoshihide Hayashizaki

Affiliation

¹ Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Kanagawa 230-0045, Japan. rgscerg@gsc.riken.jp

PMID: 15533708
DOI: 10.1016/j.ygeno.2004.08.011

Abstract

The number of mammalian transcripts identified by full-length cDNA projects and genome sequencing projects is increasing remarkably. Clustering them into a strictly nonredundant and comprehensive set provides a platform for functional analysis of the transcriptome and proteome, but the quality of the clustering and predictive usefulness have previously required manual curation to identify truncated transcripts and inappropriate clustering of closely related sequences. A Representative Transcript and Protein Sets (RTPS) pipeline was previously designed to identify the nonredundant and comprehensive set of mouse transcripts based on clustering of a large mouse full-length cDNA set (FANTOM2). Here we propose an alternative method that is more robust, requires less manual curation, and is applicable to other organisms in addition to mouse. RTPSs of human, mouse, and rat have been produced by this method and used for validation. Their comprehensiveness and quality are discussed by comparison with other clustering approaches. The RTPSs are available at .

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Animals
Cluster Analysis
Computational Biology
DNA, Complementary / genetics*
Gene Expression Profiling
Genomics / standards*
Humans
Mice
Proteome* / standards*
Rats
Reference Standards

Substances

DNA, Complementary
Proteome