Gene mining: a novel and powerful ensemble decision approach to hunting for disease genes using microarray expression profiling

Nucleic Acids Res. 2004 May 17;32(9):2685-94. doi: 10.1093/nar/gkh563. Print 2004.

Abstract

Current applications of microarrays focus on precise classification or discovery of biological types, for example tumor versus normal phenotypes in cancer research. Several challenging scientific tasks in the post-genomic epoch, like hunting for the genes underlying complex diseases from genome-wide gene expression profiles and thereby building the corresponding gene networks, are largely overlooked because of the lack of an efficient analysis approach. We have thus developed an innovative ensemble decision approach, which can efficiently perform multiple gene mining tasks. An application of this approach to analyze two publicly available data sets (colon data and leukemia data) identified 20 highly significant colon cancer genes and 23 highly significant molecular signatures for refining the acute leukemia phenotype, most of which have been verified either by biological experiments or by alternative analysis approaches. Furthermore, the globally optimal gene subsets identified by the novel approach have so far achieved the highest accuracy for classification of colon cancer tissue types. Establishment of this analysis strategy has offered the promise of advancing microarray technology as a means of deciphering the involved genetic complexities of complex diseases.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acute Disease
  • Algorithms
  • Colonic Neoplasms / genetics*
  • Databases, Genetic
  • Decision Trees*
  • Gene Expression Profiling
  • Genetic Predisposition to Disease / genetics*
  • Humans
  • Leukemia / classification
  • Leukemia / genetics*
  • Oligonucleotide Array Sequence Analysis*