Detecting epistatic effects in association studies at a genomic level based on an ensemble approach

Bioinformatics. 2011 Jul 1;27(13):i222-9. doi: 10.1093/bioinformatics/btr227.

Abstract

Motivation: Most complex diseases involve multiple genes and their interactions. Although genome-wide association studies (GWAS) have shown some success for identifying genetic variants underlying complex diseases, most existing studies are based on limited single-locus approaches, which detect single nucleotide polymorphisms (SNPs) essentially based on their marginal associations with phenotypes.

Results: In this article, we propose an ensemble approach based on boosting to study gene-gene interactions. We extend the basic AdaBoost algorithm by incorporating an intuitive importance score based on Gini impurity to select candidate SNPs. Permutation tests are used to control the statistical significance. We have performed extensive simulation studies using three interaction models to evaluate the efficacy of our approach at realistic GWAS sizes, and have compared it with existing epistatic detection algorithms. Our results indicate that our approach is valid, efficient for GWAS and on disease models with epistasis has more power than existing programs.

Contact: jingli@case.edu.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Disease / genetics*
  • Epistasis, Genetic*
  • Genome-Wide Association Study*
  • Humans
  • Phenotype
  • Polymorphism, Single Nucleotide