Mixture model for sub-phenotyping in GWAS

Pac Symp Biocomput. 2012:363-74.

Abstract

Genome Wide Association (GWA) studies resulted in discovery of genetic variants underlying several complex diseases including Chron's disease and age-related macular degeneration (AMD). Still geneticists find that in majority of studies the size of the effect even if it is significant tends to be very small. There are several factors contributing to this problem such as rare variants, complex relationships among SNPs (epistatic effect), and heterogeneity of the phenotype. In this work we focus on addressing phenotypic heterogeneity. We introduce the problem of identifying, from GWAS data, separate genotypic markers from overlapping mixtures of clinically indistinguishable phenotypes. We propose a generative model for this scenario and derive an expectation-maximization (EM) procedure to fit the model to data, as well as a novel screening procedure designed to identify skew specific to certain phenotypic regimes. We present results on several simulated datasets as well as preliminary findings in applying the model to type 2 diabetes dataset.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology
  • Computer Simulation
  • Databases, Genetic / statistics & numerical data
  • Diabetes Mellitus, Type 2 / genetics
  • Genetic Association Studies
  • Genome-Wide Association Study / statistics & numerical data*
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Phenotype
  • Polymorphism, Single Nucleotide