A simple and efficient algorithm for genome-wide homozygosity analysis in disease

Mol Syst Biol. 2009:5:304. doi: 10.1038/msb.2009.53. Epub 2009 Sep 15.

Abstract

Here we propose a simple statistical algorithm for rapidly scoring loci associated with disease or traits due to recessive mutations or deletions using genome-wide single nucleotide polymorphism genotyping case-control data in unrelated individuals. This algorithm identifies loci by defining homozygous segments of the genome present at significantly different frequencies between cases and controls. We found that false positive loci could be effectively removed from the output of this procedure by applying different physical size thresholds for the homozygous segments. This procedure is then conducted iteratively using random sub-datasets until the number of selected loci converges. We demonstrate this method in a publicly available data set for Alzheimer's disease and identify 26 candidate risk loci in the 22 autosomes. In this data set, these loci can explain 75% of the genetic risk variability of the disease.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Algorithms*
  • Alzheimer Disease / genetics*
  • Genetic Predisposition to Disease
  • Genome, Human
  • Genome-Wide Association Study / methods*
  • Humans
  • Models, Genetic*
  • Models, Statistical
  • Polymorphism, Single Nucleotide
  • Systems Biology / methods*