Common statistical issues in genome-wide association studies: a review on power, data quality control, genotype calling and population structure

Curr Opin Lipidol. 2008 Apr;19(2):133-43. doi: 10.1097/MOL.0b013e3282f5dd77.

Abstract

Purpose of review: Genetic association studies which survey the entire genome have become a common design for uncovering the genetic basis of common diseases, including lipid-related traits. Such studies have identified several novel loci which influence blood lipids. The present review highlights the statistical challenges associated with such large-scale genetic studies and discusses the available methodological strategies for handling these issues.

Recent findings: The successful analysis of genome-wide data assayed on commercial genotyping arrays depends on careful exploration of the data. Unaccounted sample failures, genotyping errors and population structure can introduce misleading signals that mimic genuine association. Careful interpretation of useful summary statistics and graphical data displays can minimize the extent of false associations that need to be followed up in replication or fine-mapping experiments.

Summary: Recently published genome-wide studies are beginning to yield valuable insights into the importance of well designed methodological and statistical techniques for sensible interpretation of the plethora of genetic data generated.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Genetic Predisposition to Disease / genetics*
  • Genetics, Population / standards
  • Genetics, Population / statistics & numerical data*
  • Genome, Human / genetics*
  • Genotype
  • Humans
  • Polymorphism, Single Nucleotide / genetics
  • Quality Control