Unbiased methods for population-based association studies

Genet Epidemiol. 2001 Dec;21(4):273-84. doi: 10.1002/gepi.1034.

Abstract

Large, population-based samples and large-scale genotyping are being used to evaluate disease/gene associations. A substantial drawback to such samples is the fact that population substructure can induce spurious associations between genes and disease. We review two methods, called genomic control (GC) and structured association (SA), that obviate many of the concerns about population substructure by using the features of the genomes present in the sample to correct for stratification. The GC approach exploits the fact that population substructure generates "over dispersion" of statistics used to assess association. By testing multiple polymorphisms throughout the genome, only some of which are pertinent to the disease of interest, the degree of overdispersion generated by population substructure can be estimated and taken into account. The SA approach assumes that the sampled population, although heterogeneous, is composed of subpopulations that are themselves homogeneous. By using multiple polymorphisms throughout the genome, this "latent class method" estimates the probability sampled individuals derive from each of these latent subpopulations. GC has the advantage of robustness, simplicity, and wide applicability, even to experimental designs such as DNA pooling. SA is a bit more complicated but has the advantage of greater power in some realistic settings, such as admixed populations or when association varies widely across subpopulations. It, too, is widely applicable. Both also have weaknesses, as elaborated in our review.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Analysis of Variance
  • Bias*
  • Case-Control Studies*
  • Confounding Factors, Epidemiologic
  • Data Interpretation, Statistical*
  • Epidemiologic Studies*
  • Gene Pool
  • Genetic Heterogeneity
  • Genetic Markers / genetics
  • Genetics, Population*
  • Genomics / methods*
  • Genomics / standards
  • Genotype
  • Haplotypes / genetics
  • Humans
  • Linkage Disequilibrium / genetics
  • Models, Genetic*
  • Molecular Epidemiology* / methods*
  • Molecular Epidemiology* / standards
  • Polymorphism, Genetic / genetics*
  • Quantitative Trait, Heritable
  • Reproducibility of Results

Substances

  • Genetic Markers