ComPaSS-GWAS: A method to reduce type I error in genome-wide association studies when replication data are not available

Genet Epidemiol. 2019 Feb;43(1):102-111. doi: 10.1002/gepi.22168. Epub 2018 Oct 18.

Abstract

Results from association studies are traditionally corroborated by replicating the findings in an independent data set. Although replication studies may be comparable for the main trait or phenotype of interest, it is unlikely that secondary phenotypes will be comparable across studies, making replication problematic. Alternatively, there may simply not be a replication sample available because of the nature or frequency of the phenotype. In these situations, an approach based on complementary pairs stability selection for genome-wide association study (ComPaSS-GWAS), is proposed as an ad-hoc alternative to replication. In this method, the sample is randomly split into two conditionally independent halves multiple times (resamples) and a GWAS is performed on each half in each resample. Similar in spirit to testing for association with independent discovery and replication samples, a marker is corroborated if its p-value is significant in both halves of the resample. Simulation experiments were performed for both nongenetic and genetic models. The type I error rate and power of ComPaSS-GWAS were determined and compared to the statistical properties of a traditional GWAS. Simulation results show that the type I error rate decreased as the number of resamples increased with only a small reduction in power and that these results were comparable with those from a traditional GWAS. Blood levels of vitamin pyridoxal 5'-phosphate from the Trinity Student Study (TSS) were used to validate this approach. The results from the validation study were compared to, and were consistent with, those obtained from previously published independent replication data and functional studies.

Keywords: GWAS; corroboration; power; replication; type I error.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Computer Simulation
  • Genome-Wide Association Study*
  • Humans
  • Models, Genetic
  • Phenotype
  • Polymorphism, Single Nucleotide / genetics
  • Reproducibility of Results