An evaluation of methods correcting for cell-type heterogeneity in DNA methylation studies

Genome Biol. 2016 May 3:17:84. doi: 10.1186/s13059-016-0935-y.

Abstract

Background: Many different methods exist to adjust for variability in cell-type mixture proportions when analyzing DNA methylation studies. Here we present the result of an extensive simulation study, built on cell-separated DNA methylation profiles from Illumina Infinium 450K methylation data, to compare the performance of eight methods including the most commonly used approaches.

Results: We designed a rich multi-layered simulation containing a set of probes with true associations with either binary or continuous phenotypes, confounding by cell type, variability in means and standard deviations for population parameters, additional variability at the level of an individual cell-type-specific sample, and variability in the mixture proportions across samples. Performance varied quite substantially across methods and simulations. In particular, the number of false positives was sometimes unrealistically high, indicating limited ability to discriminate the true signals from those appearing significant through confounding. Methods that filtered probes had consequently poor power. QQ plots of p values across all tested probes showed that adjustments did not always improve the distribution. The same methods were used to examine associations between smoking and methylation data from a case-control study of colorectal cancer, and we also explored the effect of cell-type adjustments on associations between rheumatoid arthritis cases and controls.

Conclusions: We recommend surrogate variable analysis for cell-type mixture adjustment since performance was stable under all our simulated scenarios.

Keywords: Cell-type mixture; DNA methylation; Deconvolution; Matrix decomposition.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Arthritis, Rheumatoid / genetics
  • Case-Control Studies
  • Cell Separation / methods*
  • Cell Separation / standards
  • Colorectal Neoplasms / genetics
  • Computer Simulation
  • DNA Methylation*
  • Genetic Heterogeneity*
  • Humans
  • Monocytes / cytology
  • Monocytes / metabolism
  • Organ Specificity
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / standards