The Illumina Infinium HumanMethylation450 BeadChip has emerged as one of the most popular platforms for genome wide profiling of DNA methylation. While the technology is wide-spread, systematic technical biases are believed to be present in the data. For example, this array incorporates two different chemical assays, i.e., Type I and Type II probes, which exhibit different technical characteristics and potentially complicate the computational and statistical analysis. Several normalization methods have been introduced recently to adjust for possible biases. However, there is considerable debate within the field on which normalization procedure should be used and indeed whether normalization is even necessary. Yet despite the importance of the question, there has been little comprehensive comparison of normalization methods. We sought to systematically compare several popular normalization approaches using the Norwegian Mother and Child Cohort Study (MoBa) methylation data set and the technical replicates analyzed with it as a case study. We assessed both the reproducibility between technical replicates following normalization and the effect of normalization on association analysis. Results indicate that the raw data are already highly reproducible, some normalization approaches can slightly improve reproducibility, but other normalization approaches may introduce more variability into the data. Results also suggest that differences in association analysis after applying different normalizations are not large when the signal is strong, but when the signal is more modest, different normalizations can yield very different numbers of findings that meet a weaker statistical significance threshold. Overall, our work provides useful, objective assessment of the effectiveness of key normalization methods.
Keywords: association testing; cotinine exposure; genome wide methylation profiling; normalization; reproducibility.