5-hydroxymethylcytosine (5hmC) is a methylation state linked with gene regulation, commonly found in cells of the central nervous system. 5hmC is associated with demethylation of cytosines from 5-methylcytosine (5mC) to the unmethylated state. The presence of 5hmC can be inferred by a paired experiment involving bisulfite and oxidation-bisulfite treatments on the same sample, followed by a methylation assay using a platform such as the Illumina Infinium MethylationEPIC BeadChip (EPIC). Existing methods for analysis of the resulting EPIC data are not ideal. Most approaches ignore the correlation between the two experiments and any imprecision associated with DNA damage from the additional treatment. Estimates of 5mC/5hmC levels free from these limitations are desirable to reveal associations between methylation states and phenotypes. We propose a hierarchical Bayesian method called Constrained HYdroxy Methylation Estimation (CHYME) to simultaneously estimate 5mC/5hmC signals as well as any associations between these signals and covariates or phenotypes, while accounting for the potential impact of DNA damage and dependencies induced by the experimental design. Simulations show that CHYME has valid type 1 error and better power than a range of alternative methods, including the popular OxyBS method and linear models on transformed proportions. Other methods we examined suffer from hugely inflated type 1 error for inference on 5hmC proportions. We use CHYME to explore genome-wide associations between 5mC/5hmC levels and cause of death in postmortem prefrontal cortex brain tissue samples. These analyses indicate that CHYME is a useful tool to reveal phenotypic associations with 5mC/5hmC levels.
Keywords: 5-hydroxymethylcytosine (5hmC); 5-methylcytosine (5mC); hierarchical Bayesian methods; methylation data; paired bisulfite and oxidation-bisulfite treatments; phenotype-methylation associations; prefrontal cortex brain samples.
© 2022 Wiley Periodicals LLC.