A varying-coefficient model for the analysis of methylation sequencing data

Comput Biol Chem. 2024 Aug:111:108094. doi: 10.1016/j.compbiolchem.2024.108094. Epub 2024 May 18.

Abstract

DNA methylation is an important epigenetic modification involved in gene regulation. Advances in the next generation sequencing technology have enabled the retrieval of DNA methylation information at single-base-resolution. However, due to the sequencing process and the limited amount of isolated DNA, DNA-methylation-data are often noisy and sparse, which complicates the identification of differentially methylated regions (DMRs), especially when few replicates are available. We present a varying-coefficient model for detecting DMRs by using single-base-resolved methylation information. The model simultaneously smooths the methylation profiles and allows detection of DMRs, while accounting for additional covariates. The proposed model takes into account possible overdispersion by using a beta-binomial distribution. The overdispersion itself can be modeled as a function of the genomic region and explanatory variables. We illustrate the properties of the proposed model by applying it to two real-life case studies.

Keywords: Beta-binomial model; CpG site; Differentially methylated regions; Methylation sequencing; Smoothing splines; Varying-coefficient model.

MeSH terms

  • DNA Methylation*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Sequence Analysis, DNA* / methods