The three-dimensional organization of chromatin plays a critical role in gene regulation. Recently developed technologies, such as HiChIP and proximity ligation-assisted ChIP-Seq (PLAC-seq) (hereafter referred to as HP for brevity), can measure chromosome spatial organization by interrogating chromatin interactions mediated by a protein of interest. While offering cost-efficiency over genome-wide unbiased high-throughput chromosome conformation capture (Hi-C) data, HP data remain sparse at kilobase (Kb) resolution with the current sequencing depth in the order of 108 reads per sample. Deep learning models, including HiCPlus, HiCNN, HiCNN2, DeepHiC and Variationally Encoded Hi-C Loss Enhancer (VEHiCLE), have been developed to enhance the sequencing depth of Hi-C data, but their performance on HP data has not been benchmarked. Here, we performed a comprehensive evaluation of HP data sequencing depth enhancement using models developed for Hi-C data. Specifically, we analyzed various HP data, including Smc1a HiChIP data of the human lymphoblastoid cell line GM12878, H3K4me3 PLAC-seq data of four human neural cell types as well as of mouse embryonic stem cells (mESC), and mESC CCCTC-binding factor (CTCF) PLAC-seq data. Our evaluations lead to the following three findings: (i) most models developed for Hi-C data achieve reasonable performance when applied to HP data (e.g. with Pearson correlation ranging 0.76-0.95 for pairs of loci within 300 Kb), and the enhanced datasets lead to improved statistical power for detecting long-range chromatin interactions, (ii) models trained on HP data outperform those trained on Hi-C data and (iii) most models are transferable across cell types. Our results provide a general guideline for HP data enhancement using existing methods designed for Hi-C data.
Keywords: Hi-C; HiChIP; PLAC-seq; deep learning; enhancement; evaluation.
© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.