Background: Ultra-Deep Sequencing (UDS) enabled identification of specific changes in human genome occurring in malignant tumors, with current approaches calling for the detection of specific mutations associated with certain cancers. However, such associations are frequently idiosyncratic and cannot be generalized for diagnostics. Mitochondrial DNA (mtDNA) has been shown to be functionally associated with several cancer types. Here, we study the association of intra-host mtDNA diversity with Hepatocellular Carcinoma (HCC).
Results: UDS mtDNA exome data from blood of patients with HCC (n = 293) and non-cancer controls (NC, n = 391) were used to: (i) measure the genetic heterogeneity of nucleotide sites from the entire population of intra-host mtDNA variants rather than to detect specific mutations, and (ii) apply machine learning algorithms to develop a classifier for HCC detection. Average total entropy of HCC mtDNA is 1.24-times lower than of NC mtDNA (p = 2.84E-47). Among all polymorphic sites, 2.09% had a significantly different mean entropy between HCC and NC, with 0.32% of the HCC mtDNA sites having greater (p < 0.05) and 1.77% of the sites having lower mean entropy (p < 0.05) as compared to NC. The entropy profile of each sample was used to further explore the association between mtDNA heterogeneity and HCC by means of a Random Forest (RF) classifier The RF-classifier separated 232 HCC and 232 NC patients with accuracy of up to 99.78% and average accuracy of 92.23% in the 10-fold cross-validation. The classifier accurately separated 93.08% of HCC (n = 61) and NC (n = 159) patients in a validation dataset that was not used for the RF parameter optimization.
Conclusions: Polymorphic sites contributing most to the mtDNA association with HCC are scattered along the mitochondrial genome, affecting all mitochondrial genes. The findings suggest that application of heterogeneity profiles of intra-host mtDNA variants from blood may help overcome barriers associated with the complex association of specific mutations with cancer, enabling the development of accurate, rapid, inexpensive and minimally invasive diagnostic detection of cancer.
Keywords: Entropy; Liquid biopsy; Machine learning; Mitochondria.