Screening and diagnosis of colorectal cancer and advanced adenoma by Bionic Glycome method and machine learning

Am J Cancer Res. 2021 Jun 15;11(6):3002-3020. eCollection 2021.

Abstract

Colorectal cancer (CRC), one of the major health problems worldwide, mostly develops from colorectal adenomas. Advanced adenomas are generally considered as precancerous lesions and patients are recommended to remove the adenomas. Screening for colorectal cancer is usually performed by fecal tests (FOBT or FIT) and colonoscopy, however, their benefits are limited by uptake and adherence. Most CRC develops from colorectal advanced adenomas, but there is currently a lack of effective noninvasive screening method for advanced adenomas. N-glycans in human serum hold the great potentials as biomarker for diagnosis of human cancers. Our aim was to discover blood-based markers for screening and diagnosis of advanced adenomas and CRC, and to ascertain their efficiency in classifying healthy controls, patients with advanced adenomas and CRC by incorporating machine learning techniques with reliable and simple quantitative method with "Bionic Glycome" as internal standard based on the high-throughput Matrix-assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-MS). The quantitative results showed that there is a positive correlation between multi-antennary, sialylated N-glycans and CRC progress, while bi-antennary core-fucosylated N-glycans are negatively correlated with CRC progress. Machine learning is a powerful classification tool, suitable for mining big data, especially the large amount of data generated by high-throughput technologies. Using the predictive model constructed by machine learning, we obtained the classification accuracy of 75% for classification of 189 samples including CRC, advanced adenomas and healthy controls, and the accuracy of 87% for detection of the disease group that required treatment, including CRC and advanced adenomas. To our delight, the model successfully applied to the prediction of 176 samples collected a few months later, and five samples were wrongly predicted in the disease group. Overall, this diagnostic model we constructed here has valuable potential in the clinical application of detecting advanced adenomas and colorectal cancer and could compensate for the limitations of the current screening methods for detection of CRC and advanced adenomas.

Keywords: Colorectal cancer; advanced adenoma; biomarker; internal standard; machine learning; mass spectrometry; serum N-glycome quantification.