Using Copy Number Variation Data and Neural Networks to Predict Cancer Metastasis Origin Achieves High Area under the Curve Value with a Trade-Off in Precision

Curr Issues Mol Biol. 2024 Aug 1;46(8):8301-8319. doi: 10.3390/cimb46080490.

Abstract

The accurate identification of the primary tumor origin in metastatic cancer cases is crucial for guiding treatment decisions and improving patient outcomes. Copy number alterations (CNAs) and copy number variation (CNV) have emerged as valuable genomic markers for predicting the origin of metastases. However, current models that predict cancer type based on CNV or CNA suffer from low AUC values. To address this challenge, we employed a cutting-edge neural network approach utilizing a dataset comprising CNA profiles from twenty different cancer types. We developed two workflows: the first evaluated the performance of two deep neural networks-one ReLU-based and the other a 2D convolutional network. In the second workflow, we stratified cancer types based on anatomical and physiological classifications, constructing shallow neural networks to differentiate between cancer types within the same cluster. Both approaches demonstrated high AUC values, with deep neural networks achieving a precision of 60%, suggesting a mathematical relationship between CNV type, location, and cancer type. Our findings highlight the potential of using CNA/CNV to aid pathologists in accurately identifying cancer origins with accessible clinical tests.

Keywords: Ai; CNV; copy number variant.

Grants and funding

We would like to acknowledge the financial support provided by PM Forskningscenter (grant number 067).