Background: In identifying prognostic markers in cancer, the roles of tumor-adjacent normal tissues are often confined to drawing expression differences between tumor and normal tissues rather than being treated as the main targets of investigations. Thus, differential expression analysis between tumors and adjacent normal tissues is performed prior to prognostic analysis in previous studies. However, recent studies have suggested that the prognostic relevance of differentially expressed genes (DEGs) is insignificant for some cancers, contradicting conventional approaches METHODS: This study investigated the prognostic efficacy of transcriptomic data from tumors and adjacent normal tissues using The Cancer Genome Atlas dataset. Prognostic analysis using Cox regression models and survival prediction using machine-learning models and feature selection methods were employed.
Results: The results revealed that for kidney, liver, and head and neck cancer, adjacent normal tissues harbored higher proportions of prognostic genes and exhibited better survival prediction performance than tumor tissues and DEGs in machine-learning models. Furthermore, the application of a distance correlation-based feature selection method to kidney and liver cancer using external datasets revealed that the selected genes for adjacent normal tissues exhibited higher prediction performance than those for tumor tissues. The study results suggest that the expression levels of genes in adjacent normal tissues are potential prognostic markers. The source code of this study is available at https://github.com/DMCB-GIST/Survival_Normal.
Keywords: machine learning; survival prediction; tumor adjacent normal tissues.
© 2023 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.