Background: Diffuse gliomas are a group of diseases that contain different degrees of malignancy and complex heterogeneity. Previous studies proposed biomarkers for certain grades of gliomas, but few of them have conducted a systematic analysis of different grades to search for molecular markers. Methods: WGCNA was used to find significant genes associated with malignant progression of diffuse glioma in TCGA glioma sequencing expression data and the GEO expression profile-merge meta dataset. Lasso regression was used for potential model building and the best model was selected by CPE, IDI, and C_index. Risk score model was used to evaluate the gene signature prognostic power. Multi-omics data, including CNV, methylation, clinical traits, and mutation, were used for model evaluation. Results: We found out 67 genes significantly associated with malignant progression of diffuse glioma by WGCNA. Next, we established a new 4 gene molecular marker (KDELR2, EMP3, TIMP1, and TAGLN2). Multivariate cox analysis identified the risk score of the 4 genes as an independent predictor of prognosis in patients with diffuse gliomas, and its predictive power was independent of the histopathological grades of glioma. Further, we had confirmed in five independent test datasets and the risk score remained good predictive power. The combination of the prognosis model with specific molecular characteristics possessed a better predictive power. Furthermore, we divided the low-risk group into three subtypes: LowRisk_IDH1wt, LowRisk_IDH1mut/ATRXmut, and LowRisk_IDH1mut/ATRXwt by combining IDH1 mutation with ATRX mutation, which possessed obvious survival difference. In further analysis, we found that the 4 gene prognosis model possessed multi-omics features. Conclusion: We established a malignant-related 4-gene molecular marker by glioma expression profile data from multiple microarrays and sequencing data. The four markers had good predictive power on the overall survival of glioma patients and were associated with gliomas' clinical and genetic backgrounds, including clinical features, gene mutation, methylation, CNV, signal pathways.
Keywords: bioinformatics; diffuse glioma; expression data; prognosis model; systematic analysis.
© The author(s).