Deep Learning data integration for better risk stratification models of bladder cancer

AMIA Jt Summits Transl Sci Proc. 2018 May 18:2017:197-206. eCollection 2018.

Abstract

We propose an unsupervised multi-omics integration pipeline, using deep-learning autoencoder algorithm, to predict the survival subtypes in bladder cancer (BC). We used TCGA dataset comprising mRNA, miRNA and methylation to infer two survival subtypes. We then constructed a supervised classification model to predict the survival subgroups of any new individual sample. Our training data gave two subgroups with significant survival differences (p-value=8e-4), where high-risk survival subgroup was enriched with KRT6/14 overexpression and PI3K-Akt pathways. We tested the robustness of model by randomly splitting the main dataset into multiple training and test folds, which gave overall significant p-values. Then, we successfully inferred the subtypes for a subset of samples kept as test dataset (p-value=0.03). We further applied our pipeline to predict the survival subgroups from another validation dataset with miRNA data (p-value=0.02). Conclusively, present pipeline is an effective approach to infer the survival subtype of a new sample, exemplified by BC.