Joint multi-omics discriminant analysis with consistent representation learning using PANDA

Muhammad Aminu; Lingzhi Hong; Natalie Vokes; Stephanie T Schmidt; Maliazurina Saad; Bo Zhu; Xiuning Le; Cascone Tina; Ajay Sheshadri; Bo Wang; David Jaffray; Andy Futreal; J Jack Lee; Lauren A Byers; Don Gibbons; John Heymach; Ken Chen; Chao Cheng; Jianjun Zhang; Jia Wu

doi:10.21203/rs.3.rs-4353037/v1

Joint multi-omics discriminant analysis with consistent representation learning using PANDA

Res Sq [Preprint]. 2024 May 17:rs.3.rs-4353037. doi: 10.21203/rs.3.rs-4353037/v1.

Authors

Muhammad Aminu¹, Lingzhi Hong^{2

1}, Natalie Vokes², Stephanie T Schmidt², Maliazurina Saad¹, Bo Zhu², Xiuning Le², Cascone Tina², Ajay Sheshadri³, Bo Wang⁴, David Jaffray⁵, Andy Futreal⁶, J Jack Lee⁷, Lauren A Byers², Don Gibbons², John Heymach², Ken Chen⁸, Chao Cheng⁹, Jianjun Zhang², Jia Wu^{1

2}

Affiliations

¹ Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
² Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
³ Department of Pulmonary Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
⁴ Department of Medical Biophysics, University of Toronto, Ontario, Canada.
⁵ Office of the Chief Technology and Digital Officer, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
⁶ Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
⁷ Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
⁸ Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
⁹ Department of Medicine, Institution of Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA.

Abstract

Integrative multi-omics analysis provides deeper insight and enables better and more realistic modeling of the underlying biology and causes of diseases than does single omics analysis. Although several integrative multi-omics analysis methods have been proposed and demonstrated promising results in integrating distinct omics datasets, inconsistent distribution of the different omics data, which is caused by technology variations, poses a challenge for paired integrative multi-omics methods. In addition, the existing discriminant analysis-based integrative methods do not effectively exploit correlation and consistent discriminant structures, necessitating a compromise between correlation and discrimination in using these methods. Herein we present PAN-omics Discriminant Analysis (PANDA), a joint discriminant analysis method that seeks omics-specific discriminant common spaces by jointly learning consistent discriminant latent representations for each omics. PANDA jointly maximizes between-class and minimizes within-class omics variations in a common space and simultaneously models the relationships among omics at the consistency representation and cross-omics correlation levels, overcoming the need for compromise between discrimination and correlation as with the existing integrative multi-omics methods. Because of the consistency representation learning incorporated into the objective function of PANDA, this method seeks a common discriminant space to minimize the differences in distributions among omics, can lead to a more robust latent representations than other methods, and is against the inconsistency of the different omics. We compared PANDA to 10 other state-of-the-art multi-omics data integration methods using both simulated and real-world multi-omics datasets and found that PANDA consistently outperformed them while providing meaningful discriminant latent representations. PANDA is implemented using both R and MATLAB, with codes available at https://github.com/WuLabMDA/PANDA.

Publication types

Preprint

Abstract

Publication types

Grants and funding