SNMF: Integrated Learning of Mutational Signatures and Prediction of DNA Repair Deficiencies

bioRxiv [Preprint]. 2024 Nov 28:2024.11.27.624656. doi: 10.1101/2024.11.27.624656.

Abstract

Motivation: Many tumours show deficiencies in DNA damage response (DDR), which influence tumorigenesis and progression, but also expose vulnerabilities with therapeutic potential. Assessing which patients might benefit from DDR-targeting therapy requires knowledge of tumour DDR deficiency status, with mutational signatures reportedly better predictors than loss of function mutations in select genes. However, signatures are identified independently using unsupervised learning, and therefore not optimised to distinguish between different pathway or gene deficiencies.

Results: We propose SNMF, a supervised non-negative matrix factorisation that jointly optimises the learning of signatures: (1) shared across samples, and (2) predictive of DDR deficiency. We applied SNMF to mutation profiles of human induced pluripotent cell lines carrying gene knockouts linked to three DDR pathways. The SNMF model achieved high accuracy (0.971) and learned more complete signatures of the DDR status of a sample, further discerning distinct mechanisms within a pathway. Cell line SNMF signatures recapitulated tumour-derived COSMIC signatures and predicted DDR pathway deficiency of TCGA tumours with high recall, suggesting that SNMF-like models can leverage libraries of induced DDR deficiencies to decipher intricate DDR signatures underlying patient tumours.

Availability: https://github.com/joanagoncalveslab/SNMF .

Publication types

  • Preprint