Improved enzyme functional annotation prediction using contrastive learning with structural inference

Commun Biol. 2024 Dec 23;7(1):1690. doi: 10.1038/s42003-024-07359-z.

Abstract

Recent years have witnessed the remarkable progress of deep learning within the realm of scientific disciplines, yielding a wealth of promising outcomes. A prominent challenge within this domain has been the task of predicting enzyme function, a complex problem that has seen the development of numerous computational methods, particularly those rooted in deep learning techniques. However, the majority of these methods have primarily focused on either amino acid sequence data or protein structure data, neglecting the potential synergy of combining both modalities. To address this gap, we propose a Contrastive Learning framework for Enzyme functional ANnotation prediction combined with protein amino acid sequences and Contact maps (CLEAN-Contact). We rigorously evaluate the performance of our CLEAN-Contact framework against the state-of-the-art enzyme function prediction models using multiple benchmark datasets. Using CLEAN-Contact, we predict previously unknown enzyme functions within the proteome of Prochlorococcus marinus MED4. Our findings convincingly demonstrate the substantial superiority of our CLEAN-Contact framework, marking a significant step forward in enzyme function prediction accuracy.

MeSH terms

  • Amino Acid Sequence
  • Computational Biology / methods
  • Databases, Protein
  • Deep Learning*
  • Enzymes* / chemistry
  • Enzymes* / metabolism
  • Molecular Sequence Annotation / methods
  • Prochlorococcus / enzymology
  • Prochlorococcus / genetics
  • Proteome

Substances

  • Enzymes
  • Proteome