Learning Protein Embedding to Improve Protein Fold Recognition Using Deep Metric Learning

J Chem Inf Model. 2022 Sep 12;62(17):4283-4291. doi: 10.1021/acs.jcim.2c00959. Epub 2022 Aug 25.

Abstract

Protein fold recognition refers to predicting the most likely fold type of the query protein and is a critical step of protein structure and function prediction. With the popularity of deep learning in bioinformatics, protein fold recognition has obtained impressive progress. In this study, to extract the fold-specific feature to improve protein fold recognition, we proposed a unified deep metric learning framework based on a joint loss function, termed NPCFold. In addition, we also proposed an integrated machine learning model based on the similarity of proteins in various properties, termed NPCFoldpro. Benchmark experiments show both NPCFold and NPCFoldpro outperform existing protein fold recognition methods at the fold level, indicating that our proposed strategies of fusing loss functions and fusing features could improve the fold recognition level.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology* / methods
  • Machine Learning
  • Proteins* / chemistry

Substances

  • Proteins