DeepFrag-k: a fragment-based deep learning approach for protein fold recognition

BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):203. doi: 10.1186/s12859-020-3504-z.

Abstract

Background: One of the most essential problems in structural bioinformatics is protein fold recognition. In this paper, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features at fragment level to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multi-modal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolutional neural network (CNN) to classify the fragment vector into the corresponding fold.

Results: Our results show that DeepFrag-k yields 92.98% accuracy in predicting the top-100 most popular fragments, which can be used to generate discriminative fragment feature vectors to improve protein fold recognition.

Conclusions: There is a set of fragments that can serve as structural "keywords" distinguishing between major protein folds. The deep learning architecture in DeepFrag-k is able to accurately identify these fragments as structure features to improve protein fold recognition.

Keywords: Deep learning; Fold recognition; Protein fragments.

MeSH terms

  • Computational Biology*
  • Deep Learning*
  • Neural Networks, Computer
  • Protein Folding*
  • Proteins

Substances

  • Proteins