DeepFrag-k: a fragment-based deep learning approach for protein fold recognition

Wessam Elhefnawy; Min Li; Jianxin Wang; Yaohang Li

doi:10.1186/s12859-020-3504-z

DeepFrag-k: a fragment-based deep learning approach for protein fold recognition

BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):203. doi: 10.1186/s12859-020-3504-z.

Authors

Wessam Elhefnawy¹, Min Li², Jianxin Wang², Yaohang Li³

Affiliations

¹ Department of Computer Science, Old Dominion University, Norfolk, U.S.A.
² Department of Computer Science, Central South University, Changsha, China.
³ Department of Computer Science, Old Dominion University, Norfolk, U.S.A.. yaohang@cs.odu.edu.

Abstract

Background: One of the most essential problems in structural bioinformatics is protein fold recognition. In this paper, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features at fragment level to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multi-modal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolutional neural network (CNN) to classify the fragment vector into the corresponding fold.

Results: Our results show that DeepFrag-k yields 92.98% accuracy in predicting the top-100 most popular fragments, which can be used to generate discriminative fragment feature vectors to improve protein fold recognition.

Conclusions: There is a set of fragments that can serve as structural "keywords" distinguishing between major protein folds. The deep learning architecture in DeepFrag-k is able to accurately identify these fragments as structure features to improve protein fold recognition.

Keywords: Deep learning; Fold recognition; Protein fragments.

MeSH terms

Computational Biology*
Deep Learning*
Neural Networks, Computer
Protein Folding*
Proteins

Substances

Proteins

Abstract

MeSH terms

Substances

Grants and funding