TMPpred: A support vector machine-based thermophilic protein identifier

Anal Biochem. 2022 May 15:645:114625. doi: 10.1016/j.ab.2022.114625. Epub 2022 Feb 23.

Abstract

Motivation: The thermostability of proteins will cause them to break the temperature binding and play more functions. Using machine learning, we explored the mechanism of and reasons for protein thermostability characteristics.

Results: Different from other methods that only pursue the performance of models, we aim to find important features so as to provide a powerful reference for in vitro experiments. We transformed this problem into a binary classification problem, that is, the distinction between thermophilic proteins and nonthermophilic proteins. Using support vector machine-based model construction and analysis, we inferred that Gly, Ala, Ser and Thr may be the most important components at the residue level that determine the thermal stability of proteins. It is also noteworthy that our proposed model obtains an Sn of 0.892, an Sp of 0.857, an ACC of 0.87566 and an AUC of 0.874. To facilitate other researchers, we wrapped our model and deployed it as a web server, which is accessible at http://112.124.26.17:7000/TMPpred/index.html.

Keywords: Binary classification; Machine learning; Support vector machine; Thermostability of protein.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Machine Learning*
  • Proteins / chemistry
  • Support Vector Machine*

Substances

  • Proteins