MIonSite: Ligand-specific prediction of metal ion-binding sites via enhanced AdaBoost algorithm with protein sequence information

Anal Biochem. 2019 Feb 1:566:75-88. doi: 10.1016/j.ab.2018.11.009. Epub 2018 Nov 9.

Abstract

Accurately targeting metal ion-binding sites solely from protein sequences is valuable for both basic experimental biology and drug discovery studies. Although considerable progress has been made, metal ion-binding site prediction is still a challenging problem due to the small size and high versatility of the metal ions. In this paper, we develop a ligand-specific predictor called MIonSite for predicting metal ion-binding sites from protein sequences. MIonSite first employs protein evolutionary information, predicted secondary structure, predicted solvent accessibility, and conservation information calculated by Jensen-Shannon Divergence score to extract the discriminative feature of each residue. An enhanced AdaBoost algorithm is then designed to cope with the serious imbalance problem buried in the metal ion-binding site prediction, where the number of non-binding sites is far more than that of metal ion-binding sites. A new gold-standard benchmark dataset, consisting of training and independent validation subsets of Zn2+, Ca2+, Mg2+, Mn2+, Fe3+, Cu2+, Fe2+, Co2+, Na+, K+, Cd2+, and Ni2+, is constructed to evaluate the proposed MIonSite with other existing predictors. Experimental results demonstrate that the proposed MIonSite achieves high prediction performance and outperforms other state-of-the-art sequence-based predictors. The standalone program of MIonSite and corresponding datasets can be freely downloaded at https://github.com/LiangQiaoGu/MIonSite.git for academic use.

Keywords: Imbalance learning; Ligand-specific; Metal ion-binding site prediction; Sequence-based.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Binding Sites
  • Databases, Protein
  • Datasets as Topic
  • Ligands*
  • Metals / chemistry*
  • Protein Binding
  • Protein Conformation
  • Protein Domains
  • Proteins / chemistry*
  • Software

Substances

  • Ligands
  • Metals
  • Proteins