Metal ions are central to the molecular function of many proteins. Thus their knowledge in experimentally determined structure is important; however, such structures often lose bound metal ions during sample preparation. Identification of these metal-binding site(s) becomes difficult when the receptor is novel and/or their conformations differ in the bound/unbound states. Locating such sites in theoretical models also poses a challenge due to the uncertainties with side-chain modeling. We address the problem by employing the Geometric Hashing algorithm to create a template library of functionally important binding sites and match query structures with the available templates. The matching is done on the structure ensemble obtained from coarse-grained molecular dynamics simulation, where metal-specific amino acids are screened to infer the true site. Test on 1347 non-redundant monomer protein structures show that Ca2+ , Zn2+ , Mg2+ , Cu2+ , and Fe3+ binding site residues can be classified at 0.92, 0.95, 0.80, 0.90, and 0.92 aggregate performance (out of 1) across all possible thresholds. The performance for Ca2+ and Zn2+ is notably superior in comparison to state-of-the-art methods like IonCom and MIB. Specific case studies show that additionally predicted metal-binding site residues in proteins have features necessary for ion binding. These include new sites not predicted by other methods. The use of coarse-grained dynamics thus provides a generalized approach to improve metal-binding site prediction. The work is expected to contribute to improving our ability to correctly predict protein molecular function where knowledge of metal binding is a key requirement.
Keywords: binding residue; coarse-grained dynamics; geometric hashing; metal-binding; site prediction.
© 2021 Wiley Periodicals LLC.