Residue-level error detection in cryoelectron microscopy models

Structure. 2023 Jul 6;31(7):860-869.e4. doi: 10.1016/j.str.2023.05.002. Epub 2023 May 29.

Abstract

Building accurate protein models into moderate resolution (3-5 Å) cryoelectron microscopy (cryo-EM) maps is challenging and error prone. We have developed MEDIC (Model Error Detection in Cryo-EM), a robust statistical model that identifies local backbone errors in protein structures built into cryo-EM maps by combining local fit-to-density with deep-learning-derived structural information. MEDIC is validated on a set of 28 structures that were subsequently solved to higher resolutions, where we identify the differences between low- and high-resolution structures with 68% precision and 60% recall. We additionally use this model to fix over 100 errors in 12 deposited structures and to identify errors in 4 refined AlphaFold predictions with 80% precision and 60% recall. As modelers more frequently use deep learning predictions as a starting point for refinement and rebuilding, MEDIC's ability to handle errors in structures derived from hand-building and machine learning methods makes it a powerful tool for structural biologists.

Keywords: cryoelectron microscopy; machine learning; model building; protein model validation.

MeSH terms

  • Cryoelectron Microscopy / methods
  • Machine Learning*
  • Models, Molecular
  • Protein Conformation
  • Proteins* / chemistry

Substances

  • Proteins