Assembling multidomain protein structures through analogous global structural alignments

Proc Natl Acad Sci U S A. 2019 Aug 6;116(32):15930-15938. doi: 10.1073/pnas.1905068116. Epub 2019 Jul 24.

Abstract

Most proteins exist with multiple domains in cells for cooperative functionality. However, structural biology and protein folding methods are often optimized for single-domain structures, resulting in a rapidly growing gap between the improved capability for tertiary structure determination and high demand for multidomain structure models. We have developed a pipeline, termed DEMO, for constructing multidomain protein structures by docking-based domain assembly simulations, with interdomain orientations determined by the distance profiles from analogous templates as detected through domain-level structure alignments. The pipeline was tested on a comprehensive benchmark set of 356 proteins consisting of 2-7 continuous and discontinuous domains, for which DEMO generated models with correct global fold (TM-score > 0.5) for 86% of cases with continuous domains and for 100% of cases with discontinuous domain structures, starting from randomly oriented target-domain structures. DEMO was also applied to reassemble multidomain targets in the CASP12 and CASP13 experiments using domain structures excised from the top server predictions, where the full-length DEMO models showed a significantly improved quality over the original server models. Finally, sparse restraints of mass spectrometry-generated cross-linking data and cryo-EM density maps are incorporated into DEMO, resulting in improvements in the average TM-score by 6.3% and 12.5%, respectively. The results demonstrate an efficient approach to assembling multidomain structures, which can be easily used for automated, genome-scale multidomain protein structure assembly.

Keywords: domain assembly; multidomain protein; multidomain template recognition; protein structure prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Cross-Linking Reagents / chemistry
  • Cryoelectron Microscopy
  • Databases, Protein
  • Models, Molecular
  • Protein Domains
  • Proteins / chemistry*
  • Software

Substances

  • Cross-Linking Reagents
  • Proteins