High-resolution protein structure determination starting with a global fold calculated from exact solutions to the RDC equations

J Biomol NMR. 2009 Nov;45(3):265-81. doi: 10.1007/s10858-009-9366-3. Epub 2009 Aug 27.

Abstract

We present a novel structure determination approach that exploits the global orientational restraints from RDCs to resolve ambiguous NOE assignments. Unlike traditional approaches that bootstrap the initial fold from ambiguous NOE assignments, we start by using RDCs to compute accurate secondary structure element (SSE) backbones at the beginning of structure calculation. Our structure determination package, called RDC-PANDA: (RDC-based SSE PAcking with NOEs for Structure Determination and NOE Assignment), consists of three modules: (1) RDC-EXACT: ; (2) PACKER: ; and (3) HANA: (HAusdorff-based NOE Assignment). RDC-EXACT: computes the global optimal solution of backbone dihedral angles for each secondary structure element by exactly solving a system of quartic RDC equations derived by Wang and Donald (Proceedings of the IEEE computational systems bioinformatics conference (CSB), Stanford, CA, 2004a; J Biomol NMR 29(3):223-242, 2004b), and systematically searching over the roots, each of which is a backbone dihedral varphi- or psi-angle consistent with the RDC data. Using a small number of unambiguous inter-SSE NOEs extracted using only chemical shift information, PACKER: performs a systematic search for the core structure, including all SSE backbone conformations. HANA: uses a Hausdorff-based scoring function to measure the similarity between the experimental spectra and the back-computed NOE pattern for each side-chain from a statistically-diverse rotamer library, and drives the selection of optimal position-specific rotamers for filtering ambiguous NOE assignments. Finally, a local minimization approach is used to compute the loops and refine side-chain conformations by fixing the core structure as a rigid body while allowing movement of loops and side-chains. RDC-PANDA: was applied to NMR data for the FF Domain 2 of human transcription elongation factor CA150 (RNA polymerase II C-terminal domain interacting protein), human ubiquitin, the ubiquitin-binding zinc finger domain of the human Y-family DNA polymerase Eta (pol eta UBZ), and the human Set2-Rpb1 interacting domain (hSRI). These results demonstrated the efficiency and accuracy of our algorithm, and show that RDC-PANDA: can be successfully applied for high-resolution protein structure determination using only a limited set of NMR data by first computing RDC-defined backbones.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Humans
  • Models, Molecular
  • Nuclear Magnetic Resonance, Biomolecular / methods*
  • Protein Structure, Tertiary
  • Proteins / chemistry*

Substances

  • Proteins

Associated data

  • PDB/2KIQ