A Bayesian approach for determining protein side-chain rotamer conformations using unassigned NOE data

J Comput Biol. 2011 Nov;18(11):1661-79. doi: 10.1089/cmb.2011.0172. Epub 2011 Oct 4.

Abstract

A major bottleneck in protein structure determination via nuclear magnetic resonance (NMR) is the lengthy and laborious process of assigning resonances and nuclear Overhauser effect (NOE) cross peaks. Recent studies have shown that accurate backbone folds can be determined using sparse NMR data, such as residual dipolar couplings (RDCs) or backbone chemical shifts. This opens a question of whether we can also determine the accurate protein side-chain conformations using sparse or unassigned NMR data. We attack this question by using unassigned nuclear Overhauser effect spectroscopy (NOESY) data, which records the through-space dipolar interactions between protons nearby in three-dimensional (3D) space. We propose a Bayesian approach with a Markov random field (MRF) model to integrate the likelihood function derived from observed experimental data, with prior information (i.e., empirical molecular mechanics energies) about the protein structures. We unify the side-chain structure prediction problem with the side-chain structure determination problem using unassigned NMR data, and apply the deterministic dead-end elimination (DEE) and A* search algorithms to provably find the global optimum solution that maximizes the posterior probability. We employ a Hausdorff-based measure to derive the likelihood of a rotamer or a pairwise rotamer interaction from unassigned NOESY data. In addition, we apply a systematic and rigorous approach to estimate the experimental noise in NMR data, which also determines the weighting factor of the data term in the scoring function derived from the Bayesian framework. We tested our approach on real NMR data of three proteins: the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), and human ubiquitin. The promising results indicate that our algorithm can be applied in high-resolution protein structure determination. Since our approach does not require any NOE assignment, it can accelerate the NMR structure determination process.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Bacterial Proteins / chemistry
  • Bayes Theorem*
  • Computer Simulation*
  • Humans
  • Likelihood Functions
  • Magnetic Resonance Spectroscopy
  • Markov Chains
  • Models, Molecular*
  • Protein Conformation
  • Protein Structure, Tertiary
  • Signal-To-Noise Ratio
  • Transcriptional Elongation Factors / chemistry
  • Ubiquitin / chemistry

Substances

  • Amino Acids
  • Bacterial Proteins
  • IgG Fc-binding protein, Streptococcus
  • TCERG1 protein, human
  • Transcriptional Elongation Factors
  • Ubiquitin