A MOTIF-BASED METHOD FOR PREDICTING INTERFACIAL RESIDUES IN BOTH THE RNA AND PROTEIN COMPONENTS OF PROTEIN-RNA COMPLEXES

Pac Symp Biocomput. 2016:21:445-455.

Abstract

Efforts to predict interfacial residues in protein-RNA complexes have largely focused on predicting RNA-binding residues in proteins. Computational methods for predicting protein-binding residues in RNA sequences, however, are a problem that has received relatively little attention to date. Although the value of sequence motifs for classifying and annotating protein sequences is well established, sequence motifs have not been widely applied to predicting interfacial residues in macromolecular complexes. Here, we propose a novel sequence motif-based method for "partner-specific" interfacial residue prediction. Given a specific protein-RNA pair, the goal is to simultaneously predict RNA binding residues in the protein sequence and protein-binding residues in the RNA sequence. In 5-fold cross validation experiments, our method, PS-PRIP, achieved 92% Specificity and 61% Sensitivity, with a Matthews correlation coefficient (MCC) of 0.58 in predicting RNA-binding sites in proteins. The method achieved 69% Specificity and 75% Sensitivity, but with a low MCC of 0.13 in predicting protein binding sites in RNAs. Similar performance results were obtained when PS-PRIP was tested on two independent "blind" datasets of experimentally validated protein- RNA interactions, suggesting the method should be widely applicable and valuable for identifying potential interfacial residues in protein-RNA complexes for which structural information is not available. The PS-PRIP webserver and datasets are available at: http://pridb.gdcb.iastate.edu/PSPRIP/.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Amino Acid Motifs
  • Amino Acid Sequence
  • Base Sequence
  • Binding Sites / genetics
  • Computational Biology / methods
  • Computational Biology / statistics & numerical data
  • Databases, Nucleic Acid / statistics & numerical data
  • Databases, Protein / statistics & numerical data
  • Escherichia coli Proteins / chemistry
  • Escherichia coli Proteins / genetics
  • Escherichia coli Proteins / metabolism
  • Models, Molecular
  • Protein Binding
  • RNA / chemistry*
  • RNA / genetics
  • RNA / metabolism*
  • RNA, Bacterial / chemistry
  • RNA, Bacterial / genetics
  • RNA, Bacterial / metabolism
  • RNA, Ribosomal, 16S / chemistry
  • RNA, Ribosomal, 16S / genetics
  • RNA, Ribosomal, 16S / metabolism
  • RNA-Binding Proteins / chemistry*
  • RNA-Binding Proteins / genetics
  • RNA-Binding Proteins / metabolism*
  • Ribosomal Proteins / chemistry
  • Ribosomal Proteins / genetics
  • Ribosomal Proteins / metabolism
  • Software

Substances

  • Escherichia coli Proteins
  • RNA, Bacterial
  • RNA, Ribosomal, 16S
  • RNA-Binding Proteins
  • Ribosomal Proteins
  • ribosomal protein S11
  • ribosomal protein S4
  • RNA