Predicting protein secondary structure content. A tandem neural network approach

J Mol Biol. 1992 Jun 5;225(3):713-27. doi: 10.1016/0022-2836(92)90396-2.

Abstract

A priori knowledge of secondary structure content can be of great use in theoretical and experimental determination of protein structure. We present a method that uses two computer-simulated neural networks placed in "tandem" to predict the secondary structure content of water-soluble, globular proteins. The first of the two networks, NET1, predicts a protein's helix and strand content given information about the protein's amino acid composition, molecular weight and heme presence. Because NET1 contained more adjustable parameters (network weights) than learning examples, this network experienced problems with memorization, which is the inability to generalize onto new, never-seen-before examples. To overcome this problem, we designed a second network, NET2, which learned to determine when NET1 was in a state of generalization. Together, these two networks produce prediction errors as low as 5.0% and 5.6% for helix and strand content, respectively, on a set of protein crystal structures bearing little homology to those used in network training. A comparison between three other methods including a multiple linear regression analysis, a non-hidden-node network analysis and a secondary structure assignment analysis reveals that our tandem neural network scheme is, indeed, the best method for predicting secondary structure content. The results of our analysis suggest that the knowledge of sequence information is not necessary for highly accurate predictions of protein secondary structure content.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases, Bibliographic
  • Neural Networks, Computer
  • Protein Conformation*
  • Proteins / chemistry*
  • Structure-Activity Relationship
  • Thermodynamics

Substances

  • Proteins