TOPITS: threading one-dimensional predictions into three-dimensional structures

Proc Int Conf Intell Syst Mol Biol. 1995:3:314-21.

Abstract

Homology modelling, currently, is the only theoretical tool which can successfully predict protein 3D structure. As 3D structure is conserved in sequence families, homology modelling allows to predict 3D structure for 20% of SWISSPROT. 20% of the proteins in PDB are remote homologues to another PDB protein. Threading techniques attempt to predict such remote homologues based on sequence information. Here, a new threading method is presented. First, for a list of PDB proteins, 3D structure was projected onto 1D strings of secondary structure and relative solvent accessibility. Then, secondary structure and accessibility were predicted by neural network systems (PHD). Finally, the predicted and observed 1D strings were aligned by dynamic programming. The resulting alignment was used to detect remote 3D homologues. Four results stand out. Firstly, even for an optimal prediction (assignment based on known structure), only about half the hits that ranked above a given threshold were correctly identified as remote homologues; only about 25% of the first hits were correct. Secondly, real predictions (PHD) were not much worse: about 20% of the first hits were correct. Thirdly, a simple filtering procedure improved prediction performance to about 30% correct first hits. The correct hit ranked among the first three for more than 23 out of 46 cases. Fourthly, the combination of the 1D threading and sequence alignments markedly improved the performance of the threading method TOPITS for some selected cases.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Conserved Sequence
  • Databases, Factual*
  • Mathematics
  • Models, Molecular*
  • Molecular Sequence Data
  • Protein Conformation*
  • Protein Structure, Secondary*
  • Proteins / chemistry*
  • Proteins / genetics
  • Sequence Homology, Amino Acid

Substances

  • Proteins

Associated data

  • SWISSPROT/UNKNOWN