Detecting and sorting targeting peptides with neural networks and support vector machines

J Bioinform Comput Biol. 2006 Feb;4(1):1-18. doi: 10.1142/s0219720006001771.

Abstract

This paper presents a composite multi-layer classifier system for predicting the subcellular localization of proteins based on their amino acid sequence. The work is an extension of our previous predictor PProwler v1.1 which is itself built upon the series of predictors SignalP and TargetP. In this study we outline experiments conducted to improve the classifier design. The major improvement came from using Support Vector machines as a "smart gate" sorting the outputs of several different targeting peptide detection networks. Our final model (PProwler v1.2) gives MCC values of 0.873 for non-plant and 0.849 for plant proteins. The model improves upon the accuracy of our previous subcellular localization predictor (PProwler v1.1) by 2% for plant data (which represents 7.5% improvement upon TargetP).

MeSH terms

  • Amino Acid Sequence
  • Artificial Intelligence
  • Computational Biology*
  • Computer Simulation
  • Neural Networks, Computer
  • Plant Proteins / chemistry
  • Plant Proteins / genetics
  • Plant Proteins / metabolism
  • Protein Sorting Signals / genetics
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism*
  • Sequence Analysis, Protein
  • Subcellular Fractions / metabolism

Substances

  • Plant Proteins
  • Protein Sorting Signals
  • Proteins