Peptide identification by database search of mixture tandem mass spectra

Mol Cell Proteomics. 2011 Dec;10(12):M111.010017. doi: 10.1074/mcp.M111.010017. Epub 2011 Aug 23.

Abstract

In high-throughput proteomics the development of computational methods and novel experimental strategies often rely on each other. In certain areas, mass spectrometry methods for data acquisition are ahead of computational methods to interpret the resulting tandem mass spectra. Particularly, although there are numerous situations in which a mixture tandem mass spectrum can contain fragment ions from two or more peptides, nearly all database search tools still make the assumption that each tandem mass spectrum comes from one peptide. Common examples include mixture spectra from co-eluting peptides in complex samples, spectra generated from data-independent acquisition methods, and spectra from peptides with complex post-translational modifications. We propose a new database search tool (MixDB) that is able to identify mixture tandem mass spectra from more than one peptide. We show that peptides can be reliably identified with up to 95% accuracy from mixture spectra while considering only a 0.01% of all possible peptide pairs (four orders of magnitude speedup). Comparison with current database search methods indicates that our approach has better or comparable sensitivity and precision at identifying single-peptide spectra while simultaneously being able to identify 38% more peptides from mixture spectra at significantly higher precision.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Data Interpretation, Statistical*
  • Databases, Protein*
  • Models, Statistical
  • Peptide Fragments / chemistry*
  • Reproducibility of Results
  • Saccharomyces cerevisiae Proteins / chemistry
  • Search Engine*
  • Support Vector Machine
  • Tandem Mass Spectrometry*

Substances

  • Peptide Fragments
  • Saccharomyces cerevisiae Proteins