A large-scale genomically predicted protein mass database enables rapid and broad-spectrum identification of bacterial and archaeal isolates by mass spectrometry

Genome Biol. 2023 Dec 5;24(1):257. doi: 10.1186/s13059-023-03096-4.

Abstract

MALDI-TOF MS-based microbial identification relies on reference spectral libraries, which limits the screening of diverse isolates, including uncultured lineages. We present a new strategy for broad-spectrum identification of bacterial and archaeal isolates by MALDI-TOF MS using a large-scale database of protein masses predicted from nearly 200,000 publicly available genomes. We verify the ability of the database to identify microorganisms at the species level and below, achieving correct identification for > 90% of measured spectra. We further demonstrate its utility by identifying uncultured strains from mouse feces with metagenomics, allowing the identification of new strains by customizing the database with metagenome-assembled genomes.

Keywords: Archaeal genomes; Bacterial genomes; Culturomics; MALDI-TOF MS; Microbial identification; Protein mass database; Uncultured microbes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Archaea* / genetics
  • Bacteria* / genetics
  • Databases, Factual
  • Mice
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization / methods