Analysis of 16S ribosomal RNA (rRNA) gene amplification data for microbial barcoding can be inaccurate across complex environmental samples. A method, ANCHOR, is presented and designed for improved species-level microbial identification using paired-end sequences directly, multiple high-complexity samples and multiple reference databases. A standard operating procedure (SOP) is reported alongside benchmarking against artificial, single sample and replicated mock data sets. The method is then directly tested using a real-world data set from surface swabs of the International Space Station (ISS). Simple mock community analysis identified 100% of the expected species and 99% of expected gene copy variants (100% identical). A replicated mock community revealed similar or better numbers of expected species than MetaAmp, DADA2, Mothur and QIIME1. Analysis of the ISS microbiome identified 714 putative unique species/strains and differential abundance analysis distinguished significant differences between the Destiny module (U.S. laboratory) and Harmony module (sleeping quarters). Harmony was remarkably dominated by human gastrointestinal tract bacteria, similar to enclosed environments on earth; however, Destiny module bacteria also derived from nonhuman microbiome carriers present on the ISS, the laboratory's research animals. ANCHOR can help substantially improve sequence resolution of 16S rRNA gene amplification data within biologically replicated environmental experiments and integrated multidatabase annotation enhances interpretation of complex, nonreference microbiomes.
© 2019 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.