Mock microbial community meta-analysis using different trimming of amplicon read lengths

Environ Microbiol. 2024 Jan;26(1):e16566. doi: 10.1111/1462-2920.16566. Epub 2023 Dec 27.

Abstract

Trimming of sequencing reads is a pre-processing step that aims to discard sequence segments such as primers, adapters and low quality nucleotides that will interfere with clustering and classification steps. We evaluated the impact of trimming length of paired-end 16S and 18S rRNA amplicon reads on the ability to reconstruct the taxonomic composition and relative abundances of communities with a known composition in both even and uneven proportions. We found that maximizing read retention maximizes recall but reduces precision by increasing false positives. The presence of expected taxa was accurately predicted across broad trim length ranges but recovering original relative proportions remains a difficult challenge. We show that parameters that maximize taxonomic recovery do not simultaneously maximize relative abundance accuracy. Trim length represents one of several experimental parameters that have non-uniform impact across microbial clades, making it a difficult parameter to optimize. This study offers insights, guidelines, and helps researchers assess the significance of their decisions when trimming raw reads in a microbiome analysis based on overlapping or non-overlapping paired-end amplicons.

Publication types

  • Meta-Analysis

MeSH terms

  • DNA Primers / genetics
  • High-Throughput Nucleotide Sequencing
  • Microbiota* / genetics
  • RNA, Ribosomal, 16S / genetics
  • RNA, Ribosomal, 18S
  • Sequence Analysis, DNA

Substances

  • RNA, Ribosomal, 16S
  • RNA, Ribosomal, 18S
  • DNA Primers