Phylogenetic marker gene sequencing is often used as a quick and cost-effective way of evaluating microbial composition within a community. While 16S rRNA gene sequencing (16S) is commonly used for bacteria and archaea, other marker genes are preferable in certain situations, such as when 16S sequences cannot distinguish between taxa within a group. Another situation is when researchers want to study cospeciation of host taxa that diverged much more recently than the slowly evolving 16S rRNA gene. For example, the bacterial gyrase subunit B (gyrB) gene has been used to investigate cospeciation between the microbiome and various hominid species. However, to date, only primers that generate short-read Illumina MiSeq-length amplicons exist to investigate gyrB of the Bacteroidaceae, Bifidobacteriaceae, and Lachnospiraceae families. Here, we update this methodology by creating gyrB primers for the Bacteroidaceae, Bifidobacteriaceae, and Lachnospiraceae families for long-read PacBio sequencing and characterize them against established short-read gyrB primer sets. We demonstrate both bioinformatically and analytically that these longer amplicons offer more sequence space for greater taxonomic resolution, lower off-target amplification rates, and lower error rates with PacBio CCS sequencing versus established short-read sequencing. The availability of these long-read gyrB primers will prove to be integral to the continued analysis of cospeciation between bacterial members of the gut microbiome and recently diverging host species.
Importance: Previous studies have shown that the marker gene gyrase subunit B (gyrB) can be used to study codiversification between the gut microbiome and hominids. However, only primers for short-read sequencing have been developed which have limited resolution for subspecies assignment. In the present study, we create new gyrB primer sets for long-read sequencing approaches and compare them to the existing short-read gyrB primers. We show that using longer reads leads to better taxonomic resolution, lower off-target amplification, and lower error rates, which are vital for accurate estimates of codiversification.
Keywords: GyrB; codiversification; cospeciation; long-read sequencing.