R2C2 + UMI: Combining concatemeric and unique molecular identifier-based consensus sequencing enables ultra-accurate sequencing of amplicons on Oxford Nanopore Technologies sequencers

PNAS Nexus. 2024 Aug 21;3(9):pgae336. doi: 10.1093/pnasnexus/pgae336. eCollection 2024 Sep.

Abstract

The sequencing of PCR amplicons is a core application of high-throughput sequencing technology. Using unique molecular identifiers (UMIs), individual amplified molecules can be sequenced to very high accuracy on an Illumina sequencer. However, Illumina sequencers have limited read length and are therefore restricted to sequencing amplicons shorter than 600 bp unless using inefficient synthetic long-read approaches. Native long-read sequencers from Pacific Biosciences and Oxford Nanopore Technologies can, using consensus read approaches, match or exceed Illumina quality while achieving much longer read lengths. Using a circularization-based concatemeric consensus sequencing approach (R2C2) paired with UMIs (R2C2 + UMI), we show that we can sequence an ∼550-nt antibody heavy chain (Immunoglobulin heavy chain - IGH) and an ∼1,500-nt 16S amplicons at accuracies up to and exceeding Q50 (<1 error in 100,000 sequenced bases), which exceeds accuracies of UMI-supported Illumina-paired sequencing as well as synthetic long-read approaches.