Species identification of non-human biological evidence through DNA nucleotide sequencing is routinely used for forensic genetic analysis to support law enforcement. The gold standard for forensic genetics is conventional Sanger sequencing; however, this is gradually being replaced by high-throughput sequencing (HTS) approaches which can generate millions of individual reads in a single experiment. HTS sequencing, which now dominates molecular biology research, has already been demonstrated for use in a number of forensic genetic analysis applications, including species identification. However, the generation of HTS data to date requires expensive equipment and is cost-effective only when large numbers of samples are analysed simultaneously. The Oxford Nanopore Technologies (ONT) MinION™ is an affordable and small footprint DNA sequencing device with the potential to quickly deliver reliable and cost effective data. However, there has been no formal validation of forensic species identification using high-throughput (deep read) sequence data from the MinION making it currently impractical for many wildlife forensic end-users. Here, we present a MinION deep read sequence data validation study for species identification. First, we tested whether the clustering-based bioinformatics pipeline NGSpeciesID can be used to generate an accurate consensus sequence for species identification. Second, we systematically evaluated the read variation distribution around the generated consensus sequences to understand what confidence we have in the accuracy of the resulting consensus sequence and to determine how to interpret individual sample results. Finally, we investigated the impact of differences between the MinION consensus and Sanger control sequences on correct species identification to understand the ability and accuracy of the MinION consensus sequence to differentiate the true species from the next most similar species. This validation study establishes that ONT MinION sequence data used in conjunction with the NGSpeciesID pipeline can produce consensus DNA sequences of sufficient accuracy for forensic genetic species identification.
Keywords: Bioinformatic pipeline; DNA barcoding; High-throughput sequencing (HTS); MinION; MtDNA; NGSpeciesID; Species identification; Validation.
Copyright © 2021 The Authors. Published by Elsevier B.V. All rights reserved.