DNAscan2: a versatile, scalable, and user-friendly analysis pipeline for human next-generation sequencing data

Bioinformatics. 2023 Apr 3;39(4):btad152. doi: 10.1093/bioinformatics/btad152.

Abstract

Summary: The current widespread adoption of next-generation sequencing (NGS) in all branches of basic research and clinical genetics fields means that users with highly variable informatics skills, computing facilities and application purposes need to process, analyse, and interpret NGS data. In this landscape, versatility, scalability, and user-friendliness are key characteristics for an NGS analysis software. We developed DNAscan2, a highly flexible, end-to-end pipeline for the analysis of NGS data, which (i) can be used for the detection of multiple variant types, including SNVs, small indels, transposable elements, short tandem repeats, and other large structural variants; (ii) covers all standard steps of NGS analysis, from quality control of raw data and genome alignment to variant calling, annotation, and generation of reports for the interpretation and prioritization of results; (iii) is highly adaptable as it can be deployed and run via either a graphic user interface for non-bioinformaticians and a command line tool for personal computer usage; (iv) is scalable as it can be executed in parallel as a Snakemake workflow, and; (v) is computationally efficient by minimizing RAM and CPU time requirements.

Availability and implementation: DNAscan2 is implemented in Python3 and is available at https://github.com/KHP-Informatics/DNAscanv2.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • High-Throughput Nucleotide Sequencing* / methods
  • Humans
  • INDEL Mutation
  • Quality Control
  • Software*
  • Workflow