Long walk to genomics: History and current approaches to genome sequencing and assembly

Comput Struct Biotechnol J. 2019 Nov 17:18:9-19. doi: 10.1016/j.csbj.2019.11.002. eCollection 2020.

Abstract

Genomes represent the starting point of genetic studies. Since the discovery of DNA structure, scientists have devoted great efforts to determine their sequence in an exact way. In this review we provide a comprehensive historical background of the improvements in DNA sequencing technologies that have accompanied the major milestones in genome sequencing and assembly, ranging from early sequencing methods to Next-Generation Sequencing platforms. We then focus on the advantages and challenges of the current technologies and approaches, collectively known as Third Generation Sequencing. As these technical advancements have been accompanied by progress in analytical methods, we also review the bioinformatic tools currently employed in de novo genome assembly, as well as some applications of Third Generation Sequencing technologies and high-quality reference genomes.

Keywords: BAC, Bacterial Artificial Chromosome; Bioinformatics; Genome assembly; HGP, Human Genome Project; HMW, high molecular weight; HapMap, haplotype map; NGS, Next Generation Sequencing; Next-generation; OLC, Overlap-Layout-Consensus; QV, Quality Value (QV); Reference; SBS, Sequencing by Synthesis; SMRT, Single Molecule Real-Time; SNPs, Single Nucleotide Polymorphisms; SRA, Short Read Archive; SV, Structural Variant; Sequencing; TGS, Third Generation Sequencing; Third-generation; WGS, Whole Genome Sequencing; ZMW, Zero-Mode Waveguide; bp, base pair; dNTPs, deoxynucleoside triphosphates; ddNTP, 2,3-dideoxynucleoside triphosphate.

Publication types

  • Review