Using long and linked reads to improve an Atlantic herring (Clupea harengus) genome assembly

Sci Rep. 2019 Nov 27;9(1):17716. doi: 10.1038/s41598-019-54151-9.

Abstract

Atlantic herring (Clupea harengus) is one of the most abundant fish species in the world. It is an important economical and nutritional resource, as well as a crucial part of the North Atlantic ecosystem. In 2016, a draft herring genome assembly was published. Being a species of such importance, we sought to independently verify and potentially improve the herring genome assembly. We sequenced the herring genome generating paired-end, mate-pair, linked and long reads. Three assembly versions of the herring genome were generated based on a de novo assembly (A1), which was scaffolded using linked and long reads (A2) and then merged with the previously published assembly (A3). The resulting assemblies were compared using parameters describing the size, fragmentation, correctness, and completeness of the assemblies. Results showed that the A2 assembly was less fragmented, more complete and more correct than A1. A3 showed improvement in fragmentation and correctness compared with A2 and the published assembly but was slightly less complete than the published assembly. Thus, we here confirmed the previously published herring assembly, and made improvements by further scaffolding the assembly and removing low-quality sequences using linked and long reads and merging of assemblies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Contig Mapping / methods*
  • Contig Mapping / standards
  • Fishes / genetics*
  • Genome*
  • Whole Genome Sequencing / methods*
  • Whole Genome Sequencing / standards