Capture of complete ciliate chromosomes in single sequencing reads reveals widespread chromosome isoforms

BMC Genomics. 2019 Dec 30;20(1):1037. doi: 10.1186/s12864-019-6189-9.

Abstract

Background: Whole-genome shotgun sequencing, which stitches together millions of short sequencing reads into a single genome, ushered in the era of modern genomics and led to a rapid expansion of the number of genome sequences available. Nevertheless, assembly of short reads remains difficult, resulting in fragmented genome sequences. Ultimately, only a sequencing technology capable of capturing complete chromosomes in a single run could resolve all ambiguities. Even "third generation" sequencing technologies produce reads far shorter than most eukaryotic chromosomes. However, the ciliate Oxytricha trifallax has a somatic genome with thousands of chromosomes averaging only 3.2 kbp, making it an ideal candidate for exploring the benefits of sequencing whole chromosomes without assembly.

Results: We used single-molecule real-time sequencing to capture thousands of complete chromosomes in single reads and to update the published Oxytricha trifallax JRB310 genome assembly. In this version, over 50% of the completed chromosomes with two telomeres derive from single reads. The improved assembly includes over 12,000 new chromosome isoforms, and demonstrates that somatic chromosomes derive from variable rearrangements between somatic segments encoded up to 191,000 base pairs away. However, while long reads reduce the need for assembly, a hybrid approach that supplements long-read sequencing with short reads for error correction produced the most complete and accurate assembly, overall.

Conclusions: This assembly provides the first example of complete eukaryotic chromosomes captured by single sequencing reads and demonstrates that traditional approaches to genome assembly can mask considerable structural variation.

Keywords: Alternative fragmentation; Ciliate; Genome assembly; Oxytricha; PacBio; SMRT sequencing.

MeSH terms

  • Chromosomes*
  • Ciliophora / genetics*
  • Computational Biology / methods
  • Genetic Variation*
  • Genome
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing*
  • Hybridization, Genetic
  • Sequence Analysis, DNA*