Long-read sequencing and genome assembly of natural history collection samples and challenging specimens

bioRxiv [Preprint]. 2024 Sep 27:2024.03.04.583385. doi: 10.1101/2024.03.04.583385.

Abstract

Museum collections harbor millions of samples, largely unutilized for long-read sequencing. Here, we use ethanol-preserved samples containing kilobase-sized DNA to show that amplification-free protocols can yield contiguous genome assemblies. Additionally, using a modified amplification-based protocol, employing an alternative polymerase to overcome PCR bias, we assembled the 3.1 Gb maned sloth genome, surpassing the previous 500 Mb protocol size limit. Our protocol also improves assemblies of other difficult-to-sequence molluscs and arthropods, including millimeter-sized organisms. By highlighting collections as valuable sample resources and facilitating genome assembly of tiny and challenging organisms, our study advances efforts to obtain reference genomes of all eukaryotes.

Keywords: PCR amplification; genome assembly; long-read sequencing; museum collections.

Publication types

  • Preprint