Read length versus depth of coverage for viral quasispecies reconstruction

PLoS One. 2012;7(10):e47046. doi: 10.1371/journal.pone.0047046. Epub 2012 Oct 3.

Abstract

Recent advancements of sequencing technology have opened up unprecedented opportunities in many application areas. Virus samples can now be sequenced efficiently with very deep coverage to infer the genetic diversity of the underlying virus populations. Several sequencing platforms with different underlying technologies and performance characteristics are available for viral diversity studies. Here, we investigate how the differences between two common platforms provided by 454/Roche and Illumina affect viral diversity estimation and the reconstruction of viral haplotypes. Using a mixture of ten HIV clones sequenced with both platforms and additional simulation experiments, we assessed the trade-off between sequencing coverage, read length, and error rate. For fixed costs, short Illumina reads can be generated at higher coverage and allow for detecting variants at lower frequencies. They can also be sufficient to assess the diversity of the sample if sequences are dissimilar enough, but, in general, assembly of full-length haplotypes is feasible only with the longer 454/Roche reads. The quantitative comparison highlights the advantages and disadvantages of both platforms and provides guidance for the design of viral diversity studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genes, Viral / genetics*
  • Haplotypes / genetics
  • Sequence Analysis, DNA / methods*
  • Software

Grants and funding

Part of this work has been funded by the Swiss National Science Foundation under grant number CR32I2_127017. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No additional external funding has been received for this study.