Next-generation transcriptome assembly and analysis: Impact of ploidy

Methods. 2020 Apr 1:176:14-24. doi: 10.1016/j.ymeth.2019.06.001. Epub 2019 Jun 6.

Abstract

Whole genome duplications (WGD) occur widely in plants, but the effects of these events impact all branches of life. WGD events have major evolutionary impacts, often leading to major structural changes within the chromosomes and massive changes in gene expression that facilitate rapid speciation and gene diversification. Even for species that currently have diploid genomes, the impact of ancestral duplication events is still present in the genomes, especially in the context of highly similar gene families that are retained from WGD. However, the impact of these ploidies on various bioinformatics workflows has not been studied well. In this review, we overview biological significance of polyploidy in different organisms. We describe the impact of having polyploid transcriptomes on bioinformatics analyses, especially focusing on transcriptome assembly and transcript quantification. We discuss the benefits of using simulated benchmarking data when we examine the performance of various methods. We also present an example strategy to generate simulated allopolyploid transcriptomes and RNAseq datasets and how these benchmark datasets can be used to assess the performance of transcript assembly and quantification methods. Our benchmarking study shows that all transcriptome assembly methods are affected by having polyploid genomes. Quantification accuracy is also impacted by polyploidy depending on the method. These simulated datasets can be adapted for testing, such as, read mapping, variant calling, and differential expression using biologically realistic conditions.

Keywords: Polyploidy; RNAseq; Simulation; Transcript quantification; Transcriptome assembly; Whole genome duplication.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Computational Biology / methods*
  • Polyploidy*
  • RNA-Seq / methods*
  • Sequence Alignment
  • Transcriptome / genetics*