Differential allelic representation (DAR) identifies candidate eQTLs and improves transcriptome analysis

PLoS Comput Biol. 2024 Feb 12;20(2):e1011868. doi: 10.1371/journal.pcbi.1011868. eCollection 2024 Feb.

Abstract

In comparisons between mutant and wild-type genotypes, transcriptome analysis can reveal the direct impacts of a mutation, together with the homeostatic responses of the biological system. Recent studies have highlighted that, when the effects of homozygosity for recessive mutations are studied in non-isogenic backgrounds, genes located proximal to the mutation on the same chromosome often appear over-represented among those genes identified as differentially expressed (DE). One hypothesis suggests that DE genes chromosomally linked to a mutation may not reflect functional responses to the mutation but, instead, result from an unequal distribution of expression quantitative trait loci (eQTLs) between sample groups of mutant or wild-type genotypes. This is problematic because eQTL expression differences are difficult to distinguish from genes that are DE due to functional responses to a mutation. Here we show that chromosomally co-located differentially expressed genes (CC-DEGs) are also observed in analyses of dominant mutations in heterozygotes. We define a method and a metric to quantify, in RNA-sequencing data, localised differential allelic representation (DAR) between those sample groups subjected to differential expression analysis. We show how the DAR metric can predict regions prone to eQTL-driven differential expression, and how it can improve functional enrichment analyses through gene exclusion or weighting-based approaches. Advantageously, this improved ability to identify probable eQTLs also reveals examples of CC-DEGs that are likely to be functionally related to a mutant phenotype. This supports a long-standing prediction that selection for advantageous linkage disequilibrium influences chromosome evolution. By comparing the genomes of zebrafish (Danio rerio) and medaka (Oryzias latipes), a teleost with a conserved ancestral karyotype, we find possible examples of chromosomal aggregation of CC-DEGs during evolution of the zebrafish lineage. Our method for DAR analysis requires only RNA-sequencing data, facilitating its application across new and existing datasets.

MeSH terms

  • Animals
  • Gene Expression Profiling
  • Genotype
  • Quantitative Trait Loci* / genetics
  • RNA
  • Transcriptome / genetics
  • Zebrafish* / genetics

Substances

  • RNA