rvTWAS: identifying gene-trait association using sequences by utilizing transcriptome-directed feature selection

Genetics. 2024 Feb 7;226(2):iyad204. doi: 10.1093/genetics/iyad204.

Abstract

Toward the identification of genetic basis of complex traits, transcriptome-wide association study (TWAS) is successful in integrating transcriptome data. However, TWAS is only applicable for common variants, excluding rare variants in exome or whole-genome sequences. This is partly because of the inherent limitation of TWAS protocols that rely on predicting gene expressions. Our previous research has revealed the insight into TWAS: the 2 steps in TWAS, building and applying the expression prediction models, are essentially genetic feature selection and aggregations that do not have to involve predictions. Based on this insight disentangling TWAS, rare variants' inability of predicting expression traits is no longer an obstacle. Herein, we developed "rare variant TWAS," or rvTWAS, that first uses a Bayesian model to conduct expression-directed feature selection and then uses a kernel machine to carry out feature aggregation, forming a model leveraging expressions for association mapping including rare variants. We demonstrated the performance of rvTWAS by thorough simulations and real data analysis in 3 psychiatric disorders, namely schizophrenia, bipolar disorder, and autism spectrum disorder. We confirmed that rvTWAS outperforms existing TWAS protocols and revealed additional genes underlying psychiatric disorders. Particularly, we formed a hypothetical mechanism in which zinc finger genes impact all 3 disorders through transcriptional regulations. rvTWAS will open a door for sequence-based association mappings integrating gene expressions.

Keywords: Bayesian feature selection; gene–trait association mapping; kernel-based feature aggregation; rare genetic variants; transcriptome-wide association study.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Autism Spectrum Disorder* / genetics
  • Bayes Theorem
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study / methods
  • Humans
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Quantitative Trait Loci
  • Transcriptome*