Partitioning gene-based variance of complex traits by gene score regression

PLoS One. 2020 Aug 20;15(8):e0237657. doi: 10.1371/journal.pone.0237657. eCollection 2020.

Abstract

The majority of genome-wide association studies (GWAS) loci are not annotated to known genes in the human genome, which renders biological interpretations difficult. Transcriptome-wide association studies (TWAS) associate complex traits with genotype-based prediction of gene expression deriving from expression quantitative loci(eQTL) studies, thus improving the interpretability of GWAS findings. However, these results can sometimes suffer from a high false positive rate, because predicted expression of different genes may be highly correlated due to linkage disequilibrium between eQTL. We propose a novel statistical method, Gene Score Regression (GSR), to detect causal gene sets for complex traits while accounting for gene-to-gene correlations. We consider non-causal genes that are highly correlated with the causal genes will also exhibit a high marginal association with the complex trait. Consequently, by regressing on the marginal associations of complex traits with the sum of the gene-to-gene correlations in each gene set, we can assess the amount of variance of the complex traits explained by the predicted expression of the genes in each gene set and identify plausible causal gene sets. GSR can operate either on GWAS summary statistics or observed gene expression. Therefore, it may be widely applied to annotate GWAS results and identify the underlying biological pathways. We demonstrate the high accuracy and computational efficiency of GSR compared to state-of-the-art methods through simulations and real data applications. GSR is openly available at https://github.com/li-lab-mcgill/GSR.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Regulation / genetics
  • Genetic Predisposition to Disease
  • Genome, Human / genetics
  • Genome-Wide Association Study / statistics & numerical data*
  • Genotype
  • Humans
  • Linkage Disequilibrium
  • Models, Genetic
  • Molecular Sequence Annotation*
  • Multifactorial Inheritance / genetics*
  • Polymorphism, Single Nucleotide / genetics
  • Quantitative Trait Loci
  • Transcriptome / genetics*

Grants and funding

The research is supported by Canada First Research Excellence Fund (CFREF) Healthy Brains, Healthy Life (HBHL) New Investigator fund (249591) at McGill University and Mon- treal Neurologic Institute (MNI) and NSERC Discovery Grant (RGPIN-2019-0621). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. No author received a salary from any of the funders.