Quantile Regression for biomarkers in the UK Biobank

bioRxiv [Preprint]. 2023 Jun 7:2023.06.05.543699. doi: 10.1101/2023.06.05.543699.

Abstract

Genome-wide association studies (GWAS) for biomarkers important for clinical phenotypes can lead to clinically relevant discoveries. GWAS for quantitative traits are based on simplified regression models modeling the conditional mean of a phenotype as a linear function of genotype. An alternative and easy to apply approach is quantile regression that naturally extends linear regression to the analysis of the entire conditional distribution of a phenotype of interest by modeling conditional quantiles within a regression framework. Quantile regression can be applied efficiently at biobank scale using standard statistical packages in much the same way as linear regression, while having some unique advantages such as identifying variants with heterogeneous effects across different quantiles, including non-additive effects and variants involved in gene-environment interactions; accommodating a wide range of phenotype distributions with invariance to trait transformation; and overall providing more detailed information about the underlying genotype-phenotype associations. Here, we demonstrate the value of quantile regression in the context of GWAS by applying it to 39 quantitative traits in the UK Biobank (n>300,000 individuals). Across these 39 traits we identify 7,297 significant loci, including 259 loci only detected by quantile regression. We show that quantile regression can help uncover replicable but unmodelled gene-environment interactions, and can provide additional key insights into poorly understood genotype-phenotype correlations for clinically relevant biomarkers at minimal additional cost.

Publication types

  • Preprint