Variance Estimation and Confidence Intervals from Genome-wide Association Studies Through High-dimensional Misspecified Mixed Model Analysis

J Stat Plan Inference. 2022 Sep:220:15-23. doi: 10.1016/j.jspi.2022.01.003. Epub 2022 Jan 25.

Abstract

We study variance estimation and associated confidence intervals for parameters characterizing genetic effects from genome-wide association studies (GWAS) in misspecified mixed model analysis. Previous studies have shown that, in spite of the model misspecification, certain quantities of genetic interests are consistently estimable, and consistent estimators of these quantities can be obtained using the restricted maximum likelihood (REML) method under a misspecified linear mixed model. However, the asymptotic variance of such a REML estimator is complicated and not ready to be implemented for practical use. In this paper, we develop practical and computationally convenient methods for estimating such asymptotic variances and constructing the associated confidence intervals. Performance of the proposed methods is evaluated empirically based on Monte-Carlo simulations and real-data application.

Keywords: GWAS; asymptotic approximation; confidence intervals; heritability; mis-LMM; unbiasedness; variance.