Using Summary Statistics to Model Multiplicative Combinations of Initially Analyzed Phenotypes With a Flexible Choice of Covariates

Jack M Wolf; Jason Westra; Nathan Tintle

doi:10.3389/fgene.2021.745901

Using Summary Statistics to Model Multiplicative Combinations of Initially Analyzed Phenotypes With a Flexible Choice of Covariates

Front Genet. 2021 Oct 12:12:745901. doi: 10.3389/fgene.2021.745901. eCollection 2021.

Authors

Jack M Wolf¹, Jason Westra², Nathan Tintle^{2

3}

Affiliations

¹ Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, United States.
² Department of Mathematics, Computer Science, and Statistics, Dordt University, Sioux Center, IA, United States.
³ Department of Population Health Nursing Science, College of Nursing, University of Illinois Chicago, Chicago, IL, United States.

Abstract

While the promise of electronic medical record and biobank data is large, major questions remain about patient privacy, computational hurdles, and data access. One promising area of recent development is pre-computing non-individually identifiable summary statistics to be made publicly available for exploration and downstream analysis. In this manuscript we demonstrate how to utilize pre-computed linear association statistics between individual genetic variants and phenotypes to infer genetic relationships between products of phenotypes (e.g., ratios; logical combinations of binary phenotypes using "and" and "or") with customized covariate choices. We propose a method to approximate covariate adjusted linear models for products and logical combinations of phenotypes using only pre-computed summary statistics. We evaluate our method's accuracy through several simulation studies and an application modeling ratios of fatty acids using data from the Framingham Heart Study. These studies show consistent ability to recapitulate analysis results performed on individual level data including maintenance of the Type I error rate, power, and effect size estimates. An implementation of this proposed method is available in the publicly available R package pcsstools.

Keywords: covariate adjustment; linear models; multiplication; phenotype; summary statistics.

Publication types

Review

Grants and funding

R15 HG006915/HG/NHGRI NIH HHS/United States