In many settings, researchers may not have direct access to data on 1 or more variables needed for an analysis and instead may use regression-based estimates of those variables. Using such estimates in place of original data, however, introduces complications and can result in uninterpretable analyses. In simulations and observational data, we illustrate the issues that arise when an average treatment effect is estimated from data where the outcome of interest is predicted from an auxiliary model. We show that bias in any direction can result, under both the null and alternative hypotheses.
Keywords: imputation; measurement error; proxy variables.
© The Author(s) 2020. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.