Logistic regression with correlated measurement error and misclassification in covariates

Stat Methods Med Res. 2023 Apr;32(4):789-805. doi: 10.1177/09622802231154324. Epub 2023 Feb 15.

Abstract

Many areas of research, such as nutritional epidemiology, may encounter measurement errors of continuous covariates and misclassification of categorical variables when modeling. It is well known that ignoring measurement errors or misclassification can lead to biased results. But most research has focused on solving these two problems separately. Addressing both measurement error and misclassification simultaneously in a single analysis is less actively studied. In this article, we propose a new correction method for a logistic regression to handle correlated error variables involved in multivariate continuous covariates and misclassification in a categorical variable simultaneously. It is not computationally intensive since a closed-form of the approximate likelihood function conditional on observed covariates is derived. The asymptotic normality of this proposed estimator is established under regularity conditions and its finite-sample performance is empirically examined by simulation studies. We apply this new estimation method to handle measurement error in some nutrients of interest and misclassification of a categorical variable named physical activity in the European Prospective Investigation into Cancer and Nutrition-InterAct Study data. Analyses show that fruit is negatively associated with type 2 diabetes for a group of women doing active physical activity, protein has positive association with type 2 diabetes for the group of less active physical activity, and actual physical activity has a greater effect on reducing the risk of type 2 diabetes than observed physical activity.

Keywords: Approximate likelihood estimation; correlated measurement error; logistic regression; misclassification; nutritional epidemiology.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bias
  • Computer Simulation
  • Diabetes Mellitus, Type 2*
  • Female
  • Humans
  • Likelihood Functions
  • Logistic Models
  • Prospective Studies