A multiple imputation approach for MNAR mechanisms compatible with Heckman's model

Stat Med. 2016 Jul 30;35(17):2907-20. doi: 10.1002/sim.6902. Epub 2016 Feb 18.

Abstract

Standard implementations of multiple imputation (MI) approaches provide unbiased inferences based on an assumption of underlying missing at random (MAR) mechanisms. However, in the presence of missing data generated by missing not at random (MNAR) mechanisms, MI is not satisfactory. Originating in an econometric statistical context, Heckman's model, also called the sample selection method, deals with selected samples using two joined linear equations, termed the selection equation and the outcome equation. It has been successfully applied to MNAR outcomes. Nevertheless, such a method only addresses missing outcomes, and this is a strong limitation in clinical epidemiology settings, where covariates are also often missing. We propose to extend the validity of MI to some MNAR mechanisms through the use of the Heckman's model as imputation model and a two-step estimation process. This approach will provide a solution that can be used in an MI by chained equation framework to impute missing (either outcomes or covariates) data resulting either from a MAR or an MNAR mechanism when the MNAR mechanism is compatible with a Heckman's model. The approach is illustrated on a real dataset from a randomised trial in patients with seasonal influenza. Copyright © 2016 John Wiley & Sons, Ltd.

Keywords: Heckman's model; missing data; missing not at random (MNAR); multiple imputation: chained equation; sample selection.

MeSH terms

  • Data Accuracy*
  • Data Interpretation, Statistical*
  • Humans
  • Influenza, Human / drug therapy
  • Models, Statistical
  • Randomized Controlled Trials as Topic*
  • Research Design