In this article, we first review the literature on dealing with missing values on a covariate in randomized studies and summarize what has been done and what is lacking to date. We then investigate the situation with a continuous outcome and a missing binary covariate in more details through simulations, comparing the performance of multiple imputation (MI) with various simple alternative methods. This is finally extended to the case of time-to-event outcome. The simulations consider five different missingness scenarios: missing completely at random (MCAR), at random (MAR) with missingness depending only on the treatment, and missing not at random (MNAR) with missingness depending on the covariate itself (MNAR1), missingness depending on both the treatment and covariate (MNAR2), and missingness depending on the treatment, covariate and their interaction (MNAR3). Here, we distinguish two different cases: (1) when the covariate is measured before randomization (best practice), where only MCAR and MNAR1 are plausible, and (2) when it is measured after randomization but before treatment (which sometimes occurs in nonpharmaceutical research), where the other three missingness mechanisms can also occur. The proposed methods are compared based on the treatment effect estimate and its standard error. The simulation results suggest that the patterns of results are very similar for all missingness scenarios in case (1) and also in case (2) except for MNAR3. Furthermore, in each scenario for continuous outcome, there is at least one simple method that performs at least as well as MI, while for time-to-event outcome MI is best.
Keywords: mean imputation; missing covariate; missing-indicator method; multiple imputation; randomized studies; review.
© 2020 The Authors. Pharmaceutical Statistics published by John Wiley & Sons Ltd.