A doubly robust estimator for the Mann Whitney Wilcoxon rank sum test when applied for causal inference in observational studies

J Appl Stat. 2024 May 15;51(16):3267-3291. doi: 10.1080/02664763.2024.2346357. eCollection 2024.

Abstract

The Mann-Whitney-Wilcoxon rank sum test (MWWRST) is a widely used method for comparing two treatment groups in randomized control trials, particularly when dealing with highly skewed data. However, when applied to observational study data, the MWWRST often yields invalid results for causal inference. To address this limitation, Wu et al. (Causal inference for Mann-Whitney-Wilcoxon rank sum and other nonparametric statistics, Stat. Med. 33 (2014), pp. 1261-1271) introduced an approach that incorporates inverse probability weighting (IPW) into this rank-based statistic to mitigate confounding effects. Subsequently, Mao (On causal estimation using U-statistics, Biometrika 105 (2018), pp. 215-220), Zhang et al. (Estimating Mann Whitney-type causal effects, J. Causal Inference 7 (2019), ARTICLE ID 20180010), and Ai et al. (A Mann-Whitney test of distributional effects in a multivalued treatment, J. Stat. Plan. Inference 209 (2020), pp. 85-100) extended this IPW estimator to develop doubly robust estimators. Nevertheless, each of these approaches has notable limitations. Mao's method imposes stringent assumptions that may not align with real-world study data. Zhang et al.'s (Estimating Mann Whitney-type causal effects, J. Causal Inference 7 (2019), ARTICLE ID 20180010) estimators rely on bootstrap inference, which suffers from computational inefficiency and lacks known asymptotic properties. Meanwhile, Ai et al. (A Mann-Whitney test of distributional effects in a multivalued treatment, J. Stat. Plan. Inference 209 (2020), pp. 85-100) primarily focus on testing the null hypothesis of equal distributions between two groups, which is a more stringent assumption that may not be well-suited to the primary practical application of MWWRST. In this paper, we aim to address these limitations by leveraging functional response models (FRM) to develop doubly robust estimators. We demonstrate the performance of our proposed approach using both simulated and real study data.

Keywords: Functional response models; U-statistics generalized estimating equations; inverse probability weighting; mean rank; mean score imputation; outcome regression.