Tree-based scan statistics to generate drug repurposing hypotheses: a test case using sodium-glucose cotransporter-2 inhibitors

Am J Epidemiol. 2024 Sep 11:kwae355. doi: 10.1093/aje/kwae355. Online ahead of print.

Abstract

Most drug repurposing studies using real-world data focused on validating, instead of generating, hypotheses. We used tree-based scan statistics to generate repurposing hypotheses for sodium-glucose cotransporter-2 inhibitors (SGLT2i). We used an active-comparator, new-user design to create a 1:1 propensity-score matched cohort of SGLT2i and dipeptidyl peptidase-4 inhibitors (DPP4i) initiators in the MerativeTM MarketScan® Research Databases. Tree-based scan statistics were estimated across an ICD-10-CM-based hierarchical outcome tree using incident outcomes identified from hospital and outpatient diagnoses. We used an adjusted P≤0.01 as the threshold for statistical alert to prioritize associations for evaluation as repurposing signals. We varied the analyses by tree size, scanning level, and clinical settings for outcomes. There were 80,510 matched SGLT2i-DPP4i initiator pairs with 215,333 outcomes among SGLT2i initiators and 223,428 outcomes among DPP4i initiators. There were 18 prioritized associations, which included chronic kidney disease (P=0.0001), an expected signal, and anemia (P=0.0001). Heart failure (P=0.0167), another expected signal, was identified slightly beyond the statistical alert threshold. Narrowing the outcome tree, scanning at different tree levels, and including outcomes from different clinical settings influenced the scan statistics. We identified signals aligning with recently approved indications of SGLT2i, plus potential repurposing signals supported by existing evidence but requiring future validation.

Keywords: Drug repurposing; TreeScan; data-mining; drug repositioning; pharmacoepidemiology; real-world data; tree-based scan statistics.