Purpose: When using claims data, dichotomous covariates (C) are often assumed to be absent unless a claim for the condition is observed. When available historical data differs among subjects, investigators must choose between using all available historical data versus data from a fixed window to assess C. Our purpose was to compare estimation under these two approaches.
Methods: We simulated cohorts of 20,000 subjects with dichotomous variables representing exposure (E), outcome (D), and a single time-invariant C, as well as varying availability of historical data. C was operationally defined under each paradigm and used to estimate the adjusted risk ratio of E on D via Mantel-Haenszel methods.
Results: In the base case scenario, less bias and lower mean square error were observed using all available information compared with a fixed window; differences were magnified at higher modeled confounder strength. Upon introduction of an unmeasured covariate (F), the all-available approach remained less biased in most circumstances and rendered estimates that better approximated those that were adjusted for the true (modeled) value of C in all instances.
Conclusions: In most instances considered, operationally defining time-invariant dichotomous C based on all available historical data, rather than on data observed over a commonly shared fixed historical window, results in less biased estimates.
Copyright © 2013 John Wiley & Sons, Ltd.