Importance: Risk assessment tools for exercise treadmill testing may have limited external validity. Cardiovascular mortality has decreased in recent decades, and women have been underrepresented in prior cohorts.
Objectives: To determine whether exercise and clinical variables are associated with differential mortality outcomes in men and women and to assess whether sex-specific risk scores better estimate all-cause mortality.
Design, setting, and participants: This retrospective cohort study included 59 877 patients seen at the Cleveland Clinic Foundation (CCF cohort) from January 1, 2000, through December 31, 2010, and 49 278 patients seen at the Henry Ford Hospital (FIT cohort) from January 1, 1991, through December 31, 2009. All patients were 18 years or older and underwent exercise treadmill testing. Data were analyzed from January 1, 2000, to October 27, 2011, in the CCF cohort and from January 1, 1991, to April 1, 2013, in the FIT cohort.
Main outcomes and measurements: The CCF cohort was divided randomly into derivation and validation samples, and separate risk scores were developed for men and women. Net reclassification, C statistics, and integrated discrimination improvement were used to compare the sex-specific risk scores with other tools that have all-cause mortality as the outcome. Discrimination and calibration were also evaluated with these sex-specific risk scores in the FIT cohort.
Results: The CCF cohort included 59 877 patients (59.4% men; 40.5% women) with a median (interquartile range [IQR]) age of 54 (45-63) years and 2521 deaths (4.2%) during a median follow-up of 7 (IQR, 4.1-9.6) years. The FIT cohort included 49 278 patients (52.5% men; 47.4% women) with a median (IQR) age of 54 (46-64) years and 6643 deaths (13.5%) during a median (IQR) follow-up of 10.2 (7-13.4) years. C statistics for the sex-specific risk scores in the CCF validation sample were higher (0.79 in women and 0.81 in men) than C statistics using other tools in women (0.70 for Duke Treadmill Score; 0.74 for Lauer nomogram) and men (0.72 for Duke Treadmill Score; 0.75 for Lauer nomogram). Net reclassification and integrated discrimination improvement were superior with the sex-specific risk scores, mostly owing to correct reclassification of events. The sex-specific risk scores in the FIT cohort demonstrated similar discrimination (C statistic, 0.78 for women and 0.79 for men), and calibration was reasonable.
Conclusions and relevance: Sex-specific risk scores better estimate mortality in patients undergoing exercise treadmill testing. In particular, these sex-specific risk scores help to identify patients at the highest residual risk in the present era.