Geographic and temporal validity of prediction models: different approaches were useful to examine model performance

Peter C Austin; David van Klaveren; Yvonne Vergouwe; Daan Nieboer; Douglas S Lee; Ewout W Steyerberg

doi:10.1016/j.jclinepi.2016.05.007

Geographic and temporal validity of prediction models: different approaches were useful to examine model performance

J Clin Epidemiol. 2016 Nov:79:76-85. doi: 10.1016/j.jclinepi.2016.05.007. Epub 2016 Jun 2.

Authors

Peter C Austin¹, David van Klaveren², Yvonne Vergouwe³, Daan Nieboer³, Douglas S Lee⁴, Ewout W Steyerberg³

Affiliations

¹ Institute for Clinical Evaluative Sciences, G106, 2075 Bayview Avenue, Toronto, Ontario M4N 3M5, Canada; Institute of Health Policy, Management and Evaluation, University of Toronto, 155 College Street, Suite 425, Toronto, Ontario M5T 3M6, Canada; Schulich Heart Research Program, Sunnybrook Research Institute, 2056 Bayview Avenue, Toronto, Ontario M4N 3M5, Canada. Electronic address: peter.austin@ices.on.ca.
² Department of Public Health, Erasmus MC-University Medical Center Rotterdam, PO Box 2040, Rotterdam 3000 CA, The Netherlands; Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Boston, MA 02111, USA.
³ Department of Public Health, Erasmus MC-University Medical Center Rotterdam, PO Box 2040, Rotterdam 3000 CA, The Netherlands.
⁴ Institute for Clinical Evaluative Sciences, G106, 2075 Bayview Avenue, Toronto, Ontario M4N 3M5, Canada; Institute of Health Policy, Management and Evaluation, University of Toronto, 155 College Street, Suite 425, Toronto, Ontario M5T 3M6, Canada; Peter Munk Cardiac Centre and Joint Department of Medical Imaging, Division of Cardiology, Department of Medicine, University of Toronto, 200 Elizabeth Street, NU 4-482, Toronto, Ontario M5G 2C4, Canada.

Abstract

Objective: Validation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods.

Study design and setting: We illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random-effects meta-analysis methods. I² statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation.

Results: Estimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I² statistics and prediction intervals for c-statistics.

Conclusion: This study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods.

Keywords: Calibration; Clinical prediction model; Discrimination; Receiver operating characteristic curve; Risk prediction; Validation; c-statistic.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Heart Failure / epidemiology*
Humans
Logistic Models
Models, Statistical*
Multicenter Studies as Topic / statistics & numerical data
Ontario / epidemiology
ROC Curve
Reproducibility of Results
Spatio-Temporal Analysis*

Abstract

Publication types

MeSH terms

Grants and funding