GRADE concept paper 2: Concepts for judging certainty on the calibration of prognostic models in a body of validation studies

Farid Foroutan; Gordon Guyatt; Marialena Trivella; Nina Kreuzberger; Nicole Skoetz; Richard D Riley; Pavel S Roshanov; Ana Carolina Alba; Nigar Sekercioglu; Carlos Canelo-Aybar; Zachary Munn; Romina Brignardello-Petersen; Holger J Schünemann; Alfonso Iorio

doi:10.1016/j.jclinepi.2021.11.024

GRADE concept paper 2: Concepts for judging certainty on the calibration of prognostic models in a body of validation studies

J Clin Epidemiol. 2022 Mar:143:202-211. doi: 10.1016/j.jclinepi.2021.11.024. Epub 2021 Nov 18.

Affiliations

¹ Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto, Ontario, Canada; Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada. Electronic address: foroutaf@mcmaster.ca.
² Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada.
³ Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto, Ontario, Canada; Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamitlon, Canada; Division of Nephrology, Department of Medicine, London Health Sciences Centre, London, UK; NK: Cochrane Haematology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany; Evidence-based Oncology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany; School of Medicine, Keele University, Keele, United Kingdom.
⁴ NK: Cochrane Haematology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
⁵ NK: Cochrane Haematology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany; Evidence-based Oncology, Department I of Internal Medicine, Center for Integrated Oncology Aachen Bonn Cologne Duesseldorf, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
⁶ School of Medicine, Keele University, Keele, United Kingdom.
⁷ Division of Nephrology, Department of Medicine, London Health Sciences Centre, London, UK.
⁸ Ted Rogers Centre for Heart Research, Peter Munk Cardiac Centre, Toronto, Ontario, Canada.
⁹ CIBER de Epidemiología y Salud Pública (CIBERESP), Madrid, Spain; Iberoamerican Cochrane Centre - Department of Clinical Epidemiology and Public Health, Biomedical Research Institute Sant Pau (IIB Sant Pau), Sant Antonio María Claret 167, 08025 Barcelona, Spain.

PMID: 34800677
DOI: 10.1016/j.jclinepi.2021.11.024

Abstract

Background: Prognostic models combine several prognostic factors to provide an estimate of the likelihood (or risk) of future events in individual patients, conditional on their prognostic factor values. A fundamental part of evaluating prognostic models is undertaking studies to determine whether their predictive performance, such as calibration and discrimination, is reproduced across settings. Systematic reviews and meta-analyses of studies evaluating prognostic models' performance are a necessary step for selection of models for clinical practice and for testing the underlying assumption that their use will improve outcomes, including patient's reassurance and optimal future planning.

Methods: In this paper, we highlight key concepts in evaluating the certainty of evidence regarding the calibration of prognostic models.

Results and conclusion: Four concepts are key to evaluating the certainty of evidence on prognostic models' performance regarding calibration. The first concept is that the inference regarding calibration may take one of two forms: deciding whether one is rating certainty that a model's performance is satisfactory or, instead, unsatisfactory, in either case defining the threshold for satisfactory (or unsatisfactory) model performance. Second, inconsistency is the critical GRADE domain to deciding whether we are rating certainty in the model performance being satisfactory or unsatisfactory. Third, depending on whether one is rating certainty in satisfactory or unsatisfactory performance, different patterns of inconsistency of results across studies will inform ratings of certainty of evidence. Fourth, exploring the distribution of point estimates of observed to expected ratio across individual studies, and its determinants, will bear on the need for and direction of future research.

Keywords: Calibration; Certainty in evidence; Discrimination; GRADE; Meta-Analysis; Prognosis; Prognostic models; Systematic review.

MeSH terms

Calibration
Forecasting
Humans
Probability
Prognosis*

Grants and funding

PG/17/49/33099/BHF_/British Heart Foundation/United Kingdom