Illusory generalizability of clinical prediction models

Adam M Chekroud; Matt Hawrilenko; Hieronimus Loho; Julia Bondar; Ralitza Gueorguieva; Alkomiet Hasan; Joseph Kambeitz; Philip R Corlett; Nikolaos Koutsouleris; Harlan M Krumholz; John H Krystal; Martin Paulus

doi:10.1126/science.adg8538

Illusory generalizability of clinical prediction models

Science. 2024 Jan 12;383(6679):164-167. doi: 10.1126/science.adg8538. Epub 2024 Jan 11.

Authors

Affiliations

¹ Spring Health, New York City, NY 10010, USA.
² Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06520, USA.
³ Department of Biostatistics, Yale University, New Haven, CT 06520, USA.
⁴ Department of Psychiatry, Psychotherapy and Psychosomatics, University Augsburg, 86159 Augsburg, Germany.
⁵ Department of Psychiatry and Psychotherapy, University of Cologne, Faculty of Medicine and University Hospital of Cologne, Cologne, Germany.
⁶ Department of Psychiatry and Psychotherapy, Ludwig-Maximilians-University, Munich, Germany.
⁷ Center for Outcomes Research and Evaluation, Yale New Haven Hospital, New Haven, CT 06520, USA.
⁸ Laureate Institute for Brain Research, Tulsa, OK 74136, USA.

PMID: 38207039
DOI: 10.1126/science.adg8538

Abstract

It is widely hoped that statistical models can improve decision-making related to medical treatments. Because of the cost and scarcity of medical outcomes data, this hope is typically based on investigators observing a model's success in one or two datasets or clinical contexts. We scrutinized this optimism by examining how well a machine learning model performed across several independent clinical trials of antipsychotic medication for schizophrenia. Models predicted patient outcomes with high accuracy within the trial in which the model was developed but performed no better than chance when applied out-of-sample. Pooling data across trials to predict outcomes in the trial left out did not improve predictions. These results suggest that models predicting treatment outcomes in schizophrenia are highly context-dependent and may have limited generalizability.

MeSH terms

Adolescent
Adult
Aged
Aged, 80 and over
Antipsychotic Agents* / therapeutic use
Child
Female
Humans
Machine Learning*
Male
Middle Aged
Models, Statistical
Prognosis
Schizophrenia* / drug therapy
Treatment Outcome
Young Adult

Substances

Antipsychotic Agents