Machine-learning based risk prediction of outcomes in patients hospitalised with COVID-19 in Australia: the AUS-COVID score

J Am Med Inform Assoc. 2025 Jan 25:ocaf016. doi: 10.1093/jamia/ocaf016. Online ahead of print.

Abstract

Objective: We aimed to develop a highly interpretable and effective, machine-learning based risk prediction algorithm to predict in-hospital mortality, intubation and adverse cardiovascular events in patients hospitalised with COVID-19 in Australia (AUS-COVID Score).

Materials and methods: This prospective study across 21 hospitals included 1714 consecutive patients aged ≥ 18 in their index hospitalization with COVID-19. The dataset was separated into training (80%) and test sets (20%). Eight supervised ML methods were used: LASSO, ridge, elastic net (EN), decision tree, support vector machine, random forest, AdaBoost and gradient boosting. A feature selection method was used to establish informative variables, which were considered in groups of 5/10/15/20/all. The final model was selected by balancing the optimal area under the curve (AUC) score with interpretability, through the number of included variables. The coefficients of the final models were used to build the AUS-COVID Score.

Results & discussion: Among the patients, 181 (10.6%) died in-hospital, 148 (8.6%) required intubation and 90 (5.3%) had adverse cardiovascular events. The LASSO model performed best for predicting in-hospital mortality (AUC 0.85) using five variables: age, respiratory rate, COVID-19 features on chest X-ray (CXR), troponin elevation, and COVID-19 vaccination (≥1 dose). The Elastic Net model performed best for predicting intubation (AUC 0.75) and adverse cardiovascular events (AUC 0.64), each with five variables. A user-friendly web-based application was built for clinician use at the bedside.

Conclusion: The AUS-COVID Score is an accurate and practical, machine-learning-based risk score to predict in-hospital mortality, intubation, and adverse cardiovascular events in hospitalized COVID-19 patients.

Keywords: COVID-19; cardiovascular disease; machine-learning; mortality; risk prediction.