Utilizing electronic health data and machine learning for the prediction of 30-day unplanned readmission or all-cause mortality in heart failure

Cardiovasc Digit Health J. 2020 Oct 22;1(2):71-79. doi: 10.1016/j.cvdhj.2020.07.004. eCollection 2020 Sep-Oct.

Abstract

Background: Existing risk assessment tools for heart failure (HF) outcomes use structured databases with static, single-timepoint clinical data and have limited accuracy.

Objective: The purpose of this study was to develop a comprehensive approach for accurate prediction of 30-day unplanned readmission and all-cause mortality (ACM) that integrates clinical and physiological data available in the electronic health record system.

Methods: Three predictive models for 30-day unplanned readmissions or ACM were created using an extreme gradient boosting approach: (1) index admission model; (2) index discharge model; and (3) feature-aggregated model. Performance was assessed by the area under the curve (AUC) metric and compared with that of the HOSPITAL score, a widely used predictive model for hospital readmission.

Results: A total of 3774 patients with a primary billing diagnosis of HF were included (614 experienced the primary outcome), with 796 variables used in the admission and discharge models, and 2032 in the feature-aggregated model. The index admission model had AUC = 0.723, the index discharge model had AUC = 0.754, and the feature-aggregated model had AUC = 0.756 for prediction of 30-day unplanned readmission or ACM. For comparison, the HOSPITAL score had AUC = 0.666 (admission model: P = .093; discharge model: P = .022; feature aggregated: P = .012).

Conclusion: These models predict risk of HF hospitalizations and ACM in patients admitted with HF and emphasize the importance of incorporating large numbers of variables in machine learning models to identify predictors for future investigation.

Keywords: Big data; Electronic health data; Heart failure; Machine learning; Readmission.