Domain generalization for enhanced predictions of hospital readmission on unseen domains among patients with diabetes

Artif Intell Med. 2024 Dec:158:103010. doi: 10.1016/j.artmed.2024.103010. Epub 2024 Nov 10.

Abstract

A prediction model to assess the risk of hospital readmission can be valuable to identify patients who may benefit from extra care. Developing hospital-specific readmission risk prediction models using local data is not feasible for many institutions. Models developed on data from one hospital may not generalize well to another hospital. There is a lack of an end-to-end adaptable readmission model that can generalize to unseen test domains. We propose an early readmission risk domain generalization network, ERR-DGN, for cross-domain knowledge transfer. ERR-DGN internalizes the shared patterns and characteristics that are consistent across source domains, enabling it to adapt to a new domain. It transforms source datasets to a common embedding space while capturing relevant temporal long-term dependencies of sequential data. Domain generalization is then applied on domain-specific fully connected linear layers. The model is optimized by a loss function that integrates distribution discrepancy loss to match the mean embeddings of multiple source distributions with the task-specific loss. A model was developed using electronic health record (EHR) data of 201,688 patients with diabetes across urban, suburban, rural, and mixed hospital systems to enhance 30-day readmission predictions among patients with diabetes on 67,066 unseen patients at a rural hospital. We also explored how model performance varied by the number of sites and over time. The proposed method outperformed the baseline models, yielding a 6 % increase in F1-score (0.79 ± 0.006 vs. 0.73 ± 0.007). Model performance peaked with the inclusion of three sites. Performance of the model was relatively stable for 3 years then declined at 4 years. ERR-DGN may be a proficient tool for learning data from multiple sites and subsequently applying a hospitalization readmission prediction model to a new site. Including a relatively small number of varied sites may be sufficient to achieve peak performance. Periodic retraining at least every 3 years may mitigate model degradation over time.

Keywords: Deep learning; Domain adaptation; Domain generalization; Electronic health records data; Readmission prediction; Transfer learning.

MeSH terms

  • Diabetes Mellitus* / therapy
  • Electronic Health Records*
  • Humans
  • Neural Networks, Computer
  • Patient Readmission* / statistics & numerical data
  • Risk Assessment