The value of missing information in severity of illness score development

J Biomed Inform. 2019 Sep:97:103255. doi: 10.1016/j.jbi.2019.103255. Epub 2019 Jul 23.

Abstract

Objective: We aim to investigate the hypothesis that using information about which variables are missing along with appropriate imputation improves the performance of severity of illness scoring systems used to predict critical patient outcomes.

Study design and setting: We quantify the impact of missing and imputed variables on the performance of prediction models used in the development of a sepsis-related severity of illness scoring system. Electronic health records (EHR) data were compiled from Christiana Care Health System (CCHS) on 119,968 adult patients hospitalized between July 2013 and December 2015. Two outcomes of interest were considered for prediction: (1) first transfer to intensive care unit (ICU) and (2) in-hospital mortality. Five different prediction models were employed. Indicators were utilized in these prediction models to identify when variables were missing and imputed.

Results: We observed statistically significant gains in prediction performance when moving from models that did not indicate missing information to those that did. Moreover, this increase was higher in models that use summary variables as predictors compared to those that use all variables.

Conclusion: When developing prediction models using longitudinal EHR data, researchers should explore the incorporation of indicators for missing variables along with appropriate imputation.

Keywords: Electronic health records; Missing data; Prediction models; Sepsis; Severity of illness scores.

MeSH terms

  • Adolescent
  • Adult
  • Aged
  • Aged, 80 and over
  • Area Under Curve
  • Computational Biology / methods
  • Data Interpretation, Statistical
  • Electronic Health Records / statistics & numerical data
  • Female
  • Hospital Mortality
  • Humans
  • Intensive Care Units
  • Logistic Models
  • Male
  • Middle Aged
  • Models, Statistical
  • Outcome Assessment, Health Care / statistics & numerical data
  • Sepsis / mortality
  • Severity of Illness Index*
  • Support Vector Machine
  • Young Adult