A Machine Learning Model for Predicting Mortality within 90 Days of Dialysis Initiation

Kidney360. 2022 Jul 20;3(9):1556-1565. doi: 10.34067/KID.0007012021. eCollection 2022 Sep 29.

Abstract

Background: The first 90 days after dialysis initiation are associated with high morbidity and mortality in end-stage kidney disease (ESKD) patients. A machine learning-based tool for predicting mortality could inform patient-clinician shared decision making on whether to initiate dialysis or pursue medical management. We used the eXtreme Gradient Boosting (XGBoost) algorithm to predict mortality in the first 90 days after dialysis initiation in a nationally representative population from the United States Renal Data System.

Methods: A cohort of adults initiating dialysis between 2008-2017 were studied for outcome of death within 90 days of dialysis initiation. The study dataset included 188 candidate predictors prognostic of early mortality that were known on or before the first day of dialysis and was partitioned into training (70%) and testing (30%) subsets. XGBoost modeling used a complete-case set and a dataset obtained from multiple imputation. Model performance was evaluated by c-statistics overall and stratified by subgroups of age, sex, race, and dialysis modality.

Results: The analysis included 1,150,195 patients with ESKD, of whom 86,083 (8%) died in the first 90 days after dialysis initiation. The XGBoost models discriminated mortality risk in the nonimputed (c=0.826, 95% CI, 0.823 to 0.828) and imputed (c=0.827, 95% CI, 0.823 to 0.827) models and performed well across nearly every subgroup (race, age, sex, and dialysis modality) evaluated (c>0.75). Across predicted risk thresholds of 10%-50%, higher risk thresholds showed declining sensitivity (0.69-0.04) with improving specificity (0.79-0.99); similarly, positive likelihood ratio was highest at the 40% threshold, whereas the negative likelihood ratio was lowest at the 10% threshold. After calibration using isotonic regression, the model accurately estimated the probability of mortality across all ranges of predicted risk.

Conclusions: The XGBoost-based model developed in this study discriminated risk of early mortality after dialysis initiation with excellent calibration and performed well across key subgroups.

Keywords: ESRD; United States Renal Data System; chronic kidney failure; chronic renal failure; dialysis; end stage kidney disease; machine learning; mortality; outcomes; prediction modeling.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adult
  • Cohort Studies
  • Female
  • Humans
  • Kidney Failure, Chronic* / ethnology
  • Kidney Failure, Chronic* / mortality
  • Kidney Failure, Chronic* / therapy
  • Machine Learning*
  • Male
  • Models, Statistical*
  • Renal Dialysis* / methods
  • Renal Dialysis* / statistics & numerical data
  • Reproducibility of Results
  • Risk Assessment
  • Time Factors
  • Treatment Outcome
  • United States / epidemiology