Machine learning ensemble models predict total charges and drivers of cost for transsphenoidal surgery for pituitary tumor

J Neurosurg. 2018 Sep 21;131(2):507-516. doi: 10.3171/2018.4.JNS18306. Print 2019 Aug 1.

Abstract

Objective: Efficient allocation of resources in the healthcare system enables providers to care for more and needier patients. Identifying drivers of total charges for transsphenoidal surgery (TSS) for pituitary tumors, which are poorly understood, represents an opportunity for neurosurgeons to reduce waste and provide higher-quality care for their patients. In this study the authors used a large, national database to build machine learning (ML) ensembles that directly predict total charges in this patient population. They then interrogated the ensembles to identify variables that predict high charges.

Methods: The authors created a training data set of 15,487 patients who underwent TSS between 2002 and 2011 and were registered in the National Inpatient Sample. Thirty-two ML algorithms were trained to predict total charges from 71 collected variables, and the most predictive algorithms combined to form an ensemble model. The model was internally and externally validated to demonstrate generalizability. Permutation importance and partial dependence analyses were performed to identify the strongest drivers of total charges. Given the overwhelming influence of length of stay (LOS), a second ensemble excluding LOS as a predictor was built to identify additional drivers of total charges.

Results: An ensemble model comprising 3 gradient boosted tree classifiers best predicted total charges (root mean square logarithmic error = 0.446; 95% CI 0.439-0.453; holdout = 0.455). LOS was by far the strongest predictor of total charges, increasing total predicted charges by approximately $5000 per day.In the absence of LOS, the strongest predictors of total charges were admission type, hospital region, race, any postoperative complication, and hospital ownership type.

Conclusions: ML ensembles predict total charges for TSS with good fidelity. The authors identified extended LOS, nonelective admission type, non-Southern hospital region, minority race, postoperative complication, and private investor hospital ownership as drivers of total charges and potential targets for cost-lowering interventions.

Keywords: LOS = length of stay; ML = machine learning; NIS = National (Nationwide) Inpatient Sample; RMSLE = root mean square logarithmic error; TSS = transsphenoidal surgery; machine learning; outcomes modeling; pituitary surgery; total charges; transsphenoidal surgery.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adenoma / economics
  • Adenoma / epidemiology
  • Adenoma / surgery*
  • Adult
  • Aged
  • Costs and Cost Analysis / methods
  • Costs and Cost Analysis / trends*
  • Databases, Factual / economics
  • Databases, Factual / trends
  • Female
  • Forecasting
  • Health Care Costs / trends*
  • Humans
  • Machine Learning / trends*
  • Male
  • Middle Aged
  • Pituitary Neoplasms / economics
  • Pituitary Neoplasms / epidemiology
  • Pituitary Neoplasms / surgery*
  • Sphenoid Sinus / surgery*
  • United States / epidemiology