Predictive models for secondary epilepsy in patients with acute ischemic stroke within one year

Elife. 2024 Nov 14:13:RP98759. doi: 10.7554/eLife.98759.

Abstract

Background: Post-stroke epilepsy (PSE) is a critical complication that worsens both prognosis and quality of life in patients with ischemic stroke. An interpretable machine learning model was developed to predict PSE using medical records from four hospitals in Chongqing.

Methods: Medical records, imaging reports, and laboratory test results from 21,459 ischemic stroke patients were collected and analyzed. Univariable and multivariable statistical analyses identified key predictive factors. The dataset was split into a 70% training set and a 30% testing set. To address the class imbalance, the Synthetic Minority Oversampling Technique combined with Edited Nearest Neighbors was employed. Nine widely used machine learning algorithms were evaluated using relevant prediction metrics, with SHAP (SHapley Additive exPlanations) used to interpret the model and assess the contributions of different features.

Results: Regression analyses revealed that complications such as hydrocephalus, cerebral hernia, and deep vein thrombosis, as well as specific brain regions (frontal, parietal, and temporal lobes), significantly contributed to PSE. Factors such as age, gender, NIH Stroke Scale (NIHSS) scores, and laboratory results like WBC count and D-dimer levels were associated with increased PSE risk. Tree-based methods like Random Forest, XGBoost, and LightGBM showed strong predictive performance, achieving an AUC of 0.99.

Conclusions: The model accurately predicts PSE risk, with tree-based models demonstrating superior performance. NIHSS score, WBC count, and D-dimer were identified as the most crucial predictors.

Funding: The research is funded by Central University basic research young teachers and students research ability promotion sub-projec t(2023CDJYGRH-ZD06), and by Emergency Medicine Chongqing Key Laboratory Talent Innovation and development joint fund project (2024RCCX10).

Keywords: epilepsy; machine learning; neuroscience; none; stroke.

MeSH terms

  • Aged
  • Aged, 80 and over
  • Epilepsy* / complications
  • Female
  • Humans
  • Ischemic Stroke* / complications
  • Machine Learning*
  • Male
  • Middle Aged
  • Prognosis
  • Risk Factors

Associated data

  • Dryad/10.5061/dryad.w0vt4b92c