Predicting suicide attempts in adolescents with longitudinal clinical data and machine learning

J Child Psychol Psychiatry. 2018 Dec;59(12):1261-1270. doi: 10.1111/jcpp.12916. Epub 2018 Apr 30.

Abstract

Background: Adolescents have high rates of nonfatal suicide attempts, but clinically practical risk prediction remains a challenge. Screening can be time consuming to implement at scale, if it is done at all. Computational algorithms may predict suicide risk using only routinely collected clinical data. We used a machine learning approach validated on longitudinal clinical data in adults to address this challenge in adolescents.

Methods: This is a retrospective, longitudinal cohort study. Data were collected from the Vanderbilt Synthetic Derivative from January 1998 to December 2015 and included 974 adolescents with nonfatal suicide attempts and multiple control comparisons: 496 adolescents with other self-injury (OSI), 7,059 adolescents with depressive symptoms, and 25,081 adolescent general hospital controls. Candidate predictors included diagnostic, demographic, medication, and socioeconomic factors. Outcome was determined by multiexpert review of electronic health records. Random forests were validated with optimism adjustment at multiple time points (from 1 week to 2 years). Recalibration was done via isotonic regression. Evaluation metrics included discrimination (AUC, sensitivity/specificity, precision/recall) and calibration (calibration plots, slope/intercept, Brier score).

Results: Computational models performed well and did not require face-to-face screening. Performance improved as suicide attempts became more imminent. Discrimination was good in comparison with OSI controls (AUC = 0.83 [0.82-0.84] at 720 days; AUC = 0.85 [0.84-0.87] at 7 days) and depressed controls (AUC = 0.87 [95% CI 0.85-0.90] at 720 days; 0.90 [0.85-0.94] at 7 days) and best in comparison with general hospital controls (AUC 0.94 [0.92-0.96] at 720 days; 0.97 [0.95-0.98] at 7 days). Random forests significantly outperformed logistic regression in every comparison. Recalibration improved performance as much as ninefold - clinical recommendations with poorly calibrated predictions can lead to decision errors.

Conclusions: Machine learning on longitudinal clinical data may provide a scalable approach to broaden screening for risk of nonfatal suicide attempts in adolescents.

Keywords: Suicide; adolescent; attempted; decision support techniques; electronic health records; machine learning.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Depression / epidemiology
  • Depression / psychology
  • Female
  • Humans
  • Longitudinal Studies
  • Machine Learning*
  • Male
  • Retrospective Studies
  • Risk Assessment
  • Self-Injurious Behavior / epidemiology
  • Self-Injurious Behavior / psychology
  • Suicide, Attempted / prevention & control*
  • Suicide, Attempted / psychology
  • Suicide, Attempted / statistics & numerical data