Predicting suicide attempts in adolescents with longitudinal clinical data and machine learning

Colin G Walsh; Jessica D Ribeiro; Joseph C Franklin

doi:10.1111/jcpp.12916

Predicting suicide attempts in adolescents with longitudinal clinical data and machine learning

J Child Psychol Psychiatry. 2018 Dec;59(12):1261-1270. doi: 10.1111/jcpp.12916. Epub 2018 Apr 30.

Authors

Colin G Walsh¹, Jessica D Ribeiro², Joseph C Franklin²

Affiliations

¹ Vanderbilt University Medical Center, Nashville, TN, USA.
² Florida State University, Tallahassee, FL, USA.

PMID: 29709069
DOI: 10.1111/jcpp.12916

Abstract

Background: Adolescents have high rates of nonfatal suicide attempts, but clinically practical risk prediction remains a challenge. Screening can be time consuming to implement at scale, if it is done at all. Computational algorithms may predict suicide risk using only routinely collected clinical data. We used a machine learning approach validated on longitudinal clinical data in adults to address this challenge in adolescents.

Methods: This is a retrospective, longitudinal cohort study. Data were collected from the Vanderbilt Synthetic Derivative from January 1998 to December 2015 and included 974 adolescents with nonfatal suicide attempts and multiple control comparisons: 496 adolescents with other self-injury (OSI), 7,059 adolescents with depressive symptoms, and 25,081 adolescent general hospital controls. Candidate predictors included diagnostic, demographic, medication, and socioeconomic factors. Outcome was determined by multiexpert review of electronic health records. Random forests were validated with optimism adjustment at multiple time points (from 1 week to 2 years). Recalibration was done via isotonic regression. Evaluation metrics included discrimination (AUC, sensitivity/specificity, precision/recall) and calibration (calibration plots, slope/intercept, Brier score).

Results: Computational models performed well and did not require face-to-face screening. Performance improved as suicide attempts became more imminent. Discrimination was good in comparison with OSI controls (AUC = 0.83 [0.82-0.84] at 720 days; AUC = 0.85 [0.84-0.87] at 7 days) and depressed controls (AUC = 0.87 [95% CI 0.85-0.90] at 720 days; 0.90 [0.85-0.94] at 7 days) and best in comparison with general hospital controls (AUC 0.94 [0.92-0.96] at 720 days; 0.97 [0.95-0.98] at 7 days). Random forests significantly outperformed logistic regression in every comparison. Recalibration improved performance as much as ninefold - clinical recommendations with poorly calibrated predictions can lead to decision errors.

Conclusions: Machine learning on longitudinal clinical data may provide a scalable approach to broaden screening for risk of nonfatal suicide attempts in adolescents.

Keywords: Suicide; adolescent; attempted; decision support techniques; electronic health records; machine learning.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Adolescent
Depression / epidemiology
Depression / psychology
Female
Humans
Longitudinal Studies
Machine Learning*
Male
Retrospective Studies
Risk Assessment
Self-Injurious Behavior / epidemiology
Self-Injurious Behavior / psychology
Suicide, Attempted / prevention & control*
Suicide, Attempted / psychology
Suicide, Attempted / statistics & numerical data

Grants and funding

UL1 RR024975/RR/NCRR NIH HHS/United States