Development of a Late-Life Dementia Prediction Index with Supervised Machine Learning in the Population-Based CAIDE Study

J Alzheimers Dis. 2017;55(3):1055-1067. doi: 10.3233/JAD-160560.

Abstract

Background and objective: This study aimed to develop a late-life dementia prediction model using a novel validated supervised machine learning method, the Disease State Index (DSI), in the Finnish population-based CAIDE study.

Methods: The CAIDE study was based on previous population-based midlife surveys. CAIDE participants were re-examined twice in late-life, and the first late-life re-examination was used as baseline for the present study. The main study population included 709 cognitively normal subjects at first re-examination who returned to the second re-examination up to 10 years later (incident dementia n = 39). An extended population (n = 1009, incident dementia 151) included non-participants/non-survivors (national registers data). DSI was used to develop a dementia index based on first re-examination assessments. Performance in predicting dementia was assessed as area under the ROC curve (AUC).

Results: AUCs for DSI were 0.79 and 0.75 for main and extended populations. Included predictors were cognition, vascular factors, age, subjective memory complaints, and APOE genotype.

Conclusion: The supervised machine learning method performed well in identifying comprehensive profiles for predicting dementia development up to 10 years later. DSI could thus be useful for identifying individuals who are most at risk and may benefit from dementia prevention interventions.

Keywords: Computer-assisted decision making; dementia; prediction; prevention; supervised machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Apolipoproteins E / genetics
  • Cerebrovascular Disorders / epidemiology
  • Cognition / physiology
  • Community Health Planning
  • Dementia / diagnosis*
  • Dementia / epidemiology*
  • Dementia / genetics
  • Female
  • Finland / epidemiology
  • Humans
  • Male
  • Neuropsychological Tests
  • Predictive Value of Tests
  • ROC Curve
  • Reproducibility of Results
  • Retrospective Studies
  • Risk Factors
  • Severity of Illness Index*
  • Supervised Machine Learning*

Substances

  • Apolipoproteins E