Practical foundations of machine learning for addiction research. Part II. Workflow and use cases

Am J Drug Alcohol Abuse. 2022 May 4;48(3):272-283. doi: 10.1080/00952990.2021.1966435. Epub 2022 Apr 7.

Abstract

In a continuum with applied statistics, machine learning offers a wide variety of tools to explore, analyze, and understand addiction data. These tools include algorithms that can leverage useful information from data to build models; these models can solve particular tasks to answer addiction scientific questions. In this second part of a two-part review on machine learning, we explain how to apply machine learning methods to addiction research. Like other analytical tools, machine learning methods require a careful implementation to carry out a reproducible and transparent research process with reliable results. This review describes a workflow to guide the application of machine learning in addiction research, detailing study design, data collection, data pre-processing, modeling, and results communication. How to train, validate, and test a model, detect and characterize overfitting, and determine an adequate sample size are some of the key issues when applying machine learning. We also illustrate the process and particular nuances with examples of how researchers in addiction have applied machine learning techniques with different goals, study designs, or data sources as well as explain the main limitations of machine learning approaches and how to best address them. A good use of machine learning enriches the addiction research toolkit.

Keywords: Aprendizaje automático; Machine learning; addiction research methods; análisis de datos; artificial intelligence; ciencia de datos; data analysis; data science; inteligencia artificial; métodos para la investigación de adicciones.

Publication types

  • Review

MeSH terms

  • Data Collection
  • Humans
  • Machine Learning*
  • Workflow