Much attention is currently devoted to developing diagnostic classifiers for mental disorders. Complementing these efforts, we highlight the potential of machine learning to gain biological insights into the psychopathology and nosology of mental disorders. Studies to this end have mainly used brain imaging data, which can be obtained noninvasively from large cohorts and have repeatedly been argued to reveal potentially intermediate phenotypes. This may become particularly relevant in light of recent efforts to identify magnetic resonance imaging-derived biomarkers that yield insight into pathophysiological processes as well as to refine the taxonomy of mental illness. In particular, the accuracy of machine learning models may be used as dependent variables to identify features relevant to pathophysiology. Moreover, such approaches may help disentangle the dimensional (within diagnosis) and often overlapping (across diagnoses) symptomatology of psychiatric illness. We also point out a multiview perspective that combines data from different sources, bridging molecular and system-level information. Finally, we summarize recent efforts toward a data-driven definition of subtypes or disease entities through unsupervised and semisupervised approaches. The latter, blending unsupervised and supervised concepts, may represent a particularly promising avenue toward dissecting heterogeneous categories. Finally, we raise several technical and conceptual aspects related to the reviewed approaches. In particular, we discuss common pitfalls pertaining to flawed input data or analytic procedures that would likely lead to unreliable outputs.
Keywords: Biomarker; Heterogeneous dissection; Machine learning; Multiview integration; Nosology; Psychiatric disorder.
Copyright © 2022 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.