Cluster failure or power failure? Evaluating sensitivity in cluster-level inference

Neuroimage. 2020 Apr 1:209:116468. doi: 10.1016/j.neuroimage.2019.116468. Epub 2019 Dec 15.

Abstract

Pioneering work in human neuroscience has relied on the ability to map brain function using task-based fMRI, but the empirical validity of these inferential methods is still being characterized. A recent landmark study by Eklund and colleagues showed that popular multiple comparison corrections based on cluster extent suffer from unexpectedly low specificity (i.e., high false positive rate). Yet that study's focus on specificity, while important, is incomplete. The validity of a method depends also on its sensitivity (i.e., true positive rate or power), yet the sensitivity of cluster correction remains poorly understood. Here, we assessed the sensitivity of gold-standard nonparametric cluster correction by resampling real data from five tasks in the Human Connectome Project and comparing results with those from the full "ground truth" datasets (n ​= ​480-493). Critically, we found that sensitivity after correction is lower than may be practical for many fMRI applications. In particular, sensitivity to medium-sized effects (|Cohen's d| ​= ​0.5) was less than 20% across tasks on average, about three times smaller than without any correction. Furthermore, cluster extent correction exhibited a spatial bias in sensitivity that was independent of effect size. In comparison, correction based on the Threshold-Free Cluster Enhancement (TFCE) statistic approximately doubled sensitivity across tasks but increased spatial bias. These results suggest that we have, until now, only measured the tip of the iceberg in the activation-mapping literature due to our goal of limiting the familywise error rate through cluster extent-based inference. There is a need to revise our practices to improve sensitivity; we therefore conclude with a list of modern strategies to boost sensitivity while maintaining respectable specificity in future investigations.

Keywords: Activation; Empirical; HCP; Power; Resampling; Sensitivity; fMRI.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Adult
  • Brain / diagnostic imaging*
  • Cluster Analysis
  • Connectome
  • Data Interpretation, Statistical*
  • Functional Neuroimaging / standards*
  • Humans
  • Magnetic Resonance Imaging / standards*
  • Neuropsychological Tests
  • Reproducibility of Results
  • Sensitivity and Specificity