Some methodological questions concerning receiver operating characteristic (ROC) analysis as a method for assessing image quality in radiology

J Digit Imaging. 1990 Nov;3(4):211-8. doi: 10.1007/BF03168117.

Abstract

This paper raises five methodological questions concerning receiver operating characteristic (ROC) analysis: (1) can the ROC "confidence criterion" be applied in a valid, reliable way?; (2) can ROC deal with ambiguous findings?; (3) can ROC deal effectively with false-negative findings?; (4) are ROC curves susceptible to valid statistical testing?; and (5) are ROC results useful in choosing among alternative imaging modalities? A review of the evidence leads to six conclusions. First, using ROC, all radiological findings must be unambiguously scored as true-positive, true-negative, false-positive, or false-negative, often forcing arbitrary, procrustean choices on readers and evaluators. Second, ROC requires radiologists to report findings by confidence level on a consistent, reliable basis throughout a ROC experiment; something that seems unrealistic, given what is known about human performance in almost all perceptual tasks of comparable complexity. Third, as gathered during the typical experiment, ROC data are probably nominal, but treated as if ordinal (or even interval) data, leading to distorted results. Fourth, ROC does not deal effectively with false-negatives, despite their importance. Fifth, there is no satisfactory method for statistically testing the significance of observed differences between two ROC curves if they are based on nominal data. Finally, the artificial tasks required of radiologists in a ROC evaluation limit the usefulness of ROC results in choosing among the imaging modalities.

MeSH terms

  • False Negative Reactions
  • False Positive Reactions
  • Humans
  • Observer Variation
  • ROC Curve*
  • Radiographic Image Enhancement*