Autosegmentation for thoracic radiation treatment planning: A grand challenge at AAPM 2017

Med Phys. 2018 Oct;45(10):4568-4581. doi: 10.1002/mp.13141. Epub 2018 Sep 19.

Abstract

Purpose: This report presents the methods and results of the Thoracic Auto-Segmentation Challenge organized at the 2017 Annual Meeting of American Association of Physicists in Medicine. The purpose of the challenge was to provide a benchmark dataset and platform for evaluating performance of autosegmentation methods of organs at risk (OARs) in thoracic CT images.

Methods: Sixty thoracic CT scans provided by three different institutions were separated into 36 training, 12 offline testing, and 12 online testing scans. Eleven participants completed the offline challenge, and seven completed the online challenge. The OARs were left and right lungs, heart, esophagus, and spinal cord. Clinical contours used for treatment planning were quality checked and edited to adhere to the RTOG 1106 contouring guidelines. Algorithms were evaluated using the Dice coefficient, Hausdorff distance, and mean surface distance. A consolidated score was computed by normalizing the metrics against interrater variability and averaging over all patients and structures.

Results: The interrater study revealed highest variability in Dice for the esophagus and spinal cord, and in surface distances for lungs and heart. Five out of seven algorithms that participated in the online challenge employed deep-learning methods. Although the top three participants using deep learning produced the best segmentation for all structures, there was no significant difference in the performance among them. The fourth place participant used a multi-atlas-based approach. The highest Dice scores were produced for lungs, with averages ranging from 0.95 to 0.98, while the lowest Dice scores were produced for esophagus, with a range of 0.55-0.72.

Conclusion: The results of the challenge showed that the lungs and heart can be segmented fairly accurately by various algorithms, while deep-learning methods performed better on the esophagus. Our dataset together with the manual contours for all training cases continues to be available publicly as an ongoing benchmarking resource.

Keywords: automatic segmentation; grand challenge; lung cancer; radiation therapy.

MeSH terms

  • Algorithms
  • Humans
  • Organs at Risk / radiation effects
  • Radiotherapy Planning, Computer-Assisted / methods*
  • Radiotherapy, Image-Guided / adverse effects
  • Radiotherapy, Image-Guided / methods*
  • Thorax / diagnostic imaging*
  • Thorax / radiation effects*
  • Tomography, X-Ray Computed