Segmentation of glioma is crucial for quantitative brain tumor assessment, to guide therapeutic research and clinical management, but very time-consuming. Fully automated tools for the segmentation of multi-sequence MRI are needed. We developed and pretrained a deep learning (DL) model using publicly available datasets A (n = 210) and B (n = 369) containing FLAIR, T2WI, and contrast-enhanced (CE)-T1WI. This was then fine-tuned with our institutional dataset (n = 197) containing ADC, T2WI, and CE-T1WI, manually annotated by radiologists, and split into training (n = 100) and testing (n = 97) sets. The Dice similarity coefficient (DSC) was used to compare model outputs and manual labels. A third independent radiologist assessed segmentation quality on a semi-quantitative 5-scale score. Differences in DSC between new and recurrent gliomas, and between uni or multifocal gliomas were analyzed using the Mann-Whitney test. Semi-quantitative analyses were compared using the chi-square test. We found that there was good agreement between segmentations from the fine-tuned DL model and ground truth manual segmentations (median DSC: 0.729, std-dev: 0.134). DSC was higher for newly diagnosed (0.807) than recurrent (0.698) (p < 0.001), and higher for unifocal (0.747) than multi-focal (0.613) cases (p = 0.001). Semi-quantitative scores of DL and manual segmentation were not significantly different (mean: 3.567 vs. 3.639; 93.8% vs. 97.9% scoring ≥ 3, p = 0.107). In conclusion, the proposed transfer learning DL performed similarly to human radiologists in glioma segmentation on both structural and ADC sequences. Further improvement in segmenting challenging postoperative and multifocal glioma cases is needed.
Keywords: BraTS; Deep learning; Glioma; Segmentation.
© 2024. The Author(s).