Purpose: The objective of this study is to evaluate the diagnostic accuracy, interobserver variability, and common lexicon pitfalls of the ACR O-RADS scoring system among staff radiologists without prior experience to O-RADS.
Materials and methods: After independent review of the ACR O-RADS publications and 30 training cases, three fellowship-trained, board-certified staff radiologists scored 50 pelvic ultrasound exams using the O-RADS system. The diagnostic accuracy and area under receiver operating characteristic were analyzed for each reader. Overall agreement and pair-wise agreement between readers were also analyzed.
Results: Excellent specificities (92 to 100%), NPVs (92 to 100%), and variable sensitivities (72 to 100%), PPVs (66 to 100%) were observed. Considering O-RADS 4 and O-RADS 5 as predictors of malignancy, individual reader AUC values range from 0.94 to 0.98 (p < 0.001). Overall inter-reader agreement for all 3 readers was "very good," k = 0.82 (0.73 to 0.90, 95% CI, p < 0.001). Pair-wise agreement between readers were also "very good," k = 0.86-0.92. 14 out of 150 lesions were misclassified, with the most common error being down-scoring of a solid lesion with irregular outer contours.
Conclusion: Even without specific training, experienced ultrasound readers can achieve excellent diagnostic performance and high inter-reader reliability with self-directed review of guidelines and cases. The study highlights the effectiveness of ACR O-RADS as a stratification tool for radiologists and supports its continued use in practice.
Keywords: Accuracy; Inter-observer variability; O-RADS; Ovarian cysts; Reliability; Ultrasound.
© 2021. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.