Purpose: To evaluate whether automated knowledge-based planning (KBP) (a) is noninferior to human-driven planning across multiple disease sites and (b) systematically affects dosimetric plan quality and variability.
Methods and materials: Clinical KBP automated planning routines were developed for prostate, prostatic fossa, hypofractionated lung, and head and neck. Clinical implementation consisted of independent generation of human-generated and KBP plans (145 cases across all sites), followed by blinded plan selection. Reviewing physicians were prompted to select a single plan; when plan equivalence was volunteered, this scored as KBP selection. Plan selection analysis used a noninferiority framework testing the hypothesis that KBP is not worse than human-driven planning (threshold: lower 95% confidence interval [CI] > 0.45 = noninferiority; > 0.5 = superiority). Target and organ-at-risk metrics were compared by dose differencing: ΔDx = Dx, human-Dx, KBP (2-tailed paired t test, Bonferroni-corrected P < .05 significance threshold). To evaluate the aggregated effect of KBP on planning performance, we examined post-KBP dosimetric parameters against 183 plans generated just before KBP implementation (2-tailed unpaired t test, Bonferroni-corrected P < .05).
Results: Across all disease sites, the KBP success rate (physician preferred + equivalent) was noninferior compared with human-driven planning (83 of 145 = 57.2%; range, 49.2%-65.3%) but did not cross the threshold for superiority. The KBP success rate in respective disease sites was superior with head and neck ([22 + 2]/36 = 66.7%; 95% CI, 51%-82%) and noninferior for lung stereotactic body radiation therapy ([21 + 2]/36 = 63.9%; 95% CI, 48%-80%) but did not meet noninferiority criteria with prostate ([16 + 3]/41 = 46.3%; 95% CI, 31%-62%) or prostatic fossa ([17 + 0]/32 = 53.1%; 95% CI, 36%-70%). Prostate, prostatic fossa, and head and neck showed significant differences in KBP-selected plans versus human-selected plans, with KBP generally exhibiting greater organ-at-risk sparing and human plans exhibiting better target homogeneity. Analysis of plan quality pre- and post-KBP showed some reductions in organ doses and quality metric variability in prostate and head and neck.
Conclusions: Fully automated KBP was noninferior to human-driven plan optimization across multiple disease sites. Dosimetric analysis of treatment plans before and after KBP implementation showed a systematic shift to higher plan quality and lower variability with the introduction of KBP.
Copyright © 2019 Elsevier Inc. All rights reserved.