Clinical validation of AI-assisted contouring in prostate radiation therapy treatment planning: Highlighting automation bias and the need for standardized quality assurance

AI辅助勾画前列腺放射治疗计划的临床验证:强调自动化偏差和标准化质量保证的必要性

阅读:1

Abstract

PURPOSE: This study evaluated the impact of a commercial AI-assisted contouring tool on intra- and inter-observer variability in prostate radiation therapy and assessed the dosimetric consequences of geometric contour differences. METHODS: Two experienced radiation oncologists independently delineated clinical target volume (CTV) and organs at risk (OARs) for prostate cancer patients. Manual contours (C(man)) and AI-generated contours (C(AI)) were compared with adjusted AI contours (C(AI,adj)). A consensus reference (C(ref)) served as the benchmark. To evaluate clinical impact, treatment plans were recalculated and replanned on each contour set under identical beam geometries to assess dose-volume histogram (DVH) parameters. RESULTS: AI-assisted contouring significantly improved both intra- and inter-observer agreement. Inter-observer analysis revealed that the Dice similarity coefficient (DSCs) for CTV increased from 0.78 (± 0.11) for C(man) to 0.89 (± 0.09) for C(AI, adj). Similarly, intra-observer analysis revealed that both oncologists showed significantly higher DSCs for C(AI, adj) compared to C(man). A thorough geometric comparison to the C(ref) revealed that while adjustments to C(AI) improved accuracy, they generally did not surpass C(man) for CTV and rectum. Dosimetric analyses demonstrated that, under fixed plan geometry, both C(man) and C(AI,adj) contours yielded lower planning target volume (PTV) D95% values compared with C(ref), whereas after replanning, all plans met institutional criteria with no clinically significant differences among contour sets. CONCLUSION: AI-assisted contouring in prostate radiotherapy reduced intra- and inter-observer variability and improved contouring consistency. However, C(AI, adj) did not consistently surpass C(man), especially for the CTV and rectum, where automation bias or selective clinical acceptance may have influenced edits. Fixed-plan recalculations revealed dose differences from minor geometric deviations. These findings underscore the importance of structured quality assurance (QA) and human oversight to mitigate automation bias while leveraging AI's efficiency. The single-institution design with two oncologists and one AI software limits generalizability, underscoring the need for multi-observer validation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。