When AI joins the table: evaluating large language model performance in soft tissue sarcoma tumor board decisions

当人工智能参与决策:评估大型语言模型在软组织肉瘤肿瘤委员会决策中的表现

阅读:1

Abstract

OBJECTIVES: Multidisciplinary tumor boards (MDTs) are critical for the personalized management of soft tissue sarcomas (STS), but they are limited by time, costs, and resource demands. With recent advances in large language models (LLMs) like ChatGPT, there is growing interest in evaluating their potential role in augmenting MDT workflows. This study aimed to assess the clinical performance of ChatGPT-4o in real-world STS cases using predefined evaluation criteria, comparing its treatment suggestions with expert MDT decisions. MATERIALS AND METHODS: This retrospective study included 152 patients presented to the multidisciplinary sarcoma tumor board. ChatGPT-4o was prompted to generate guideline-based treatment recommendations based on anonymized tumor board registration letters. Outputs were scored by blinded expert reviewers using a five-domain framework: diagnostic modalities, therapeutic modalities, treatment sequencing/timing, chemotherapy regimen, and clinical contextualization. Descriptive statistics and non-parametric ANOVA with post hoc tests assessed performance, including subgroup analysis by sarcoma subtype. RESULTS: ChatGPT-4o scores were significantly lower than the maximum achievable value of 1.0 across all five criteria (all p < 0.0001). Among individual domains, clinical contextualization significantly outperformed all other criteria in pairwise comparisons (all p < 0.05). No significant performance differences were observed across sarcoma subtypes (H = 19.74, p = 0.138). CONCLUSIONS: ChatGPT-4o demonstrated substantial expert-rated performance in generating tumor board recommendations for soft tissue sarcoma cases, particularly excelling in personalized contextualization. Discrepancies in treatment sequencing and chemotherapy selection highlight the need for expert oversight. These findings support the feasibility of LLM integration into oncology workflows, warranting further refinement toward safe, supportive clinical use.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。