Improving diagnostic accuracy in preoperative glioma classification: performance of knowledge-enhanced large language models compared with radiologists

提高术前胶质瘤分类的诊断准确率：知识增强型大型语言模型与放射科医生的性能比较

阅读：1

作者：Li,Shuang,Fang,Xin,Jin,Yuqi,Deng,YuJiao,Hu,Wei,Wu,Bing,Zhou,Xiaobo,Wang,Guotai,Li,Kang,Yue,Qiang

期刊：	npj Precision Oncology	影响因子：	8.000
时间：	2025	起止号：	2025 Nov 27;9(1):383
doi：	10.1038/s41698-025-01171-6	疾病类型：	胶质瘤

Abstract

Accurate preoperative MRI classification of gliomas is essential but challenging due to complex radiological features and inter-observer variability. This study evaluated three large language models (LLMs) for VASARI-based glioma classification compared to radiologist interpretations. We retrospectively analyzed 150 histopathologically confirmed gliomas (43 circumscribed astrocytic, 53 high-grade diffuse, 54 low-grade diffuse gliomas) using standardized MRI protocols. Three radiologists extracted VASARI features, while three LLMs (GPT-4, Claude3.5-Sonnet, Claude3.0-Opus) analyzed these features using standard input-output or knowledge-enhanced prompting incorporating diagnostic guidelines. Knowledge-enhanced prompting consistently outperformed standard prompting, improving diagnostic consistency (intra-model agreement: Sonnet κ = 0.91, Opus κ = 0.92, GPT-4 κ = 0.72). For diffuse versus circumscribed classification, senior radiologists (AUC = 0.88) and Claude3.5-Sonnet with knowledge-enhanced prompting (AUC = 0.84) performed similarly (p > 0.05). LLM assistance significantly improved junior radiologists' performance, with AUC increases from 0.77 to 0.83 (p = 0.026). Knowledge-enhanced LLMs demonstrate diagnostic performance comparable to experienced radiologists and improve junior accuracy, suggesting potential as decision-support tools requiring radiologist oversight.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。