Improving diagnostic accuracy in preoperative glioma classification: performance of knowledge-enhanced large language models compared with radiologists

提高术前胶质瘤分类的诊断准确率:知识增强型大型语言模型与放射科医生的性能比较

阅读:1

Abstract

Accurate preoperative MRI classification of gliomas is essential but challenging due to complex radiological features and inter-observer variability. This study evaluated three large language models (LLMs) for VASARI-based glioma classification compared to radiologist interpretations. We retrospectively analyzed 150 histopathologically confirmed gliomas (43 circumscribed astrocytic, 53 high-grade diffuse, 54 low-grade diffuse gliomas) using standardized MRI protocols. Three radiologists extracted VASARI features, while three LLMs (GPT-4, Claude3.5-Sonnet, Claude3.0-Opus) analyzed these features using standard input-output or knowledge-enhanced prompting incorporating diagnostic guidelines. Knowledge-enhanced prompting consistently outperformed standard prompting, improving diagnostic consistency (intra-model agreement: Sonnet κ = 0.91, Opus κ = 0.92, GPT-4 κ = 0.72). For diffuse versus circumscribed classification, senior radiologists (AUC = 0.88) and Claude3.5-Sonnet with knowledge-enhanced prompting (AUC = 0.84) performed similarly (p > 0.05). LLM assistance significantly improved junior radiologists' performance, with AUC increases from 0.77 to 0.83 (p = 0.026). Knowledge-enhanced LLMs demonstrate diagnostic performance comparable to experienced radiologists and improve junior accuracy, suggesting potential as decision-support tools requiring radiologist oversight.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。