A retrospective evaluation of the potential of ChatGPT in the accurate diagnosis of acute stroke

回顾性评估 ChatGPT 在急性卒中准确诊断中的应用潜力

阅读:1

Abstract

PURPOSE: Stroke is a neurological emergency requiring rapid, accurate diagnosis to prevent severe consequences. Early diagnosis is crucial for reducing morbidity and mortality. Artificial intelligence (AI) diagnosis support tools, such as Chat Generative Pre-trained Transformer (ChatGPT), offer rapid diagnostic advantages. This study assesses ChatGPT's accuracy in interpreting diffusion-weighted imaging (DWI) for acute stroke diagnosis. METHODS: A retrospective analysis was conducted to identify the presence of stroke using DWI and apparent diffusion coefficient (ADC) map images. Patients aged >18 years who exhibited diffusion restriction and had a clinically explainable condition were included in the study. Patients with artifacts that affected image homogeneity, accuracy, and clarity, as well as those who had undergone previous surgery or had a history of stroke, were excluded from the study. ChatGPT was asked four consecutive questions regarding the identification of the magnetic resonance imaging (MRI) sequence, the demonstration of diffusion restriction on the ADC map after sequence recognition, and the identification of hemispheres and specific lobes. Each question was repeated 10 times to ensure consistency. Senior radiologists subsequently verified the accuracy of ChatGPT's responses, classifying them as either correct or incorrect. We assumed a response to be incorrect if it was partially correct or suggested multiple answers. These responses were systematically recorded. We also recorded non-responses from ChatGPT-4V when it failed to provide an answer to a query. We assessed ChatGPT-4V's performance by calculating the number and percentage of correct responses, incorrect responses, and non-responses across all images and questions, a metric known as "accuracy." ChatGPT-4V was considered successful if it answered ≥80% of the examples correctly. RESULTS: A total of 530 diffusion MRI, of which 266 were stroke images and 264 were normal, were evaluated in the study. For the initial query identifying MRI sequence type, ChatGPT-4V's accuracy was 88.3% for stroke and 90.1% for normal images. For detecting diffusion restriction, ChatGPT-4V had an accuracy of 79.5% for stroke images, with a 15% false positive rate for normal images. Regarding identifying the brain or cerebellar hemisphere involved, ChatGPT-4V correctly identified the hemisphere in 26.2% of stroke images. For identifying the specific brain lobe or cerebellar area affected, ChatGPT-4V had a 20.4% accuracy for stroke images. The diagnostic sensitivity of ChatGPT-4V in acute stroke was found to be 79.57%, with a specificity of 84.87%, a positive predictive value of 83.86%, a negative predictive value of 80.80%, and a diagnostic odds ratio of 21.86. CONCLUSION: Despite limitations, ChatGPT shows potential as a supportive tool for healthcare professionals in interpreting diffusion examinations in stroke cases, where timely diagnosis is critical. CLINICAL SIGNIFICANCE: ChatGPT can play an important role in various aspects of stroke cases, such as risk assessment, early diagnosis, and treatment planning.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。