Performance of ChatGPT-4 as an Auxiliary Tool: Evaluation of Accuracy and Repeatability on Orthodontic Radiology Questions

ChatGPT-4 作为辅助工具的性能:正畸放射学问题的准确性和重复性评估

阅读:4

Abstract

Background: Large language models (LLMs) are increasingly considered in dentistry, yet their accuracy in orthodontic radiology remains uncertain. This study evaluated the performance of ChatGPT-4 on questions aligned with current radiology guidelines. Methods: Fifty short, guideline-anchored questions were authored; thirty were pre-selected a priori for their diagnostic relevance. Using the ChatGPT-4 web interface in March 2025, we obtained 30 answers per item (900 in total) across two user accounts and three times of day, each in a new chat with a standardised prompt. Two blinded experts graded all responses on a 3-point scale (0 = incorrect, 1 = partially correct, 2 = correct); disagreements were adjudicated. The primary outcome was strict accuracy (proportion of answers graded 2). Secondary outcomes were partial-credit performance (mean 0-2 score) and inter-rater agreement using multiple coefficients. Results: Strict accuracy was 34.1% (95% CI 31.0-37.2), with wide item-level variability (0-100%). The mean partial-credit score was 1.09/2.00 (median 1.02; IQR 0.53-1.83). Inter-rater agreement was high (percent agreement: 0.938, with coefficients indicating substantial to almost-perfect reliability). Conclusions: In the conditions of this study, ChatGPT-4 demonstrated limited strict accuracy yet substantial reliability in expert grading when applied to orthodontic radiology questions. These findings underline its potential as a complementary educational and decision-support resource while also highlight its present limitations. Its role should remain supportive and informative, never replacing the critical appraisal and professional judgement of the clinician.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。