An Assessment of the Accuracy and Consistency of ChatGPT in the Management of Midshaft Clavicle Fractures

ChatGPT在锁骨中段骨折管理中的准确性和一致性评估

阅读:1

Abstract

Background Midshaft clavicle fractures are common orthopaedic injuries with no consensus on optimal management. Large language models (LLMs) such as ChatGPT (OpenAI, San Francisco, USA) present a novel tool for patient education and clinical decision-making. This study aimed to evaluate the accuracy and consistency of ChatGPT's responses to patient-focused and clinical decision-making questions regarding this injury. Methods ChatGPT-4o mini was prompted three times with 14 patient-focused and orthopaedic clinical decision-making questions. References were requested for each response. Response accuracy was graded as: (I) comprehensive; (II) correct but inadequate; (III) mixed with correct and incorrect information; or (IV) completely incorrect. Two consultant and two trainee orthopaedic surgeons evaluated the accuracy and consistency of responses. References provided by ChatGPT were evaluated for accuracy. Results All 42 responses were graded as (III), indicating a mix of correct and incorrect information, with 78.6% consistency across the responses. Of the 128 references provided, 0.8% were correct, 10.9% were incorrect, and 88.3% were fabricated. Only 3.1% of references accurately reflected the cited conclusions. Conclusion ChatGPT demonstrates limitations in accuracy and consistency when answering patient-focused queries or aiding in orthopaedic clinical decision-making for midshaft clavicle fractures. Caution is advised before integrating ChatGPT into clinical workflows for patients or orthopaedic clinicians.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。