Widely Available Large Language Models Are Not a Reliable Source to Address Medical Treatment Recommendations of Patients After a First-Time Anteroinferior Shoulder Dislocation

广泛应用的大型语言模型并非为首次发生肩关节前下脱位的患者提供医疗治疗建议的可靠来源。

阅读:1

Abstract

PURPOSE: To assess the ability of ChatGPT 3.5 to aid in the treatment planning process of first-time anteroinferior shoulder dislocation. METHODS: Forty fictional patient cases were created varying in 15 different characteristics, whose distribution was randomized. Six orthopaedic surgeons (3 residents and 3 specialists in shoulder surgery) were then asked to determine the best treatment option for these patient cases. Their answers were compared with the treatment recommendations proposed by ChatGPT in 2 different sessions on the basis of preselected literature. To counteract the wide dispersion of responses, tendencies towards nonoperative, open surgical, or arthroscopic treatment were subsequently defined. The results were then analyzed descriptively. RESULTS: The mean age of the fictional patients was 44 years (13-80 years), with 57.5% of the patients female. The agreement between the ChatGPT responses in the 2 sessions was 70.0%. In contrast, the 3 assistant physicians agreed with each other in 35% of all cases and the 3 specialists agreed in 32.5% of all cases. There was an exact match of 12.5% between the ChatGPT responses and all human assessments. In 65.0% of all cases, the physicians showed similar tendencies in their choice of therapy resulting in a 55.0% match between ChatGPT and the surgeons. CONCLUSIONS: There was no clear consensus regarding the treatment for first-time anteroinferior dislocations of the shoulder, neither among physicians nor with ChatGPT 3.5. However, ChatGPT 3.5 and physicians showed similar tendencies regarding the treatment in over half of the cases. Because of the inconsistent responses of ChatGPT 3.5, it cannot yet be considered as reliable tool for therapy planning. CLINICAL RELEVANCE: ChatGPT 3.5, widely available and free of charge, is increasingly used in clinical settings. However, it's crucial to highlight its limitations in treatment planning for pathologies, especially when there's no clear consensus even among experienced surgeons.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。