Exploring the proficiency of ChatGPT-4: An evaluation of its performance in the Taiwan advanced medical licensing examination

探究 ChatGPT-4 的熟练程度:对其在台湾高级医师执照考试中的表现进行评估

阅读:1

Abstract

BACKGROUND: Taiwan is well-known for its quality healthcare system. The country's medical licensing exams offer a way to evaluate ChatGPT's medical proficiency. METHODS: We analyzed exam data from February 2022, July 2022, February 2023, and July 2033. Each exam included four papers with 80 single-choice questions, grouped as descriptive or picture-based. We used ChatGPT-4 for evaluation. Incorrect answers prompted a "chain of thought" approach. Accuracy rates were calculated as percentages. RESULTS: ChatGPT-4's accuracy in medical exams ranged from 63.75% to 93.75% (February 2022-July 2023). The highest accuracy (93.75%) was in February 2022's Medicine Exam (3). Subjects with the highest misanswered rates were ophthalmology (28.95%), breast surgery (27.27%), plastic surgery (26.67%), orthopedics (25.00%), and general surgery (24.59%). While using "chain of thought," the "Accuracy of (CoT) prompting" ranged from 0.00% to 88.89%, and the final overall accuracy rate ranged from 90% to 98%. CONCLUSION: ChatGPT-4 succeeded in Taiwan's medical licensing exams. With the "chain of thought" prompt, it improved accuracy to over 90%.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。