Integrating AI into clinical education: evaluating general practice trainees' proficiency in distinguishing AI-generated hallucinations and impacting factors

将人工智能融入临床教育:评估全科医生培训生区分人工智能生成的幻觉的能力及其影响因素

阅读:1

Abstract

OBJECTIVE: To assess the ability of General Practice (GP) Trainees to detect AI-generated hallucinations in simulated clinical practice, ChatGPT-4o was utilized. The hallucinations were categorized into three types based on the accuracy of the answers and explanations: (1) correct answers with incorrect or flawed explanations, (2) incorrect answers with explanations that contradict factual evidence, and (3) incorrect answers with correct explanations. METHODS: This multi-center, cross-sectional survey study involved 142 GP Trainees, all of whom were undergoing General Practice Specialist Training and volunteered to participate. The study evaluated the accuracy and consistency of ChatGPT-4o, as well as the Trainees' response time, accuracy, sensitivity (d'), and response tendencies (β). Binary regression analysis was used to explore factors affecting the Trainees' ability to identify errors generated by ChatGPT-4o. RESULTS: A total of 137 participants were included, with a mean age of 25.93 years. Half of the participants were unfamiliar with AI, and 35.0% had never used it. ChatGPT-4o's overall accuracy was 80.8%, which slightly decreased to 80.1% after human verification. However, the accuracy for professional practice (Subject 4) was only 57.0%, and after human verification, it dropped further to 44.2%. A total of 87 AI-generated hallucinations were identified, primarily occurring at the application and evaluation levels. The mean accuracy of detecting these hallucinations was 55.0%, and the mean sensitivity (d') was 0.39. Regression analysis revealed that shorter response times (OR = 0.92, P = 0.02), higher self-assessed AI understanding (OR = 0.16, P = 0.04), and more frequent AI use (OR = 10.43, P = 0.01) were associated with stricter error detection criteria. CONCLUSIONS: The study concluded that GP trainees faced challenges in identifying ChatGPT-4o's errors, particularly in clinical scenarios. This highlights the importance of improving AI literacy and critical thinking skills to ensure effective integration of AI into medical education.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。