Artificial Intelligence Chatbots in Pediatric Emergencies: A Reliable Lifeline or a Risk?

人工智能聊天机器人在儿科急诊中的应用:可靠的生命线还是风险?

阅读:1

Abstract

Introduction Artificial intelligence (AI) chatbots have rapidly gained popularity for disseminating health information, especially with the growth of digital medicine in recent times. Recent studies have shown that Chat Generative Pre-Trained Transformer (ChatGPT; OpenAI, San Francisco, CA), a widely used AI chatbot, has at times surpassed emergency department physicians in diagnostic accuracy and has passed basic life support (BLS) exams, underscoring its potential for emergency use. Parents are a key demographic for online health information, frequently turning to these chatbots for urgent guidance during child-related emergencies, such as choking incidents. While research has extensively examined AI chatbots' effectiveness in delivering adult BLS guidelines, their accuracy and reliability in providing pediatric BLS guidance aligned with American Heart Association (AHA) standards remain underexplored. This gap raises concerns about the safety and appropriateness of relying on AI chatbots for guidance in pediatric emergencies. In light of this, we hoped that comparing the performance of two ChatGPT versions, ChatGPT-4o and ChatGPT-4o mini, against established pediatric protocols by AHA could help optimize their integration into emergency response frameworks, providing parents with reliable assistance in critical situations. This analysis can pinpoint improvements for real-world integration, ensuring trustworthy assistance in critical situations. Methodology A prospective comparative content analysis was conducted between responses from ChatGPT (version 4o and its mini version) against the 2020 AHA Guidelines for Cardiopulmonary Resuscitation and Emergency Cardiovascular Care. The analysis focused on pediatric BLS, utilizing 13 broad questions designed to cover all key components, including fundamental concepts like the pediatric chain of survival and specific emergencies such as choking. Responses were evaluated for completeness and conformity to AHA guidelines. Completeness of the responses was analyzed as 'Completely Addressed', 'Partially Addressed', or 'Not Addressed', with partial responses further classified as 'Superficial', 'Inaccurate', or 'Hallucination'. Conformity of responses to AHA 2020 guidelines was similarly analyzed and classified. Assessment of reliability was performed using Cronbach's alpha. Cohen's kappa was used to check for interrater agreement between responses generated from two separate devices for the same set of questions. Results Content analysis of ChatGPT responses revealed that only 9.61% were fully addressed, and just 5.77% fully conformed to the AHA 2020 pediatric BLS guidelines. A majority of the responses (61.54%) were partially addressed and lacked depth, while 59.61% conformed only partially and superficially to the guidelines. Additionally, 5.77% of the queries were not addressed at all. ChatGPT-4o responses were generally more detailed and comprehensive compared to those from ChatGPT-4o mini. Inter-rater agreement ranged from slight to substantial between the two users. Conclusions While chatbots may assist with basic guidance, they lack the accuracy, depth, and hands-on instruction crucial for life-saving procedures. Misinterpretation or incomplete information from chatbots could lead to critical errors in emergencies. Hence, widespread BLS training remains essential for ensuring individuals have the practical skills and precise knowledge needed to respond effectively in real-life situations.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。