Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini

人工智能在周围动脉疾病教育中的应用:ChatGPT 与 Google Gemini 的较量

阅读:1

Abstract

Background Peripheral artery disease (PAD) is a prevalent yet often overlooked manifestation of atherosclerosis that significantly contributes to cardiovascular morbidity and mortality. With the increasing reliance on artificial intelligence (AI) for medical information, it is essential to assess the accuracy and readability of AI-generated health content, especially with regard to common cardiovascular diseases. Objective This study evaluates the accuracy, completeness, and readability of responses generated by OpenAI's ChatGPT (San Francisco, CA) and Google's Gemini (Mountain View, CA) when answering common questions about PAD. AI responses were compared to Cleveland Clinic's frequently asked questions (FAQs) on PAD to assess the reliability of AI-generated responses as a patient education tool. Methods ChatGPT 4.0 and Gemini 1.0 were prompted in three formats (no prompt (Form 1), patient-level prompt (Form 2), and physician-level prompt (Form 3)) before answering 19 questions from Cleveland Clinic's FAQs on PAD. Responses were categorized as correct, partially correct, or incorrect based on percent content alignment. Readability was assessed using the Flesch-Kincaid (FK) grade level, and word count differences were analyzed. Chi-square tests and one-way analysis of variance (ANOVA) were used for statistical analysis, with a significance threshold of p < 0.05. Results ChatGPT provided 70% correct and 30% partially correct responses, with no incorrect answers. Gemini provided 52% correct, 45% partially correct, and 3% incorrect responses. ChatGPT performed significantly better in accuracy, with a p-value < 0.05. FK analysis showed no significant readability differences between the two chatbots (mean FK grade: ChatGPT, 10.81; Gemini, 10.73), although both were higher than the recommended reading level for patient education. ChatGPT's responses were significantly longer than Gemini's, with a p-value < 0.0001. Conclusion Both ChatGPT and Gemini provided mostly accurate and comprehensive responses to commonly asked questions about PAD, demonstrating their potential use as supplementary education tools for patients with appropriate provider oversight. However, the grade reading level of these materials exceeded the recommended reading levels set forth by national guidelines, which warrants improvement in AI-driven health communication. Given the growing reliance on AI in healthcare, further research should explore ways to enhance AI-generated medical content for broader patient accessibility and evaluate its impact on patient outcomes.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。