Exploring the Role of ChatGPT-4, BingAI, and Gemini as Virtual Consultants to Educate Families about Retinopathy of Prematurity

探索 ChatGPT-4、BingAI 和 Gemini 作为虚拟顾问在教育早产儿视网膜病变家庭方面的作用

阅读:1

Abstract

BACKGROUND: Large language models (LLMs) are becoming increasingly important as they are being used more frequently for providing medical information. Our aim is to evaluate the effectiveness of electronic artificial intelligence (AI) large language models (LLMs), such as ChatGPT-4, BingAI, and Gemini in responding to patient inquiries about retinopathy of prematurity (ROP). METHODS: The answers of LLMs for fifty real-life patient inquiries were assessed using a 5-point Likert scale by three ophthalmologists. The models' responses were also evaluated for reliability with the DISCERN instrument and the EQIP framework, and for readability using the Flesch Reading Ease (FRE), Flesch-Kincaid Grade Level (FKGL), and Coleman-Liau Index. RESULTS: ChatGPT-4 outperformed BingAI and Gemini, scoring the highest with 5 points in 90% (45 out of 50) and achieving ratings of "agreed" or "strongly agreed" in 98% (49 out of 50) of responses. It led in accuracy and reliability with DISCERN and EQIP scores of 63 and 72.2, respectively. BingAI followed with scores of 53 and 61.1, while Gemini was noted for the best readability (FRE score of 39.1) but lower reliability scores. Statistically significant performance differences were observed particularly in the screening, diagnosis, and treatment categories. CONCLUSION: ChatGPT-4 excelled in providing detailed and reliable responses to ROP-related queries, although its texts were more complex. All models delivered generally accurate information as per DISCERN and EQIP assessments.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。