Evaluating ChatGPT's Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search

评估 ChatGPT 在系统性红斑狼疮生物疗法中的实用性:ChatGPT 与 Google 网络搜索的比较研究

阅读:1

Abstract

BACKGROUND: Systemic lupus erythematosus (SLE) is a life-threatening, multisystem autoimmune disease. Biologic therapy is a promising treatment for SLE. However, public understanding of this therapy is still insufficient, and the quality of related information on the internet varies, which affects patients' acceptance of this treatment. The effectiveness of artificial intelligence technologies, such as ChatGPT (OpenAI), in knowledge dissemination within the health care field has attracted significant attention. Research on ChatGPT's utility in answering questions regarding biologic therapy for SLE could promote the dissemination of this treatment. OBJECTIVE: This study aimed to evaluate ChatGPT's utility as a tool for users to obtain health information about biologic therapy for SLE. METHODS: This study extracted 20 common questions related to biologic therapy for SLE, their corresponding answers, and the sources of these answers from both Google Web Search and ChatGPT-4o (OpenAI). Then, based on Rothwell's classification, the questions were categorized into 3 main types: fact, policy, and value. The sources of the answers were classified into 5 categories: commercial, academic, medical practice, government, and social media. The accuracy and completeness of the answers were assessed using Likert scales. The readability of the answers was evaluated using the Flesch Reading Ease and Flesch-Kincaid Grade Level (FKGL) scores. RESULTS: The study found that, in terms of question types, ChatGPT-4o had the highest proportion of fact questions (10/20), followed by policy (7/20) and value (3/20). Google Web Search had the highest proportion of fact questions (12/20), followed by value (5/20) and policy (3/20). In terms of website sources, ChatGPT-4o's answers were sourced from 48 sources, with the majority coming from academic sources (29/48). Google Web Search provided answers from 20 sources, with an even distribution across all 5 categories. For accuracy, ChatGPT-4o's mean score of 5.83 (SD 0.49) was higher than that of Google Web Search (mean 4.75, SD 0.94), with a mean difference of 1.08 (95% CI 0.61-1.54). For completeness, ChatGPT-4o's mean score of 2.88 (SD 0.32) was higher than that of Google Web Search (mean 1.68, SD 0.69), with a mean difference of 1.2 (95% CI 0.96-1.44). For readability, the Flesch Reading Ease and Flesch-Kincaid Grade Level scores for ChatGPT-4o and Google Web Search were 11.7 and 14.9, and 16.2 and 20, respectively, indicating that both texts were of high reading difficulty, requiring readers to have a college graduate-level reading proficiency. When asking ChatGPT to respond at a sixth-grade level, the readability of the answers significantly improved. CONCLUSIONS: ChatGPT's answers are characterized by accuracy, rigor, comprehensiveness, and professional supporting materials, and demonstrate humanistic care. However, the readability of the provided text is low, requiring users to have a college education background. Given the study's limitations in question scope, comparison dimensions, research perspectives, and language types, further in-depth comparative research is recommended.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。