An evaluation of the reliability and readability of large language models in the dissemination of traumatic brain injury information

对大型语言模型在传播创伤性脑损伤信息方面的可靠性和可读性进行评估

阅读:2

Abstract

OBJECTIVE: To compare the reliability and readability of responses from Generative Pre-trained Transformer versions 3.5 (GPT-3.5) and 4.0 (GPT-4.0) on traumatic brain injury (TBI) topics against Model Systems Knowledge Translation Center (MSKTC) fact sheets. METHODS: This study analyzed responses from GPT-3.5 and GPT-4.0 for accuracy, comprehensiveness, and readability against MSKTC fact sheets, incorporating a correlation analysis between reliability and readability scores. RESULTS: Findings showed an improvement in reliability from GPT-3.5 (mean score = 3.21) to GPT-4.0 (mean score = 3.63), indicating better accuracy and completeness in the latter. Despite advancements, responses generally remained accurate but not fully comprehensive. Readability comparisons found the MSKTC fact sheets were significantly more reader-friendly compared to responses from both artificial intelligence (AI) versions, with no strong correlation between reliability and readability. CONCLUSION: The study highlights progress in AI-generated information on TBI from GPT-3.5 to GPT-4.0 in terms of reliability. However, challenges persist in matching the readability of standard patient education materials, emphasizing the need for future AI developments to focus on enhancing understandability alongside accuracy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。