Can ChatGPT assist authors with abstract writing in medical journals? Evaluating the quality of scientific abstracts generated by ChatGPT and original abstracts

ChatGPT能否帮助作者撰写医学期刊摘要?评估ChatGPT生成的科学摘要与原始摘要的质量

阅读:1

Abstract

INTRODUCTION: ChatGPT, a sophisticated large language model (LLM), has garnered widespread attention for its ability to mimic human-like communication. As recent studies indicate a potential supportive role of ChatGPT in academic writing, we assessed the LLM's capacity to generate accurate and comprehensive scientific abstracts from published Randomised Controlled Trial (RCT) data, focusing on the adherence to the Consolidated Standards of Reporting Trials for Abstracts (CONSORT-A) statement, in comparison to the original authors' abstracts. METHODOLOGY: RCTs, identified in a PubMed/MEDLINE search post-September 2021 across various medical disciplines, were subjected to abstract generation via ChatGPT versions 3.5 and 4, following the guidelines of the respective journals. The overall quality score (OQS) of each abstract was determined by the total number of adequately reported components from the 18-item CONSORT-A checklist. Additional outcome measures included percent adherence to each CONOSORT-A item, readability, hallucination rate, and regression analysis of reporting quality determinants. RESULTS: Original abstracts achieved a mean OQS of 11.89 (95% CI: 11.23-12.54), outperforming GPT 3.5 (7.89; 95% CI: 7.32-8.46) and GPT 4 (5.18; 95% CI: 4.64-5.71). Compared to GPT 3.5 and 4 outputs, original abstracts were more adherent with 10 and 14 CONSORT-A items, respectively. In blind assessments, GPT 3.5-generated abstracts were deemed most readable in 62.22% of cases which was significantly greater than the original (31.11%; P = 0.003) and GPT 4-generated (6.67%; P<0.001) abstracts. Moreover, ChatGPT 3.5 exhibited a hallucination rate of 0.03 items per abstract compared to 1.13 by GPT 4. No determinants for improved reporting quality were identified for GPT-generated abstracts. CONCLUSIONS: While ChatGPT could generate more readable abstracts, their overall quality was inferior to the original abstracts. Yet, its proficiency to concisely relay key information with minimal error holds promise for medical research and warrants further investigations to fully ascertain the LLM's applicability in this domain.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。