Can ChatGPT Recognize Its Own Writing in Scientific Abstracts?

ChatGPT能否识别科学摘要中自己的写作内容?

阅读:2

Abstract

BACKGROUND: With the growing use of generative AI in scientific writing, distinguishing between AI-generated and human-authored content has become a pressing challenge. It remains unclear whether ChatGPT (OpenAI, San Francisco, CA) can accurately and consistently recognize its own output. METHODS: We randomly selected 100 research articles published in 2000, before the advent of generative AI, from 10 high-impact internal medicine journals. For each article, a structured abstract was generated using ChatGPT-4.0 based on the full PDF. The original and AI-generated abstracts (n = 200) were then evaluated twice by ChatGPT-4.0, which was asked to rate the likelihood of authorship on a 0-10 scale (0 = definitely human, 10 = definitely ChatGPT, 5 = undetermined). Classifications of 0-4 were considered human, and 6-10 were considered AI generated. RESULTS: Misclassification rates were high in both rounds (49% and 47.5%). No abstract received a score of 5. Score distributions overlapped substantially between groups, with no statistically significant difference (Wilcoxon p-value = 0.93 and 0.21). Cohen's kappa for binary classification was 0.33 (95% CI: 0.19-0.46) and weighted kappa on the 0-10 scale was 0.24 (95% CI: 0.15-0.34), both reflecting poor agreement. CONCLUSION: ChatGPT-4.0 cannot reliably identify whether a scientific abstract was written by itself or by humans. More robust external tools are needed to ensure transparency in academic authorship.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。