A linguistic comparison between human- and AI-generated content

人类生成内容与人工智能生成内容的语言学比较

阅读:2

Abstract

This study explores the linguistic differences between AI-generated content and human-written texts, particularly in Portuguese. We created two datasets: one with factual and false human-written texts, and another with texts generated by advanced, large language models (LLMs; GPT-4o, Mistral Large, and Llama 3.3 70B), using various prompts. Using tools like linguistic inquiry and word count (LIWC) and sparse additive generative model (SAGE), we identified distinctive traits: AI-generated text tends to be more formal, structured, positive, and motivational, while human texts vary more in length, exhibit negative emotions, and often use personal references. Additionally, a misinformation detection model performed well on human texts (93% accuracy) but struggled with LLM outputs (75% accuracy). This highlights the unique linguistic patterns of AI-generated misinformation and underscores the need for better detection methods to tackle misleading content in Portuguese.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。