Abstract
This study explores the linguistic differences between AI-generated content and human-written texts, particularly in Portuguese. We created two datasets: one with factual and false human-written texts, and another with texts generated by advanced, large language models (LLMs; GPT-4o, Mistral Large, and Llama 3.3 70B), using various prompts. Using tools like linguistic inquiry and word count (LIWC) and sparse additive generative model (SAGE), we identified distinctive traits: AI-generated text tends to be more formal, structured, positive, and motivational, while human texts vary more in length, exhibit negative emotions, and often use personal references. Additionally, a misinformation detection model performed well on human texts (93% accuracy) but struggled with LLM outputs (75% accuracy). This highlights the unique linguistic patterns of AI-generated misinformation and underscores the need for better detection methods to tackle misleading content in Portuguese.