Abstract
Online social networks are currently the most widely utilized interactive media for interpersonal communication, emotional expression, and information sharing. Despite the helpful and fascinating content, unfortunately, inappropriate or abusive content, such as toxicity, hate speech, and insults, can occasionally be shared on social networks. Any kind of online abuse, including but not limited to cyberbullying, discrimination, abusive language, profanity, flames, hate speech, and harassment, is considered toxic content. While there has been little effort in the Arabic language, the majority of toxicity detection attempts have focused on English text. In this work, we constructed a standard Arabic dataset that can be used for toxicity and abuse detection on OSNs. The proposed dataset has been annotated by the experts of five native and fluent Arabic speakers and linguists. To evaluate the performance of our dataset, we conducted a series of experiments by using sixteen machine learning algorithms, the FastText model, and seven transfer learning architectures to compare the performance. Furthermore, we used four word embedding techniques (bag of words (BOW), term frequency-inverse document frequency (TF-IDF), FASTTEXT, and bidirectional encoder representations from transformers (BERT)). Our experimental results demonstrated that the fine-tuned MARBERTv2 model with BERT embedding outperforms the other models, achieving an F1-score of 92.43% and an accuracy of 92.21%. Notably, this study highlights the importance of addressing toxicity on social media platforms, considering diverse languages and cultures. This signifies a significant breakthrough in the classification of toxic tweets in Arabic.