Using generative adversarial network to improve the accuracy of detecting AI-generated tweets

利用生成对抗网络提高人工智能生成推文的检测准确率

阅读:1

Abstract

This paper provides a novel approach using state-of-the-art generative Artificial Intelligence (AI) models to enhance the accuracy of machine learning methods in detecting AI-generated texts; the underlying generative capabilities are used along with ensemble-based learning methods for the exact characterization of created text attributes. Four basic steps are involved in the proposed methodology. The first step of the text process is the preprocessing stage itself consisting of several steps for the purification of irrelevant data. These stages include noise removal, text tokenization, removal of stop-words, word normalization, and handling uncommon words. In the next step, feature engineering and text representations are done whereby every preprocessed text is represented by a square matrix. This matrix encapsulates data about word correlations, cooccurrence, and word weights. The third step is Generative Adversarial Network (GAN)-based feature extraction, using a GAN model to extract efficient features in classifying the texts based on their creator type. After that, it turns the discriminator part into a strong feature extraction model. The fourth step is weighted Random Forest (RF)-based detection, with the features extracted by the discriminator of GAN serving as input to the RF-based detection model. This approach has covered the differences between texts generated by a human and that generated by Artificial Intelligence, with a significant improvement of 99.60% average accuracy, representing a 1.5% improvement against comparative methods.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。