Automated detection of corruption reports in text via deep reinforcement learning

利用深度强化学习自动检测文本中的腐败报告

阅读:1

Abstract

We encounter issues with the huge number of useful features and the uneven target classes in the dataset when attempting to detect corruption reports in texts. We offer a novel approach that uses deep reinforcement learning techniques to identify corruption reports in texts in order to address these issues. Our suggested approach is broken down into four primary phases and integrates deep reinforcement learning, feature selection, and feature description techniques. In order to prepare the texts for the following steps, the first step is devoted to data preparation activities. The second step involves the feature extraction process, which is carried out employing three feature types: statistical features, which define the text's attributes in terms of frequency and statistics; Term Frequency-Inverse Document Frequency (TF-IDF) features, which assign weights to terms based on their occurrence in all texts and throughout the dataset; and Word2Vec characteristics, which not only describe the importance of features but also model the concurrency and communication traits of phrases. Following the combination of these three feature sets, an ideal subset is chosen and the dataset's dimensionality is decreased using the Singular Value Decomposition method. In the fourth and last stage, a Convolutional Neural Network (CNN) is utilized to carry out the detection process. The CNN model's configuration is modified using the Q-learning model. Experiments on identifying corruption reports in texts have revealed that our suggested approach has an average accuracy of 90.04% and F-measure of 0.9. These findings demonstrate the method's superior performance over other approaches already in use and validate its capacity to identify positive samples in texts pertaining to corruption with more accuracy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。