A hybrid re-fusion model for text classification

一种用于文本分类的混合重融合模型

阅读:1

Abstract

Text classification is an important task in the field of natural language processing, aiming to automatically assign text data to predefined categories. The BertGCN model combines the advantages from both BERT and GCN, enabling it to effectively handle text data for classification. However, there are still some limitations when it comes to handling complex text classification tasks. BERT processes sequence information in segments and cannot directly capture long-distance dependencies across segments, which is a limitation when dealing with long sequences. GCN tends to suffer from over-smoothing problem in deep networks, leading to information loss. To overcome these limitations, we propose the XLG-Net model, which integrates XLNet and GCNII to enhance text classification performance. XLNet employs permutation language modeling and improvements of the Transformer-XL architecture, not only improving the ability to capture long-distance dependencies but also enhancing the model's understanding of complex language structures. Additionally, we introduce GCNII to overcome the over-smoothing problem in GCN. GCNII effectively retains the initial features of nodes by incorporating initial residual connections and identity mapping mechanisms, ensuring effective information transmission even in deep networks. Furthermore, to achieve excellent performance on both long and short texts, we apply the design philosophy of DoubleMix to the XLNet model, using a hybrid approach of mixing hidden states improves the model's accuracy and robustness. Experimental results demonstrate that the XLG-Net model achieves significant performance improvements on four benchmark text classification datasets, validating the model's effectiveness on complex text classification tasks.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。