A novel transformer-based semantic feature extraction method for multi-label text classification

一种基于Transformer的多标签文本分类语义特征提取新方法

阅读:1

Abstract

Multi-label text classification is a critical task in natural language processing, in which each document may belong to multiple categories. This setting is challenging, as it involves complex label dependencies and requires extracting fine-grained semantic features for each label. In this paper, we propose the TMSFE, a new Transformer-based semantic feature extraction method for multi-label text classification, which integrates label-specific query embeddings with a multi-head attention mechanism to extract discriminative features for each potential label and leverages a Latent semantic space to enhance the efficiency of feature extraction. Unlike conventional single-label classifiers or flat multi-label methods, the proposed model designs a DeBERTaV3-based Transformer encoder to jointly model the document and label semantics. Additionally, the proposed SimCSE-Based latent semantic space module projects text and label representations into a shared latent semantic space to enhance feature extraction efficiency. And a sigmoid-based multi-label classification head is applied to the extracted features. Results show that the proposed TMSFE consistently outperforms baseline models, achieving lower Hamming loss and higher feature extraction accuracy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。