Semi-Supervised Relation Extraction Informed by Area Under the Margin Ranking and Large Language Models

基于边际面积排序和大型语言模型的半监督关系抽取

阅读:1

Abstract

Relation extraction is an important task for understanding relationships between entities, building knowledge graphs, and facilitating knowledge discovery. Pre-trained models can be fine-tuned for relation extraction if a substantial amount of labeled data is available. However, acquiring extensive labeled data is generally challenging. Semi-supervised techniques for low-resource relation extraction, such as self-training, offer a promising solution by leveraging both limited labeled data and vast unlabeled data to mitigate this challenge. Traditional self-training methods use a teacher-student framework, where a student is iteratively trained with pseudo-labels generated by the teacher. This may lead to noisy pseudo-labels and impact performance. To address this limitation, we introduce a new model called RE-AUM-LLM that generates high-quality pseudo-labels using self-training combined with Area Under the Margin (AUM) and Large Language Models (LLMs), such as Llama 3.1. Experimental results on two benchmark datasets show that the proposed approach achieves state-of-the-art results for low-resource relation extraction by comparison with several strong baselines. We will make the code publicly available to enable reproducibility and further research in this area.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。