Contrastive learning enhanced retrieval-augmented few-shot framework for multi-label patent classification

对比学习增强检索增强型少样本框架用于多标签专利分类

阅读:3

Abstract

The rapid expansion of patent databases poses increasing challenges for multi-label patent classification, particularly for inventions spanning multiple technological domains. Conventional approaches are hindered by high annotation costs and limited scalability, while often neglecting the semantic structure of patent documents. Here, we present a retrieval-enhanced few-shot learning framework that combines patent-specific contrastive pre-training with semantic retrieval to enable scalable multi-label classification. Drone technologies are selected as the evaluation domain due to their multidisciplinary characteristics encompassing mechanical, electronic, and software aspects. The proposed method learns domain-adapted embeddings that capture multi-label co-occurrence patterns and leverages retrieval-augmented few-shot learning with structured reasoning to reduce reliance on extensive annotations. Experiments on a curated dataset of 15,000 annotated drone patents across ten categories demonstrate that the framework achieves Macro-F1 and Micro-F1 scores of 0.847 and 0.892, corresponding to improvements of 30% and 23% over few-shot baselines. Furthermore, contrastive pre-training yields notable benefits for underrepresented categories, with performance improvements reaching 16% over transformer-based approaches. These results indicate that the proposed approach offers an effective and resource-efficient solution for multi-label patent classification, with potential to improve the scalability and accessibility of intellectual property analysis.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。