A Museum artifact classification model based on cross-modal attention fusion and generative data augmentation

基于跨模态注意力融合和生成式数据增强的博物馆文物分类模型

阅读:2

Abstract

Cultural heritage preservation has garnered global attention. Museum artifact classification, a core task, faces challenges related to insufficient multimodal information collaboration and a scarcity of high-quality annotated data. Traditional methods and single-modality deep learning models struggle to achieve both efficiency and accuracy. To address this, this paper proposes a museum artifact classification model (VBG Model) based on cross-modal attention fusion and generative data augmentation. This model constructs an integrated multimodal framework through task-oriented refactoring of the Vision Transformer (ViT), BERT, and a Generative Adversarial Network (GAN). ViT extracts global visual features from artifact images, while BERT mines the historical and cultural semantics of text. A bidirectional interactive attention fusion layer achieves precise feature alignment. The GAN generates diverse samples, forming a closed "generation-feedback-optimization" loop to alleviate data scarcity. Experiments on the MET and MS COCO datasets demonstrate exceptional performance: the VBG Model achieves 92% classification accuracy, 0.85 mAP, and 88% F1 score for the former, while the latter achieves 90% accuracy, 0.83 mAP, and 86% F1 score for the latter. These performance indicators outperform competing models such as ResNet and DenseNet. Ablation experiments confirm that cross-modal fusion and generative data augmentation modules are essential; removing either module results in a 5%-9% drop in accuracy. The current model still has room for improvement in terms of training time and generated image quality. Future work will focus on optimizing performance through lightweight design and multi-scale fusion, enhancing the ability to distinguish similar artifacts and providing technical support for digital artifact management and cultural heritage preservation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。