Toward non-invasive early pest surveillance: cross-modal adaptation using PLMS acoustic-visual representation and pre-trained transfer learning

迈向非侵入式早期害虫监测:利用PLMS声学-视觉表征和预训练迁移学习的跨模态适应

阅读:1

Abstract

Pest infestations pose significant threats to agricultural productivity and ecological balance, making early prevention crucial for effective management. Toward non-invasive early-stage pest surveillance, this study introduces a novel cross-modal adaptation paradigm, leveraging the comprehensive bioacoustic repository, InsectSound1000 database. Firstly, the methodology initiates with adaptive audio preprocessing, where raw signals are filtered using the low-pass filter to remove high-frequency interference, followed by the downsampling operation to prevent aliasing and reduce computational complexity. Secondly, Patch-level log-scale mel spectrum (PLMS) spectrograms are proposed to convert acoustic signals into visual representations, refining time-frequency patterns through patch-level hierarchical decomposition to capture low-frequency and localized spectral features. The logarithmic transformation further enhances subtle low-frequency insect sound characteristics, optimizing feature analysis and boosting model sensitivity and generalization. Next, the PLMS acoustic-visual spectrograms undergo data augmentation prior to being processed by the pre-trained You Only Look Once version 11(YOLOv11) model for deep transfer learning, facilitating the efficient extraction of high-level semantic features. Finally, we compare the proposed algorithm with traditional acoustic features and networks, investigating how to balance preserving the frequency content of the signal and meeting computational requirements through optimized downsampling. Experimental results demonstrate that the proposed method achieves an Accuracy@1 of 96.49%, a Macro-F1 score of 96.49%, and a Macro-AUC of 99.93% at the 2500Hz sampling rate, showcasing its superior performance. These findings indicate that cross-modal adaptation with PLMS spectrograms and YOLOv11-based transfer learning can significantly enhance pest sound detection, providing a robust framework for non-invasive, early-stage agricultural pest surveillance.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。