DuSAFNet: A Multi-Path Feature Fusion and Spectral-Temporal Attention-Based Model for Bird Audio Classification

DuSAFNet：一种基于多路径特征融合和频谱-时间注意力机制的鸟类音频分类模型

阅读：2

作者：Lu,Zhengyang,Li,Huan,Liu,Min,Lin,Yibin,Qin,Yao,Wu,Xuanyu,Xu,Nanbo,Pu,Haibo

期刊：	Animals	影响因子：	2.700
时间：	2025	起止号：	2025 Jul 29;15(15)
doi：	10.3390/ani15152228	种属：	Bird

Abstract

This research presents DuSAFNet, a lightweight deep neural network for fine-grained bird audio classification. DuSAFNet combines dual-path feature fusion, spectral-temporal attention, and a multi-band ArcMarginProduct classifier to enhance inter-class separability and capture both local and global spectro-temporal cues. Unlike single-feature approaches, DuSAFNet captures both local spectral textures and long-range temporal dependencies in Mel-spectrogram inputs and explicitly enhances inter-class separability across low, mid, and high frequency bands. On a curated dataset of 17,653 three-second recordings spanning 18 species, DuSAFNet achieves 96.88% accuracy and a 96.83% F1 score using only 6.77 M parameters and 2.275 GFLOPs. Cross-dataset evaluation on Birdsdata yields 93.74% accuracy, demonstrating robust generalization to new recording conditions. Its lightweight design and high performance make DuSAFNet well-suited for edge-device deployment and real-time alerts for rare or threatened species. This work lays the foundation for scalable, automated acoustic monitoring to inform biodiversity assessments and conservation planning.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。