Abstract
BACKGROUND: Bioacoustics classification plays a crucial role in ecological surveillance and neonatal health monitoring. Infant cry analysis can aid early health diagnostics, while ecological acoustics informs conservation. However, the presence of environmental noise, signal variability, and limited annotated datasets often hinders model reliability and deployment. Robust feature extraction and denoising techniques have become critical for improving model robustness, enabling more accurate interpretation of acoustic events across diverse bioacoustic domains under real-world conditions. OBJECTIVE: This review systematically evaluates advancements in noise-resilient feature extraction and denoising techniques for bioacoustics classification. Specifically, it explores methodological trends, model types, cross-domain transferability between clinical and ecological applications, and evidence for real-world deployment. METHODS: A systematic review was conducted by searching 8 electronic databases, including IEEE Xplore, ScienceDirect, Web of Science, ACM Digital Library, and Scopus, through December 2024. Eligible studies entailed audio-based classification models and applied empirical or computational evaluations of bioacoustics classification using machine learning or deep learning methods. In addition, studies also included explicit or implicit consideration of noise. Two reviewers independently screened studies, extracted data, and assessed quality. Risk of bias was assessed using a customized tool, and reporting quality was evaluated using the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) checklist. RESULTS: Of the 5462 records, 132 studies met the eligibility criteria. The majority (112/132, 84.8%) of studies focused on model innovation, with deep learning and hybrid approaches being the most dominant. Feature extraction played a critical role, with 96.2% (127/132) of studies clearly demonstrating feature extraction. Mel frequency cepstral coefficients, spectrograms, and filter bank-based representations were the most common feature representations. Nearly half (62/132, 47%) of the studies incorporated noise-resilient methods, such as adaptive deep models, wavelet transforms, and spectral filtering. However, only 14.4% (19/132) demonstrated real-world deployment across neonatal care and ecological field settings. CONCLUSIONS: The integration of noise-resilient techniques has significantly improved classification performance, but real-world deployment and proper use of denoising strategies in various datasets remain limited. Cross-domain synthesis reveals shared challenges, including dataset heterogeneity, inconsistent reporting, and reliance on synthetic noise. Future work should prioritize harmonized benchmarks, cross-domain generalization, and deployment, as well as opportunities for transferability.