Abstract
Early and reliable detection of cardiac murmurs from phonocardiogram (PCG) recordings is essential for improving cardiovascular screening and supporting diagnosis in primary care. However, automated murmur classification remains challenging due to signal variability, class imbalance, and temporal dependence within heart-sound sequences. This study presents a leakage-safe heart-sound classification framework that combines peak-based segmentation, Mel-Frequency Cepstral Coefficient (MFCC) feature extraction, Synthetic Minority Over-sampling Technique (SMOTE)–based class balancing, and Recurrent Neural Network (RNN)–driven temporal modeling. Segmentation was performed around cardiac onset peaks, and evaluation was conducted using recording-level splits for the PhysioNet 2016 dataset and patient-level splits for the PhysioNet 2022 dataset to prevent segment correlation bias. The proposed model achieved 98.6% accuracy (precision = 98.26%, recall = 98.95%, F1-score = 98.61%) on PhysioNet 2022, and 98.5% accuracy (precision = 98.49%, recall = 98.52%, F1-score = 98.50%) on PhysioNet 2016, demonstrating consistently high performance across datasets with different class distributions. These results indicate that combining temporal modeling with balanced learning improves robustness in murmur detection. The findings highlight the potential of PCG-based deep learning systems to support scalable, non-invasive cardiac screening, particularly in settings with limited access to specialist assessment.