Abstract
PURPOSE: Hearing-aid users often face challenges in noisy environments due to time, level, and spectral cues being compromised by current generation hearing aids. This study explored a physiologically based speech-segregation algorithm that selectively removes or attenuates unwanted sound sources. METHOD: In our previously developed localization algorithm, the time-frequency responses after a unique normalization approach always reside inside the unit circle of the model space. Given the "sparseness" property of daily sound, the model forms distinct clusters that correspond to the source locations. In the present study, the localization model was adapted to segregate speech by using a binary mask to remove the cluster of unwanted sound. The speech target was one of 200 intelligible sentences. The interfering sound was time-reversed sentences in a random sequence spoken by the same speaker. Automatic speech recognition transcribed the sound mixture before and after the segregation algorithm. RESULTS: When both target and noise were located at the front, applying a hard mask (i.e., 1 or 0) almost perfectly removed the energy of noise. When the sound sources moved to the side or back with smaller angular separations, clusters were less distinguishable, leading to worse intelligibility performance. Applying a soft mask (i.e., 1 or 0.2) instead showed slightly lower performance for the front but improved performance for the back and side. CONCLUSION: Our algorithm performs localization and segregation in a combined and straightforward manner, potentially for spatial hearing aids to function better in challenging listening environments.