Abstract
Several segregation cues help listeners understand speech in the presence of distractor talkers, most notably differences in talker sex (i.e., differences in fundamental frequency and vocal tract length) and spatial location. It is unclear, however, how these cues work together, namely whether they show additive or even synergistic effects. Furthermore, previous research suggests better performance for target words that occur later in a sentence or sequence. We additionally investigate for which segregation cues or cue combinations this build-up occurs and whether it depends on memory effects. Twenty normal-hearing participants completed a speech-on-speech masking experiment using the OLSA (a German matrix test) speech material. We adaptively measured speech-reception thresholds for different segregation cues (differences in spatial location, fundamental frequency, and talker sex) and response conditions (which word(s) need(s) to be reported). The results show better thresholds for single-word reports, reflecting memory constraints for multiple-word reports. We also found additivity of segregation cues for multiple- but sub-additivity for single-word reports. Finally, we observed a build-up of release from speech-on-speech masking that depended on response and cue conditions, namely no build-up for multiple-word reports and continuous build-up except for the easiest condition, i.e., different sex/spatially separated maskers for single-word reports. These results shed further light on how listeners follow a target talker in the presence of competing talkers, i.e., the classical cocktail-party problem, and indicate the potential for performance improvement from enhancing segregation cues in the hearing-impaired.