Phonological complexity, speech style, and individual differences influence ASR performance for Tarifit

语音复杂性、语音风格和个体差异会影响Tarifit的自动语音识别性能。

阅读:1

Abstract

This study examines individual differences through the lens of automatic speech recognition (ASR) transfer (applying ASR trained on one language to a new language) from Arabic to Tarifit, an under-resourced Amazigh language with typologically rare phonological structures. Thirty-seven native Tarifit speakers produced target words in both clear and casual speaking styles, allowing us to assess how phonological complexity and speech clarity interact to influence ASR performance. Results show that clear speech significantly improves recognition accuracy, particularly for words with rising sonority onset clusters. In contrast, falling sonority clusters and initial geminate consonants, which are both typologically marked structures, yield higher error rates even when spoken clearly. Importantly, we observe substantial speaker-level variability in ASR outcomes, though demographic factors such as age and gender do not predict performance. These findings suggest that individual differences in speech production and phonological encoding play a critical role in shaping ASR recognition success. By leveraging ASR as a proxy for perceptual processing, this work contributes to our understanding of how phonological structure and speaker variability jointly influence speech perception, with implications for inclusive ASR design and phonological theory.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。