Parallel hierarchical encoding of linguistic representations in the human auditory cortex and recurrent automatic speech recognition systems

人类听觉皮层中语言表征的并行分层编码和循环自动语音识别系统

阅读:1

Abstract

The human brain's ability to transform acoustic speech signals into rich linguistic representations has inspired advancements in automatic speech recognition (ASR) systems. While ASR systems now achieve human-level performance under controlled conditions, prior research on their parallels with the brain has been limited by the use of biologically implausible models, narrow feature sets, and comparisons that primarily emphasize predictability of brain activity without fully exploring shared underlying representations. Additionally, studies comparing the brain to text-based language models overlook the acoustic stages of speech processing, an essential part in transforming sound to meaning. Leveraging high-resolution intracranial recordings and a recurrent ASR model, this study bridges these gaps by uncovering a striking correspondence in the hierarchical encoding of linguistic features, from low-level acoustic signals to high-level semantic processing. Specifically, we demonstrate that neural activity in distinct regions of the auditory cortex aligns with representations in corresponding layers of the ASR model and, crucially, that both systems encode similar features at each stage of processing-from acoustic to phonetic, lexical, and semantic information. These findings suggest that both systems, despite their distinct architectures, converge on similar strategies for language processing, providing insight in the optimal computational principles underlying linguistic representation and the shared constraints shaping human and artificial speech processing.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。