An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech

普通话语音舌运动的音频-超声同步数据库

阅读:1

Abstract

Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to construct a multimodal database designed for Mandarin speech. The dataset integrates synchronized ultrasound images of lingual movement, and the corresponding audio recordings and text annotations elicited from 43 healthy speakers and 11 patients with dysarthria through speech tasks (including vowels, monosyllables, and sentences), with a total duration of 22.31 hours. In addition, a customized helmet structure was employed to stabilize the ultrasound probe, precisely controlling for head movement and minimizing displacement interference, The proposed database carries apparent values in automatic speech recognition, silent interface development, and research in speech pathology and linguistics.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。