Synchronous Analysis of Speech Production and Lips Movement to Detect Parkinson's Disease Using Deep Learning Methods

利用深度学习方法同步分析语音产生和唇部运动以检测帕金森病

阅读:1

Abstract

BACKGROUND/OBJECTIVES: Parkinson's disease (PD) affects more than 6 million people worldwide. Its accurate diagnosis and monitoring are key factors to reduce its economic burden. Typical approaches consider either speech signals or video recordings of the face to automatically model abnormal patterns in PD patients. METHODS: This paper introduces, for the first time, a new methodology that performs the synchronous fusion of information extracted from speech recordings and their corresponding videos of lip movement, namely the bimodal approach. RESULTS: Our results indicate that the introduced method is more accurate and suitable than unimodal approaches or classical asynchronous approaches that combine both sources of information but do not incorporate the underlying temporal information. CONCLUSIONS: This study demonstrates that using a synchronous fusion strategy with concatenated projections based on attention mechanisms, i.e., speech-to-lips and lips-to-speech, exceeds previous results reported in the literature. Complementary information between lip movement and speech production is confirmed when advanced fusion strategies are employed. Finally, multimodal approaches, combining visual and speech signals, showed great potential to improve PD classification, generating more confident and robust models for clinical diagnostic support.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。