Cross-Modal Alignment and Rectified Flow-Based Latent Representation Synthesis for Enhanced Speech-Driven Alzheimer's Disease Detection

基于跨模态对齐和修正流的潜在表示合成技术用于增强语音驱动的阿尔茨海默病检测

阅读:1

Abstract

To address the limited accuracy of speech-based Alzheimer's Disease (AD) screening and the shortage of paired multimodal data, this paper proposes a detection framework based on feature alignment and Rectified Flow-driven latent representation generation. The EEG dataset consists of 36 AD patients and 29 Healthy Controls (HC). The speech dataset contains 399 samples, which include 114 AD cases, 132 Mild Cognitive Impairment (MCI) cases, and 153 HC cases. We extracted multidimensional features of EEG signals, such as time-domain and frequency-domain characteristics, alongside behavioral representations of speech. A heterogeneous alignment network was used to map these features into a common semantic subspace, where an adaptive interpolation strategy reconstructed the missing pathological trajectories of MCI within the latent space. On this basis, a conditional Rectified Flow model was introduced to learn the optimal transport mapping from speech to EEG. This model generated physiological-information-rich latent representations to compensate for semantic gaps. Experimental results showed that the fused features from speech and latent representations achieved a three-class classification accuracy of 89.08%, a precision of 88.77%, and a recall of 88.71%. This performance represented an accuracy improvement of 9.28% compared with the speech-based baseline system. Our method combines the convenience of speech screening with the high reliability of neurophysiological signals, and it provides a new approach for low-cost early detection of AD.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。