Dynamic context-aware multi-modal deep learning for longitudinal prediction of Parkinson's disease progression

基于动态上下文感知的多模态深度学习用于帕金森病进展的纵向预测

阅读:1

Abstract

Accurately forecasting the progression of Parkinson's disease (PD) motor symptoms in early-to-moderate stages is essential for timely intervention and personalized patient care but remains challenging due to heterogeneous and longitudinal symptom evolution. We present a novel dynamic context-aware multi-modal deep learning framework that predicts future motor symptom severity by integrating advanced voice biomarkers with signal processing techniques, clinical progression features, demographic metadata, and semantically enriched patient summary embeddings derived from comprehensive clinical narratives via state-of-the-art natural language processing. Leveraging bidirectional LSTMs augmented with multi-head self-attention, our architecture captures complex temporal dependencies while preventing information leakage. To ensure robust evaluation despite limited sample size (42 patients), we implemented repeated 5-fold cross-validation at the patient level (8 repetitions, 40 total folds), substantially exceeding standard evaluation rigor. Our approach achieves exceptional performance ([Formula: see text] = 0.9925 ± 0.0027, RMSE = 0.67 ± 0.19, MAE = 0.50 ± 0.15) with all 40 folds achieving [Formula: see text] > 0.989, significantly outperforming classical machine learning baselines ([Formula: see text] and 0.002785) and all previously published methods on this dataset. Cross-validated ablation studies (240 total model trainings across 6 configurations) reveal that clinical features establish a strong baseline ([Formula: see text] = 0.9887 ± 0.0043), while text embeddings provide the largest incremental gain (3.82% RMSE reduction). Voice biomarkers contribute modestly to accuracy (2.72%) but substantially enhance stability (10-fold lower variability). The full multi-modal model achieves optimal performance (7.50% RMSE reduction vs. clinical-only) with the lowest variability (CV = 0.27%), demonstrating that dynamic cross-modal fusion enhances both accuracy and robustness. These findings, validated through 40 independent evaluations with each patient tested 8 times, demonstrate that integrating engineered temporal dynamics and contextual embeddings through advanced temporal modeling enables accurate longitudinal predictions of early-to-moderate PD progression. Complete code and implementation details are publicly available to ensure reproducibility.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。