An explainable ensemble machine learning model using baseline blood transcriptomics to predict Parkinson's disease motor progression

利用基线血液转录组学数据构建可解释的集成机器学习模型,以预测帕金森病运动进展

阅读:1

Abstract

INTRODUCTION: Predicting Parkinson's disease (PD) motor progression remains challenging despite advances in neuroimaging. Blood-based transcriptomic profiling offers a more accessible and cost-effective alternative. This study aimed to develop and validate a machine learning approach using blood-based transcriptomic data to predict 12-month motor severity in PD and to identify the transcriptomic features and biological pathways most strongly associated with progression. METHODS: A Stacking Regressor ensemble model combining three gradient boosting algorithms (XGBoost, LightGBM, CatBoost) was developed using baseline Parkinson's Progression Markers Initiative (PPMI) data (n = 390), integrating blood RNA sequencing (RNA-seq) and clinical features to predict 12-month UPDRS Part III scores. SHapley Additive exPlanations (SHAP) analysis was applied to identify key prognostic features, evaluating seven PD risk genes (SNCA, LRRK2, GBA, PRKN, PINK1, PARK7, VPS35) and pathway scores for mitochondrial dysfunction, neuroinflammation, and autophagy. RESULTS: On an independent test set (n = 78), the model achieved a Coefficient of Determination (R²) of 0.551 and Mean Absolute Error (MAE) of 6.01. SHAP analysis identified the baseline UPDRS × PINK1 interaction (UPDRS_BL × PINK1) as the most influential feature (mean |SHAP| = 0.283). Among transcriptomic features, VPS35 (mean |SHAP| = 0.010), GBA, and LRRK2 were most prominent. Mitochondrial dysfunction showed the highest pathway contribution (mean |SHAP| = 0.008). DISCUSSION: The study establishes that machine learning integrating blood transcriptomics and clinical data effectively predicts motor progression in PD. Crucially, the interplay between initial clinical state and specific genetic backgrounds-particularly PINK1-is a more powerful prognostic indicator than any factor alone. This study provides systematic evidence that mitochondrial dysfunction is a dominant prognostic signal for disease progression, nominating key genes and pathways for future mechanistic and therapeutic investigation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。