Abstract
OBJECTIVE: This study aims to construct and validate a machine learning model based on radiomics analysis of magnetic resonance imaging (MRI) to identify high-risk subtypes of Parkinson’s disease(PD) patients who may progress to depression. METHODS: The study utilized data from 272 PD patients in the PPMI database, among whom 81 developed depression during a five-year follow-up period. The cohort was randomly divided into a training set (n = 191) and a test set (n = 81). Radiomic features from white matter, gray matter, and cerebrospinal fluid were extracted from structural MRI scans in the training set and subsequently reduced in dimensionality to create radiomics biomarker. Multiple logistic regression was employed to select predictors for Depression in Parkinson’s Disease (DPD) based on clinical characteristics, which were then combined with radiomics marker to develop an integrated model using the decision tree algorithm for predicting DPD patients. The performance of this model was evaluated using the receiver operating characteristic (ROC) curve with data from both the training and test sets; additionally, classification performance was visualized through a confusion matrix. RESULTS: The area under the curve (AUC) for the radiomics marker in both the training and test sets were 0.729 and 0.695, respectively. Multivariate logistic regression analysis indicated that mild cognitive impairment (MCI), UPDRS I score, and radiomics marker are significant predictive factors for DPD patients. The integrated model constructed with radiomics marker achieved AUCs of 0.841 and 0.823 in the training and test sets, respectively, with sensitivities of 0.912 and 0.875, and specificities of 0.767 and 0.714, respectively. The confusion matrix revealed statistically significant differences (P < 0.05) in the number of individuals experiencing actual depression progression between low-risk and high-risk groups classified by the model in both training and test groups. CONCLUSION: This study demonstrates that an integrated model utilizing radiomics combined with machine learning can serve as a valuable tool in clinical practice for predicting the onset of DPD among patients, potentially facilitating adaptive strategies for clinical management. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12883-026-04779-8.