Abstract
Major depressive disorder (MDD) is a globally prevalent psychiatric disorder that significantly impairs quality of life and increases suicide risk. Accurate identification of MDD is critical for clinically assisted diagnosis. Although substantial progress has been made in MDD identification, extracting region of interest (ROI) features from functional brain networks remains underexplored. Furthermore, most studies rely on small-scale resting-state functional magnetic resonance imaging (rs-fMRI) datasets, which limits the generalizability of their findings to large-scale brain networks. To address these issues, we propose a novel graph embedding-based feature selection classification framework (GEF-FSC) to identify MDD through multi-site rs-fMRI data. The framework employs the node2vec algorithm to learn local and global functional connectivity (FC) features of ROIs via flexible random walks, capturing structural information in functional brain networks. Random Forest is then applied for feature selection on the learned embedding features, followed by classification using an ensemble classifier. This approach captures complex, higher-order structural information between ROIs and retains important features, enhancing classification accuracy by minimizing redundancy in high-dimensional FC features. Evaluated on the REST-meta-MDD dataset, our framework achieved 81.65% accuracy under the Dosenbach template and 75.30% under the AAL atlas. Comparative experiments with eight benchmark methods and six state-of-the-art classifiers demonstrated superior accuracy, sensitivity, specificity, and F1-score. Interpretability analysis highlighted key brain regions and networks consistent with previous findings. The GEF-FSC framework effectively classifies MDD and identifies key brain regions and networks associated with the disorder, emphasizing the importance of higher-order structural information in improving diagnostic accuracy.