Abstract
INTRODUCTION: Multiple sclerosis (MS) is a complex disease characterized by diverse clinical presentations and progression patterns. Accurate classification and prediction of disease severity are crucial for personalized treatment. We applied machine learning (ML) to demographic, clinical and MRI data to distinguish MS patients from healthy controls (HC), classify MS phenotypes and predict disability using the Expanded Disability Status Scale (EDSS) score. METHODS: We included 1,554 MS patients and 520 HC from the Italian Neuroimaging Network Initiative repository, all with neurological assessment and brain T2-/3D T1-weighted MRI. Derived MRI features included total and regional T2 lesion volumes (LV), and normalized tissue volumes from cortical and subcortical grey matter (GM), white matter, cerebellum and brainstem. ML models, including support vector machines, multi-layer perceptron networks, Random Forest and Gradient Boosting were trained for classification and prediction tasks. SHAP analysis ranked the most influential variables. RESULTS: ML models achieved 89-96% accuracy in distinguishing MS patients from HC, driven mainly by T2 LV and brainstem/cerebellar GM volumes. Relapsing vs progressive MS was classified with 92% accuracy, with EDSS, age, thalamic and cortical GM volumes as key predictors. EDSS prediction achieved an intra-class correlation of 0.56-0.76; most relevant contributors were T2 LV, sex, cortical/cerebellar GM and thalamic volumes. DISCUSSION: ML models demonstrated high accuracy in detecting MS, differentiating phenotypes, and predicting disability. Integrating demographic, clinical and MRI measures emerges as an effective strategy for patients' classification and disease severity assessment.