Abstract
BACKGROUND: Post-thrombotic syndrome (PTS) is a common and debilitating complication after lower extremity deep vein thrombosis (LEDVT). Early risk assessment of PTS patients is still a clinical challenge. METHODS: The retrospective, multicenter cohort study included 265 patients with unprovoked LEDVT. Baseline clinical and biochemical data including serum uric acid, body mass index (BMI) and treatment delay were noted. Ten machine learning (ML) models (e.g., support vector machine [SVM], XGBoost and LightGBM) were constructed for predicting PTS occurrence in a 70:30 training-validation split. Discrimination of the models was evaluated by F1-score, area under the receiver operating characteristic curve (AUC), calibration curve, and decision curve analysis. Model interpretability was performed through SHapley Additive exPlanations (SHAP). RESULTS: PTS occurred in 92 patients (34.7%). In multivariate logistic regression, iliofemoral DVT, elevated uric acid, prolonged treatment delay, higher BMI, and lack of statin use were independent predictors of PTS. Among ML models, SVM achieved the best test performance (AUC = 0.985, F1 = 0.926). SHAP analysis identified BMI, treatment delay, and DVT location as top contributors to prediction, while uric acid showed moderate influence. A web-based risk calculator was deployed for clinical use. CONCLUSIONS: The model of machine learning shows a great capacity in prediction of PTS, and had high prediction accuracy, the SVM algorithm predicted better than other algorithms. Addition of uric acid and timing of treatment refines risk stratification. The tool developed might assist clinicians in early identification and risk stratification of patients for individualized care.