Abstract
OBJECTIVE: Flares of axial spondyloarthritis (axSpA) are common yet unpredictable. We aimed to develop and internally validate a machine learning (ML) model to forecast flares 3, 6, 9 and 12 months ahead using routinely collected electronic health record (EHR) data. METHODS: We performed a retrospective cohort study of 282 axSpA patients (January 2018 to May 2024) in our centre. Flares were defined as a ≥ 2.1 unit rise in Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), ≥ 0.9 unit rise in axSpA Disease Activity Score and clinician confirmed flare. Ninety-eight candidate predictors spanning demographics, patient-reported outcomes, laboratory indices and comorbidities were aggregated into time-series windows. We applied Light Gradient-Boosting Machine (LGBM) and eXtreme Gradient Boosting to forecast the risk of flare. Model interpretability was assessed with SHapley Additive exPlanations (SHAP). RESULTS: Of 282 patients, 100 (35.5%) experienced at least one flare. Our LGBM model demonstrated key 3-month metrics of: accuracy 0.846 (95% CI 0.545–0.980), sensitivity 0.833 (95% CI 0.358–0.995), specificity 0.857 (95% CI 0.421–0.996) and area under the receiver operating characteristic curve (AUROC) 0.845 (95% CI 0.615–1.000). Performance decreased modestly at 12 months, AUROC 0.773 (95% CI 0.562–0.984). Top SHAP contributors included comorbidity burden, BASDAI, CRP deviation, lymphocyte count, age and deprivation index. Individual-level SHAP plots enabled personalised risk profiles. CONCLUSION: This proof-of-concept study demonstrates the feasibility of explainable ML models to predict axSpA flares up to 1 year in advance using real-world EHR data. Embedding the algorithm in electronic records could triage high-risk patients to earlier review and therapy adjustment. This approach offers a novel strategy to inform treat-to-target care pathways and support future integration into digital rheumatology systems.