Abstract
BACKGROUND: Immune cells are involved in rheumatoid arthritis (RA), but the link between other blood cell indices and the disease activity of RA, along with the underlying mechanisms, is unclear. OBJECTIVE: This study aimed to develop an interpretable machine learning model based on blood cell parameters to assess RA disease severity and assist in personalized treatment decisions. METHODS: A retrospective case-control study was conducted with blood routine and biochemical detection data from 4401 patients at the First Affiliated Hospital of Guangxi Medical University, spanning from January 1, 2018, to January 1, 2024. The primary outcome was disease severity stratification. Recursive feature elimination was applied to identify key variables, and 10 machine learning algorithms were benchmarked on 55 clinical features with internal validation. Model interpretability was assessed with SHAP, while logistic regression and restricted cubic spline models were used to examine associations between blood cell indices and disease severity. In addition, Mendelian randomization analysis was performed to explore potential causal relationships. DESIGN: This was a retrospective case-control study. RESULTS: Blood cell indices were identified as the primary factors associated with RA severity. In model evaluation, the Random Forest achieved the best performance, with test set AUCs of 0.870 and 0.874. Mendelian randomization supported a causal relationship between blood cell indices and RA risk. CONCLUSION: These results reinforce the associations between blood cell indices and RA severity. The machine learning model demonstrates good predictive capabilities for RA severity and may assist clinicians in developing personalized treatment strategies.