Abstract
OBJECTIVES: To develop and evaluate a deep learning model for diagnosing untreated rheumatoid arthritis (RA) using digital camera images of bilateral dorsal hands, benchmarking its performance against the widely-used 2010 ACR/EULAR criteria as a clinical reference standard. METHODS: This pilot study included 170 participants (86 RA, 84 non-RA) who presented with joint symptoms at participating medical institutions. Digital images of both dorsal hands were captured under standardized conditions and processed using a deep learning-based background removal algorithm. A Swin Transformer-based model was developed and trained on these images. Model performance was evaluated using area under the receiver operating characteristics curve (AUROC), sensitivity, specificity, and calibration metrics. Gradient-Weighted Class Activation Mapping (Grad-CAM) was employed to visualize the model’s decision-making process. RESULTS: The deep learning model achieved an AUROC of 0.870 (95% CI: 0.708–0.988), compared with 0.981 (95% CI: 0.953–1.010) for the ACR/EULAR criteria, with the difference not reaching statistical significance (p = 0.131). While demonstrating comparable sensitivity to the ACR/EULAR criteria, the model showed lower specificity, accuracy, and F1-score. Post-Platt scaling calibration analysis revealed good alignment with ideal calibration in the 0.4–0.6 probability range. Grad-CAM visualization confirmed that the model focused on clinically relevant joint regions, particularly the metacarpophalangeal and proximal interphalangeal joints. CONCLUSION: Our deep learning-based approach for RA diagnosis using standard digital camera images demonstrated clinically viable performance, albeit with lower specificity than the ACR/EULAR criteria. This accessible screening tool could potentially expedite early RA detection, particularly in resource-limited settings. Larger multi-centre studies are needed to validate our findings and establish broader clinical applicability. CLINICAL TRIAL NUMBER: Not applicable. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s41927-026-00639-7.