Abstract
BACKGROUND: Rheumatoid arthritis (RA) is characterized by persistent synovial inflammation, yet the molecular mechanisms distinguishing early from late-stage disease remain incompletely elucidated. Identifying stage-specific biomarkers and pathogenic cellular interactions is crucial for precision medicine. OBJECTIVE: To comprehensively characterize the transcriptomic landscape and cellular composition of early versus late RA synovium, identify diagnostic biomarkers, and elucidate key pathogenic cell–cell interactions driving disease chronicity. METHODS: Synovial tissues from 51 RA patients (13 early, 38 late-stage) were analyzed using histopathology, immunohistochemistry, bulk RNA sequencing (n = 19),and single-cell RNA sequencing (scRNA-seq, n = 6; 3 Early RA vs. 3 Late-stage RA).Machine learning algorithms (LASSO, SVM-RFE, random forest) were employed to identify diagnostic biomarkers. An artificial neural network (ANN) model was constructed and validated. Cell–cell communication analysis was performed using CellChat. RESULTS: Histopathological analysis revealed significantly increased infiltration of macrophages (CD68 +) and plasma cells (CD138 +) in late-stage RA (P < 0.05). RNA sequencing identified 87 differentially expressed genes, with interferon-stimulated genes significantly upregulated. Integrated machine learning identified a minimal three-gene signature (CXCL10, ISG15, IFIH1) as a promising candidate model for RA staging. The three-gene ANN model showed excellent diagnostic performance (AUC = 0.922). Notably, CXCL10 emerged as the most critical component, demonstrating potentially high classification accuracy in this cohort (AUC = 0.767) and standing as the sole independent predictor in multivariable analysis (OR = 7.271, P = 0.022). CXCL10 high expression was strongly associated with M1 macrophage infiltration (r = 0.446, P = 0.005) and enriched in chemokine and JAK-STAT pathways. scRNA-seq revealed macrophages as the primary source of CXCL10, with upstream stimulation from CD8 + T cells via the IFN-γ-CXCL10-CXCR3 axis. Critically, we identified an expanded TREM2 + macrophage subset in late RA, which highly expressed APRIL (TNFSF13) and expanded in parallel with plasma cells expressing APRIL receptors (BCMA + /TACI +). This TREM2 + macrophage-plasma cell niche may represent a potential pathogenic circuit that could contribute to autoimmune chronicity. CONCLUSIONS: Late-stage RA appears to be characterized by a CXCL10-driven inflammatory signature and an expanded TREM2 + macrophage-plasma cell survival niche. CXCL10 represents a promising candidate biomarker for disease staging that may have mechanistic links to pathogenesis. The IFN-γ-CXCL10-CXCR3 axis and the APRIL-BCMA/TACI pathway may constitute potential therapeutic targets for refractory RA. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13075-026-03764-3.