Abstract
This study identifies diagnostic biomarkers of OA-related synovitis from synovial tissue expression and develops a validated diagnostic nomogram (differentially expressed genes = differentially expressed genes; single-sample gene set enrichment analysis [ssGSEA]). We analyzed GEO synovium datasets (training: GSE55235, GSE55457, GSE82107, OA = 30 vs controls = 27; validation: GSE89408, OA = 22 vs controls = 28; cartilage comparator: GSE129147, OA = 10 vs controls = 9) and applied weighted gene correlation network analysis to identify phenotype-linked modules, followed by 4 machine learning models (random forest [RF], support vector machine [SVM], xtreme gradient boosting (XGB), generalized linear model [GLM]) to rank genes, selection of hub genes from the top SVM features, construction and validation of a multigene nomogram predicting OA-related synovitis vs control, and integrative pathway and immune profiling (gene ontology/kyoto encyclopedia of genes and genomes, ssGSEA), competitive endogenous RNA network analysis, and hypothesis-generating protein-ligand docking. In the training synovium set (GSE55235 + GSE55457 + GSE82107; outcome = OA-related synovitis vs control), model area under the curves (AUCs; 95% confidence intervals) were RF 0.944 (0.882-1.000), SVM 1.000 (0.997-1.000), XGB 0.917 (0.842-0.992), and GLM 0.944 (0.882-1.000). In the external synovium validation dataset GSE89408 (outcome = OA-related synovitis vs control), AUCs (95% confidence intervals) were RF 0.729 (0.585-0.873), SVM 0.792 (0.662-0.922), XGB 0.717 (0.571-0.863), and GLM 0.771 (0.636-0.906), emphasizing external validation as the fairer test of model generalizability. The cartilage comparator GSE129147 (outcome = OA vs control in cartilage) yielded SVM AUC 0.833 (0.333-1.000), supporting tissue-specific yet cross-tissue consistency. Five hub genes - CTSH, ephrin-B2, YIPF2, ZNF671, SLC27A6 - were identified from 462 intersecting genes, selected from the SVM model because it showed the smallest residuals and best internal discrimination among the 4 tested algorithms. The 5-gene nomogram showed good calibration and decision-curve net benefit across 10% to 40% threshold probabilities, confirming its diagnostic utility. ssGSEA analysis revealed enriched immune-related pathways and higher infiltration of B cells, macrophages, mast cells, and T-cell subsets in OA synovium, closely associated with the expression of hub genes such as YIPF2 and ZNF671 linked to adaptive-immune and inflammatory signaling. Molecular docking indicated that dexamethasone and triamcinolone acetonide bind to the protein products of the hub genes (-7.1 to -8.5 kcal/mol). The 5-gene synovium-based SVM model provides a validated diagnostic nomogram for OA-related synovitis; docking findings are hypothesis-generating and not evidence of therapeutic efficacy.