Abstract
BACKGROUND: Primary Sjögren's syndrome (pSS) is an autoimmune and inflammatory disorder that may affect the lungs, leading to interstitial lung disease (ILD). However, the diagnosis of progression from pSS to ILD is frequently delayed due to unstandardized interdisciplinary diagnostic criteria and a lack of reliable shared biomarkers. This diagnostic challenge, compounded by significant pathophysiological divergence in target organs, has hindered elucidation of their comorbidity mechanisms. This study employs integrated bioinformatics to identify shared biomarkers in pSS and ILD, deciphers their pathogenic mechanisms, and predicts targeted therapeutics via network pharmacology. METHODS: From the Gene Expression Omnibus (GEO) database, we retrieved gene expression profiles of pSS and ILD. Differential expression gene (DEG) analysis was performed on the profiles, followed by further screening using four machine learning algorithms. Concurrently, weighted gene co-expression network analysis (WGCNA) was applied to identify gene modules, and enrichment analysis of WGCNA-derived genes was conducted to explore their biological functions. Genes obtained from WGCNA and machine learning approaches were then intersected to identify candidate biomarkers for pSS-ILD. The diagnostic potential of these candidate genes was evaluated in both discovery and validation sets using receiver operating characteristic (ROC) curves. Finally, we performed immune cell infiltration analysis of candidate genes, regulatory network construction for transcription factor (TF)-gene and miRNA-gene interactions, drug-target prediction, and molecular docking coupled with molecular dynamics simulations for predicted drugs. RESULTS: Differential expression analysis identified 25 shared genes between pSS and ILD gene expression profiles, with machine learning algorithms refining six key genes from these DEGs. WGCNA revealed 39 intersecting genes significantly enriched in biological processes including cell division, oocyte maturation, and metabolic regulation. Intersection of machine learning and WGCNA results yielded two hub genes (CYSLTR1 and SIGLEC10), both demonstrating robust diagnostic value in discovery and validation cohorts. Immune cell infiltration profiling showed: upregulation of activated CD4+ memory T cells and memory B cells; downregulation of resting NK cells. Regulatory network analysis indicated FOXC1, hsa-mir-27a-3p, hsa-mir-195-5p, and hsa-miR-26a-5p as potential coregulators of CYSLTR1 and SIGLEC10 expression. Finally, ten candidate drug compounds targeting the hub genes were prioritized, exemplified by:Rev-5901 (CTD 00002161), Zafirlukast (BOSS database) and Montelukast (CTD 00003205). Molecular docking demonstrated substantial binding affinity of both montelukast and zafirlukast for CYSLTR1, while molecular dynamics simulations further validated the stability of their complexes. CONCLUSION: This study revealed that CYSLTR1 and SIGLEC10 demonstrate diagnostic potential for pSS-ILD. Their mechanism of action likely involves synergistically upregulating memory B cells to promote disease progression. Furthermore, we identified montelukast as a potential therapeutic agent. This discovery holds promise for improving clinical outcomes for pSS-ILD patients.