Abstract
BACKGROUND: Non-small cell lung cancer (NSCLC) is a leading cause of cancer-related mortality worldwide. The urgent need to understand its risk factors and develop effective treatment strategies drives ongoing research in this field. Among various environmental factors, air pollutants have emerged as potential risk factors. Therefore, in-depth exploration is necessary to elucidate their impact on NSCLC pathogenesis. METHODS: This study employs a multifaceted approach combining transcriptomic data analysis, machine learning, and molecular docking simulations to assess the association between air pollutants-carbon monoxide (CO), nitric oxide (NO), nitrogen dioxide (NO₂), sulfur dioxide (SO₂), benzo[a]anthracene (BaA), benzo[a]pyrene (BaP), and 3-methylcholanthrene (3-MC)-and NSCLC. We identified a total of 30 gene targets associated with air pollutants in NSCLC. These findings highlight significant molecular alterations. Pathway enrichment analysis was then performed to identify crucial pathways implicated in tumorigenesis. Particular emphasis was placed on the cell cycle and p53 signaling pathways. RESULTS: Using machine learning, seven core genes (CKS1B, GAPDH, TYMS, AURKA, CCNE1, PARP1, and MGLL) were identified as promising diagnostic markers, achieving an area under the curve (AUC) value exceeding 0.95 during validation. Additionally, molecular docking revealed strong binding interactions between these core genes and selected air pollutants, with molecular dynamics simulations confirming the stability of these interactions. CONCLUSIONS: Our findings suggest a significant association between air pollutants and the development of NSCLC and propose potential biomarkers for enhanced diagnostic accuracy, alongside potential therapeutic targets. Future research should prioritize the clinical validation of these findings and the investigation of targeted therapies that consider environmental risk factors, thereby enhancing NSCLC management strategies and patient outcomes.