Abstract
BACKGROUND: Benzo[a]pyrene (BaP) is a pervasive environmental carcinogen present in PM2.5, tobacco smoke, and vehicular emissions. Although its toxic potential is recognized, its molecular mechanisms in lung carcinogenesis are unclear. The purpose of this study is to find the core target of BaP-induced lung cancer and explain its pathological process by integrated calculation. METHODS: The toxicity of BaP was assessed by ProTox 3.0 and ADMET Lab 3.0. Potential targets of BaP were identified through comparative analysis of chemical toxicology databases Comparative Toxicogenomics Database (CTD) and lung adenocarcinoma genomic data from The Cancer Genome Atlas (TCGA), and Gene Expression Omnibus (GEO). Machine learning (ML) algorithms were applied to screen the signature genes, complemented by immune infiltration analysis. Molecular docking simulations were conducted to evaluate the binding capacity between BaP and identified protein targets. RESULTS: Machine learning and immune infiltration identified five core genes (KDR, NQO1, MMP12, MMP13, PLAU) which were related to BaP-induced oncogenesis, especially in angiogenesis and extracellular matrix remodeling. The results of molecular docking validated direct interactions between BaP and these targets, with docking scores < -5 kcal/mol confirming strong binding affinities. CONCLUSIONS: This study revealed the multi-target mechanism of BaP in promoting lung cancer development, primarily involved in angiogenesis and extracellular matrix reorganization. These findings provided novel insights into BaP’s carcinogenic mechanisms and offered the potential targets for environmental chemical-induced lung cancer interventions. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s40360-025-01064-1.