Abstract
BACKGROUND: Tyrosine kinase inhibitors (TKIs) resistance poses a significant challenge in the targeted therapy of lung adenocarcinoma (LUAD), highlighting the need to identify key molecular markers associated with both drug resistance and prognosis to guide precision treatment. This study aimed to elucidate the molecular mechanisms underlying TKIs resistance in LUAD, identify core differentially expressed genes (DEGs), clarify the relationships between different gene clusters and patient survival/drug response, and construct and validate a prognostic risk model for LUAD, thereby providing a foundation for precision therapy and prognostic assessment. METHODS: Multiple LUAD-related datasets, including GSE162045 and GSE114647, were integrated. Core overlapping DEGs were identified using Venn diagrams, and a gene correlation network was constructed. Consensus clustering was applied for sample grouping, combined with t-SNE dimensionality reduction to visually validate clustering stability and distinctness. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Set Enrichment Analysis (GSEA) were performed to explore the functional enrichment of DEGs. The 50% maximal inhibitory concentration (IC50) values of 12 drugs were compared across clusters to evaluate differences in drug sensitivity. Prognosis-related core genes were selected via LASSO regression to construct a risk model, whose performance was subsequently validated in the GSE31210 cohort using Sankey diagrams, Kaplan-Meier survival curves, and receiver operating characteristic (ROC) curves. Expression differences of core genes among clusters and between risk groups were analyzed, and Kaplan-Meier curves were plotted to assess the association between individual gene expression and survival. The expression of PLEK2 in LUAD tissues was analyzed based on multiple datasets (GSE19804, GSE19188, GSE44077, GSE30219), and its protein level in epidermal growth factor receptor (EGFR)-TKIs-resistant LUAD cell lines was detected by Western blot. RESULTS: Twelve core DEGs (e.g., HMGA1, PLEK2) were identified. When the cluster number (K) was set to 2, samples were stably divided into Cluster A and Cluster B. The expression of 10 core genes was significantly different between the two clusters (P<0.0001), and the patients in Cluster A exhibited significantly better overall survival (OS), disease-free survival (DFS), and progression-free survival (PFS) compared to those in Cluster B. Notable differences were observed in the mutation profiles of high-frequency genes such as TP53, KRAS and EGFR between the clusters. KEGG enrichment analysis revealed that the DEGs were primarily enriched in pathways such as "Cell cycle" and "Neuroactive ligand-receptor interaction". GSEA indicated significant associations with gene sets related to the malignant progression of tumors. Drug sensitivity analysis demonstrated significant differences in IC50 values for the 10 drugs between the two clusters. A risk model based on 9 genes was successfully constructed. Patients in the high-risk group had a higher proportion of deaths and significantly lower survival probability (P<0.0001). The area under the curve (AUC) values for the model at 1, 3 and 5 years were 0.700, 0.647 and 0.675, respectively. Validation in the GSE31210 cohort confirmed the model's stability and generalizability. The expression of core genes differed significantly between risk groups (P<0.0001). High expression of HMGA1 and PLEK2 was associated with poor prognosis, whereas the expression of ID3 and DAPK2 showed no significant association with prognosis. Univariate Cox regression incorporating clinical variables and the LASSO risk score demonstrated that the risk score was significantly associated with OS (HR=0.49, P=3.80×10-6). After multivariate adjustment, the risk score remained an independent prognostic factor (HR=0.57, P=6.40×10-4), exhibiting stable independent predictive value. Analysis of public datasets and Western blot experiments confirmed that PLEK2 expression was upregulated in LUAD tissues and further elevated in EGFR-TKIs-resistant cell lines. CONCLUSIONS: The risk model constructed in this study effectively predicts the prognosis of LUAD patients. PLEK2 is highly expressed in LUAD and associated with EGFR-TKIs resistance, suggesting its potential as a prognostic biomarker and therapeutic target.