Abstract
Gastric cancer (GC) is a highly morbid and mortal gastrointestinal malignancy, urgently requiring sensitive and specific biomarkers for detection. Protein palmitoylation, a reversible lipid modification process, has been connected to tumor formation, yet its function in gastric cancer (GC) is still insufficiently explored. This research creatively combined palmitoylation-associated characteristics with machine learning methods, utilizing the SHapley Additive exPlanations (SHAP) framework to boost the interpretability of the model. Gene expression profiling datasets from public repositories were collected, with batch effects corrected. Genes with differential expression (DEGs) were pinpointed, and an analysis of functional enrichment was carried out. Through intersection analysis of DEGs and a palmitoylation gene set, and integration of LASSO regression, SVM-RFE, and random forest algorithms, four core genes (ASPA, RBM20, COL4A1, and MAL) were selected. Ten machine learning models were built, among which the Gradient Boosting Machine (GBM) model achieved the optimal performance (AUC = 0.963). SHAP analysis uncovered the notable contributions of the four core genes to model classification. The study also explored gene expression characteristics, immune cell correlations, and spatial heterogeneity. However, it has limitations such as lack of in-vivo animal model validation, unclear core gene-immune cell interaction mechanisms, and insufficient sample diversity. Overall, this research provides new insights into GC pathogenesis and directions for future studies on diagnosis and treatment.
