Abstract
BACKGROUND: Gastric GISTs (GG) are significant mesenchymal tumors. No biomarker has been identified for GG detection. We first observed mucosal atrophy surrounding GG tumors, leading to the hypothesis that localized atrophy may alter serum pepsinogen (PG) levels. Therefore, we developed a machine learning (ML) model incorporating serum PG levels and clinical features to predict GG and differentiate it from gastric cancer (GC). METHODS: We retrospectively analyzed GG and GC patients with tested PG levels before medical intervention. Seven ML algorithms were assessed, and feature importance was determined using SHapley Additive exPlanations (SHAP). Gastric atrophy was assessed histologically using the updated Sydney System. RESULTS: After screening 562 GG and 1090 GC patients, 100 GG and 174 GC samples were included. The multilayer perceptron (MLP) model achieved the highest AUC. The final MLP model, which included 4 features-gender, PGI levels, PGI/PGII ratio, and CEA-predicted GG with an AUC of 0.854. Considering clinical practice and the feature importance identified by the final MLP model, we established a Positive-Gastric-GIST-PG-CEA criterion (PGI < 70 ng/mL, PGI/PGII ratio ≥ 3.0, and CEA ≤ 5 μg/L) referring to the cutoff values revealed by the ROC curve. The Positive-Gastric-GIST-PG-CEA displayed exceptional performance in predicting GG (AUC = 0.772, accuracy = 0.748, specificity = 0.787, sensitivity = 0.680), with performance comparable to the final MLP model (ΔAUC = 0.082, p > 0.05). The contributions of PGI levels, PGI/PGII ratio, and CEA in the Positive-Gastric-GIST-PG-CEA model performance were 0.33, 0.15, and 0.13 based on SHAP analysis. Histopathological evaluation of gastric mucosal atrophy in 50 GG patients revealed peri-tumoral glandular atrophy in 29 cases (58%). CONCLUSIONS: The Positive-Gastric-GIST-PG-CEA criterion is valuable for detecting GG and distinguishing it from GC. Integrating our criteria into existing PG tests could help in GG detection without additional economic expense.