Integrated bioinformatic and machine learning analysis identifies MCM7 and ADAM17 as potential biomarkers for early stage gastric cancer

整合生物信息学和机器学习分析表明,MCM7 和 ADAM17 可作为早期胃癌的潜在生物标志物。

阅读:1

Abstract

BACKGROUND: Early detection of gastric cancer is crucial for improving prognosis, yet current diagnostic biomarkers remain insufficient for identifying early gastric cancer (EGC, stage I-II). While previous studies have proposed molecular markers, few have systematically validated them across multiple cohorts, and their diagnostic accuracy and immune relevance remain unclear. This study aimed to identify and validate potential early diagnostic biomarkers for EGC using an integrated bioinformatic and machine learning framework. METHODS: The transcriptome data from four Gene Expression Omnibus (GEO) datasets comprising 434 tumor and 100 normal samples were integrated. Only stage I-II gastric cancer samples, defined by pathological criteria according to the American Joint Committee on Cancer Tumor-Node-Metastasis (AJCC TNM) staging system, were included in this study, while advanced-stage cases were excluded to ensure a homogeneous early-stage cohort. Normal gastric tissues were obtained from non-tumor regions of gastrectomy specimens and served as controls. Differentially expressed genes (DEGs) were identified using the limma algorithm. Three machine-learning methods [i.e., least absolute shrinkage and selection operator (LASSO) regression, support vector machine recursive feature elimination (SVM-RFE), and random forest (RF)] were applied to screen feature genes. A diagnostic support vector machine (SVM) model was constructed based on the overlapping DEGs. External validation was conducted using The Cancer Genome Atlas - Stomach Adenocarcinoma (TCGA-STAD) and Human Protein Atlas (HPA) datasets. Functional enrichment and CIBERSORT immune infiltration analyses were performed to explore potential mechanisms. RESULTS: A total of 101 DEGs were identified, and four feature genes (i.e., MCM7, ADAM17, DPT, and KIT) were selected by all three machine-learning algorithms. The SVM diagnostic model showed excellent performance [area under the curve (AUC) =0.998, sensitivity =96.5%, specificity =95.2%]. Among these, MCM7 and ADAM17 were significantly overexpressed in the tumor tissues and associated with a poor prognosis (P<0.05, AUC >0.85). The SHapley Additive exPlanations (SHAP) analysis revealed that these two genes contributed most to the model's predictions. The functional analysis showed MCM7 was enriched in DNA replication and cell cycle pathways, while ADAM17 was involved in inflammatory and tumor-related signaling. The immune infiltration analysis indicated that both genes were significantly associated with various immune cell subpopulations, suggesting a potential role in modulating the tumor immune microenvironment. CONCLUSIONS: This study identified MCM7 and ADAM17 as potential biomarkers for EGC through integrated multi-cohort bioinformatic analysis. Further experimental and clinical studies are required to validate their diagnostic specificity and applicability in real-world settings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。