Integrating machine learning and genetic evidence to uncover novel gene biomarkers for colorectal cancer diagnosis

整合机器学习和遗传证据,以发现用于结直肠癌诊断的新型基因生物标志物

阅读:1

Abstract

From 2020 to 2022, colorectal cancer (CRC) cases increased, making it the third most common cancer and the second leading cause of cancer-related deaths worldwide. Early detection remains a significant challenge due to the lack of reliable diagnostic biomarkers. This study aimed to develop a robust gene diagnostic model for CRC using publicly available databases, such as GEO and GEPIA2. The approach integrated differential expression analysis, weighted gene co-expression network analysis (WGCNA), and the application of 113 machine learning combinations derived from 12 algorithms. The most effective model was then validated using independent datasets, which included analyses such as Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), protein-protein interaction (PPI) networks, and receiver operating characteristic (ROC) curves, along with assessments of immune infiltration and tumor-node-metastasis (TNM) staging. Notably, the glmBoost + RF algorithm identified an eight-gene diagnostic model with high precision, pinpointing key genes such as CLDN1, IFITM1, and FOXQ1, which exhibited strong diagnostic performance (AUC > 0.9). Furthermore, Mendelian randomization (MR) analysis suggested that IFITM1 may be a potential causal gene for CRC, with significant associations to immune cell profiles and established roles in immune regulation and tumor progression. Collectively, these findings highlight IFITM1, SCGN, and FOXQ1 as promising early diagnostic biomarkers and therapeutic targets for CRC, laying a foundation for future research focused on enhancing early detection and intervention strategies in colorectal cancer management.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。