Bioinformatics combined with machine learning for the identification of malignant transformation markers in colorectal polyps

生物信息学结合机器学习技术识别结直肠息肉恶性转化标志物

阅读:2

Abstract

BACKGROUND: Colorectal polyps, as crucial precancerous lesions of colorectal cancer (CRC), have incompletely clarified origin and evolutionary mechanisms, which restrict the early prevention and control of CRC. This study aimed to screen core genes regulating colorectal tumorigenesis and construct a reliable diagnostic model for CRC. METHODS: The edgeR package and weighted gene co-expression network analysis (WGCNA) were first used to analyze the GSE209741 dataset to identify differentially expressed genes (DEGs) and module genes, followed by functional enrichment analysis to reveal core biological pathways and functions. Combined with the GSE161277 single-cell RNA sequencing dataset, 57 epithelial cell-specific regulatory molecules were screened. Based on the TCGA-COADREAD cohort, feature genes were selected by the combined application of the Boruta algorithm, LASSO regression and XGBoost model. Finally, a ridge regression diagnostic model was established using six core genes (EIF2S3, GTF3A, HMGA1, HSP90AB1, PABPC1, S100A11), and its performance was verified in the internal validation set and the external independent cohort GSE41258. Meanwhile, the UALCAN database was used to validate the protein expression levels of core genes in tumor tissues, survival analysis was performed to explore their correlation with CRC prognosis, and qRT-PCR was applied to verify the mRNA expression differences of the six core genes between CRC cell lines (SW480, HCT116) and the normal colorectal epithelial cell line NCM460. RESULTS: The diagnostic model exhibited excellent diagnostic efficacy in both internal and external datasets. The UALCAN database confirmed that the protein expression of the six genes was significantly upregulated in CRC tissues. Survival analysis revealed that high expression of EIF2S3 and S100A11 was associated with poor prognosis in CRC patients. qRT-PCR further verified that the mRNA expression levels of the six core genes were significantly elevated in CRC cell lines. CONCLUSION: This study identified six key genes regulating colorectal tumorigenesis and constructed a high-performance diagnostic model. These findings provide novel insights into the molecular mechanisms underlying the initiation and progression of CRC, and offer potential biomarkers and therapeutic targets for the clinical diagnosis and treatment of CRC.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。