Abstract
Background There is significant individual variation in the efficacy of cetuximab for the treatment of colorectal cancer (CRC). However, effective models to predict treatment outcomes are still lacking in clinical practice. Methods Datasets (GSE106582 and GSE83889) were used to identify differentially expressed genes (DEGs) in CRC by the 'Limma' package in R software. Hypoxia-related genes were retrieved from the Molecular Signatures Database and cross-referenced with CRC DEGs. Protein expression levels were verified using immunohistochemistry (IHC) data from the Human Protein Atlas (HPA), and prognostic significance was assessed through the Kaplan-Meier plotter platform. Additionally, pathway and immune infiltration analyses were performed using the GSCA platform. We also successfully constructed a prediction model for cetuximab treatment response using the K-nearest neighbors (KNN) algorithm in GSE108277 dataset, in which the feature selection was performed through the permutation importance method. Results Analysis of GSE106582 and GSE83889 identified 417 overlapping DEGs by comparing cancer tissues with normal controls, including 16 hypoxia-related genes. 6 genes (BGN, DDIT4, MIF, SLC2A1, STC2, and TGFBI) were upregulated, and 10 genes (CA12, CITED2, MT1E, MT2A, NEDD4L, PCK1, PLAC8, PPARGC1A, SELENBP1, and SRPX) were downregulated in CRC. Survival analysis revealed that the 16 hypoxia-related DEGs were linked to the survival outcomes of CRC patients. Pathway analysis indicated that these genes were almost involved in EMT, cell cycle, and RTK pathways. Furthermore, these genes play a role in the infiltration of immune cells and may regulate the immune microenvironment. A prediction model for cetuximab response was developed, based on 10 key genes (CA12, DDIT4, MIF, MT2A, NEDD4L, PLAC8, SELENBP1, SLC2A1, SRPX, and TGFBI) and dataset from GSE108277. The model demonstrated robust performance with an accuracy of 0.9500, precision of 0.8378, recall of 1.0000, F1-score of 0.9118, and a receiver operating characteristic-area under the curve (ROC-AUC) of 0.9663. Conclusion Our study identifies 10 hypoxia-related DEGs as key players in CRC progression and cetuximab response. And we successfully developed a predictive model to forecast the response of CRC patients to cetuximab treatment. This study will provide valuable biomarkers for CRC prognosis and help guide more effective therapeutic strategies.