Interpretable machine learning-based survival prediction and key gene identification in cancer using gene expression and clinical data

利用基因表达和临床数据进行可解释的基于机器学习的癌症生存预测和关键基因识别

阅读:1

Abstract

BACKGROUND: Gastrointestinal cancer is a common malignant tumor with high incidence and poor prognosis. Accurate prediction of prognosis can improve the treatment of cancer patients, but the clinical features currently used provide insufficient information. This study aimed to establish an efficient survival prediction model for gastrointestinal cancer based on gene expression and clinical data. METHODS: Based on the gastrointestinal cancer samples in The Cancer Genome Atlas, we established efficient gastrointestinal cancer survival prediction models with gene expression profiling data as input molecular features. A series of bioinformatics methods were applied to conduct a comprehensive analysis of the identified gastrointestinal cancer-related genes. The molecular mechanism by which newly identified gastrointestinal cancer-related genes mediate cancer occurrence was preliminarily explored. RESULTS: Random forest-based model (I) had an accuracy of 94.98% with Mathew's correlation coefficient (MCC) of 0.8995. Support vector machine-based model (II) had an accuracy of 94.98% with MCC of 0.9000. We found a significant difference in survival between the two subtypes (S1 and S2, 3-year survival rates ≥75% and ≤45%, respectively). These subtypes have independent predictive value for patient survival. The models constructed in this study exhibit inherent interpretability. Twenty key genes related to gastrointestinal cancer were successfully identified. The comprehensive functional analysis in this study provides important clues for elucidating the potential mechanisms of action of the selected cancer-related genes in tumor initiation and progression. Most importantly, we conducted drug target predictions for these genes and successfully identified potential targeted drugs for seven genes (NR3C1, HNF4A, DNAAF9, CDX2, ATP2B4, RBMS3, LIFR). CONCLUSIONS: The findings of this study hold significant implications for predicting survival and treatment decisions in gastrointestinal cancer.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。