Interpretable machine learning driven biomarker identification and validation for prostate cancer

基于可解释机器学习的前列腺癌生物标志物识别与验证

阅读:1

Abstract

BACKGROUND: Prostate cancer (PCa), a common malignancy among men globally, requires the identification of biomarkers for early diagnosis and predicting progression. This study aimed to identify the key genes involved in the occurrence and development of PCa. METHODS: Leveraging data from the Gene Expression Omnibus (GEO) database, this study integrated multi-chip datasets, conducting differential expression analysis and enrichment analysis to pinpoint PCa-related genes. Subsequently, machine learning models were constructed using least absolute shrinkage and selection operator (LASSO) regression, support vector machine (SVM), and random forest (RF) methods. The optimal model was selected for further study and the contribution of related genes was explained using SHapley Additive exPlanations (SHAP) analysis. Furthermore, gene set enrichment analysis (GSEA) and immune cell infiltration analysis were utilized to uncover the underlying molecular mechanisms. RESULTS: In this study, 222 differentially expressed genes (DEGs) were identified and found to be enriched in functions and pathways potentially associated with PCa. Using multiple machine learning models, eight PCa-related core genes (TRPM4, EDN3, EFCAB4A, FAM83B, PENK, NUDT10, KRT14, and CXCL13) were identified. The most accurate RF model was selected for further study with SHAP analysis, which also revealed the contribution of the above genes. GSEA and immune cell infiltration analysis uncovered distinctions between PCa and normal tissues. CONCLUSIONS: This study offered potential biomarkers and a theoretical basis for the diagnosis and treatment for PCa.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。