Serum Untargeted Metabolomics Integrated with SHAP-Based Machine Learning for Multiclass Stratification of Prostate Cancer, Prostatitis, and Benign Prostatic Hyperplasia

血清非靶向代谢组学与基于SHAP的机器学习相结合,用于前列腺癌、前列腺炎和良性前列腺增生的多分类分层

阅读:1

Abstract

BACKGROUND: Prostate cancer, benign prostatic hyperplasia, and prostatitis share substantial overlap in clinical symptoms and biological characteristics, which hampers non-invasive and early differential diagnosis. Untargeted metabolomics enables comprehensive profiling of disease-associated metabolic alterations; however, its high dimensionality and strong feature correlations challenge conventional statistical approaches. METHODS: To address this, we analyzed serum untargeted LC-MS data following standardized preprocessing. We adopted a nested cross-validation strategy to evaluate various feature selection methods and machine learning classifiers, ultimately determining that multiclass LASSO regression was the most effective feature selection approach. RESULTS: An optimized Random Forest model demonstrated strong, superior performance in distinguishing between prostate cancer, prostatitis, benign prostatic hyperplasia, and healthy controls (out-of-fold accuracy: 93.8%; macro-F1: 0.937). Additionally, SHAP (SHapley Additive exPlanations) analysis translated feature statistical importance into biologically meaningful modules, revealing that distinct, disease-specific patterns of metabolic reprogramming drove the model's robust multiclass discrimination. CONCLUSIONS: This study demonstrates the value of integrating serum untargeted metabolomics with advanced explainable machine learning for effective multiclass differentiation of major prostate diseases, providing a promising non-invasive framework for diagnostic stratification and metabolic biomarker discovery.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。