Machine learning approaches for predicting progression in hormone-sensitive prostate cancer patients

机器学习方法在预测激素敏感性前列腺癌患者病情进展中的应用

阅读:2

Abstract

OBJECTIVE: Almost all hormone-sensitive prostate cancer (HSPC) cases eventually progress to castration-resistant prostate cancer (CRPC) following androgen deprivation therapy (ADT). This study aims to develop a machine learning (ML) model to predict the progression of HSPC patients. Additionally, we conducted statistical analyses on the dataset to identify significant features and clinical markers predictive of HSPC transitioning to CRPC. METHODS: Data from 410 HSPC patients treated at Yunnan Cancer Hospital between 01/01/2017, and 31/05/2022, were analyzed. Predictive analyses were performed on a series of features observed during the patients' initial visits. The primary ML methods employed were decision tree (DT), random forest (RF), XGBoost, artificial neural network (ANN), and support vector machine (SVM). Feature selection was conducted using a genetic algorithm (GA). The ML models were trained with an 80% training set and validated with a 20% test set. Model performance was evaluated using the area under the ROC curve (AUC), calibration plots, and learning curves to assess fit and calibration. Evaluation metrics included accuracy (ACC), precision (PRE), specificity (SPE), sensitivity (SEN), and F1 score. RESULTS: Visualization of evaluation metrics was presented through confusion matrices and ROC curves. Ensemble learning methods, particularly RF and XGBoost, demonstrated the best model performance. RF achieved a score of 0.838 (95% CI:0.8324-0.902)on the training dataset and 0.817 (95% CI: 0.659 - 0.829) on the test dataset (AUC: 0.873, 95% CI:0.730-0.878). XGBoost achieved a score of 0.814 (95% CI:0.790-0.878) on the training dataset and 0.805 (95% CI:0.707-0.829) on the test dataset (AUC: 0.866, 95% CI:0.780-0.871). Calibration curves indicated good model calibration, and learning curves suggested no significant overfitting in both the training and test sets. CONCLUSION: Our findings demonstrate that ensemble learning methods, particularly RF, exhibit superior performance in predicting HSPC progression. This study represents a preliminary step toward a predictive tool, highlighting the potential of baseline clinical data for risk stratification. Future prospective studies with larger, multi-center cohorts are warranted to validate and refine this approach for possible clinical integration.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。