An interpretable machine learning model for predicting brain metastasis in breast cancer

一种用于预测乳腺癌脑转移的可解释机器学习模型

阅读:1

Abstract

BACKGROUND: Breast cancer is the most common malignancy worldwide. Brain metastasis in breast cancer severely impacts prognosis, and the objective of this study is to develop a machine learning model for predicting the risk of brain metastasis in breast cancer patients to assist clinical management. METHODS: Univariate and multivariate logistic regression analyses were employed to screen the final included variables, and eight machine learning algorithms were utilized for model construction. Model performance was evaluated using receiver operating characteristic curves, precision-recall curves, decision curve analysis (DCA), and calibration curves, with the optimal model selected based on these metrics. The model was trained on a cohort of 154,193 patients, internally validated on 66,084 patients, and externally validated on 765 real-world cases, incorporating metrics such as area under the curve (AUC), area under the precision-recall curve (AUPRC), decision curves, and calibration plots, while SHAP analysis was applied to enhance interpretability. A web-based calculator was developed based on the optimal model to facilitate clinical application. RESULTS: Univariate logistic regression identified higher tumor grade, advanced T/N stage, advanced clinical stage, and PR positivity as risk factors, whereas radiotherapy, chemotherapy, surgery, HR + /HER2- subtype, and unilateral tumors served as protective factors (P < 0.001). Multivariate analysis confirmed independent risk factors, including poorer pathological grade, N3 lymph node status, later stage, and PR positivity, and protective factors, including radiotherapy, chemotherapy, surgery, non-HR-/HER2- subtypes, and HER2 positivity. The XGBoost model achieved an AUC of 0.98 in 10-fold cross-validation, with AUCs of 0.99 and 0.97 in the internal test set and external validation set, respectively; AUPRC values were 0.933, 0.864, and 0.648; decision curve analysis demonstrated superior net benefit compared to alternative models within the 0.1-0.8 threshold range; calibration curves showed high concordance between predicted and observed event rates. SHAP analysis highlighted surgery as the primary protective factor, followed by stage and T classification as risk enhancers, revealing interactions among treatment variables. CONCLUSION: This study developed an interpretable and clinically deployable XGB model, accompanied by a web-based calculator, thereby advancing personalized risk stratification, early screening, and resource optimization in the management of breast cancer brain metastasis.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。