Construction and validation of a risk prediction model for complications in patients with acute leukemia based on machine learning

基于机器学习的急性白血病患者并发症风险预测模型的构建与验证

阅读：1

作者：Xu,Rong,Tian,Hong,Zhao,Sufang,Gu,Shulan

期刊：	Scientific Reports	影响因子：	3.900
时间：	2025	起止号：	2025 Nov 19;15(1):40787
doi：	10.1038/s41598-025-24478-7	疾病类型：	白血病

Abstract

Early-phase severe complications remain a major cause of morbidity and mortality during induction chemotherapy for acute leukaemia. Existing risk scores capture only limited prognostic variance and are rarely well-calibrated for clinical decision support. To develop and externally validate a machine-learning model that accurately predicts severe complications after induction, and to assess its clinical utility across key patient sub-groups. We retrospectively assembled electronic-health-record data from three tertiary haematology centres (2013-2024). After exclusion of duplicates and predefined ineligible cases, 2 870 adults with newly diagnosed AML or ALL were analysed (derivation = 2 009; external validation = 861). Forty-two candidate predictors spanning demographics, comorbidity indices, baseline laboratory values, disease biology and treatment logistics were selected via multiple imputation, Winsorised z-scaling and correlation filtering. Five supervised algorithms-including Elastic-Net, Random Forest, XGBoost, LightGBM and a multilayer perceptron-were trained using nested 5-fold cross-validation. Discrimination, calibration, decision-curve net benefit and SHAP-based interpretability were evaluated according to TRIPOD-AI and PROBAST-AI recommendations. LightGBM achieved the highest mean AUROC in derivation (0.824 ± 0.008) and maintained robust performance in external validation (AUROC = 0.801, 95% CI 0.774-0.827; AUPRC = 0.628). Calibration was excellent (slope = 0.97; intercept = - 0.03; Hosmer-Lemeshow p = 0.41). Decision-curve analysis showed superior net benefit over "treat-all," "treat-none," and a four-variable logistic benchmark across risk thresholds of 5-40%, potentially enabling targeted interventions for 14 additional high-risk patients per 100 at a 20% threshold, though clinical benefit requires prospective validation. Discrimination remained ≥ 0.80 in AML, older adults and all three centres. CRP, absolute neutrophil count, cytogenetic-risk tier, age and ferritin were the top predictors, with interpretable monotonic SHAP effects. A rigorously validated LightGBM model provides well-calibrated, interpretable prediction of early severe complications after induction therapy for acute leukaemia and provides a foundation for risk-adapted supportive care strategies, though prospective studies are needed to demonstrate clinical impact. Prospective implementation studies are warranted to confirm real-world impact.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。