Abstract
BACKGROUND: Heart failure (HF) is a severe and common complication of type 2 diabetes mellitus (T2DM), associated with increased morbidity and mortality. Although the biomarker NT-proBNP, at a cut-off value of 125 pg/mL, has demonstrated satisfactory discriminatory power for predicting HF risk in T2DM patients, its measurement remains inaccessible in most primary healthcare settings in China. This study aimed to develop and externally validate a machine learning-based nomogram for predicting the risk of elevated NT-proBNP (≥125 pg/mL) as a surrogate for HF risk in patients with T2DM. METHODS: We retrospectively enrolled 564 T2DM patients as the development cohort and 302 from two external centers as the validation cohort. After feature selection via least absolute shrinkage and selection operator regression, five machine learning models were constructed and evaluated using 10-fold cross-validation. The optimal model was presented as a static nomogram and further deployed as an online web application for clinical use. RESULTS: Six key predictors were identified: estimated glomerular filtration rate, age, serum albumin, hemoglobin, urine albumin-to-creatinine ratio, and the binary indicator of age ≥ 65 years. Interpretability analysis using SHapley Additive exPlanations revealed estimated glomerular filtration rate as the most influential feature. The final machine learning-based nomogram achieved AUCs of 0.806 (95% CI: 0.767-0.845) in training and 0.861 (95% CI: 0.813-0.908) in external validation, with good calibration and clinical utility. Furthermore, the nomogram scores showed a significant positive correlation with established TRS-HF(DM) risk strata, supporting its clinical relevance. CONCLUSION: We developed and validated an interpretable machine learning-based nomogram that effectively predicts the risk of elevated NT-proBNP in T2DM patients using six routine clinical variables. This tool demonstrates robust performance and generalizability, offering a practical and accessible solution for HF risk stratification in resource-limited primary care settings in China.