Abstract
INTRODUCTION: Blood donors may face deferral due to low hemoglobin (Hb) levels, posing a key challenge for blood management. Most prediction studies are based on European data, and their accuracy can be improved, while research on Chinese donors remains limited. Objective establish and evaluate the applicability of existing low Hb delay prediction methods to Chinese data and introduce new machine learning models and hyperparameter adjustment methods to optimize prediction schemes. METHODS: This study used 26,796 whole blood donation records from Hangzhou (Jan 2023-Oct 2025) to build a machine learning classification model. Seven algorithms, including Logistic Regression and LightGBM, were evaluated using SMOTE for imbalance correction and Hyperband for tuning. Logistic regression and SHAP analysis were then applied to identify key donor characteristics influencing prediction performance. RESULTS: Under the premise of 90% specificity, the LightGBM model had the highest prediction accuracy for blood donation delay due to low Hb. At the same time, the historical average Hb test value of blood donors plays the most important role in the prediction performance of the model, followed by gender, occupation, and type of blood donors (whether to donate blood in groups). In addition, ethnic group also had a significant impact on the prediction of delayed blood donation in Hangzhou area of China. CONCLUSIONS: Using historical pre-donation test data and donor personal information enables reliable prediction of low-Hb deferral. The feature importance results based on Hangzhou data were consistent with previous studies, suggesting shared mechanisms and cross-regional model transferability. Moreover, hyperparameter optimization further enhances model performance.