Abstract
The 28-day compressive strength of cement is a key indicator for assessing cement quality. To overcome the time delays inherent in manual testing, this paper proposed a 28-day cement strength fusion prediction method based on a Transformer feature extractor and an XGBoost meta-learner. This method first encoded the physicochemical multi-source strength variables through the Transformer embedding layer, then calculated the attention scores using the multi-head attention mechanism to allocate weights dynamically. Next, XGBoost's gradient boosting tree structure and regularization techniques were employed to enhance the robustness of the cement strength prediction model in small-sample scenarios. Finally, the method was validated using real-world 28-day strength testing data from cement plants. The results indicated that, compared to the model without feature extraction, the regression model's R2 increased by 5.62%, and its RMSE decreased by 22.33% after applying Transformer feature extraction. Furthermore, when compared with other small-sample models, XGBoost achieved the highest average R2 of 0.93 in 5-fold cross-validation (CV). Its training efficiency, robustness to noise, and ability to handle feature missingness outperformed other meta-learners. Compared to other methods, TF-XGBoost achieved the highest average R2 of 0.94 in 25 Monte Carlo (MC) CVs, providing the best fit. The method proposed in this paper demonstrates higher accuracy, better generalization, and greater stability, offering a new approach for the prediction of cement 28-day strength with small sample sizes.