Abstract
OBJECTIVE: The subset of breast cancer patients with pathologic node-negative status (ypN0) after neoadjuvant chemotherapy (NAC) who benefit from postoperative radiotherapy (PORT) through a reduced risk of recurrence remains poorly defined. This study aimed to develop and validate an interpretable machine learning (ML) model to perform risk stratification for recurrence among ypN0 patients who all received PORT. METHODS: We conducted a retrospective analysis of 1450 breast cancer patients treated between January 2017 and January 2024. All patients received NAC, underwent radical surgery confirming ypN0 status, and subsequently received PORT. Based on follow-up outcomes after PORT, patients were classified as 'recurrence' or 'no-recurrence'. From 20 initial clinicopathological variables, feature selection was performed using LASSO regression, stepwise logistic regression, and the Boruta algorithm, retaining 16 key features identified by at least two methods. Predictive models were constructed using ten machine learning algorithms, with hyperparameters optimized via particle swarm optimization. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration curves, decision curve analysis (DCA), and standard metrics including accuracy, sensitivity, specificity, and F1-score. The optimal model was interpreted using the SHapley Additive exPlanations (SHAP) framework. RESULTS: The Gradient Boosting Machine (GBM) model demonstrated superior predictive performance, achieving an AUC of 0.945 (95% CI: 0.917-0.972) on the test set. SHAP analysis identified breast reconstruction, perineural invasion, and the chemotherapy-to-surgery interval as the three most influential predictors. These features primarily exerted independent effects on recurrence risk. CONCLUSION: We developed and validated a highly accurate and interpretable ML model to stratify recurrence risk in ypN0 breast cancer patients after PORT. Functioning as a risk-stratification tool within this uniformly treated cohort, it may help identify patients at high risk of recurrence despite receiving PORT, who could be candidates for intensified surveillance or consideration of adjuvant therapy escalation. Its role in personalizing radiotherapy decisions requires prospective validation in studies including untreated control groups.