Abstract
BACKGROUND: Neoadjuvant pembrolizumab combined with chemotherapy has become a standard treatment strategy for early-stage triple-negative breast cancer (TNBC). However, real-world evidence regarding its effectiveness and models for predicting pathological complete response (pCR) remain limited. METHODS: We retrospectively collected data from 248 patients with TNBC who received neoadjuvant chemotherapy (NACT) with or without pembrolizumab followed by surgery between May 2023 and December 2025. Patients were categorized into two groups: the pembrolizumab-chemotherapy group (n = 40) and the chemotherapy-alone group (n = 208). Propensity score matching (PSM) was applied to balance baseline characteristics. Machine learning models were developed using routinely available clinical and pathological variables to predict pCR, and model interpretability was evaluated using SHapley Additive exPlanations (SHAP). RESULTS: Both before and after PSM, the pembrolizumab-chemotherapy group achieved a higher pCR rate than the chemotherapy-alone group (before: 50.0% vs. 30.8%, p = 0.030; after: 47.2% vs. 26.5%, p = 0.037). LASSO regression selected six variables associated with pCR, including pembrolizumab, age, sum of diameters of target lesions (SLD), Ki67, N stage, and clinical stage. In the training cohort, these six variables were used to develop eight machine learning models to predict pCR. In the validation cohort, the MLP model achieved the highest receiver operating characteristic-area under the curve (ROC-AUC) of 0.71. The calibration curve and decision curve analysis (DCA) further indicated good calibration and clinical utility of the MLP model. CONCLUSIONS: Neoadjuvant pembrolizumab combined with chemotherapy further improved pCR rates in TNBC, and the MLP-based model demonstrated good performance for predicting pCR.