Abstract
BACKGROUND: Hormone receptor (HR)-positive, human epidermal growth factor receptor 2 (HER2)-negative breast cancer is the most prevalent subtype among women but has a modest response to neoadjuvant chemotherapy (NAC). Accurately predicting NAC efficacy and recurrence risk remains challenging, as conventional clinical and molecular markers have limited predictive power. Advances in digital pathology and artificial intelligence now enable quantitative pathomics analysis, offering new opportunities for precise prediction and prognostic assessment. METHODS: In this retrospective study, 162 HR-positive, HER2-negative breast cancer patients treated with NAC between 2014 and 2021 were included. Hematoxylin and eosin (H&E)-stained pretreatment biopsy slides were digitized and analyzed using Vision Transformer (ViT) and Unified Network for Image (UNI) deep learning models to extract pathomic features. Thirteen clinical variables were collected. After least absolute shrinkage and selection operator (LASSO)-based feature selection, multiple machine learning models were developed for both response prediction and prognostic evaluation of recurrence, with performance evaluated by receiver operating characteristic (ROC) curves, area under the curve (AUC), sensitivity, specificity, confusion matrix, calibration curves, and decision curve analysis (DCA). Furthermore, SHapley Additive exPlanations (SHAP) was used to rank the importance of features for each model. RESULTS: The CatBoost model achieved the best predictive performance (AUC = 0.900 in training and 0.848 in validation) when a combination of clinical and pathomics-derived variables was used. Key predictive factors included Ki-67 expression, age, histological grade, PR status, and prominent pathomic features. A Kaplan-Meier survival plot indicated that regardless of stratification by MP grade or pCR status, there was no significant difference in recurrence status or survival outcomes between the two groups in this cohort. Furthermore, the recurrence models developed mainly using pathomics were strongly accurate for predicting 1-year recurrence (AUC = 0.907 in training and 0.769 in validation). CONCLUSIONS: Integrating pathomic features with clinical variables via machine learning enables robust pretreatment prediction of NAC efficacy and short-term recurrence in HR-positive, HER2-negative breast cancer. This approach has the potential to offer a clinically practical tool to optimize individualized therapy and improve patient management, highlighting the translational value of AI-powered digital pathology in breast cancer care.