Abstract
BACKGROUND: Accurately evaluating human epidermal growth factor receptor (HER2) expression status in breast cancer enables clinicians to develop individualized treatment plans and improve patient prognosis. The purpose of this study was to assess the performance of a machine learning (ML) model that was developed using (18)F-FDG PET/CT parameters and clinicopathological features in distinguishing different levels of HER2 expression in breast cancer. METHODS: This retrospective study enrolled breast cancer patients who underwent (18)F-FDG PET/CT scans prior to treatment at Lianyungang First People's Hospital (centre 1, n=157) and the Third Affiliated Hospital of Soochow University (centre 2, n=84). Two classification tasks were analysed: distinguishing HER2-zero expression from HER2-low/positive expression (Task 1) and distinguishing HER2-low expression from HER2-positive expression (Task 2). For each task, patients from Centre 1 were randomly divided into training and internal test sets at a 7:3 ratio, whereas patients from Centre 2 served as an external test set. The prediction models included logistic regression (LR), support vector machine (SVM), extreme gradient boosting (XGBoost) and multilayer perceptron (MLP), and SHAP analysis provided model interpretability. Model performance was evaluated via the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). RESULTS: XGBoost models exhibited the best predictive performance in both tasks. For Task 1, recursive feature elimination (RFE) was used to select 8 features, excluding pathological features, and the XGBoost model achieved AUCs of 0.888, 0.844 and 0.759 for the training, internal and external testing sets, respectively. The top three features according to the SHAP values were the tumour minimum diameter, mean standardized uptake value (SUVmean) and CTmean. For Task 2, 9 features were selected, including progesterone receptor (PR) status as a pathological feature. The XGBoost model achieved AUCs of 0.920, 0.814 and 0.693 for the training, internal and external testing sets, respectively. The top three features according to the SHAP values were the PR status, maximum tumour diameter and metabolic tumour volume (MTV). CONCLUSIONS: ML models that incorporate (18)F-FDG PET/CT parameters and clinicopathological features can aid in the prediction of different HER2 expression statuses in breast cancer.