Abstract
BACKGROUND: Various treatments and therapies for breast cancer surgery patients significantly influence breast cancer treatment costs, necessitating a comprehensive analysis of cost determinants to optimize cost management strategies. METHODS: This retrospective study analyzed data from 19,094 eligible individuals from SHA2011 inpatient sample (2016–2020) in China. Ensemble models (gradient boosting trees) were trained using 10-fold cross-validation. Permutation importance and partial dependence analyses identified key drivers of total hospitalization costs (inflation-adjusted). The analysis encompassed variables covering patient characteristics, hospital attributes, treatments, and comorbidities. RESULTS: The most influential predictor of total costs was LOS, which increased predicted charges by approximately 145 USD per day (training model of RMSLE 0.474, 95% CI 0.466–0.483). Contrary to expectations, patients with a higher drug ratio (> 30%) exhibited lower total costs. Hospital location, number of beds, and radiotherapy also emerged as significant cost factors. In models excluding LOS, the strongest predictors were drug ratio, number of beds, general hospital admission, tumor surgery admission, and radiotherapy. CONCLUSION: This study elucidated the intricate interplay of factors that affect healthcare expenses in breast cancer surgery patients. These findings informed targeted cost management strategies and resource allocation, offering valuable insights for enhancing the quality and efficiency of breast cancer care. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12913-025-13814-2.