Abstract
BACKGROUND: This study aimed to develop a machine learning model to predict seroma risk following prepectoral breast reconstruction. METHODS: Two methodologies were used to develop machine learning models for predicting seroma formation based on a retrospective review of institutional data with 2-stage prepectoral breast reconstruction. Method 1 used a dataset including all preoperative patient attributes and operative details, whereas method 2 focused only on variables that were statistically significant on univariate logistic regression. Six algorithms were trained in both methods: logistic regression, Naive Bayes, support vector machine, k-nearest neighbors, decision tree, and random forest. RESULTS: Chart review identified 318 breasts that underwent prepectoral reconstruction, with a seroma rate of 25.58%. Univariate analysis found that body mass index, mastectomy specimen weight, hypertension, neoadjuvant chemotherapy, and skin-sparing mastectomy were positively associated with seroma. Method 1 identified the decision tree to have the highest accuracy (0.81) and area under the receiver operating characteristic curve (0.81). Method 2 improved model performance. The random forest achieved the best results, with an accuracy of 0.81 and an area under the receiver operating characteristic curve of 0.83. A web application was then created using the random forest model to provide real-time seroma risk predictions. CONCLUSIONS: Machine learning models offer a valuable tool for improving clinical decision-making by accurately predicting patient-specific seroma risk in breast reconstruction. Our models outperformed traditional methods in identifying high-risk patients, allowing for tailored surgical techniques and intensified follow-up care.