Abstract
BACKGROUND: Predicting in vitro fertilization (IVF) pregnancy outcomes is crucial for individualized decision-making. However, due to complex interactions among multiple factors, accurate manual integration and assessment are challenging. This study aims to develop an interpretable machine learning (ML) model for predicting the probability of clinical pregnancy after fresh and frozen-thawed embryo transfer. METHODS: This retrospective study included infertile patients undergoing fresh or frozen-thawed embryo transfer between December 2023 and June 2025. Endometrial regions of interest (ROIs) were manually segmented on mid-sagittal uterine ultrasound images. The K-means clustering algorithm was employed to partition the ROIs into habitat subregions. Radiomic features were extracted from the entire ROI and each subregion. After feature selection using Mann–Whitney U tests, Pearson correlation, and mRMR, 11 machine learning algorithms were trained in combination with clinically independent predictors. Model hyperparameters were optimized through five-fold stratified cross-validation and grid search. The optimal classifier was selected based on performance evaluation to construct the final predictive model, subsequently validated on an independent test set. Shapley Additive Explanations (SHAP) provided model interpretability. RESULTS: This study included 543 patients, randomly divided into a training cohort of 380 and a test cohort of 163. ROIs were subdivided into four habitat subregions. Imaging features were systematically extracted from the entire ROI and each subregion. Following mRMR, 15 habitat subregion features were incorporated into the model. Based on area under the ROC curve, the ExtraTrees model demonstrated optimal predictive performance on the test set (AUC: 0.766; 95% CI: 0.689–0.830; accuracy: 0.699; sensitivity: 73.9%; specificity: 65.3%; F1 score: 0.726). SHAP analysis confirmed the significant contribution of embryo type and higher-order texture features from specific subregions to model prediction. CONCLUSION: A habitat-based radiomics machine learning model integrating endometrial ultrasound features and clinical data effectively predicted clinical pregnancy outcomes following embryo transfer. This approach offers a non-invasive, interpretable tool with potential to support personalized embryo transfer strategies in assisted reproduction. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13048-026-02039-4.