Abstract
OBJECTIVES: To develop a multidimensional clinical indicator-based prediction model for identifying high-risk patients with fertilization failure conventional in vitro fertilization (c-IVF) cycles, thereby optimizing therapeutic decision-making. METHODS: This retrospective single-center study analyzed 691 cycles (594 c-IVF, 97 rescue ICSI) from January 2019 to August 2024. Key parameters included female age, BMI, male semen parameters (sperm concentration, total progressive motile sperm count [TPMC], DNA fragmentation index [DFI]), and infertility duration. Three machine learning models (logistic regression, random forest, XGBoost) were developed and validated using a nested cross-validation framework with SMOTE oversampling. RESULTS: The logistic regression model demonstrated superior predictive performance (mean AUC = 0.734 ± 0.049), significantly outperforming random forest (0.714 ± 0.034) and XGBoost (0.697 ± 0.038). Significant predictors included protective factors-male age (OR = 0.642, 95%CI:0.598-0.689) and TPMC (OR = 0.428, 95%CI:0.392-0.466), and risk factors-female BMI (OR = 1.268, 95%CI:1.191-1.351) and DFI (OR = 1.362, 95%CI:1.274-1.455). The nomogram showed moderate-to-high discriminative power (C-index = 0.722, 95%CI:0.667-0.773) upon internal validation. Decision curve analysis confirmed clinical utility at threshold probabilities between 0.05 and 0.60. CONCLUSIONS: The logistic regression-based prediction model exhibits robust performance in assessing c-IVF fertilization failure risk. While optimized for our center's specific clinical context, external multicenter validation is required to confirm broader clinical applicability.