Abstract
BACKGROUND: Unscheduled return visits (URVs) to emergency departments (EDs) contribute significantly to healthcare burden through resource utilization and ED overcrowding. While artificial intelligence (AI) methodologies show potential in URV prediction, existing studies have employed limited algorithms with moderate performance, highlighting the need for comprehensive AI architecture comparison within unified cohorts. OBJECTIVE: This study evaluated the predictive performance of multiple AI models for 72-h ED URVs, aiming to identify optimal risk stratification strategies for improved discharge planning and targeted interventions. METHODS: This retrospective study analyzed adult internal medicine visits to the ED at a tertiary hospital. URVs were defined as ED revisits occurring within 72 h after initial ED discharge time. The dataset was partitioned into training (70%) and testing (30%) sets. Four traditional machine learning algorithms (logistic regression, support vector machine, random forest, and extreme gradient boosting) and one deep learning architecture (TabNet) were developed with Bayesian optimization for hyperparameter tuning. Model performance was assessed through comprehensive metrics including discrimination, calibration, clinical utility, and confusion matrices. The optimal model underwent feature importance analysis, systematic ablation studies, sensitivity analyses, and subgroup fairness evaluation. RESULTS: Of 143,192 analyzed visits, 24,117 (16.8%) were classified as URVs. Data were allocated into training (n = 100,235) and testing (n = 42,957) sets with consistent URV proportions. TabNet demonstrated optimal discriminative performance with AUROC 0.867 (95% CI: 0.854-0.880) and sensitivity of 0.809 (95% CI: 0.801-0.816). Decision curve analysis demonstrated sustained clinical utility across threshold probabilities of 10-30%. Feature importance analysis identified initial diagnoses of digestive and respiratory system diseases, patient age, P3 triage classification, and ED visit frequency as key predictive variables. Subgroup analysis confirmed consistent performance across patient demographics and clinical characteristics. CONCLUSION: TabNet outperformed traditional machine learning approaches in predicting 72-h ED URVs, offering potential for improved risk stratification in emergency care settings.