Abstract
Background/Objectives: The development of anti-drug antibodies (ADA) significantly diminishes the clinical efficacy of infliximab (IFX) in Crohn's disease (CD). This study aimed to develop and validate an interpretable machine learning (ML) framework for predicting ADA risk during IFX induction therapy using multidimensional clinical and laboratory data. Methods: We conducted a retrospective analysis of 606 CD patients who initiated IFX induction between January 2023 and August 2024 at the Sixth Affiliated Hospital of Sun Yat-sen University. Predictor selection was performed through univariate analysis and least absolute shrinkage and selection operator (LASSO) regression, with significant features further evaluated via multivariate logistic regression. Seven ML models were developed and evaluated mainly based on area under the curve (AUC), F1 score, and Brier score. Model interpretability was enhanced using SHapley Additive exPlanations (SHAP). Results: Among the 606 CD patients, 145 (23.93%) developed ADA during IFX induction. Independent predictors included serum trough levels of IFX (TLI), erythrocyte sedimentation rate (ESR), history of delayed treatment, prior exposure to anti-TNF agents, and concomitant use of immunosuppressants (IMM). The XGBoost algorithm outperformed others, with an AUC of 0.899, accuracy of 0.851, F1 score of 0.640, and Brier score of 0.102 in validation. SHAP analysis identified TLI and ESR as the most influential predictors, with history of delayed treatment and prior exposure to anti-TNF agents showing moderate impact, while concomitant use of IMM was associated with a protective effect. Conclusions: We developed an interpretable ML model that effectively predicts ADA formation in CD patients undergoing IFX induction therapy, facilitating early risk stratification and personalized treatment planning. This approach integrates advanced analytics with clinical practice to support precision medicine in CD management.