Abstract
Inflammatory bowel disease (IBD), including Crohn's disease and Ulcerative colitis, often shows variable responses to biological therapies. Identifying the most significant variables for predicting the response to these therapies could help prioritize efforts in data collection and preprocessing. This study evaluated the predictive performance of machine learning models in forecasting remission and response to vedolizumab and ustekinumab in the treatment of IBD. The goal was not to compare the two therapies, but rather to identify the variables most influential in predicting response for each treatment. Data from 227 IBD patients treated at Virgen Macarena University Hospital (2015-2022) were analyzed. Clinical, demographic, and laboratory variables were used to develop Extreme gradient boosting (XGBoost) models to predict the clinical response at 26 and 52 weeks and remission at 52 weeks. Model performance was evaluated via F1 scores, accuracy, precision, and recall, with fairness analyses across sex and age groups. The models achieved F1 scores of 0.842, 0.869, and 0.649, respectively. The predictors included leukocyte count, FCP, CRP, and vitamin B12 levels, with higher inflammatory marker levels linked to poorer responses. Demographic subgroup analysis revealed variability in model performance due to small sample sizes. Machine learning models have potential as clinical decision support tools for personalizing IBD treatment. These findings underscore the value of leveraging real-world evidence in optimizing therapeutic strategies. Further multicenter studies are needed to validate these models and enhance their applicability across diverse populations. Taken together, these findings arise from an exploratory, hypothesis-generating study that constitutes an early step toward truly personalized biologic therapy in IBD.