Abstract
OBJECTIVE: The lack of a rapid, validated, consistent test for tracking disease activity in patients with inflammatory bowel disease (IBD) is currently a major challenge. Currently used biomarkers have notable disadvantages, such as the slow processing (faecal calprotectin) and the lack of specificity (bloodwork). White blood cell (WBC) subsets, also known as 'the differential', are commonly obtained in evaluating IBD patients, but there is minimal evidence on how these subsets relate to disease activity. Given the interplay between immune cells, it is possible that complex patterns in WBC subsets could be used to classify IBD activity. Machine learning (ML) could be used to reveal these changes. The aim of this study was to classify IBD activity via routine bloodwork results, using an ML approach. METHODS: 1458 bloodwork measurements from 108 IBD patients were included in this analysis. Disease activity was classified by physician's global assessment score. Four ML models were trained to classify active disease or remission based on routine bloodwork metrics (complete blood count, differential, albumin, erythrocyte sedimentation rate and C reactive protein). RESULTS: The optimal model, extreme gradient boosted decision trees, achieved a receiver operator characteristic area under the curve of 0.882. Feature analysis identified neutrophils, C reactive protein and albumin as consistently important contributors to the models. Additionally, no single individual biomarker was comparable to the ML model, and medications had only a minor impact on the ML model. CONCLUSION: Classification of IBD activity can be augmented using ML analysis of commonly measured bloodwork parameters to help inform treatment plans and to improve IBD patient outcomes.