Abstract
PURPOSE: Early colorectal cancer (CRC) detection is crucial for effective treatment; however, traditional screening methods face challenges. Colonoscopy, though highly effective, has limited availability, and fecal immunochemical tests (FIT) are more accessible and cost-effective but suffer from low adherence. Our retrospective study aimed to develop a transparent artificial-intelligence model leveraging routine CBC data as a cost-effective method for CRC detection. METHODS: We conducted a retrospective analysis of 28,450 individuals aged 45-75 who underwent colonoscopy within six months of a complete blood count (CBC) test. Among them, 439 (1.8%) had CRC, 2,955 (11.8%) had advanced adenomas, and 21,662 (86.5%) had benign findings on colonoscopy. The database was divided into training (70%) and testing (30%) sets. The model was developed using ridge regression. RESULTS: Descriptive analysis revealed significant differences between CRC cases and controls across most CBC markers, CBC-derived ratios, and age (P < 0.001), except for lymphocytes. The model, based on red cell distribution width (RDW), systemic inflammation response index (SIRI), hemoglobin, and age, achieved an AUC of 0.77 (95% CI: 0.75-0.77) for CRC, comparable to a deep learning model (TabPFN). Interpretability analysis revealed that older age, elevated RDW and SIRI, and low hemoglobin were associated with CRC. In a subgroup (7.25%) with FIT results, FIT showed higher sensitivity for CRC (88%) than the model (64%), but lower specificity (77% vs. 81%). CONCLUSION: Given CBC's widespread use and accessibility, this approach may be a scalable pre-screening tool to improve CRC risk stratification and optimize resource allocation, demonstrating how explainable AI may augment existing CRC screening programs.