Abstract
OBJECTIVE: To develop and deploy a deep-learning system for automatic NICE classification of colorectal lesions that delivers real-time assistance, incorporates withdrawal-speed monitoring, and shortens the learning curve for endoscopists. METHODS: A total of 2605 colonoscopic images from three hospitals were used for training/testing, with an independent external set of 99 images for validation. Three CNN models (EfficientNet, ResNet50, VGG19) and three Transformer models (ViT, Swin, CvT) were fine-tuned via transfer learning and compared. Model decisions were explained by Grad-CAM, Guided Grad-CAM and SHAP. The best model was converted to ONNX for cross-platform deployment (>50 FPS) and combined with a perceptual-hash/Hamming-distance algorithm to visualize and alarm withdrawal speed. RESULTS: EfficientNet achieved the best performance (internal accuracy 0.910, F1 = 0.916). On the external set it reached a macro-average AUC of 0.994 with precision 0.936 and recall 0.918. Compared with endoscopists, the model matched or exceeded sensitivity/specificity for NICE 2 and NICE 3. After ONNX conversion, frame-level inference exceeded 50 FPS, and excessive withdrawal speed triggered yellow/red warnings. CONCLUSION: This study is the first to integrate high-accuracy EfficientNet-based NICE classification, comprehensive explainability, and real-time withdrawal-speed monitoring into an ONNX-based multi-terminal system. The tool shows strong potential to improve optical diagnosis consistency for early colorectal cancer and to accelerate training of novice endoscopists.