Hybrid deep learning framework based on EfficientViT for classification of gastrointestinal diseases

基于EfficientViT的混合深度学习框架用于胃肠道疾病分类

阅读:1

Abstract

GI diseases are one of the leading causes of morbidity and mortality worldwide, and early and accurate diagnosis is considered to be very important. Traditional methods like endoscopy take time and depend majorly on the judgment of the physician. The proposed Efficient Vision Transformer (EfficientViT) is a new deep learning-based model using EfficientNetB0 in combination with the Vision Transformer (ViT) for the classification of eight different types of diseases in the GI system. EfficientViT utilizes the features of EfficientNetB0 to capture local textures and multi-scale features to achieve structural changes in the GI tract. At the same time, it includes the capacity of the ViT model to recognize the context of images of the GI tract for the detection of slight disease patterns and precursors of disease diffusion. Furthermore, we designed a dual-block in which input is divided into two parts (q1, q2) to better optimize the model q1 processed through an EfficientNet for local details and a q2 through encoder block for capturing the global dependencies, which enables EfficientViT to pay attention to multiple image regions simultaneously. We have tested the model using fivefold cross-validation and achieved an outstanding accuracy of 99.82% compared to the MobileNetV2-based model which reached 99.60%. In addition, EfficientViT demonstrated excellent precision, recall, and F1 scores. Our model, in general, outperforms existing methods, offering a promising tool for clinicians to more reliably and accurately diagnose GI diseases from endoscopic images.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。