A novel superpixel based Vision Transformer for improving interpretability in glaucoma screening

一种基于超像素的新型视觉转换器,用于提高青光眼筛查结果的可解释性

阅读:1

Abstract

Interpretability remains one of the major challenges in the clinical adoption of deep learning models for medical image analysis. In ophthalmology, particularly for glaucoma screening, explainable artificial intelligence (XAI) methods are essential for ensuring trust and diagnostic transparency. This study introduces the Superpixel-based Vision Transformer (SpxViT), a model designed to enhance interpretability while maintaining competitive accuracy. SpxViT replaces the traditional fixed grid tokenization of Vision Transformers. (ViTs) with a superpixel-based approach that preserves semantic boundaries within the retinal image. Two variants, SpxViT_fix and SpxViT_var, were evaluated on public and private glaucoma datasets. Results demonstrate that SpxViT achieves comparable accuracy to ViT-B/16 (91.9% vs. 92.5%) while producing more clinically consistent attention maps focused on the optic disc and cup.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。