Abstract
In clinical practice, MRI images are widely used to diagnose neurodegenerative diseases like Alzheimer's disease (AD). However, complex MRI patterns often mimic normal aging, relying on subjective experience and risking misdiagnosis. This paper proposes the Cross Vision Transformer with Coordinate (CVTC) to assess AD severity and accurately annotate suspicious lesions, reducing clinicians' burden and improving reliability. CVTC integrates scale-adaptive embedding, dynamic position bias, and long-short attention mechanisms to enhance capture of local and global MRI features. For annotation, the Coordinate and Feature Map Guided Mechanism (CAGM) leverages pixel coordinates and feature maps to compute an importance threshold, generating lesion overlay maps for precise localization. A user-friendly UI is developed to enable intuitive exploration and validation in clinical settings. CVTC demonstrates robust performance across datasets: 98.80% accuracy on ADNI (AD/MCI/CN), 98.51% on AD subtypes (EMCI/LMCI/SMC/CN); cross-dataset validation on tumor and multiple sclerosis MRIs confirms CAGM's annotation accuracy; NACC (96.30%), OASIS-1 (98.16%), and pseudo-RGB datasets (92.96%) highlight superior generalization. Ablation and cross-validation affirm robustness. With a 21.850MB lightweight design, CVTC offers an efficient, accurate tool for diverse clinical applications.