Abstract
Automated assessment of coronary artery (CA) lesions via Coronary Computed Tomography Angiography (CCTA) is essential for the diagnosis of coronary artery disease (CAD). However, current deep learning approaches confront several challenges, primarily regarding the modeling of long-range anatomical dependencies, the effective decoupling of plaque texture from stenosis geometry, and the utilization of clinically prevalent mixed-grained annotations. To address these challenges, we propose a novel mixed-grained hierarchical geometric-semantic learning network (MG-HGLNet). Specifically, we introduce a topology-aware dual-stream encoding (TDE) module, which incorporates a bidirectional vessel Mamba (BiV-Mamba) encoder to capture global hemodynamic contexts and rectify spatial distortions inherent in curved planar reformation (CPR). Furthermore, a synergistic spectral-morphological decoupling (SSD) module is designed to disentangle task-specific features; it utilizes frequency-domain analysis to extract plaque spectral fingerprints while employing a texture-guided deformable attention mechanism to refine luminal boundary. To mitigate the scarcity of fine-grained labels, we implement a mixed-grained supervision optimization (MSO) strategy, utilizing anatomy-aware dynamic prototypes and logical consistency constraints to effectively leverage coarse branch-level labels. Extensive experiments on an in-house dataset demonstrate that MG-HGLNet achieves a stenosis grading accuracy of 92.4% and a plaque classification accuracy of 91.5%. The results suggest that our framework not only outperforms state-of-the-art methods but also maintains robust performance under weakly supervised settings, offering a promising solution for label-efficient CAD diagnosis.