Abstract
PURPOSE: Accurate cancer subtyping is essential for precision medicine but challenged by the computational demands of gigapixel whole-slide images (WSIs). Although transformer-based multiple instance learning (MIL) methods achieve strong performance, their quadratic complexity limits clinical deployment. We introduce LiteMIL, a computationally efficient cross-attention MIL, optimized for WSIs classification. APPROACH: LiteMIL employs a single learnable query with multi-head cross-attention for bag-level aggregation from extracted features. We evaluated LiteMIL against five baselines (mean/max pooling, ABMIL, MAD-MIL, and TransMIL) on four TCGA datasets (breast: n = 875 , kidney: n = 906 , lung: n = 958 , and TUPAC16: n = 821 ) using nested cross-validation with patient-level splitting. Systematic ablation studies evaluated multi-query variants, attention heads, dropout rates, and architectural components. RESULTS: LiteMIL achieved competitive accuracy (average 83.5%), matching TransMIL, while offering substantial efficiency gains: 4.8 × fewer parameters (560K versus 2.67M), 2.9 × faster inference (1.6s versus 4.6s per fold), and 6.7 × lower Graphics Processing Unit (GPU) memory usage (1.15 GB versus 7.77 GB). LiteMIL excelled on lung (86.3% versus 85.0%), TUPAC16 (72% versus 71.4%), and matched kidney performance (89.9% versus 89.7%). Ablation studies revealed task-dependent multi-query performance benefits: Q = 4 versus Q = 1 improved morphologically heterogeneous tasks (breast/lung + 1.3% each, p < 0.05 ) but degraded on grading tasks (TUPAC16: - 1.6% ), validating single-query optimality for focused attention scenarios. CONCLUSIONS: LiteMIL provides a resource-efficient solution for WSI classification. The cross-attention architecture matches complex transformer performance while enabling deployment on consumer GPUs. Task-dependent design insights, single query for sparse discriminating features, multi-query for heterogeneous patterns, guide practical implementation. The architecture's efficiency, combined with compact features, makes LiteMIL suitable for clinical integration in settings with limited computational infrastructure.