Abstract
To address the challenges of high miss rates in subcentimeter nodules, false positives caused by vascular adhesion, and insufficient multi-scale feature fusion in lung CT analysis, a multi-stage detection model named MLND-IU, which incorporates an improved U-Net++ architecture, is proposed. The three-stage framework begins with an enhanced RetinaNet optimized by a dynamic focal loss to generate candidate regions with high sensitivity while mitigating class imbalance. The second stage introduces AG-UNet++ with a novel Dense Attention Bridging Module (DABM), which employs a tensor product fusion of channel and deformable spatial attention across densely connected skip pathways to amplify feature representation for 3-5 mm nodules. The final stage employs a 3D Contextual Pyramid Module (3D-CPM) to integrate multi-slice morphological and contextual features, thereby reducing vascular false positives. Ablation studies indicated that the second stage improved the Dice coefficient by 21.1% compared with the first stage (paired t-test, p < 0.01, independent validation on LIDC-IDRI). The third stage further reduced the false positives per scan (FP/Scan) to 1.4, corresponding to an 87.3% reduction compared to the baseline. Multicenter validation on the LIDC-IDRI (n = 1,018) and DSB2017 (n = 1,595) datasets resulted in a segmentation Dice coefficient of 92.7%, a sensitivity of 93.4% for nodules smaller than 6 mm (compared to radiologists' sensitivity of 68.5%, p = 0.003), and an AUC of 0.84 for malignancy classification, representing a 19.2% improvement over conventional methods. With a processing time of 2.3 seconds per case, the proposed framework presents a clinically viable solution for early lung cancer screening by simultaneously improving small nodule detection and suppressing false positives.