Abstract
Accurate 3D reconstruction of guidewires is crucial in minimally invasive surgery and interventional procedures. Traditional biplanar X-ray-based reconstruction methods can achieve reasonable accuracy but involve high radiation doses, limiting their clinical applicability; meanwhile, single-view images inherently lack reliable depth cues. To address these issues, this paper proposes a multimodal guidewire 3D reconstruction approach that integrates magnetic field information. The method first employs the MiDaS v3 network to estimate an initial depth map from a single image and then incorporates tri-axial magnetic field measurements to enrich and refine the spatial information. To effectively fuse the two modalities, we design a multi-stage strategy combining nearest-neighbor matching (KNN) with a cross-modal attention mechanism (Cross-Attention), enabling accurate alignment and fusion of image and magnetic features. The fused representation is subsequently fed into a PointNet-based regressor to generate the final 3D coordinates of the guidewire. Experimental results demonstrate that our method achieves a root-mean-square error of 2.045 mm, a mean absolute error of 1.738 mm, and a z-axis MAE of 0.285 mm on the test set. These findings indicate that the proposed multimodal framework improves 3D reconstruction accuracy under single-view imaging and offers enhanced visualization support for interventional procedures.