Abstract
Magnetic resonance imaging (MRI), a widely adopted modality in clinical practice, enables the acquisition of multi-contrast images from the same anatomical structure, commonly referred to as multimodal images. Integrating these diverse modalities is crucial for enhancing model performance across a variety of medical image analysis tasks. However, in real-world clinical scenarios, it is often impractical to acquire all MRI modalities simultaneously due to factors such as patient discomfort, time constraints, and scanning costs. As a result, synthesizing missing modalities from available ones has emerged as an effective solution. To address these challenges, we propose HMF-MambaINR, a hierarchical multi-scale feature fusion network for cross-modality MRI synthesis. The model integrates Mamba-based Selective State Space Modeling (SSM) and implicit neural representation (INR) to capture long-range dependencies and enable continuous spatial reconstruction. A Multi-Feature Extraction Block (MFEB) captures local and global representations via multi-scale receptive fields, while a Modulation Fusion Module (MFM) adaptively fuses multi-modal features with dynamic weighting. Extensive experiments show that HMF-MambaINR surpasses state-of-the-art CNN-, Transformer-, and Mamba-based methods in synthesizing missing MRI modalities. Notably, the synthesized MRI images received positive feedback from radiologists in terms of image quality, contrast, and structural contour accuracy, highlighting the potential of the proposed method as a practical tool for clinical applications.