Abstract
Incomplete or faulty MRI sequences are common in clinical practice and can impair AI-based analyses that rely on complete multi-contrast data. The relative effectiveness of classical generative adversarial networks (GANs) versus modern diffusion and transformer-based models for clinically usable MRI synthesis remains unclear. This study evaluates cross-modality MRI synthesis using the BraTS 2019 brain tumour dataset, focusing on T1-to-T2 translation. We assess paired and unpaired CycleGAN models and compare them with two stronger but computationally intensive baselines, a conditional denoising diffusion probabilistic model (DDPM) and a transformer-enhanced GAN, using identical data splits and preprocessing pipelines. Inter-modality correlation was evaluated to estimate the achievable similarity between modalities. Conceptually, modality synthesis may be viewed as a representation-learning approach that compensates for missing imaging information by reconstructing clinically relevant features from available contrasts. Paired CycleGAN achieved correlations of r≈0.92-0.93 and SSIM ≈0.90-0.92, approaching natural T1-T2 correlation (r≈0.95) while maintaining very fast inference (<50 ms/slice). Unpaired CycleGAN achieved r≈0.74-0.78 and SSIM ≈0.82-0.85, producing clinically interpretable reconstructions without voxel-level supervision. DDPM achieved the highest fidelity (SSIM ≈0.93-0.95, r≈0.94) but required substantially greater computational resources, while transformer-enhanced GAN performance was intermediate. Qualitative analysis showed that CycleGAN and DDPM best preserved tumour and tissue boundaries, whereas unpaired CycleGAN occasionally over-smoothed subtle lesions. These findings highlight the trade-off between fidelity and efficiency in cross-modality MRI synthesis, suggesting paired CycleGAN for time-sensitive clinical workflows and diffusion models as a computationally expensive accuracy upper bound.