Abstract
Remote sensing (RS) images are often degraded by atmospheric haze, which compromises both visual interpretation and downstream applications. To address this, we introduce CSGL-Former, a novel Cross-Stripes Global-Local Fusion Transformer for RS image dehazing. Our model efficiently captures anisotropic long-range dependencies using cross-stripes attention (CSA) and aggregates hierarchical global semantics via a Multi-Layer Global Aggregation (MLGA) module. In the decoder, global context is adaptively blended with fine-grained local features to restore intricate textures. Finally, inspired by the atmospheric scattering model, a soft reconstruction head restores the clear image by predicting spatially varying affine parameters, strictly preserving content fidelity while effectively removing haze. Trained end-to-end, CSGL-Former demonstrates a compelling balance of accuracy and efficiency. Extensive experiments on the RRSHID and SateHaze1K benchmarks show that our model achieves state-of-the-art or highly competitive performance against representative baselines. Ablation studies further validate the effectiveness of each proposed component.