Abstract
Joint classification of hyperspectral imagery (HSI) and light detection and ranging (LiDAR) data has attracted increasing attention in remote sensing. However, effective multimodal fusion and robust feature modeling remain challenging due to data heterogeneity. Graph neural networks (GNNs) are well suited for modeling non-Euclidean structures and cross-modal relations, but most existing GNN-based methods rely on supervised learning, limiting their applicability in label-scarce scenarios. We propose adaptive graph contrastive learning (AGCL), a self-supervised graph framework for HSI and LiDAR classification. AGCL performs adaptive graph construction through input-conditioned neighborhood selection and learns dynamic affinity matrices for flexible message passing. A hard negative mining strategy constructs informative negative samples for contrastive learning. During self-supervised pretraining, AGCL jointly optimizes intra-modal consistency, cross-modal alignment, and graph topology reconstruction without labeled data. The learned representations are then transferred to downstream classification via supervised fine-tuning. Experiments on three benchmark datasets demonstrate the effectiveness of the proposed framework.