Abstract
Oriented object detection constitutes a fundamental yet challenging task in Artificial Intelligence of Things (AIoT)-enabled maritime surveillance, where real-time processing of dense visual streams is imperative. However, existing detectors suffer from three critical limitations: sequential attention mechanisms that fail to capture coupled spatial-channel dependencies, unconstrained deformable convolutions that yield unstable predictions for elongated vessels, and center-based distance metrics that ignore angular alignment in sample assignment. To address these challenges, we propose JAOSD (Joint Attention-based Oriented Ship Detection), an anchor-free framework incorporating three novel components: (1) a joint attention module that processes spatial and channel branches in parallel with coupled fusion, (2) an adaptive geometric convolution with two-stage offset refinement and spatial consistency regularization, and (3) an orientation-aware Adaptive Sample Selection strategy based on corner-aware distance metrics. Extensive experiments on three benchmarks demonstrate that JAOSD achieves state-of-the-art performance-94.74% mAP on HRSC2016, 92.43% AP(50) on FGSD2021, and 80.44% mAP on DOTA v1.0-while maintaining real-time inference at 42.6 FPS. Cross-domain evaluation on the Singapore Maritime Dataset further confirms robust generalization capability from aerial to shore-based surveillance scenarios without domain adaptation.