Abstract
In the field of ocean observation, we often face the challenge of the contradiction between the vast ocean environment and limited ocean sensor observations, resulting in significant sparsity in the acquired ocean sensor data. This sparse ocean sensor data typically exhibits characteristics such as discrete spatial distribution, discontinuous observation time, and vertical stratification with water depth variations. Current methods primarily employ rule-based quality control, time series modeling, or traditional graph neural networks for processing. This paper addresses the characteristics of sparse ocean sensor data, building upon these methods by further utilizing topological correlation and hierarchical feature modeling on a topological basis. It proposes a depth-aware heterogeneous spatiotemporal graph neural network (DAHSGNN) to achieve efficient anomaly detection and data correction for this type of data. DAHSGNN integrates discrete observation data along the depth axis using a local graph construction method. It employs hierarchical feature engineering to characterize the vertical stratification of the ocean. A Gaussian Hidden Markov Model is used to segment the water layers, and intra- and inter-layer trend features are extracted using a water layer probability-guided Transformer encoder. Then, a bidirectional long short-term memory deep sequence encoder captures the local dynamic context, thereby achieving fine-grained modeling of the ocean's vertical stratification features. Finally, a heterogeneous graph autoencoder is used to reconstruct the site-level data distribution. Experiments were conducted using multiple environmental variables from the International Seabed Authority (ISA) DeepData database. Results show that DAHSGNN exhibits good cross-variable generalization ability, achieves higher reconstruction accuracy than baseline methods, and significantly improves anomaly detection performance.