Abstract
Global Digital Elevation Models (DEMs) exhibit systematic biases constrained by acquisition geometry and surface penetration. This study aims to evaluate whether the increasing complexity of geometric deep learning (e.g., Graph Neural Networks, GNNs) is justified by performance gains over established feature engineering paradigms (e.g., XGBoost) under the constraints of sparse altimetry supervision. We established a rigorous comparative framework across four mainstream products-ALOS World 3D, Copernicus DEM, SRTM GL1, and TanDEM-X-using Sichuan Province, China, as a representative natural laboratory. Our results reveal a fundamental scale mismatch (where the ~485 m average spacing of sampled altimetry footprints dwarfs the local terrain resolution): despite their topological complexity, Hybrid GNN models fail to establish a statistically significant accuracy advantage over the systematically optimized XGBoost baseline, demonstrating RMSE parity. Mechanistically, we uncover a critical divergence in decision logic: XGBoost relies on a stable "Physics Skeleton" consistently dominated by deterministic features (terrain aspect and vegetation density), whereas GNNs exhibit severe "Attribution Stochasticity" (ρ ≈ 0.63-0.77). The GNN component acts as a residual-dependent latent feature learner rather than discovering universal topological laws. We conclude that for geospatial regression tasks relying on sparse supervision, "Physics Trumps Geometry." A "Feature-First" paradigm that prioritizes robust, domain-knowledge-based physical descriptors outweighs the indeterminate complexity of "Black Box" architectures. This study underscores the imperative of prioritizing explanatory stability over marginal accuracy gains to foster trusted Geo-AI.