Abstract
Leveraging non-terrestrial networks for edge computing is crucial for the development of 6G, the Internet of Things, and ubiquitous digitalization. In such scenarios, diverse tasks often exhibit continuously distributed attributes, while existing research predominantly relies on qualitative thresholds for task classification, failing to accommodate quantitatively continuous task requirements. To address this issue, this paper models a multi-task scenario with continuously distributed attributes and proposes a three-tier cloud-edge collaborative offloading architecture comprising UAV-based edge nodes, LEO satellites, and ground cloud data centers. We further formulate a system cost minimization problem that integrates UAV network load balancing and satellite energy efficiency. To solve this non-convex, multi-stage optimization problem, a two-layer multi-type-agent deep reinforcement learning (TMDRL) algorithm is developed. This algorithm categorizes agents according to their functional roles in the Markov decision process and jointly optimizes task offloading and resource allocation by integrating DQN and DDPG frameworks. Simulation results demonstrate that the proposed algorithm reduces system cost by 7.82% compared to existing baseline methods.