Abstract
Among the 5G and anticipated 6G technologies, non-orthogonal multiple access (NOMA) has attracted considerable attention due to its notable advantages in data throughput. Nevertheless, it is challenging to find the near-optimal allocation of the channel and power resources to maximize the performance of the multi-cell NOMA system. In addition, due to the complex and dynamically changing wireless communication environment and the lack of the near-optimal labels, conventional supervised learning methods cannot be directly applied. To address these challenges, this paper proposes a framework of MDRL-UL that integrates the multi-agent deep reinforcement learning with the unsupervised learning to allocate the channel and power resources in a near-optimal manner. In the framework, a multi-agent deep reinforcement learning neural network (MDRLNN) is proposed for channel allocation, while an attention-based unsupervised learning neural network (ULNN) is proposed for power allocation. Furthermore, the joint action (JA) derived from the MDRLNN for channel allocation is used as a representation to be fed into the ULNN for power allocation. In order to maximize the energy efficiency of the multi-cell NOMA system, the expectation of the energy efficiency is used to train both the MDRLNN and the ULNN. Simulation results indicate that the proposed MDRL-UL can achieve higher energy efficiency and transmission rates than other algorithms.