Abstract
With the rapid development of Internet of Things (IoT) technology, Wireless Powered Communication Networks (WPCNs) have emerged as a sustainable solution for powering IoT devices. This paper proposes a Deep Q-Network (DQN)-empowered dynamic resource collaborative management scheme addressing limitations in prior WPCN research. Traditional linear energy harvesting models introduce significant errors when Radio Frequency to Direct Current (RF-DC) conversion exhibits nonlinear saturation effects. We adopt a piecewise nonlinear harvesting model and formulate a multi-objective allocation problem using a Markov Decision Process (MDP) framework. Our objective function maximizes network utility while balancing energy efficiency and Jain's fairness index. A closed-loop optimization framework integrates Gaussian Process Regression (GPR) for harvest prediction. Theoretical contributions include: (1) convergence proofs for Q-learning under Robbins-Monro conditions; (2) Lyapunov stability analysis ensuring bounded energy queue errors; and (3) O(N) computational complexity scalability. Simulation results for a 30-node network demonstrate that our scheme extends network lifetime by 56.4% (117 to 183 rounds), reduces energy allocation standard deviation by 56.8% (23.7 mJ to 12.3 mJ), improves convergence speed by 53.1% (150 vs. 320 episodes), enhances dynamic adaptability by 66.7% (5 vs. 15 rounds), and increases throughput by 33.33% (80 vs. 60 Mbps). These results provide strong support for large-scale WPCN deployment.