Abstract
Uncrewed Aerial Vehicles (UAVs), or drones, are revolutionizing applications such as surveillance, search and rescue (SAR), precision agriculture, and disaster response. In these applications, Flying Ad-Hoc Networks (FANETs)-dynamic networks of interconnected UAVs-are essential for enabling rapid and adaptive deployments. However, the limited battery life of UAVs remains a critical challenge, particularly in extended missions like SAR, where operational longevity and reliable performance are paramount. This paper addresses two key challenges in FANET-based UAV operations: optimizing energy consumption and ensuring stable communication links. We propose a novel strategy that models energy-efficient UAV behavior by focusing on the three primary drivers of power consumption: flight dynamics, payload operations, and continuous wireless communication, which collectively account for 80-85% of the UAV's energy usage. Leveraging a multi-agent deep reinforcement learning (MADRL) framework that intelligently manages a dedicated dual-battery system: one battery (B1) optimized for flight dynamics and payload operations, and the other (B2) for processor/sensor and continuous wireless communication. The MADRL agents (Proximal Policy Optimization: PPO) are modeled to handle battery B2, which dynamically establishes communication between UAVs regarding energy usage and other environmental conditions during SAR operations, thereby efficiently conserving available energy to extend the battery life of each UAV within the FANET system. Experimental results demonstrate that MADRL enhances network connectivity and significantly reduces energy wastage, enabling sustained operations over longer durations. This research lays the foundation for developing energy-efficient UAV systems, which are crucial for large-scale and autonomous deployments in mission-critical scenarios.