Abstract
For XL-MIMO multi-user frequency division duplex systems, this paper proposes a near-field beam training scheme using a two-phase combinatorial multi-armed bandit (MAB) framework. This scheme leverages the MAB framework, integrating energy-aware user scheduling and hierarchical beam training to balance communication quality and device battery level, thereby effectively enhancing system energy efficiency and extending the device's lifespan. Specifically, in the first phase, we account for user battery levels by designing an energy-aware upper confidence bound (UCB) algorithm for user scheduling. This algorithm effectively balances exploration and exploitation, prioritizing users with higher achievable rates and sufficient battery level. In the second phase, based on the scheduled users, two UCB algorithms are employed for beam training. In the first layer, discrete Fourier transform codebook-based beam scanning is utilized, and a UCB algorithm is applied to initially acquire angle information for scheduled users. In the second layer, based on the obtained angle information, a candidate set of polar-domain codewords is constructed. Another UCB algorithm is then employed to select the optimal polar-domain codewords. The effectiveness of our scheme is confirmed by simulations, demonstrating notable achievable rate gains for multi-user communications.