Abstract
The autonomous coordination of multi-robot systems in complex, environments remains a fundamental challenge. Current Multi-Agent Reinforcement Learning (MARL) methods often struggle to reason effectively about the dynamic, causal relationships between agents and their surroundings. To address this, we introduce the Graph-Gated Transformer (GGT), a novel neural architecture designed to inject explicit relational priors directly into the self-attention mechanism for multi-robot coordination. The core mechanism of the GGT involves dynamically constructing a Tactical Relational Graph that encodes high-priority relationships like collision risk and cooperative intent. This graph is then used to generate an explicit attention mask, compelling the Transformer to focus its reasoning exclusively on entities rather than engaging in brute-force pattern matching across all perceived objects. Integrated into a Centralized Training with Decentralized Execution (CTDE) framework with QMIX, our approach demonstrates substantial improvements in high-fidelity simulations. In complex scenarios with dynamic obstacles and sensor noise, our GGT-based system achieves 95.3% coverage area efficiency with only 0.4 collisions per episode, a stark contrast to the 60.3% coverage and 20.7 collisions of standard QMIX. Ablation studies confirm that this structured, gated attention mechanism-not merely the presence of attention-is the key to unlocking robust collective autonomy. This work establishes that explicitly constraining the Transformer's attention space with dynamic, domain-aware relational graphs is a powerful and effective architectural solution for engineering safe and intelligent multi-robot systems.