Abstract
The accurate prediction of the behavior of surrounding agents is crucial for the safe operation of autonomous vehicles. Currently, the dominant approach involves manually defining rules, which often fail to cover all potential scenarios. To address the rigidity and challenges of generalizing these rules to real-world driving contexts, we introduce a novel attention mechanism called "Out Way Attention". This mechanism improves the model's capacity to dynamically adapt to various driving situations by incorporating attention exits (out way) into the attention framework. Additionally, we present a new trajectory prediction framework that includes a learnable proposal matrix and permutation-invariant positional encoding. This matrix aids in forecasting future multimodal trajectories for multiple interacting agents in dynamic settings. The permutation-invariant positional encoding ensures that the processing sequence of agents at the same time does not influence the prediction outcomes. By integrating the proposed methods into the Transformer architecture, our approach reduces human intervention and significantly enhances the model's adaptability, as well as improving training and inference efficiency. We have validated the effectiveness of our model on three public datasets, Argoverse, Trajnet++, and ETH/UCY. The results confirm that our method sustains robust performance in complex, highly dynamic environments with multiple interacting agents.