Abstract
Accurate traffic flow forecasting plays a critical role in alleviating urban road congestion. Despite the success of existing models (e.g., graph-based or attention-based methods), three key limitations persist: (1) inflexible spatial dependency modeling, where static graph structures fail to adapt to dynamic traffic patterns; (2) decoupled spatiotemporal learning, where spatial and temporal correlations are processed separately, leading to information loss; and (3) limited long-term trend awareness, as traditional attention mechanisms overlook local contextual cues (e.g., rush-hour fluctuations). To address this, a new model of traffic flow forecasting based on Spatiotemporal Interactive Learning and Temporal Attention (STIL-TA) is proposed. This model effectively enhances the accuracy of traffic flow predictions by jointly modeling the spatiotemporal characteristics of road networks. Specifically, STIL-TA consists of two key components: (1) an interactive learning module built upon interactive dynamic graph convolution, which adopts a divide-and-conquer strategy to synchronize interactions and share the dynamically captured spatiotemporal features across different time periods, and (2) a temporal multi-head trend-aware self-attention mechanism, which utilizes local contextual information to transform the numerical sequence, enabling the capture of dynamic temporal dependencies in traffic flow and improving long-term prediction accuracy. Experimental results on four real-world traffic datasets demonstrate that the proposed STIL-TA model outperforms existing approaches, achieving significant improvements in forecasting accuracy.