Abstract
The rapid proliferation of artificial intelligence applications in modern data centers demands intelligent resource management strategies that can effectively handle diverse workloads across heterogeneous computing infrastructures. This paper proposes an integrated framework that combines multi-head spatial-temporal attention mechanisms for workload prediction with dynamic resource allocation algorithms optimized for heterogeneous environments. The spatial-temporal attention architecture separately models temporal evolution patterns within individual workload streams and spatial correlations across concurrent task types, enabling accurate forecasting of resource demands. The allocation framework formulates resource assignment as a multi-objective optimization problem that jointly considers performance, energy efficiency, and utilization while explicitly accounting for prediction uncertainty. Experimental evaluation on real-world cluster traces demonstrates that our approach achieves 78.4% resource utilization with only 2.3% SLA violations, reduces average task completion time by 25.8%, and decreases energy consumption by 15.1% compared to production-grade baseline methods. The framework provides practical benefits for cloud service providers and enterprise data centers seeking to maximize infrastructure efficiency while maintaining service quality guarantees. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-38622-4.