Attention-based workload prediction and dynamic resource allocation for heterogeneous computing environments

面向异构计算环境的基于注意力机制的工作负载预测和动态资源分配

阅读:1

Abstract

The rapid proliferation of artificial intelligence applications in modern data centers demands intelligent resource management strategies that can effectively handle diverse workloads across heterogeneous computing infrastructures. This paper proposes an integrated framework that combines multi-head spatial-temporal attention mechanisms for workload prediction with dynamic resource allocation algorithms optimized for heterogeneous environments. The spatial-temporal attention architecture separately models temporal evolution patterns within individual workload streams and spatial correlations across concurrent task types, enabling accurate forecasting of resource demands. The allocation framework formulates resource assignment as a multi-objective optimization problem that jointly considers performance, energy efficiency, and utilization while explicitly accounting for prediction uncertainty. Experimental evaluation on real-world cluster traces demonstrates that our approach achieves 78.4% resource utilization with only 2.3% SLA violations, reduces average task completion time by 25.8%, and decreases energy consumption by 15.1% compared to production-grade baseline methods. The framework provides practical benefits for cloud service providers and enterprise data centers seeking to maximize infrastructure efficiency while maintaining service quality guarantees. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-38622-4.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。