A scalable automated framework for multiply-accumulate unit design in high-performance computing applications

一种用于高性能计算应用中多累加单元设计的可扩展自动化框架

阅读:1

Abstract

Multiply-Accumulate (MAC) units are essential components in real-time Digital Signal Processing (DSP) applications such as image filtering, audio processing, and neural networks, where speed and energy efficiency are critical. This work presents a scalable and automated Hardware Description Language (HDL) generation framework for efficient MAC unit design, enabling faster integration into System-on-Chip (SoC) architectures. A Python automation script is developed to generate synthesizable Verilog code supporting user-defined bit-widths and computation cycles. The framework integrates a Grouping and Decomposition (GD) multiplier with a carry-skip adder, enhancing computational speed and minimizing energy consumption through parallelized data processing. The generated Verilog code is functionally verified using Cadence(®) Nclaunch, while synthesis and physical implementation are performed using Cadence(®) Genus and Innovus tools across multiple technology nodes. Experimental evaluation shows that the 8-bit GD-based MAC implemented in 180 nm technology achieves 14.96times lower power, 9.86times improvement in Power Delay Product (PDP) and Energy per Operation (EOP), and 6.52times improvement in Energy Delay Product (EDP) compared to existing designs. Likewise, the 16-bit MAC in 90 nm technology achieves a 70.08% reduction in delay and a 91.2% improvement in EDP. Overall, the proposed framework delivers a scalable, energy-efficient, and automation-driven solution for high-performance MAC design.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。