TY - GEN
T1 - An Efficient FPGA-based Floating Random Walk Solver for Capacitance Extraction using SDAccel
AU - Wei, Xin
AU - Yan, Changhao
AU - Zhou, Hai
AU - Zhou, Dian
AU - Zeng, Xuan
PY - 2019/5/14
Y1 - 2019/5/14
N2 - The floating random walk (FRW) algorithm is an important method widely used in the capacitance extraction of very large-scale integration (VLSI) interconnects. FRW could be both time-consuming and power-consuming as the circuit scale grows. However, its highly parallel nature prompts us to accelerate it with FPGAs, which have shown great performance and energy efficiency potential to other computing architectures. In this paper, we propose a scalable FPGA/CPU heterogeneous framework of FRW using SDAccel. Large-scale circuits are partitioned first by the CPU into several segments, and these segments are then sent to the FPGA random walking one by one. The framework solves the challenge of limited FPGA on-chip resource and integrates both merits of FPGAs and CPUs by targeting separate parts of the algorithm to suitable architecture, and the FPGA bitstream is built once for all. Several kernel optimization strategies are used to maximize performance of FPGAs. Besides, the FRW algorithm we use is the naive version with walking on spheres (WOS), which is much simpler and easier to implement than the complicatedly optimized version with walking on cubes (WOC). The implementation on AWS EC2 F1 (Xilinx VU9P FPGA) shows up to 6.1x performance and 42.6x energy efficiency over a quad-core CPU, and 5.2x energy efficiency over the state-of-the-art WOC implementation on an 8-core CPU.
AB - The floating random walk (FRW) algorithm is an important method widely used in the capacitance extraction of very large-scale integration (VLSI) interconnects. FRW could be both time-consuming and power-consuming as the circuit scale grows. However, its highly parallel nature prompts us to accelerate it with FPGAs, which have shown great performance and energy efficiency potential to other computing architectures. In this paper, we propose a scalable FPGA/CPU heterogeneous framework of FRW using SDAccel. Large-scale circuits are partitioned first by the CPU into several segments, and these segments are then sent to the FPGA random walking one by one. The framework solves the challenge of limited FPGA on-chip resource and integrates both merits of FPGAs and CPUs by targeting separate parts of the algorithm to suitable architecture, and the FPGA bitstream is built once for all. Several kernel optimization strategies are used to maximize performance of FPGAs. Besides, the FRW algorithm we use is the naive version with walking on spheres (WOS), which is much simpler and easier to implement than the complicatedly optimized version with walking on cubes (WOC). The implementation on AWS EC2 F1 (Xilinx VU9P FPGA) shows up to 6.1x performance and 42.6x energy efficiency over a quad-core CPU, and 5.2x energy efficiency over the state-of-the-art WOC implementation on an 8-core CPU.
UR - http://www.scopus.com/inward/record.url?scp=85066620523&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066620523&partnerID=8YFLogxK
U2 - 10.23919/DATE.2019.8714992
DO - 10.23919/DATE.2019.8714992
M3 - Conference contribution
AN - SCOPUS:85066620523
T3 - Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
SP - 1040
EP - 1045
BT - Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
Y2 - 25 March 2019 through 29 March 2019
ER -