A Systolic Neural CPU Processor Combining Deep Learning and General-Purpose Computing with Enhanced Data Locality and End-to-End Performance

Yuhao Ju*, Jie Gu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

While neural network (NN) accelerators are being significantly developed in recent years, CPU is still essential for data management and pre-/post-processing of accelerators in a commonly used heterogeneous architecture, which usually contains an NN accelerator and a processor core with data transfer performed by direct memory access (DMA) engine. This work presents a special neural processor, referred to as a systolic neural CPU processor (SNCPU), which is a unified architecture combining deep learning and general-purpose computing for fifth-generation of reduced instruction set computer (RISC-V) to improve end-to-end performance for machine learning (ML) tasks compared with a common heterogeneous architecture with CPU and accelerator. With 64%-80% processing elements (PEs) logic reuse and 10% area overhead, SNCPU can be configured into ten RISC-V CPU cores. Special bi-directional dataflow and four different working modes are developed to enhance the utilization of deep NN (DNN) accelerator and eliminate the expensive data transfer between CPU and DNN accelerator in existing heterogeneous architecture. A 65-nm test chip was fabricated demonstrating a 39%-64% performance improvement on end-to-end image classification tasks for ImageNet, Cifar10, and MNIST datasets with over 95% PE utilization and up to 1.8TOPs/W power efficiency.

Original languageEnglish (US)
Pages (from-to)216-226
Number of pages11
JournalIEEE Journal of Solid-State Circuits
Volume58
Issue number1
DOIs
StatePublished - Jan 1 2023

Keywords

  • Bi-directional dataflow
  • CPU
  • deep neural network (DNN) accelerator
  • end-to-end performance
  • general-purpose computing for fifth-generation of reduced instruction set computer (RISC-V)
  • heterogeneous architecture
  • machine learning (ML)

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A Systolic Neural CPU Processor Combining Deep Learning and General-Purpose Computing with Enhanced Data Locality and End-to-End Performance'. Together they form a unique fingerprint.

Cite this