NCPU: An embedded neural CPU architecture on resource-constrained low power devices for real-time end-to-end performance

Tianyu Jia, Yuhao Ju, Russ Joseph, Jie Gu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Machine learning inference has become an essential task for embedded edge devices requiring the deployment of costly deep neural network accelerators onto extremely resource-constrained hardware. Although many optimization strategies have been proposed to improve the efficiency of standalone accelerators, the optimization for end-to-end performance of a computing device with heterogeneous cores is still challenging and often overlooked, especially for low power devices. In this paper, we propose a unified reconfigurable architecture, referred as Neural CPU (NCPU), for low-cost embedded systems. The proposed architecture is built on a binary neural network accelerator with the capability to emulate an in-order RISC-V CPU pipeline. The NCPU supports flexible programmability of RISC-V and maintains data locally to avoid costly core-to-core data transfer. A two-core NCPU SoC is designed and fabricated in a 65nm CMOS process. Compared with the conventional heterogeneous architecture, a single NCPU achieves 35% area reduction and 12% energy saving at 0.4V, which is suitable for low power and low-cost embedded edge devices. The NCPU design also features the capability of smooth switching between general-purpose CPU operation and a binary neural network inference to realize full utilization of the cores. The implemented two-core NCPU SoC achieves an end-to-end performance speed-up of 43% or an equivalent 74% energy saving based on use cases of real-time image classification and motion detection.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
PublisherIEEE Computer Society
Pages1097-1109
Number of pages13
ISBN (Electronic)9781728173832
DOIs
StatePublished - Oct 2020
Event53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020 - Virtual, Athens, Greece
Duration: Oct 17 2020Oct 21 2020

Publication series

NameProceedings of the Annual International Symposium on Microarchitecture, MICRO
Volume2020-October
ISSN (Print)1072-4451

Conference

Conference53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
Country/TerritoryGreece
CityVirtual, Athens
Period10/17/2010/21/20

Keywords

  • Binary neural network
  • Embedded systems
  • End-to-end performance
  • RISC-V
  • Reconfigurable architecture
  • SoC silicon validation
  • Ultra-low power device

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'NCPU: An embedded neural CPU architecture on resource-constrained low power devices for real-time end-to-end performance'. Together they form a unique fingerprint.

Cite this