NCPU: An embedded neural CPU architecture on resource-constrained low power devices for real-time end-to-end performance

Tianyu Jia, Yuhao Ju, Russ Joseph, Jie Gu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

Machine learning inference has become an essential task for embedded edge devices requiring the deployment of costly deep neural network accelerators onto extremely resource-constrained hardware. Although many optimization strategies have been proposed to improve the efficiency of standalone accelerators, the optimization for end-to-end performance of a computing device with heterogeneous cores is still challenging and often overlooked, especially for low power devices. In this paper, we propose a unified reconfigurable architecture, referred as Neural CPU (NCPU), for low-cost embedded systems. The proposed architecture is built on a binary neural network accelerator with the capability to emulate an in-order RISC-V CPU pipeline. The NCPU supports flexible programmability of RISC-V and maintains data locally to avoid costly core-to-core data transfer. A two-core NCPU SoC is designed and fabricated in a 65nm CMOS process. Compared with the conventional heterogeneous architecture, a single NCPU achieves 35% area reduction and 12% energy saving at 0.4V, which is suitable for low power and low-cost embedded edge devices. The NCPU design also features the capability of smooth switching between general-purpose CPU operation and a binary neural network inference to realize full utilization of the cores. The implemented two-core NCPU SoC achieves an end-to-end performance speed-up of 43% or an equivalent 74% energy saving based on use cases of real-time image classification and motion detection.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
PublisherIEEE Computer Society
Pages1097-1109
Number of pages13
ISBN (Electronic)9781728173832
DOIs
StatePublished - Oct 2020
Event53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020 - Virtual, Athens, Greece
Duration: Oct 17 2020Oct 21 2020

Publication series

NameProceedings of the Annual International Symposium on Microarchitecture, MICRO
Volume2020-October
ISSN (Print)1072-4451

Conference

Conference53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020
Country/TerritoryGreece
CityVirtual, Athens
Period10/17/2010/21/20

Funding

ACKNOWLEDGMENT We wish to thank the anonymous reviewers for their helpful feedback. We thank Kendall Kuzminskas for her contribution on preparing some testing programs. This paper is supported in part by the National Science Foundation under grant number CCF-1908488.

Keywords

  • Binary neural network
  • Embedded systems
  • End-to-end performance
  • RISC-V
  • Reconfigurable architecture
  • SoC silicon validation
  • Ultra-low power device

ASJC Scopus subject areas

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'NCPU: An embedded neural CPU architecture on resource-constrained low power devices for real-time end-to-end performance'. Together they form a unique fingerprint.

Cite this