19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Cycle-by-cycle dynamic timing slack (DTS), which represents extra timing margin from the critical-path timing slack reported by the static timing analysis (STA), has been observed at both program level and instruction level. Conventional dynamic voltage and frequency scaling (DVFS) works at the program level and does not provide adequate frequency-scaling granularity for instruction-level timing management [1]. Razor-based techniques leverage error detection to exploit the DTS on a cycle-by-cycle basis [2]. However, it requires additional error-detection circuits and architecture-level co-design for error recovery [3]. Supply droop-based adaptive clocking was used to reduce timing margin under PVT variation, but does not address the instruction-level timing variation [4]. Recently, instruction-based adaptive clock schemes have been introduced to enhance a CPU's operation [5-6]. For example, instruction types at the execution stage were used to provide timing control for a simple pipeline structure. However, this scheme lacks adequate consideration for other pipeline stages whose timing may not be opcode dependent [5]. In [6], the instruction-execution sequence was evaluated at the compiler level with the timing encoded into the instruction code. The scheme considers all pipeline stages but relies on in-order execution of instructions for proper timing encoding from the compiler.

Original languageEnglish (US)
Title of host publication2019 IEEE International Solid-State Circuits Conference, ISSCC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages318-320
Number of pages3
ISBN (Electronic)9781538685310
DOIs
StatePublished - Mar 6 2019
Event2019 IEEE International Solid-State Circuits Conference, ISSCC 2019 - San Francisco, United States
Duration: Feb 17 2019Feb 21 2019

Publication series

NameDigest of Technical Papers - IEEE International Solid-State Circuits Conference
Volume2019-February
ISSN (Print)0193-6530

Conference

Conference2019 IEEE International Solid-State Circuits Conference, ISSCC 2019
CountryUnited States
CitySan Francisco
Period2/17/192/21/19

Fingerprint

Clocks
Pipelines
Error detection
Program processors
Networks (circuits)
Voltage scaling
Dynamic frequency scaling

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Electrical and Electronic Engineering

Cite this

Jia, T., Joseph, R. E., & Gu, J. (2019). 19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution. In 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019 (pp. 318-320). [8662389] (Digest of Technical Papers - IEEE International Solid-State Circuits Conference; Vol. 2019-February). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISSCC.2019.8662389
Jia, Tianyu ; Joseph, Russell E ; Gu, Jie. / 19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution. 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 318-320 (Digest of Technical Papers - IEEE International Solid-State Circuits Conference).
@inproceedings{e594593df8d24fbda516df556ced0f10,
title = "19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution",
abstract = "Cycle-by-cycle dynamic timing slack (DTS), which represents extra timing margin from the critical-path timing slack reported by the static timing analysis (STA), has been observed at both program level and instruction level. Conventional dynamic voltage and frequency scaling (DVFS) works at the program level and does not provide adequate frequency-scaling granularity for instruction-level timing management [1]. Razor-based techniques leverage error detection to exploit the DTS on a cycle-by-cycle basis [2]. However, it requires additional error-detection circuits and architecture-level co-design for error recovery [3]. Supply droop-based adaptive clocking was used to reduce timing margin under PVT variation, but does not address the instruction-level timing variation [4]. Recently, instruction-based adaptive clock schemes have been introduced to enhance a CPU's operation [5-6]. For example, instruction types at the execution stage were used to provide timing control for a simple pipeline structure. However, this scheme lacks adequate consideration for other pipeline stages whose timing may not be opcode dependent [5]. In [6], the instruction-execution sequence was evaluated at the compiler level with the timing encoded into the instruction code. The scheme considers all pipeline stages but relies on in-order execution of instructions for proper timing encoding from the compiler.",
author = "Tianyu Jia and Joseph, {Russell E} and Jie Gu",
year = "2019",
month = "3",
day = "6",
doi = "10.1109/ISSCC.2019.8662389",
language = "English (US)",
series = "Digest of Technical Papers - IEEE International Solid-State Circuits Conference",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "318--320",
booktitle = "2019 IEEE International Solid-State Circuits Conference, ISSCC 2019",
address = "United States",

}

Jia, T, Joseph, RE & Gu, J 2019, 19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution. in 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019., 8662389, Digest of Technical Papers - IEEE International Solid-State Circuits Conference, vol. 2019-February, Institute of Electrical and Electronics Engineers Inc., pp. 318-320, 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019, San Francisco, United States, 2/17/19. https://doi.org/10.1109/ISSCC.2019.8662389

19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution. / Jia, Tianyu; Joseph, Russell E; Gu, Jie.

2019 IEEE International Solid-State Circuits Conference, ISSCC 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 318-320 8662389 (Digest of Technical Papers - IEEE International Solid-State Circuits Conference; Vol. 2019-February).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - 19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution

AU - Jia, Tianyu

AU - Joseph, Russell E

AU - Gu, Jie

PY - 2019/3/6

Y1 - 2019/3/6

N2 - Cycle-by-cycle dynamic timing slack (DTS), which represents extra timing margin from the critical-path timing slack reported by the static timing analysis (STA), has been observed at both program level and instruction level. Conventional dynamic voltage and frequency scaling (DVFS) works at the program level and does not provide adequate frequency-scaling granularity for instruction-level timing management [1]. Razor-based techniques leverage error detection to exploit the DTS on a cycle-by-cycle basis [2]. However, it requires additional error-detection circuits and architecture-level co-design for error recovery [3]. Supply droop-based adaptive clocking was used to reduce timing margin under PVT variation, but does not address the instruction-level timing variation [4]. Recently, instruction-based adaptive clock schemes have been introduced to enhance a CPU's operation [5-6]. For example, instruction types at the execution stage were used to provide timing control for a simple pipeline structure. However, this scheme lacks adequate consideration for other pipeline stages whose timing may not be opcode dependent [5]. In [6], the instruction-execution sequence was evaluated at the compiler level with the timing encoded into the instruction code. The scheme considers all pipeline stages but relies on in-order execution of instructions for proper timing encoding from the compiler.

AB - Cycle-by-cycle dynamic timing slack (DTS), which represents extra timing margin from the critical-path timing slack reported by the static timing analysis (STA), has been observed at both program level and instruction level. Conventional dynamic voltage and frequency scaling (DVFS) works at the program level and does not provide adequate frequency-scaling granularity for instruction-level timing management [1]. Razor-based techniques leverage error detection to exploit the DTS on a cycle-by-cycle basis [2]. However, it requires additional error-detection circuits and architecture-level co-design for error recovery [3]. Supply droop-based adaptive clocking was used to reduce timing margin under PVT variation, but does not address the instruction-level timing variation [4]. Recently, instruction-based adaptive clock schemes have been introduced to enhance a CPU's operation [5-6]. For example, instruction types at the execution stage were used to provide timing control for a simple pipeline structure. However, this scheme lacks adequate consideration for other pipeline stages whose timing may not be opcode dependent [5]. In [6], the instruction-execution sequence was evaluated at the compiler level with the timing encoded into the instruction code. The scheme considers all pipeline stages but relies on in-order execution of instructions for proper timing encoding from the compiler.

UR - http://www.scopus.com/inward/record.url?scp=85063433545&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063433545&partnerID=8YFLogxK

U2 - 10.1109/ISSCC.2019.8662389

DO - 10.1109/ISSCC.2019.8662389

M3 - Conference contribution

T3 - Digest of Technical Papers - IEEE International Solid-State Circuits Conference

SP - 318

EP - 320

BT - 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Jia T, Joseph RE, Gu J. 19.4 An Adaptive Clock Management Scheme Exploiting Instruction-Based Dynamic Timing Slack for a General-Purpose Graphics Processor Unit with Deep Pipeline and Out-of-Order Execution. In 2019 IEEE International Solid-State Circuits Conference, ISSCC 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 318-320. 8662389. (Digest of Technical Papers - IEEE International Solid-State Circuits Conference). https://doi.org/10.1109/ISSCC.2019.8662389