SCP: Synergistic cache compression and prefetching

Bhargavraj Patel, Nikos Hardavellas, Gokhan Memik

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

While processor caches cannot grow arbitrarily large due to area, power, and latency considerations, dataset sizes grow faster than Moore's Law and pressure caches to grow to accommodate the increasing working sets. Cache compression partially mitigates this problem by providing an effective cache capacity larger than the physical capacity of the cache, but the prevalent rule of thumb dictates that the miss rate lowers by only the square root of the additional cache capacity. Data prefetching and streaming engines can offer a better utilization of the cache space, but sophisticated schemes typically require significant on-chip space, and some even save part of the history they track in main memory (e.g., Spatio-Temporal Memory Streaming-STEMS) and oversubscribe the already limited off-chip bandwidth. In this paper we present synergistic cache compression and prefetching (SCP), a technique that utilizes the cache space saved by cache compression to implement the storage arrays required by data prefetching and streaming engines. SCP outperforms cache-compression-only and data-streaming-only schemes, and approximates the performance of a combined scheme that employs both cache compression and data streaming in hardware, but without the overhead of the additional history and storage arrays for the streaming engine. Utilizing the cache compression hardware to compress the storage arrays for a STEMS streaming engine, in addition to the data cache, allows the streaming engine to operate entirely on-chip using space saved by compressing the cache, obviating the need to offload parts of the history to main memory and further increasing performance.

Original languageEnglish (US)
Title of host publicationProceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages164-171
Number of pages8
ISBN (Electronic)9781467371650
DOIs
StatePublished - Dec 14 2015
Event33rd IEEE International Conference on Computer Design, ICCD 2015 - New York City, United States
Duration: Oct 18 2015Oct 21 2015

Publication series

NameProceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015

Other

Other33rd IEEE International Conference on Computer Design, ICCD 2015
CountryUnited States
CityNew York City
Period10/18/1510/21/15

Fingerprint

Engines
Data storage equipment
Hardware
Bandwidth

Keywords

  • cache compression
  • prefetching
  • processor cache
  • spatio-Temporal data streaming

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Science Applications

Cite this

Patel, B., Hardavellas, N., & Memik, G. (2015). SCP: Synergistic cache compression and prefetching. In Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015 (pp. 164-171). [7357098] (Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCD.2015.7357098
Patel, Bhargavraj ; Hardavellas, Nikos ; Memik, Gokhan. / SCP : Synergistic cache compression and prefetching. Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 164-171 (Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015).
@inproceedings{9b2a212fc95d4811b2cb5a424187dd3a,
title = "SCP: Synergistic cache compression and prefetching",
abstract = "While processor caches cannot grow arbitrarily large due to area, power, and latency considerations, dataset sizes grow faster than Moore's Law and pressure caches to grow to accommodate the increasing working sets. Cache compression partially mitigates this problem by providing an effective cache capacity larger than the physical capacity of the cache, but the prevalent rule of thumb dictates that the miss rate lowers by only the square root of the additional cache capacity. Data prefetching and streaming engines can offer a better utilization of the cache space, but sophisticated schemes typically require significant on-chip space, and some even save part of the history they track in main memory (e.g., Spatio-Temporal Memory Streaming-STEMS) and oversubscribe the already limited off-chip bandwidth. In this paper we present synergistic cache compression and prefetching (SCP), a technique that utilizes the cache space saved by cache compression to implement the storage arrays required by data prefetching and streaming engines. SCP outperforms cache-compression-only and data-streaming-only schemes, and approximates the performance of a combined scheme that employs both cache compression and data streaming in hardware, but without the overhead of the additional history and storage arrays for the streaming engine. Utilizing the cache compression hardware to compress the storage arrays for a STEMS streaming engine, in addition to the data cache, allows the streaming engine to operate entirely on-chip using space saved by compressing the cache, obviating the need to offload parts of the history to main memory and further increasing performance.",
keywords = "cache compression, prefetching, processor cache, spatio-Temporal data streaming",
author = "Bhargavraj Patel and Nikos Hardavellas and Gokhan Memik",
year = "2015",
month = "12",
day = "14",
doi = "10.1109/ICCD.2015.7357098",
language = "English (US)",
series = "Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "164--171",
booktitle = "Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015",
address = "United States",

}

Patel, B, Hardavellas, N & Memik, G 2015, SCP: Synergistic cache compression and prefetching. in Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015., 7357098, Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015, Institute of Electrical and Electronics Engineers Inc., pp. 164-171, 33rd IEEE International Conference on Computer Design, ICCD 2015, New York City, United States, 10/18/15. https://doi.org/10.1109/ICCD.2015.7357098

SCP : Synergistic cache compression and prefetching. / Patel, Bhargavraj; Hardavellas, Nikos; Memik, Gokhan.

Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015. Institute of Electrical and Electronics Engineers Inc., 2015. p. 164-171 7357098 (Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - SCP

T2 - Synergistic cache compression and prefetching

AU - Patel, Bhargavraj

AU - Hardavellas, Nikos

AU - Memik, Gokhan

PY - 2015/12/14

Y1 - 2015/12/14

N2 - While processor caches cannot grow arbitrarily large due to area, power, and latency considerations, dataset sizes grow faster than Moore's Law and pressure caches to grow to accommodate the increasing working sets. Cache compression partially mitigates this problem by providing an effective cache capacity larger than the physical capacity of the cache, but the prevalent rule of thumb dictates that the miss rate lowers by only the square root of the additional cache capacity. Data prefetching and streaming engines can offer a better utilization of the cache space, but sophisticated schemes typically require significant on-chip space, and some even save part of the history they track in main memory (e.g., Spatio-Temporal Memory Streaming-STEMS) and oversubscribe the already limited off-chip bandwidth. In this paper we present synergistic cache compression and prefetching (SCP), a technique that utilizes the cache space saved by cache compression to implement the storage arrays required by data prefetching and streaming engines. SCP outperforms cache-compression-only and data-streaming-only schemes, and approximates the performance of a combined scheme that employs both cache compression and data streaming in hardware, but without the overhead of the additional history and storage arrays for the streaming engine. Utilizing the cache compression hardware to compress the storage arrays for a STEMS streaming engine, in addition to the data cache, allows the streaming engine to operate entirely on-chip using space saved by compressing the cache, obviating the need to offload parts of the history to main memory and further increasing performance.

AB - While processor caches cannot grow arbitrarily large due to area, power, and latency considerations, dataset sizes grow faster than Moore's Law and pressure caches to grow to accommodate the increasing working sets. Cache compression partially mitigates this problem by providing an effective cache capacity larger than the physical capacity of the cache, but the prevalent rule of thumb dictates that the miss rate lowers by only the square root of the additional cache capacity. Data prefetching and streaming engines can offer a better utilization of the cache space, but sophisticated schemes typically require significant on-chip space, and some even save part of the history they track in main memory (e.g., Spatio-Temporal Memory Streaming-STEMS) and oversubscribe the already limited off-chip bandwidth. In this paper we present synergistic cache compression and prefetching (SCP), a technique that utilizes the cache space saved by cache compression to implement the storage arrays required by data prefetching and streaming engines. SCP outperforms cache-compression-only and data-streaming-only schemes, and approximates the performance of a combined scheme that employs both cache compression and data streaming in hardware, but without the overhead of the additional history and storage arrays for the streaming engine. Utilizing the cache compression hardware to compress the storage arrays for a STEMS streaming engine, in addition to the data cache, allows the streaming engine to operate entirely on-chip using space saved by compressing the cache, obviating the need to offload parts of the history to main memory and further increasing performance.

KW - cache compression

KW - prefetching

KW - processor cache

KW - spatio-Temporal data streaming

UR - http://www.scopus.com/inward/record.url?scp=84962467083&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84962467083&partnerID=8YFLogxK

U2 - 10.1109/ICCD.2015.7357098

DO - 10.1109/ICCD.2015.7357098

M3 - Conference contribution

AN - SCOPUS:84962467083

T3 - Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015

SP - 164

EP - 171

BT - Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Patel B, Hardavellas N, Memik G. SCP: Synergistic cache compression and prefetching. In Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 164-171. 7357098. (Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015). https://doi.org/10.1109/ICCD.2015.7357098