TY - GEN
T1 - SCP
T2 - 33rd IEEE International Conference on Computer Design, ICCD 2015
AU - Patel, Bhargavraj
AU - Hardavellas, Nikos
AU - Memik, Gokhan
N1 - Funding Information:
This work is partially supported by NSF award CCF-1218768, NSF CAREER award CCF-1453853, DoE award DE-SC0012531
Publisher Copyright:
© 2015 IEEE.
Copyright:
Copyright 2016 Elsevier B.V., All rights reserved.
PY - 2015/12/14
Y1 - 2015/12/14
N2 - While processor caches cannot grow arbitrarily large due to area, power, and latency considerations, dataset sizes grow faster than Moore's Law and pressure caches to grow to accommodate the increasing working sets. Cache compression partially mitigates this problem by providing an effective cache capacity larger than the physical capacity of the cache, but the prevalent rule of thumb dictates that the miss rate lowers by only the square root of the additional cache capacity. Data prefetching and streaming engines can offer a better utilization of the cache space, but sophisticated schemes typically require significant on-chip space, and some even save part of the history they track in main memory (e.g., Spatio-Temporal Memory Streaming-STEMS) and oversubscribe the already limited off-chip bandwidth. In this paper we present synergistic cache compression and prefetching (SCP), a technique that utilizes the cache space saved by cache compression to implement the storage arrays required by data prefetching and streaming engines. SCP outperforms cache-compression-only and data-streaming-only schemes, and approximates the performance of a combined scheme that employs both cache compression and data streaming in hardware, but without the overhead of the additional history and storage arrays for the streaming engine. Utilizing the cache compression hardware to compress the storage arrays for a STEMS streaming engine, in addition to the data cache, allows the streaming engine to operate entirely on-chip using space saved by compressing the cache, obviating the need to offload parts of the history to main memory and further increasing performance.
AB - While processor caches cannot grow arbitrarily large due to area, power, and latency considerations, dataset sizes grow faster than Moore's Law and pressure caches to grow to accommodate the increasing working sets. Cache compression partially mitigates this problem by providing an effective cache capacity larger than the physical capacity of the cache, but the prevalent rule of thumb dictates that the miss rate lowers by only the square root of the additional cache capacity. Data prefetching and streaming engines can offer a better utilization of the cache space, but sophisticated schemes typically require significant on-chip space, and some even save part of the history they track in main memory (e.g., Spatio-Temporal Memory Streaming-STEMS) and oversubscribe the already limited off-chip bandwidth. In this paper we present synergistic cache compression and prefetching (SCP), a technique that utilizes the cache space saved by cache compression to implement the storage arrays required by data prefetching and streaming engines. SCP outperforms cache-compression-only and data-streaming-only schemes, and approximates the performance of a combined scheme that employs both cache compression and data streaming in hardware, but without the overhead of the additional history and storage arrays for the streaming engine. Utilizing the cache compression hardware to compress the storage arrays for a STEMS streaming engine, in addition to the data cache, allows the streaming engine to operate entirely on-chip using space saved by compressing the cache, obviating the need to offload parts of the history to main memory and further increasing performance.
KW - cache compression
KW - prefetching
KW - processor cache
KW - spatio-Temporal data streaming
UR - http://www.scopus.com/inward/record.url?scp=84962467083&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962467083&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2015.7357098
DO - 10.1109/ICCD.2015.7357098
M3 - Conference contribution
AN - SCOPUS:84962467083
T3 - Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015
SP - 164
EP - 171
BT - Proceedings of the 33rd IEEE International Conference on Computer Design, ICCD 2015
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 October 2015 through 21 October 2015
ER -