TY - GEN
T1 - Memory coherence activity prediction in commercial workloads
AU - Somogyi, Stephen
AU - Wenisch, Thomas F.
AU - Hardavellas, Nikolaos
AU - Kim, Jangwoo
AU - Ailamaki, Anastassia
AU - Falsafi, Babak
PY - 2004
Y1 - 2004
N2 - Recent research indicates that prediction-based coherence optimizations offer substantial performance improvements for scientific applications in distributed shared memory multiprocessors. Important commercial applications also show sensitivity to coherence latency, which will become more acute in the future as technology scales. Therefore it is important to investigate prediction of memory coherence activity in the context of commercial workloads.This paper studies a trace-based Downgrade Predictor (DGP) for predicting last stores to shared cache blocks, and a pattern-based Consumer Set Predictor (CSP) for predicting subsequent readers. We evaluate this class of predictors for the first time on commercial applications and demonstrate that our DGP correctly predicts 47%-76% of last stores. Memory sharing patterns in commercial workloads are inherently non-repetitive; hence CSP cannot attain high coverage. We perform an opportunity study of a DGP enhanced through competitive underlying predictors, and in commercial and scientific applications, demonstrate potential to increase coverage up to 14%.
AB - Recent research indicates that prediction-based coherence optimizations offer substantial performance improvements for scientific applications in distributed shared memory multiprocessors. Important commercial applications also show sensitivity to coherence latency, which will become more acute in the future as technology scales. Therefore it is important to investigate prediction of memory coherence activity in the context of commercial workloads.This paper studies a trace-based Downgrade Predictor (DGP) for predicting last stores to shared cache blocks, and a pattern-based Consumer Set Predictor (CSP) for predicting subsequent readers. We evaluate this class of predictors for the first time on commercial applications and demonstrate that our DGP correctly predicts 47%-76% of last stores. Memory sharing patterns in commercial workloads are inherently non-repetitive; hence CSP cannot attain high coverage. We perform an opportunity study of a DGP enhanced through competitive underlying predictors, and in commercial and scientific applications, demonstrate potential to increase coverage up to 14%.
KW - coherence misses
KW - coherence prediction
KW - commercial workloads
KW - sharing patterns
KW - trace-based prediction
UR - http://www.scopus.com/inward/record.url?scp=77954446302&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954446302&partnerID=8YFLogxK
U2 - 10.1145/1054943.1054949
DO - 10.1145/1054943.1054949
M3 - Conference contribution
AN - SCOPUS:77954446302
SN - 159593040X
SN - 9781595930408
T3 - ACM International Conference Proceeding Series
SP - 37
EP - 45
BT - Proceedings of the 3rd Workshop on Memory Performance Issues, WMPI '04, in Conjunction with the 31st International Symposium on Computer Architecture
T2 - 3rd Workshop on Memory Performance Issues, WMPI '04, in Conjunction with the 31st International Symposium on Computer Architecture
Y2 - 20 June 2004 through 20 June 2004
ER -