TY - GEN
T1 - Unconventional parallelization of nondeterministic applications
AU - Deiana, Enrico A.
AU - St Amour, Vincent
AU - Dinda, Peter A
AU - Hardavellas, Nikos
AU - Campanoni, Simone
PY - 2018/3/19
Y1 - 2018/3/19
N2 - The demand for thread-level-parallelism (TLP) on commodity processors is endless as it is essential for gaining performance and saving energy. However, TLP in today's programs is limited by dependences that must be satisfied at run time. We have found that for nondeterministic programs, some of these actual dependences can be satisfied with alternative data that can be generated in parallel, thus boosting the program's TLP. Satisfying these dependences with alternative data nonetheless produces final outputs that match those of the original nondeterministic program. To demonstrate the practicality of our technique, we describe the design, implementation, and evaluation of our compilers, autotuner, profiler, and runtime, which are enabled by our proposed C++ programming language extensions. The resulting system boosts the performance of six well-known nondeterministic and multi-threaded benchmarks by 158.2% (geometric mean) on a 28-core Intel-based platform.
AB - The demand for thread-level-parallelism (TLP) on commodity processors is endless as it is essential for gaining performance and saving energy. However, TLP in today's programs is limited by dependences that must be satisfied at run time. We have found that for nondeterministic programs, some of these actual dependences can be satisfied with alternative data that can be generated in parallel, thus boosting the program's TLP. Satisfying these dependences with alternative data nonetheless produces final outputs that match those of the original nondeterministic program. To demonstrate the practicality of our technique, we describe the design, implementation, and evaluation of our compilers, autotuner, profiler, and runtime, which are enabled by our proposed C++ programming language extensions. The resulting system boosts the performance of six well-known nondeterministic and multi-threaded benchmarks by 158.2% (geometric mean) on a 28-core Intel-based platform.
KW - Dependences
KW - Parallelization
KW - Speculation
UR - http://www.scopus.com/inward/record.url?scp=85045390947&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85045390947&partnerID=8YFLogxK
U2 - 10.1145/3173162.3173181
DO - 10.1145/3173162.3173181
M3 - Conference contribution
AN - SCOPUS:85045390947
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 432
EP - 447
BT - ASPLOS 2018 - 23rd International Conference on Architectural Support for Programming Languages and Operating Systems
PB - Association for Computing Machinery
T2 - 23rd International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018
Y2 - 24 March 2018 through 28 March 2018
ER -