Abstract
In the solution of large-scale numerical problems, parallel computing is becoming simultaneously more important and more difficult. The complex organization of today's multi-processors with several memory hierarchies has forced the scientific programmer to make a choice between simple but unscalable code and scalable but extremely complex code that does not port to other architectures. This paper describes how the SMARTS runtime system and the POOMA C++ class library for high-performance scientific computing work together to exploit data parallelism in scientific applications while hiding the details of managing parallelism and data locality from the user. We present innovative algorithms, based on the macro-dataflow model, for detecting data parallelism and efficiently executing data-parallel statements on shared-memory multiprocessors. We also describe how these algorithms can be implemented on clusters of SMPs.
Original language | English (US) |
---|---|
Pages | 302-310 |
Number of pages | 9 |
State | Published - 1999 |
Event | Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 - Rhodes, Greece Duration: Jun 20 1999 → Jun 25 1999 |
Conference
Conference | Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 |
---|---|
City | Rhodes, Greece |
Period | 6/20/99 → 6/25/99 |
ASJC Scopus subject areas
- General Computer Science