Abstract
The actual performance of programs on modern processors that employ deep memory hierarchies is closely related to the performance of the memory subsystem. Compiler optimizations aimed at improving cache locality are critical in realizing the performance potential of powerful processors. For scientific applications, several loop transformations have been shown to be useful in improving both temporal and spatial locality. Recently, there has been some work in the area of data layout optimizations, i.e., changing the memory layouts of multi-dimensional arrays from the language-defined default such as column-major storage in Fortran. These memory layout optimizations affect the spatial locality characteristics of loop nests. This paper presents a technique based on integer linear programming (ILP) that attempts to derive the best combination of loop and data layout transformations. Prior attempts to unify loop and data layout transformations for programs consisting of a sequence of loop nests have been based on heuristics not only for transformations for a single loop nest but also for the sequence in which loop nests will be considered. The ILP formulation presented here obviates the need for such heuristics. Experimental results on a MIPS R10000 based system demonstrate the benefits of this approach, and show that the use of the ILP formulation does not increase the compilation time significantly.
Original language | English (US) |
---|---|
Pages | 500-509 |
Number of pages | 10 |
DOIs | |
State | Published - 1999 |
Event | Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 - Rhodes, Greece Duration: Jun 20 1999 → Jun 25 1999 |
Conference
Conference | Proceedings of the 1999 13th ACM International Conference on Supercomputing, ICS'99 |
---|---|
City | Rhodes, Greece |
Period | 6/20/99 → 6/25/99 |
ASJC Scopus subject areas
- General Computer Science