TY - GEN
T1 - A matrix-based approach to the global locality optimization problem
AU - Kandemir, M.
AU - Choudhary, A.
AU - Ramanujam, J.
AU - Banerjee, P.
N1 - Funding Information:
The work of M. Kandemir and A. Choudhary was supported in part by NSF Young Investigator Award CCR-9357840, NSF grant CCR-9509143 and Air Force Materials Command under contract F30602-97-C-0026. J. Ramanujam was supported in part by supported in part by NSF Young Investigator Award CCR-9457768 and NSF grant CCR-9210422. P. Banerjee was supported in part by NSF under grant CCR-9526325 and in part by DARPA under contract DABT-63-97-C-0035.
PY - 1998
Y1 - 1998
N2 - Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop transformations are restricted by data dependences and may not be very successful in optimizing imperfectly nested loops; the impact of a data transformation on an array might be program-wide. Therefore, in this paper we argue for a combined approach which employs both loop and data transformations. The method enjoys the advantages of the most of the previous techniques for enhancing locality and is efficient. In our approach, the loop nests are processed one by one and the data layout constraints obtained from one nest are propagated for optimization of the remaining loop nests. We show that this process can be put in a simple matrix framework which can be manipulated by an optimizing compiler. The search space that we consider for possible loop transformations comprises general non-singular linear transformation matrices and the data layouts that we consider are those which can be expressed using hyperplanes. Experiments using several programs on an SGI Origin 2000 distributed-shared-memory machine demonstrate the efficacy of our approach.
AB - Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop transformations are restricted by data dependences and may not be very successful in optimizing imperfectly nested loops; the impact of a data transformation on an array might be program-wide. Therefore, in this paper we argue for a combined approach which employs both loop and data transformations. The method enjoys the advantages of the most of the previous techniques for enhancing locality and is efficient. In our approach, the loop nests are processed one by one and the data layout constraints obtained from one nest are propagated for optimization of the remaining loop nests. We show that this process can be put in a simple matrix framework which can be manipulated by an optimizing compiler. The search space that we consider for possible loop transformations comprises general non-singular linear transformation matrices and the data layouts that we consider are those which can be expressed using hyperplanes. Experiments using several programs on an SGI Origin 2000 distributed-shared-memory machine demonstrate the efficacy of our approach.
UR - http://www.scopus.com/inward/record.url?scp=0037722074&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0037722074&partnerID=8YFLogxK
U2 - 10.1109/PACT.1998.727266
DO - 10.1109/PACT.1998.727266
M3 - Conference contribution
AN - SCOPUS:0037722074
T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
SP - 306
EP - 313
BT - Proceedings - 1998 International Conference on Parallel Architectures and Compilation Techniques, PACT 1998
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1998 International Conference on Parallel Architectures and Compilation Techniques, PACT 1998
Y2 - 12 October 1998 through 18 October 1998
ER -