A layout-conscious iteration space transformation technique

Mahmut Kandemir*, J. Ramanujam, Alok Choudhary, Prithviraj Banerjee

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Scopus citations

Abstract

Exploiting locality of references has become extremely important in realizing the potential performance of modern machines with deep memory hierarchies. The data access patterns of programs and the memory layouts of the accessed data sets play a critical role in determining the performance of applications running on these machines. This paper presents a cache locality optimization technique that can optimize a loop nest even if the arrays referenced have different layouts in memory. Such a capability is required for a global locality optimization framework that applies both loop and data transformations to a sequence of loop nests for optimizing locality. Our method uses a single linear algebra framework to represent both data layouts and loop transformations. It computes a nonsingular loop transformation matrix such that, in a given loop nest, data locality is exploited in the innermost loops, where it is most useful. The inverse of a nonsingular transformation matrix is built column-by-column, starting from the rightmost column. In addition, our approach can work in those cases where the data layouts of a subset of the referenced arrays is unknown; this is a key step in optimizing a sequence of loop nests and whole programs for locality. Experimental results on an SGI/Cray Origin 2000 nonuniform memory access multiprocessor machine show that our technique reduces execution times by as much as 70 percent.

Original languageEnglish (US)
Pages (from-to)1321-1336
Number of pages16
JournalIEEE Transactions on Computers
Volume50
Issue number12
DOIs
StatePublished - Dec 2001

Funding

Mahmut Kandemir and Alok Choudhary were supported in part by US National Science Foundation (NSF) Young Investigator Award CCR-9357840 and NSF grant CCR-9509143. The work of J. Ramanujam is supported in part by NSF grant CCR-0073800 and by NSF Young Investigator Award CCR-9457768. Prithviraj Banerjee is supported in part by the NSF under grant CCR-9526325 and in part by the US Defense Advanced Research Projects Agency under contract F30602-98-2-0144. A preliminary version of this paper was presented at the Workshop on Languages and Compilers for Parallel Computing, Chapel Hill, North Carolina, August 1998.

Keywords

  • Cache locality
  • Data reuse
  • Loop transformations
  • Memory layouts
  • Program optimization

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'A layout-conscious iteration space transformation technique'. Together they form a unique fingerprint.

Cite this