It is widely acknowledged that improving parallel I/O performance is critical for widespread adoption of high performance computing. In this paper, we show that communication in out-of-core distributed memory problems may require both interprocessor communication and file I/O. Thus, in order to improve I/O performance, it is necessary to minimize the I/O costs associated with a communication step. We present three methods for performing communication in out-of-core distributed memory problems. The first method, called the generalized collective communication method, follows a loosely synchronous model; computation and communication phases are clearly separated, and communication requires permutation of data in files. The second method, called the receiver-driven in-core communication, communicates only the in-core data. The third method, called the owner-driven in-core communication, goes even one step further and tries to identify the potential future use of data (by the recipients) while it is in the senders memory. We provide performance results for two out-of-core applications: the two-dimensional FFT code, and the two-dimensional elliptic Jacobi solver.
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence