TY - JOUR
T1 - An Extended Two-Phase Method for Accessing Sections of Out-of-Core Arrays
AU - Thakur, Rajeev
AU - Choudhary, Alok
PY - 1996
Y1 - 1996
N2 - A number of applications on parallel computers deal with very large data sets that cannot fit in main memory. In such applications, data must be stored in files on disks and fetched into memory during program execution. Parallel programs with large out- of-core arrays stored in files must read/write smaller sections of the arrays from/to files. In this article, we describe a method for accessing sections of out-of-core arrays efficiently. Our method, the extended two-phase method, uses collective I/O: Processors cooperate to combine several I/O requests into fewer larger granularity requests, to reorder requests so that the file is accessed in proper sequence, and to eliminate simultaneous I/O requests for the same data. In addition, the I/O workload is divided among processors dynamically, depending on the access requests. We present performance results obtained from two real out-of-core parallel applications—matrix multiplication and a Laplace’s equation solver—and several synthetic access patterns, all on the Intel Touchstone Delta. These results indicate that the extended two-phase method significantly outperformed a direct (noncollective) method for accessing out-of-core array sections.
AB - A number of applications on parallel computers deal with very large data sets that cannot fit in main memory. In such applications, data must be stored in files on disks and fetched into memory during program execution. Parallel programs with large out- of-core arrays stored in files must read/write smaller sections of the arrays from/to files. In this article, we describe a method for accessing sections of out-of-core arrays efficiently. Our method, the extended two-phase method, uses collective I/O: Processors cooperate to combine several I/O requests into fewer larger granularity requests, to reorder requests so that the file is accessed in proper sequence, and to eliminate simultaneous I/O requests for the same data. In addition, the I/O workload is divided among processors dynamically, depending on the access requests. We present performance results obtained from two real out-of-core parallel applications—matrix multiplication and a Laplace’s equation solver—and several synthetic access patterns, all on the Intel Touchstone Delta. These results indicate that the extended two-phase method significantly outperformed a direct (noncollective) method for accessing out-of-core array sections.
UR - http://www.scopus.com/inward/record.url?scp=0000049634&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0000049634&partnerID=8YFLogxK
U2 - 10.1155/1996/547186
DO - 10.1155/1996/547186
M3 - Article
AN - SCOPUS:0000049634
SN - 1058-9244
VL - 5
SP - 301
EP - 317
JO - Scientific Programming
JF - Scientific Programming
IS - 4
ER -