TY - JOUR
T1 - A high-performance distributed parallel file system for data-intensive computations
AU - Shen, Xiaohui
AU - Choudhary, Alok
N1 - Funding Information:
This research was in part supported by Department of Energy under the Accelerated Strategic Computing Initiative (ASCI) Academic Strategic Alliance Program (ASAP) Level 2, under subcontract No W-7405-ENG-48 from Lawrence Livermore National Laboratories. We also thank Prof. Baner-jee and Prof. Taylor of ECE department at Northwestern University for allowing us to use their students’ workstations to run our DPFS servers.
PY - 2004/10
Y1 - 2004/10
N2 - One of the challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using sufficient local storage resources to hold huge amounts of data generated by the simulation while providing high-performance I/O. DPFS, a distributed parallel file system, is designed and implemented to address this problem. DPFS collects locally distributed and unused storage resources as a supplement to the internal storage of parallel computing systems to satisfy the storage capacity requirement of large-scale applications. In addition, like parallel file systems, DPFS provides striping mechanisms that divide a file into small pieces and distributes them across multiple storage devices for parallel data access. The unique feature of DPFS is that it provides three file levels with each file level corresponding to a file striping method. In addition to the traditional linear striping method, DPFS also provides a novel Multidimensional striping method that can solve performance problems of linear striping for many popular access patterns. Other issues such as load-balancing and user interface are also addressed in DPFS.
AB - One of the challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using sufficient local storage resources to hold huge amounts of data generated by the simulation while providing high-performance I/O. DPFS, a distributed parallel file system, is designed and implemented to address this problem. DPFS collects locally distributed and unused storage resources as a supplement to the internal storage of parallel computing systems to satisfy the storage capacity requirement of large-scale applications. In addition, like parallel file systems, DPFS provides striping mechanisms that divide a file into small pieces and distributes them across multiple storage devices for parallel data access. The unique feature of DPFS is that it provides three file levels with each file level corresponding to a file striping method. In addition to the traditional linear striping method, DPFS also provides a novel Multidimensional striping method that can solve performance problems of linear striping for many popular access patterns. Other issues such as load-balancing and user interface are also addressed in DPFS.
KW - Data intensive applications
KW - Distributed file system
KW - I/O
KW - Parallel file system
KW - Striping
UR - http://www.scopus.com/inward/record.url?scp=4544272037&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=4544272037&partnerID=8YFLogxK
U2 - 10.1016/j.jpdc.2004.07.001
DO - 10.1016/j.jpdc.2004.07.001
M3 - Article
AN - SCOPUS:4544272037
VL - 64
SP - 1157
EP - 1167
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
SN - 0743-7315
IS - 10
ER -