Dynamic file striping and data layout transformation on parallel system with fluctuating I/O workload

Seung Woo Son, Saba Sehrish, Wei Keng Liao, Ron Oldfield, Alok Choudhary

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

As the number of compute cores on modern parallel machines increases to more than hundreds of thousands, scalable and consistent I/O performance is becoming hard to obtain due to fluctuating file system performance. This fluctuation is often caused by rebuilding RAID disk from hardware failures or concurrent jobs competing for I/O. We present a mechanism that stripes across a dynamically-selected subset of I/O servers with the lightest workload to achieve the best I/O bandwidth available from the system. We implement this mechanism into an I/O software layer that enables memory-to-file data layout transformation and allows transparent file partitioning. File partitioning is a technique that divides data among a set of files and manages file access, making data appear as a single file to users. Experimental results on NERSC's Hopper indicate that our approach effectively isolates I/O variation on shared systems and improves overall I/O performance significantly.

Original languageEnglish (US)
Title of host publication2013 IEEE International Conference on Cluster Computing, CLUSTER 2013
DOIs
StatePublished - Dec 1 2013
Event15th IEEE International Conference on Cluster Computing, CLUSTER 2013 - Indianapolis, IN, United States
Duration: Sep 23 2013Sep 27 2013

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
ISSN (Print)1552-5244

Other

Other15th IEEE International Conference on Cluster Computing, CLUSTER 2013
CountryUnited States
CityIndianapolis, IN
Period9/23/139/27/13

Keywords

  • Collective I/O
  • File partitioning
  • Parallel NetCDF

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Signal Processing

Fingerprint Dive into the research topics of 'Dynamic file striping and data layout transformation on parallel system with fluctuating I/O workload'. Together they form a unique fingerprint.

Cite this