TY - JOUR
T1 - A case study for scientific I/O
T2 - Improving the FLASH astrophysics code
AU - Latham, Rob
AU - Daley, Chris
AU - Liao, Wei Keng
AU - Gao, Kui
AU - Ross, Rob
AU - Dubey, Anshu
AU - Choudhary, Alok
PY - 2012/1
Y1 - 2012/1
N2 - The FLASH code is a computational science tool for simulating and studying thermonuclear reactions. The program periodically outputs large checkpoint files (to resume a calculation from a particular point in time) and smaller plot files (for visualization and analysis). Initial experiments on BlueGene/P spent excessive time in input/output (I/O), making it difficult to do actual science. Our investigation of time spent in I/O revealed several locations in the I/O software stack where we could make improvements. Fixing data corruption in the MPI-IO library allowed us to use collective I/O, yielding an order of magnitude improvement. Restructuring the data layout provided a more efficient I/O access pattern and yielded another doubling of performance, but broke format assumptions made by other tools in the application workflow. Using new nonblocking APIs in the Parallel-NetCDF library allowed us to keep high performance and maintain backward compatibility. The I/O research community has studied a host of optimizations and strategies. Sometimes the challenge for applications is knowing how to apply these new techniques to production codes. In this case study, we offer a demonstration of how computational scientists, with a detailed understanding of their application, and the I/O community, with a wide array of approaches from which to choose, can magnify each other's efforts and achieve tremendous application productivity gains.
AB - The FLASH code is a computational science tool for simulating and studying thermonuclear reactions. The program periodically outputs large checkpoint files (to resume a calculation from a particular point in time) and smaller plot files (for visualization and analysis). Initial experiments on BlueGene/P spent excessive time in input/output (I/O), making it difficult to do actual science. Our investigation of time spent in I/O revealed several locations in the I/O software stack where we could make improvements. Fixing data corruption in the MPI-IO library allowed us to use collective I/O, yielding an order of magnitude improvement. Restructuring the data layout provided a more efficient I/O access pattern and yielded another doubling of performance, but broke format assumptions made by other tools in the application workflow. Using new nonblocking APIs in the Parallel-NetCDF library allowed us to keep high performance and maintain backward compatibility. The I/O research community has studied a host of optimizations and strategies. Sometimes the challenge for applications is knowing how to apply these new techniques to production codes. In this case study, we offer a demonstration of how computational scientists, with a detailed understanding of their application, and the I/O community, with a wide array of approaches from which to choose, can magnify each other's efforts and achieve tremendous application productivity gains.
UR - http://www.scopus.com/inward/record.url?scp=84859611780&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84859611780&partnerID=8YFLogxK
U2 - 10.1088/1749-4699/5/1/015001
DO - 10.1088/1749-4699/5/1/015001
M3 - Article
AN - SCOPUS:84859611780
SN - 1749-4680
VL - 5
JO - Computational Science and Discovery
JF - Computational Science and Discovery
IS - 1
M1 - 015001
ER -