TY - GEN
T1 - Minimal-overhead virtualization of a large scale supercomputer
AU - Lange, John R.
AU - Pedretti, Kevin
AU - Dinda, Peter A
AU - Bae, Chang
AU - Bridges, Patrick G.
AU - Soltero, Philip
AU - Merritt, Alexander
PY - 2011
Y1 - 2011
N2 - Virtualization has the potential to dramatically increase the usability and reliability of high performance computing (HPC) systems. However, this potential will remain unrealized unless overheads can be minimized. This is particularly challenging on large scale machines that run carefully crafted HPC OSes supporting tightlycoupled, parallel applications. In this paper, we show how careful use of hardware and VMM features enables the virtualization of a large-scale HPC system, specifically a Cray XT4 machine, with <5% overhead on key HPC applications, microbenchmarks, and guests at scales of up to 4096 nodes. We describe three techniques essential for achieving such low overhead: passthrough I/O, workload-sensitive selection of paging mechanisms, and carefully controlled preemption. These techniques are forms of symbiotic virtualization, an approach on which we elaborate.
AB - Virtualization has the potential to dramatically increase the usability and reliability of high performance computing (HPC) systems. However, this potential will remain unrealized unless overheads can be minimized. This is particularly challenging on large scale machines that run carefully crafted HPC OSes supporting tightlycoupled, parallel applications. In this paper, we show how careful use of hardware and VMM features enables the virtualization of a large-scale HPC system, specifically a Cray XT4 machine, with <5% overhead on key HPC applications, microbenchmarks, and guests at scales of up to 4096 nodes. We describe three techniques essential for achieving such low overhead: passthrough I/O, workload-sensitive selection of paging mechanisms, and carefully controlled preemption. These techniques are forms of symbiotic virtualization, an approach on which we elaborate.
KW - Experimentation
KW - General terms design
KW - Measurement
KW - Performance
UR - http://www.scopus.com/inward/record.url?scp=79953193078&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79953193078&partnerID=8YFLogxK
U2 - 10.1145/1952682.1952705
DO - 10.1145/1952682.1952705
M3 - Conference contribution
AN - SCOPUS:79953193078
SN - 9781450305013
T3 - Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE 2011
SP - 169
EP - 180
BT - Proceedings of the 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE 2011
T2 - 7th ACM SIGPLAN/SIGOPS Conference on Virtual Execution Environments, VEE'11
Y2 - 9 March 2011 through 11 March 2011
ER -