TY - GEN
T1 - VMM emulation of intel hardware transactional memory
AU - Swiech, Maciej
AU - Hale, Kyle C.
AU - Dinda, Peter A
PY - 2014
Y1 - 2014
N2 - We describe the design, implementation, and evaluation of emulated hardware transactional memory, specifically the Intel Haswell Restricted Transactional Memory (RTM) architectural extensions for x86/64, within a virtual machine monitor (VMM). Our system allows users to investigate RTM on hardware that does not provide it, debug their RTM-based transactional software, and stress test it on diverse emulated hardware configurations, including potential future configurations that might support arbitrary length trans- actions. Initial performance results suggest that we are able to accomplish this approximately 60 times faster than under a full emulator. A noteworthy aspect of our system is a novel page-flipping technique that allows us to completely avoid instruction emulation, and to limit instruction decoding to only that necessary to determine instruction length. This makes it possible to implement RTM emulation, and poten- tially other techniques, far more compactly than would oth- erwise be possible. We have implemented our system in the context of the Palacios VMM. Our techniques are not specific to Palacios, and could be implemented in other VMMs.
AB - We describe the design, implementation, and evaluation of emulated hardware transactional memory, specifically the Intel Haswell Restricted Transactional Memory (RTM) architectural extensions for x86/64, within a virtual machine monitor (VMM). Our system allows users to investigate RTM on hardware that does not provide it, debug their RTM-based transactional software, and stress test it on diverse emulated hardware configurations, including potential future configurations that might support arbitrary length trans- actions. Initial performance results suggest that we are able to accomplish this approximately 60 times faster than under a full emulator. A noteworthy aspect of our system is a novel page-flipping technique that allows us to completely avoid instruction emulation, and to limit instruction decoding to only that necessary to determine instruction length. This makes it possible to implement RTM emulation, and poten- tially other techniques, far more compactly than would oth- erwise be possible. We have implemented our system in the context of the Palacios VMM. Our techniques are not specific to Palacios, and could be implemented in other VMMs.
UR - http://www.scopus.com/inward/record.url?scp=84903641160&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84903641160&partnerID=8YFLogxK
U2 - 10.1145/2612262.2612265
DO - 10.1145/2612262.2612265
M3 - Conference contribution
AN - SCOPUS:84903641160
SN - 9781450329507
T3 - Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2014 - In Conjunction with ICS 2014
BT - Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2014 - In Conjunction with ICS 2014
PB - Association for Computing Machinery
T2 - 4th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2014 - In Conjunction with ICS 2014
Y2 - 10 June 2014 through 10 June 2014
ER -