TY - JOUR
T1 - VM-based shared memory on low-latency, remote-memory-access networks
AU - Kontothanassis, Leonidas
AU - Hunt, Galen
AU - Stets, Robert
AU - Hardavellas, Nikolaos
AU - Cierniak, Michal
AU - Parthasarathy, Srinivasan
AU - Meira, Wagner
AU - Dwarkadas, Sandhya
AU - Scott, Michael
PY - 1997
Y1 - 1997
N2 - Recent technological advances have produced network interfaces that provide users with very low-latency access to the memory of remote machines. We examine the impact of such networks on the implementation and performance of software DSM. Specifically, we compare two DSM systems - Cashmere and TreadMarks - on a 32-processor DEC Alpha cluster connected by a Memory Channel network. Both Cashmere and TreadMarks use virtual memory to maintain coherence on pages, and both use lazy, multi-writer release consistency. The systems differ dramatically, however, in the mechanisms used to track sharing information and to collect and merge concurrent updates to a page, with the result that Cashmere communicates much more frequently, and at a much finer grain. Our principal conclusion is that low-latency networks make DSM based on fine-grain communication competitive with more coarse-grain approaches, but that further hardware improvements will be needed before such systems can provide consistently superior performance. In our experiments, Cashmere scales slightly better than TreadMarks for applications with false sharing. At the same time, it is severely constrained by limitations of the current Memory Channel hardware. In general, performance is better for TreadMarks.
AB - Recent technological advances have produced network interfaces that provide users with very low-latency access to the memory of remote machines. We examine the impact of such networks on the implementation and performance of software DSM. Specifically, we compare two DSM systems - Cashmere and TreadMarks - on a 32-processor DEC Alpha cluster connected by a Memory Channel network. Both Cashmere and TreadMarks use virtual memory to maintain coherence on pages, and both use lazy, multi-writer release consistency. The systems differ dramatically, however, in the mechanisms used to track sharing information and to collect and merge concurrent updates to a page, with the result that Cashmere communicates much more frequently, and at a much finer grain. Our principal conclusion is that low-latency networks make DSM based on fine-grain communication competitive with more coarse-grain approaches, but that further hardware improvements will be needed before such systems can provide consistently superior performance. In our experiments, Cashmere scales slightly better than TreadMarks for applications with false sharing. At the same time, it is severely constrained by limitations of the current Memory Channel hardware. In general, performance is better for TreadMarks.
UR - http://www.scopus.com/inward/record.url?scp=0030686646&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0030686646&partnerID=8YFLogxK
U2 - 10.1145/264107.264163
DO - 10.1145/264107.264163
M3 - Conference article
AN - SCOPUS:0030686646
SN - 0884-7495
SP - 157
EP - 169
JO - Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
JF - Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA
T2 - Proceedings of the 1997 24th Annual International Symposium on Computer Architecture
Y2 - 2 June 1997 through 4 June 1997
ER -