TY - JOUR
T1 - Thermal monitoring mechanisms for chip multiprocessors
AU - Long, Jieyi
AU - Memik, Seda Ogrenci
AU - Memik, Gokhan
AU - Mukherjee, Rajarshi
N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2008/8/1
Y1 - 2008/8/1
N2 - With large-scale integration and increasing power densities, thermal management has become an important tool to maintain performance and reliability in modern process technologies. In the core of dynamic thermal management schemes lies accurate reading of on-die temperatures. Therefore, careful planning and embedding of thermal monitoring mechanisms into high-performance systems becomes crucial. In this paper, we propose three techniques to create sensor infrastructures for monitoring the maximum temperature on a multicore system. Initially, we extend a nonuniform sensor placement methodology proposed in the literature to handle chip multiprocessors (CMPs) and show its limitations. We then analyze a grid-based approach where the sensors are placed on a static grid covering each core and show that the sensor readings can differ from the actual maximum core temperature by as much as 12.6°C when using 16 sensors per core. Also, as large as 10.6% of the thermal emergencies are not captured using the same number of sensors. Based on this observation, we first develop an interpolation scheme, which estimates the maximum core temperature through interpolation of the readings collected at the static grid points. We show that the interpolation scheme improves the measurement accuracy and emergency coverage compared to grid-based placement when using the same number of sensors. Second, we present a dynamic scheme where only a subset of the sensor readings is collected to predict the maximum temperature of each core. Our results indicate that, we can reduce the number of active sensors by as much as 50%, while maintaining similar measurement accuracy and emergency coverage compared to the case where the entire sensor set on the grid is sampled at all times.
AB - With large-scale integration and increasing power densities, thermal management has become an important tool to maintain performance and reliability in modern process technologies. In the core of dynamic thermal management schemes lies accurate reading of on-die temperatures. Therefore, careful planning and embedding of thermal monitoring mechanisms into high-performance systems becomes crucial. In this paper, we propose three techniques to create sensor infrastructures for monitoring the maximum temperature on a multicore system. Initially, we extend a nonuniform sensor placement methodology proposed in the literature to handle chip multiprocessors (CMPs) and show its limitations. We then analyze a grid-based approach where the sensors are placed on a static grid covering each core and show that the sensor readings can differ from the actual maximum core temperature by as much as 12.6°C when using 16 sensors per core. Also, as large as 10.6% of the thermal emergencies are not captured using the same number of sensors. Based on this observation, we first develop an interpolation scheme, which estimates the maximum core temperature through interpolation of the readings collected at the static grid points. We show that the interpolation scheme improves the measurement accuracy and emergency coverage compared to grid-based placement when using the same number of sensors. Second, we present a dynamic scheme where only a subset of the sensor readings is collected to predict the maximum temperature of each core. Our results indicate that, we can reduce the number of active sensors by as much as 50%, while maintaining similar measurement accuracy and emergency coverage compared to the case where the entire sensor set on the grid is sampled at all times.
KW - Nonuniform and uniform sensor placement
KW - Thermal sensor allocation
UR - http://www.scopus.com/inward/record.url?scp=51149097921&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51149097921&partnerID=8YFLogxK
U2 - 10.1145/1400112.1400114
DO - 10.1145/1400112.1400114
M3 - Article
AN - SCOPUS:51149097921
SN - 1544-3566
VL - 5
JO - Transactions on Architecture and Code Optimization
JF - Transactions on Architecture and Code Optimization
IS - 2
M1 - 9
ER -