TY - GEN
T1 - Time-sharing parallel applications with performance isolation and control
AU - Lin, Bin
AU - Sundararaj, Ananth I.
AU - Dinda, Peter A
PY - 2007
Y1 - 2007
N2 - Most parallel machines, such as clusters, are spaceshared in order to isolate batch parallel applications from each other and optimize their performance. However, this leads to low utilization or potentially long waiting times. We propose a self-adaptive approach to time-sharing such machines that provides isolation and allows the execution rate of an application to be tightly controlled by the administrator. Our approach combines a periodic real-time scheduler on each node with a global feedback-based control system that governs the local schedulers. We have developed an online system that implements our approach. The system takes as input a target execution rate for each application, and automatically and continuously adjusts the applications' realtime schedules to achieve those rates with proportional CPU utilization. Target rates can be dynamically adjusted. Applications are performance-isolated from each other and from other work that is not using our system. We present an extensive evaluation that shows that the system remains stable with low response times, and that our focus on CPU isolation and control does not come at the significant expense of network I/O, disk I/O, or memory isolation.
AB - Most parallel machines, such as clusters, are spaceshared in order to isolate batch parallel applications from each other and optimize their performance. However, this leads to low utilization or potentially long waiting times. We propose a self-adaptive approach to time-sharing such machines that provides isolation and allows the execution rate of an application to be tightly controlled by the administrator. Our approach combines a periodic real-time scheduler on each node with a global feedback-based control system that governs the local schedulers. We have developed an online system that implements our approach. The system takes as input a target execution rate for each application, and automatically and continuously adjusts the applications' realtime schedules to achieve those rates with proportional CPU utilization. Target rates can be dynamically adjusted. Applications are performance-isolated from each other and from other work that is not using our system. We present an extensive evaluation that shows that the system remains stable with low response times, and that our focus on CPU isolation and control does not come at the significant expense of network I/O, disk I/O, or memory isolation.
UR - http://www.scopus.com/inward/record.url?scp=35948933912&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35948933912&partnerID=8YFLogxK
U2 - 10.1109/ICAC.2007.39
DO - 10.1109/ICAC.2007.39
M3 - Conference contribution
AN - SCOPUS:35948933912
SN - 0769527795
SN - 9780769527796
T3 - Fourth International Conference on Autonomic Computing, ICAC'07
BT - Fourth International Conference on Autonomic Computing, ICAC'07
T2 - 4th International Conference on Autonomic Computing, ICAC'07
Y2 - 11 June 2007 through 15 June 2007
ER -