TY - GEN
T1 - Performance prediction for concurrent database workloads
AU - Duggan, Jennie
AU - Cetintemel, Ugur
AU - Papaemmanouil, Olga
AU - Upfal, Eli
PY - 2011
Y1 - 2011
N2 - Current trends in data management systems, such as cloud and multi-tenant databases, are leading to data processing environments that concurrently execute heterogeneous query workloads. At the same time, these systems need to satisfy diverse performance expectations. In these newly-emerging settings, avoiding potential Quality-of-Service (QoS) violations heavily relies on performance predictability, i.e., the ability to estimate the impact of concurrent query execution on the performance of individual queries in a continuously evolving workload. This paper presents a modeling approach to estimate the impact of concurrency on query performance for analytical workloads. Our solution relies on the analysis of query behavior in isolation, pairwise query interactions and sampling techniques to predict resource contention under various query mixes and concurrency levels. We introduce a simple yet powerful metric that accurately captures the joint effects of disk and memory contention on query performance in a single value. We also discuss predicting the execution behavior of a time-varying query workload through query-interaction timelines, i.e., a fine-grained estimation of the time segments during which discrete mixes will be executed concurrently. Our experimental evaluation on top of PostgreSQL/TPC-H demonstrates that our models can provide query latency predictions within approximately 20% of the actual values in the average case.
AB - Current trends in data management systems, such as cloud and multi-tenant databases, are leading to data processing environments that concurrently execute heterogeneous query workloads. At the same time, these systems need to satisfy diverse performance expectations. In these newly-emerging settings, avoiding potential Quality-of-Service (QoS) violations heavily relies on performance predictability, i.e., the ability to estimate the impact of concurrent query execution on the performance of individual queries in a continuously evolving workload. This paper presents a modeling approach to estimate the impact of concurrency on query performance for analytical workloads. Our solution relies on the analysis of query behavior in isolation, pairwise query interactions and sampling techniques to predict resource contention under various query mixes and concurrency levels. We introduce a simple yet powerful metric that accurately captures the joint effects of disk and memory contention on query performance in a single value. We also discuss predicting the execution behavior of a time-varying query workload through query-interaction timelines, i.e., a fine-grained estimation of the time segments during which discrete mixes will be executed concurrently. Our experimental evaluation on top of PostgreSQL/TPC-H demonstrates that our models can provide query latency predictions within approximately 20% of the actual values in the average case.
KW - concurrency
KW - query performance prediction
UR - http://www.scopus.com/inward/record.url?scp=79960005600&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79960005600&partnerID=8YFLogxK
U2 - 10.1145/1989323.1989359
DO - 10.1145/1989323.1989359
M3 - Conference contribution
AN - SCOPUS:79960005600
SN - 9781450306614
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 337
EP - 348
BT - Proceedings of SIGMOD 2011 and PODS 2011
PB - Association for Computing Machinery
T2 - 2011 ACM SIGMOD and 30th PODS 2011 Conference
Y2 - 12 June 2011 through 16 June 2011
ER -