TY - GEN
T1 - Simultaneous equation systems for query processing on continuous-time data streams
AU - Ahmad, Yanif
AU - Papaemmanouil, Olga
AU - Çetintemel, Uǧur
AU - Rogers, Jennie
PY - 2008/10/1
Y1 - 2008/10/1
N2 - We introduce Pulse, a framework for processing continuous queries over models of continuous-time data, which can compactly and accurately represent many real-world activities and processes. Pulse implements several query operators, including filters, aggregates and joins, that work by solving simultaneous equation systems, which in many cases is significantly cheaper than processing a stream of tuples. As such, Pulse translates regular queries to work on continuous-time inputs, to reduce computational overhead and latency while meeting user-specified error bounds on query results. For error bound checking, Pulse uses an approximate query inversion technique that ensures the solver executes infrequently and only in the presence of errors, or no previously known results. We first discuss the high-level design of Pulse, which we fully implemented in a stream processing system. We then characterise Pulse's behavior through experiments with real data, including financial data from the New York Stock Exchange, and spatial data from the Automatic Identification System for tracking naval vessels. Our results verify that Pulse is practical and demonstrates significant performance gains for a variety of workload and query types.
AB - We introduce Pulse, a framework for processing continuous queries over models of continuous-time data, which can compactly and accurately represent many real-world activities and processes. Pulse implements several query operators, including filters, aggregates and joins, that work by solving simultaneous equation systems, which in many cases is significantly cheaper than processing a stream of tuples. As such, Pulse translates regular queries to work on continuous-time inputs, to reduce computational overhead and latency while meeting user-specified error bounds on query results. For error bound checking, Pulse uses an approximate query inversion technique that ensures the solver executes infrequently and only in the presence of errors, or no previously known results. We first discuss the high-level design of Pulse, which we fully implemented in a stream processing system. We then characterise Pulse's behavior through experiments with real data, including financial data from the New York Stock Exchange, and spatial data from the Automatic Identification System for tracking naval vessels. Our results verify that Pulse is practical and demonstrates significant performance gains for a variety of workload and query types.
UR - http://www.scopus.com/inward/record.url?scp=52649170247&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=52649170247&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2008.4497475
DO - 10.1109/ICDE.2008.4497475
M3 - Conference contribution
AN - SCOPUS:52649170247
SN - 9781424418374
T3 - Proceedings - International Conference on Data Engineering
SP - 666
EP - 675
BT - Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
T2 - 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
Y2 - 7 April 2008 through 12 April 2008
ER -