TY - GEN
T1 - Location-Awareness in Time Series Compression
AU - Teng, Xu
AU - Züfle, Andreas
AU - Trajcevski, Goce P
AU - Klabjan, Diego
N1 - Funding Information:
X. Teng—Research supported by NSF grant III 1823267. G. Trajcevski—Research supported by NSF grants III-1823279 and CNS-1823267, and ONR grant N00014-14-1-0215.
Publisher Copyright:
© 2018, Springer Nature Switzerland AG.
PY - 2018
Y1 - 2018
N2 - We present our initial findings regarding the problem of the impact that time series compression may have on similarity-queries, in the settings in which the elements of the dataset are accompanied with additional contexts. Broadly, the main objective of any data compression approach is to provide a more compact (i.e., smaller size) representation of a given original dataset. However, as has been observed in the large body of works on compression of spatial data, applying a particular algorithm “blindly” may yield outcomes that defy the intuitive expectations – e.g., distorting certain topological relationships that exist in the “raw” data [7]. In this study, we quantify this distortion by defining a measure of similarity distortion based on Kendall’s T. We evaluate this measure, and the correspondingly achieved compression ratio for the five most commonly used time series compression algorithms and the three most common time series similarity measures. We report some of our observations here, along with the discussion of the possible broader impacts and the challenges that we plan to address in the future.
AB - We present our initial findings regarding the problem of the impact that time series compression may have on similarity-queries, in the settings in which the elements of the dataset are accompanied with additional contexts. Broadly, the main objective of any data compression approach is to provide a more compact (i.e., smaller size) representation of a given original dataset. However, as has been observed in the large body of works on compression of spatial data, applying a particular algorithm “blindly” may yield outcomes that defy the intuitive expectations – e.g., distorting certain topological relationships that exist in the “raw” data [7]. In this study, we quantify this distortion by defining a measure of similarity distortion based on Kendall’s T. We evaluate this measure, and the correspondingly achieved compression ratio for the five most commonly used time series compression algorithms and the three most common time series similarity measures. We report some of our observations here, along with the discussion of the possible broader impacts and the challenges that we plan to address in the future.
UR - http://www.scopus.com/inward/record.url?scp=85051090085&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051090085&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-98398-1_6
DO - 10.1007/978-3-319-98398-1_6
M3 - Conference contribution
AN - SCOPUS:85051090085
SN - 9783319983974
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 82
EP - 95
BT - Advances in Databases and Information Systems - 22nd European Conference, ADBIS 2018, Proceedings
A2 - Benczur, Andras
A2 - Horvath, Tomas
A2 - Thalheim, Bernhard
PB - Springer Verlag
T2 - 22nd East-European Conference on Advances in Databases and Information Systems, ADBIS 2018
Y2 - 2 September 2018 through 5 September 2018
ER -