TY - JOUR
T1 - Experimental comparison of representation methods and distance measures for time series data
AU - Wang, Xiaoyue
AU - Mueen, Abdullah
AU - Ding, Hui
AU - Trajcevski, Goce
AU - Scheuermann, Peter I
AU - Keogh, Eamonn
N1 - Funding Information:
Research supported by NSF awards 0803410 and 0808770, NSF-CNS grant 0910952.
PY - 2013/3
Y1 - 2013/3
N2 - The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this article, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.
AB - The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particular method has made specific claims and, aside from the occasional theoretical justifications, provided quantitative experimental observations. However, for the most part, the comparative aspects of these experiments were too narrowly focused on demonstrating the benefits of the proposed methods over some of the previously introduced ones. In order to provide a comprehensive validation, we conducted an extensive experimental study re-implementing eight different time series representations and nine similarity measures and their variants, and testing their effectiveness on 38 time series data sets from a wide variety of application domains. In this article, we give an overview of these different techniques and present our comparative experimental findings regarding their effectiveness. In addition to providing a unified validation of some of the existing achievements, our experiments also indicate that, in some cases, certain claims in the literature may be unduly optimistic.
KW - Distance measure
KW - Experimental comparison
KW - Representation
KW - Time series
UR - http://www.scopus.com/inward/record.url?scp=84872397385&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84872397385&partnerID=8YFLogxK
U2 - 10.1007/s10618-012-0250-5
DO - 10.1007/s10618-012-0250-5
M3 - Article
AN - SCOPUS:84872397385
SN - 1384-5810
VL - 26
SP - 275
EP - 309
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 2
ER -