TY - GEN
T1 - MAGIC
T2 - 2007 IEEE International Conference on Information Reuse and Integration, IEEE IRI-2007
AU - Albanese, Massimiliano
AU - Pugliese, Andrea
AU - Subrahmanian, V. S.
AU - Udrea, Octavian
PY - 2007
Y1 - 2007
N2 - Suppose we are given a set A of activities of interest, a set O of observations, and a probability threshold p. We are interested in finding the set of all pairs (a, O'), where a ∈ A and O' ⊆ O, that minimally validate the fact that an instance of activity a occurs in O with probability p or more. The novel contribution of this paper is the notion of the multi-activity graph index (MAGIC), which can index very large numbers of observations from interleaved activities and quickly retrieve completed instances of the monitored activities. We introduce two complexity reducing restrictions of the problem (which takes exponential time) and develop algorithms for each. We experimentally evaluate our exponential algorithm as well as the restricted algorithms on both synthetic data and a real (depersonalized) travel data set consisting of 5.5 million observations. Our experiments show that MAGIC consumes reasonable amounts of memory and can retrieve completed instances of activities in just a few seconds. We also report appropriate statistical significance results validating our experimental hypotheses.
AB - Suppose we are given a set A of activities of interest, a set O of observations, and a probability threshold p. We are interested in finding the set of all pairs (a, O'), where a ∈ A and O' ⊆ O, that minimally validate the fact that an instance of activity a occurs in O with probability p or more. The novel contribution of this paper is the notion of the multi-activity graph index (MAGIC), which can index very large numbers of observations from interleaved activities and quickly retrieve completed instances of the monitored activities. We introduce two complexity reducing restrictions of the problem (which takes exponential time) and develop algorithms for each. We experimentally evaluate our exponential algorithm as well as the restricted algorithms on both synthetic data and a real (depersonalized) travel data set consisting of 5.5 million observations. Our experiments show that MAGIC consumes reasonable amounts of memory and can retrieve completed instances of activities in just a few seconds. We also report appropriate statistical significance results validating our experimental hypotheses.
UR - http://www.scopus.com/inward/record.url?scp=47949115789&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47949115789&partnerID=8YFLogxK
U2 - 10.1109/IRI.2007.4296632
DO - 10.1109/IRI.2007.4296632
M3 - Conference contribution
AN - SCOPUS:47949115789
SN - 1424414997
SN - 9781424414994
T3 - 2007 IEEE International Conference on Information Reuse and Integration, IEEE IRI-2007
SP - 267
EP - 272
BT - 2007 IEEE International Conference on Information Reuse and Integration, IEEE IRI-2007
Y2 - 13 August 2007 through 15 August 2007
ER -