TY - GEN
T1 - Mining frequent patterns by differential refinement of clustered bitmaps
AU - Li, Jianwei
AU - Choudhary, Alok
AU - Jiang, Nan
AU - Liao, Wei Keng
PY - 2006
Y1 - 2006
N2 - Existing algorithms for mining frequent patterns are facing challenges to handle databases (a) of increasingly large sizes, (b) consisting of variable-length, irregularly-spaced data, and (c) with mixed or even unknown properties. In this paper, we propose a novel self-adaptive algorithm D-CLUB that thoroughly addresses these issues by progressively clustering the database into condensed association bitmaps, applying a differential technique to digest and remove dense patterns, and then mining the remaining tiny bitmaps directly through fast aggregate bit operations. The bitmaps are well organized into rectangular two-dimensional matrices and adaptively refined in regions that necessitate further computation. We show that this approach not only drastically cuts down the original database size but also largely reduces and simplifies the mining computation for a wide variety of datasets and parameters. We compare D-CLUB with various state-of-the-art algorithms and show significant performance improvement in all cases.
AB - Existing algorithms for mining frequent patterns are facing challenges to handle databases (a) of increasingly large sizes, (b) consisting of variable-length, irregularly-spaced data, and (c) with mixed or even unknown properties. In this paper, we propose a novel self-adaptive algorithm D-CLUB that thoroughly addresses these issues by progressively clustering the database into condensed association bitmaps, applying a differential technique to digest and remove dense patterns, and then mining the remaining tiny bitmaps directly through fast aggregate bit operations. The bitmaps are well organized into rectangular two-dimensional matrices and adaptively refined in regions that necessitate further computation. We show that this approach not only drastically cuts down the original database size but also largely reduces and simplifies the mining computation for a wide variety of datasets and parameters. We compare D-CLUB with various state-of-the-art algorithms and show significant performance improvement in all cases.
UR - http://www.scopus.com/inward/record.url?scp=33745474465&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745474465&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972764.26
DO - 10.1137/1.9781611972764.26
M3 - Conference contribution
AN - SCOPUS:33745474465
SN - 089871611X
SN - 9780898716115
T3 - Proceedings of the Sixth SIAM International Conference on Data Mining
SP - 294
EP - 305
BT - Proceedings of the Sixth SIAM International Conference on Data Mining
PB - Society for Industrial and Applied Mathematics
T2 - Sixth SIAM International Conference on Data Mining
Y2 - 20 April 2006 through 22 April 2006
ER -