Generating association rules from semi-structured documents using an extended concept hierarchy

Lisa Singh*, Peter Scheuermann, Bin Chen

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

24 Scopus citations

Abstract

Most data mining research has focused on generating rules within databases containing structured values while essentially ignoring the potentially valuable information that exists in the unstructured blocks of text. This paper suggests an approach for generating association rules that relates structured data values to concepts extracted from unstructured data. Our approach involves the use of an extended concept hierarchy (ECH) to maintain parent, child, and sibling relationships between concepts. This structure allows us to generate rules that relate a given concept in the ECH and a given structured attribute value to the neighbors of the given concept in the ECH. We also describe an efficient implementation of the ECH that keeps track of concepts and pointers to documents associated with them. Experimental results on documents from the ABI/Inform Information Retrieval System are presented.

Original languageEnglish (US)
Pages193-200
Number of pages8
DOIs
StatePublished - 1997
EventProceedings of the 1997 6th International Conference on Information and Knowledge Management, CIKM'97 - Las Vegas, NV, USA
Duration: Nov 10 1997Nov 14 1997

Other

OtherProceedings of the 1997 6th International Conference on Information and Knowledge Management, CIKM'97
CityLas Vegas, NV, USA
Period11/10/9711/14/97

ASJC Scopus subject areas

  • General Business, Management and Accounting

Fingerprint

Dive into the research topics of 'Generating association rules from semi-structured documents using an extended concept hierarchy'. Together they form a unique fingerprint.

Cite this