Unsupervised terminological ontology learning based on hierarchical topic modeling

Xiaofeng Zhu, Diego Klabjan, Patrick N. Bless

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

In this paper, we present hierarchical relation-based latent Dirichlet allocation (hrLDA), a data-driven hierarchical topic model for extracting terminological ontologies from a large number of heterogeneous documents. In contrast to traditional topic models, hrLDA relies on noun phrases instead of unigrams, considers syntax and document structures, and enriches topic hierarchies with topic relations. Through a series of experiments, we demonstrate the superiority of hrLDA over existing topic models, especially for building hierarchies. Furthermore, we illustrate the robustness of hrLDA in the settings of noisy data sets, which are likely to occur in many practical scenarios. Our ontology evaluation results show that ontologies extracted from hrLDA are very competitive with the ontologies created by domain experts.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Information Reuse and Integration, IRI 2017
EditorsLatifur Khan, Balaji Palanisamy, Chengcui Zhang, Sahra Sedigh Sarvestani
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages32-41
Number of pages10
ISBN (Electronic)9781538615621
DOIs
StatePublished - Nov 8 2017
Event18th IEEE International Conference on Information Reuse and Integration, IRI 2017 - San Diego, United States
Duration: Aug 4 2017Aug 6 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Information Reuse and Integration, IRI 2017
Volume2017-January

Other

Other18th IEEE International Conference on Information Reuse and Integration, IRI 2017
Country/TerritoryUnited States
CitySan Diego
Period8/4/178/6/17

Funding

This work was supported in part by Intel Corporation, Semiconductor Research Corporation (SRC). We are obliged to Professor Goce Trajcevski from Northwestern University for his insightful suggestions and discussions. This work was partly conducted using the Protege resource, which is supported by grant GM10331601 from the National Institute of General Medical Sciences of the United States National Institutes of Health.

Keywords

  • Hierarchical topic modeling
  • Knowledge acquisition
  • Ontology learning
  • Terminological ontology

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems and Management
  • Artificial Intelligence
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Unsupervised terminological ontology learning based on hierarchical topic modeling'. Together they form a unique fingerprint.

Cite this