Automatic ontology learning from domain-specific short unstructured text data

Yiming Xu, Dnyanesh Rajpathak, Ian Gibbs, Diego Klabjan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Ontology learning is a critical task in industry, which deals with identifying and extracting concepts reported in text such that these concepts can be used in different tasks, e.g. information retrieval. The problem of ontology learning is non-trivial due to several reasons with a limited amount of prior research work that automatically learns a domain specific ontology from data. In our work, we propose a two-stage classification system to automatically learn an ontology from unstructured text. In our model, the first-stage classifier classifies candidate concepts into relevant and irrelevant concepts and then the second-stage classifier assigns specific classes to the relevant concepts. The proposed system is deployed as a prototype in General Motors and its performance is validated by using complaint and repair verbatim data collected from different data sources. On average, our system shows the F1-score of 0.75, even when data distributions are vastly different.

Original languageEnglish (US)
Title of host publicationKMIS
EditorsAna Salgado, Jorge Bernardino, Joaquim Filipe
PublisherSciTePress
Pages29-39
Number of pages11
ISBN (Electronic)9789897584749
StatePublished - 2020
Event12th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2020 - Part of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2020 - Virtual, Online
Duration: Nov 2 2020Nov 4 2020

Publication series

NameIC3K 2020 - Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Volume3

Conference

Conference12th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2020 - Part of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2020
CityVirtual, Online
Period11/2/2011/4/20

Keywords

  • Classification
  • Clustering
  • Information systems
  • Ontology learning

ASJC Scopus subject areas

  • Management of Technology and Innovation
  • Strategy and Management
  • Software

Fingerprint

Dive into the research topics of 'Automatic ontology learning from domain-specific short unstructured text data'. Together they form a unique fingerprint.

Cite this