TabEL: Entity linking in web tables

Chandra Sekhar Bhagavatula*, Thanapon Noraset, Douglas C Downey

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

45 Scopus citations

Abstract

Web tables form a valuable source of relational data. The Web contains an estimated 154 million HTML tables of relational data, with Wikipedia alone containing 1.6 million high-quality tables. Extracting the semantics of Web tables to produce machine-understandable knowledge has become an active area of research. A key step in extracting the semantics of Web content is entity linking (EL): the task of mapping a phrase in text to its referent entity in a knowledge base (KB). In this paper we present TabEL, a new EL system for Web tables. TabEL differs from previous work by weakening the assumption that the semantics of a table can be mapped to pre-defined types and relations found in the target KB. Instead, TabEL enforces soft constraints in the form of a graphical model that assigns higher likelihood to sets of entities that tend to co-occur in Wikipedia documents and tables. In experiments, TabEL significantly reduces error when compared to current state-of-the-art table EL systems, including a 75% error reduction on Wikipedia tables and a 60% error reduction on Web tables. We also make our parsed Wikipedia table corpus and test datasets publicly available for future work.

Original languageEnglish (US)
Title of host publicationThe Semantic Web – ISWC 2015 - 14th International Semantic Web Conference, Proceedings
EditorsMathieu d’Aquin, Krishnaprasad Thirunarayan, Kavitha Srinivas, Paul Groth, Marcelo Arenas, Oscar Corcho, Markus Strohmaier, Jeff Heflin, Elena Simperl, Steffen Staab, Michel Dumontier
PublisherSpringer Verlag
Pages425-441
Number of pages17
ISBN (Print)9783319250069
DOIs
StatePublished - Jan 1 2015
Event14th International Semantic Web Conference, ISWC 2015 - Bethlehem, United States
Duration: Oct 11 2015Oct 15 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9366
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other14th International Semantic Web Conference, ISWC 2015
CountryUnited States
CityBethlehem
Period10/11/1510/15/15

Keywords

  • Entity linking
  • Graphical models
  • Named entity disambiguation
  • Web tables

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'TabEL: Entity linking in web tables'. Together they form a unique fingerprint.

Cite this