Construction of the literature graph in semantic scholar

Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, Rodney Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler Murray, Hsu Han Ooi, Matthew Peters, Joanna Power, Sam Skjonsberg, Lucy Lu Wang, Chris WilhelmZheng Yuan, Madeleine Van Zuylen, Oren Etzioni

Research output: Chapter in Book/Report/Conference proceedingConference contribution

44 Scopus citations

Abstract

We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery. The resulting literature graph consists of more than 280M nodes, representing papers, authors, entities and various interactions between them (e.g., authorships, citations, entity mentions). We reduce literature graph construction into familiar NLP tasks (e.g., entity extraction and linking), point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task. The methods described in this paper are used to enable semantic features in www.semanticscholar.org.

Original languageEnglish (US)
Title of host publicationIndustry Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages84-91
Number of pages8
ISBN (Electronic)9781948087308
StatePublished - Jan 1 2018
Event2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018 - New Orleans, United States
Duration: Jun 1 2018Jun 6 2018

Publication series

NameNAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
Volume3

Conference

Conference2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018
CountryUnited States
CityNew Orleans
Period6/1/186/6/18

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Construction of the literature graph in semantic scholar'. Together they form a unique fingerprint.

Cite this