Link prediction for annotation graphs using graph summarization

Andreas Thor*, Philip Anderson, Louiqa Raschid, Saket Navlakha, Barna Saha, Samir Khuller, Xiao Ning Zhang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Scopus citations


Annotation graph datasets are a natural representation of scientific knowledge. They are common in the life sciences where genes or proteins are annotated with controlled vocabulary terms (CV terms) from ontologies. The W3C Linking Open Data (LOD) initiative and semantic Web technologies are playing a leading role in making such datasets widely available. Scientists can mine these datasets to discover patterns of annotation. While ontology alignment and integration across datasets has been explored in the context of the semantic Web, there is no current approach to mine such patterns in annotation graph datasets. In this paper, we propose a novel approach for link prediction; it is a preliminary task when discovering more complex patterns. Our prediction is based on a complementary methodology of graph summarization (GS) and dense subgraphs (DSG). GS can exploit and summarize knowledge captured within the ontologies and in the annotation patterns. DSG uses the ontology structure, in particular the distance between CV terms, to filter the graph, and to find promising subgraphs. We develop a scoring function based on multiple heuristics to rank the predictions. We perform an extensive evaluation on Arabidopsis thaliana genes.

Original languageEnglish (US)
Title of host publicationThe Semantic Web, ISWC 2011 - 10th International Semantic Web Conference, Proceedings
Number of pages16
EditionPART 1
StatePublished - Nov 2 2011
Event10th International Semantic Web Conference, ISWC 2011 - Bonn, Germany
Duration: Oct 23 2011Oct 27 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7031 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference10th International Semantic Web Conference, ISWC 2011


  • Dense subgraphs
  • Graph summarization
  • Link prediction
  • Linking Open Data ontology alignment

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Link prediction for annotation graphs using graph summarization'. Together they form a unique fingerprint.

Cite this