Evaluating link prediction methods

Yang Yang*, Ryan N. Lichtenwalter, Nitesh V. Chawla

*Corresponding author for this work

Research output: Contribution to journalArticle

60 Scopus citations

Abstract

Link prediction is a popular research area with important applications in a variety of disciplines, including biology, social science, security, and medicine. The fundamental requirement of link prediction is the accurate and effective prediction of new links in networks. While there are many different methods proposed for link prediction, we argue that the practical performance potential of these methods is often unknown because of challenges in the evaluation of link prediction, which impact the reliability and reproducibility of results. We describe these challenges, provide theoretical proofs and empirical examples demonstrating how current methods lead to questionable conclusions, show how the fallacy of these conclusions is illuminated by methods we propose, and develop recommendations for consistent, standard, and applicable evaluation metrics. We also recommend the use of precision-recall threshold curves and associated areas in lieu of receiver operating characteristic curves due to complications that arise from extreme imbalance in the link prediction classification problem.

Original languageEnglish (US)
Pages (from-to)751-782
Number of pages32
JournalKnowledge and Information Systems
Volume45
Issue number3
DOIs
StatePublished - Dec 1 2015

    Fingerprint

Keywords

  • Class imbalance
  • Link prediction and Evaluation
  • Sampling
  • Temporal effects on link prediction
  • Threshold curves

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Hardware and Architecture
  • Artificial Intelligence

Cite this