Automatically finding relevant citations for clinical guideline development

Duy Duc An Bui*, Siddhartha Jonnalagadda, Guilherme Del Fiol

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    12 Scopus citations


    Objective: Literature database search is a crucial step in the development of clinical practice guidelines and systematic reviews. In the age of information technology, the process of literature search is still conducted manually, therefore it is costly, slow and subject to human errors. In this research, we sought to improve the traditional search approach using innovative query expansion and citation ranking approaches. Methods: We developed a citation retrieval system composed of query expansion and citation ranking methods. The methods are unsupervised and easily integrated over the PubMed search engine. To validate the system, we developed a gold standard consisting of citations that were systematically searched and screened to support the development of cardiovascular clinical practice guidelines. The expansion and ranking methods were evaluated separately and compared with baseline approaches. Results: Compared with the baseline PubMed expansion, the query expansion algorithm improved recall (80.2% vs. 51.5%) with small loss on precision (0.4% vs. 0.6%). The algorithm could find all citations used to support a larger number of guideline recommendations than the baseline approach (64.5% vs. 37.2%, p < 0.001). In addition, the citation ranking approach performed better than PubMed's "most recent" ranking (average precision +6.5%, recall at k +21.1%, p < 0.001), PubMed's rank by "relevance" (average precision +6.1%, recall at k +14.8%, p < 0.001), and the machine learning classifier that identifies scientifically sound studies from MEDLINE citations (average precision +4.9%, recall at k +4.2%, p < 0.001). Conclusions: Our unsupervised query expansion and ranking techniques are more flexible and effective than PubMed's default search engine behavior and the machine learning classifier. Automated citation finding is promising to augment the traditional literature search.

    Original languageEnglish (US)
    Pages (from-to)436-445
    Number of pages10
    JournalJournal of Biomedical Informatics
    StatePublished - Oct 1 2015


    • Information retrieval
    • Medical subject headings
    • Natural language processing
    • Practice guideline
    • PubMed

    ASJC Scopus subject areas

    • Health Informatics
    • Computer Science Applications


    Dive into the research topics of 'Automatically finding relevant citations for clinical guideline development'. Together they form a unique fingerprint.

    Cite this