Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR.

Kimberly Van Auken*, Petra Fey, Tanya Z. Berardini, Robert Dodson, Laurel Cooper, Donghui Li, Juancarlos Chan, Yuling Li, Siddhartha Basu, Hans Michael Muller, Rex Chisholm, Eva Huala, Paul W. Sternberg, Consortium WormBase Consortium

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

31 Scopus citations

Abstract

WormBase, dictyBase and The Arabidopsis Information Resource (TAIR) are model organism databases containing information about Caenorhabditis elegans and other nematodes, the social amoeba Dictyostelium discoideum and related Dictyostelids and the flowering plant Arabidopsis thaliana, respectively. Each database curates multiple data types from the primary research literature. In this article, we describe the curation workflow at WormBase, with particular emphasis on our use of text-mining tools (BioCreative 2012, Workshop Track II). We then describe the application of a specific component of that workflow, Textpresso for Cellular Component Curation (CCC), to Gene Ontology (GO) curation at dictyBase and TAIR (BioCreative 2012, Workshop Track III). We find that, with organism-specific modifications, Textpresso can be used by dictyBase and TAIR to annotate gene productions to GO's Cellular Component (CC) ontology.

Original languageEnglish (US)
Pages (from-to)bas040
JournalDatabase : the journal of biological databases and curation
Volume2012
DOIs
StatePublished - 2012

ASJC Scopus subject areas

  • Information Systems
  • General Biochemistry, Genetics and Molecular Biology
  • General Agricultural and Biological Sciences

Fingerprint

Dive into the research topics of 'Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR.'. Together they form a unique fingerprint.

Cite this