Integrated cTAKES for concept mention detection and normalization

Hongfang Liu*, Kavishwar Wagholikar, Siddhartha Jonnalagadda, Sunghwan Sohn

*Corresponding author for this work

    Research output: Contribution to journalConference article

    1 Citation (Scopus)

    Abstract

    We participated Task 1 using an existing system MedTagger implemented in inte-grated cTAKES (icTAKES). The concept mention detection is based on Conditional Random Fields (CRF) and the concept mention normalization is based on a greedy dictionary lookup algorithm. A distinctive feature in MedTagger compared to other concept mention detection systems is the incorporation of dictionary lookup results into a machine learning framework for sequential labeling. Dictionary lookup results of MedLex and semantic vectors representing distributed semantics were used as features. Overall, the precision, recall, and F-measure of our best run for concept mention are 0.8, 0.573, and 0.668 respectively for strict evaluation and 0.939, 0.766, and 0.844 for relaxed evaluation. The accuracy of our best run for concept men-tion normalization is 54.6% and 87.0% for strict and relaxed mapping, respectively.

    Original languageEnglish (US)
    JournalCEUR Workshop Proceedings
    Volume1179
    StatePublished - Jan 1 2013
    Event2013 Cross Language Evaluation Forum Conference, CLEF 2013 - Valencia, Spain
    Duration: Sep 23 2013Sep 26 2013

    Fingerprint

    Glossaries
    Semantics
    Labeling
    Learning systems

    Keywords

    • Conditional random fields
    • Dictionary lookup
    • Distributed semantics
    • Named entity recognition
    • Normalization

    ASJC Scopus subject areas

    • Computer Science(all)

    Cite this

    Liu, H., Wagholikar, K., Jonnalagadda, S., & Sohn, S. (2013). Integrated cTAKES for concept mention detection and normalization. CEUR Workshop Proceedings, 1179.
    Liu, Hongfang ; Wagholikar, Kavishwar ; Jonnalagadda, Siddhartha ; Sohn, Sunghwan. / Integrated cTAKES for concept mention detection and normalization. In: CEUR Workshop Proceedings. 2013 ; Vol. 1179.
    @article{30dec10e06e94c3cbb0f073aca334a75,
    title = "Integrated cTAKES for concept mention detection and normalization",
    abstract = "We participated Task 1 using an existing system MedTagger implemented in inte-grated cTAKES (icTAKES). The concept mention detection is based on Conditional Random Fields (CRF) and the concept mention normalization is based on a greedy dictionary lookup algorithm. A distinctive feature in MedTagger compared to other concept mention detection systems is the incorporation of dictionary lookup results into a machine learning framework for sequential labeling. Dictionary lookup results of MedLex and semantic vectors representing distributed semantics were used as features. Overall, the precision, recall, and F-measure of our best run for concept mention are 0.8, 0.573, and 0.668 respectively for strict evaluation and 0.939, 0.766, and 0.844 for relaxed evaluation. The accuracy of our best run for concept men-tion normalization is 54.6{\%} and 87.0{\%} for strict and relaxed mapping, respectively.",
    keywords = "Conditional random fields, Dictionary lookup, Distributed semantics, Named entity recognition, Normalization",
    author = "Hongfang Liu and Kavishwar Wagholikar and Siddhartha Jonnalagadda and Sunghwan Sohn",
    year = "2013",
    month = "1",
    day = "1",
    language = "English (US)",
    volume = "1179",
    journal = "CEUR Workshop Proceedings",
    issn = "1613-0073",
    publisher = "CEUR-WS",

    }

    Liu, H, Wagholikar, K, Jonnalagadda, S & Sohn, S 2013, 'Integrated cTAKES for concept mention detection and normalization', CEUR Workshop Proceedings, vol. 1179.

    Integrated cTAKES for concept mention detection and normalization. / Liu, Hongfang; Wagholikar, Kavishwar; Jonnalagadda, Siddhartha; Sohn, Sunghwan.

    In: CEUR Workshop Proceedings, Vol. 1179, 01.01.2013.

    Research output: Contribution to journalConference article

    TY - JOUR

    T1 - Integrated cTAKES for concept mention detection and normalization

    AU - Liu, Hongfang

    AU - Wagholikar, Kavishwar

    AU - Jonnalagadda, Siddhartha

    AU - Sohn, Sunghwan

    PY - 2013/1/1

    Y1 - 2013/1/1

    N2 - We participated Task 1 using an existing system MedTagger implemented in inte-grated cTAKES (icTAKES). The concept mention detection is based on Conditional Random Fields (CRF) and the concept mention normalization is based on a greedy dictionary lookup algorithm. A distinctive feature in MedTagger compared to other concept mention detection systems is the incorporation of dictionary lookup results into a machine learning framework for sequential labeling. Dictionary lookup results of MedLex and semantic vectors representing distributed semantics were used as features. Overall, the precision, recall, and F-measure of our best run for concept mention are 0.8, 0.573, and 0.668 respectively for strict evaluation and 0.939, 0.766, and 0.844 for relaxed evaluation. The accuracy of our best run for concept men-tion normalization is 54.6% and 87.0% for strict and relaxed mapping, respectively.

    AB - We participated Task 1 using an existing system MedTagger implemented in inte-grated cTAKES (icTAKES). The concept mention detection is based on Conditional Random Fields (CRF) and the concept mention normalization is based on a greedy dictionary lookup algorithm. A distinctive feature in MedTagger compared to other concept mention detection systems is the incorporation of dictionary lookup results into a machine learning framework for sequential labeling. Dictionary lookup results of MedLex and semantic vectors representing distributed semantics were used as features. Overall, the precision, recall, and F-measure of our best run for concept mention are 0.8, 0.573, and 0.668 respectively for strict evaluation and 0.939, 0.766, and 0.844 for relaxed evaluation. The accuracy of our best run for concept men-tion normalization is 54.6% and 87.0% for strict and relaxed mapping, respectively.

    KW - Conditional random fields

    KW - Dictionary lookup

    KW - Distributed semantics

    KW - Named entity recognition

    KW - Normalization

    UR - http://www.scopus.com/inward/record.url?scp=84922041543&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84922041543&partnerID=8YFLogxK

    M3 - Conference article

    AN - SCOPUS:84922041543

    VL - 1179

    JO - CEUR Workshop Proceedings

    JF - CEUR Workshop Proceedings

    SN - 1613-0073

    ER -

    Liu H, Wagholikar K, Jonnalagadda S, Sohn S. Integrated cTAKES for concept mention detection and normalization. CEUR Workshop Proceedings. 2013 Jan 1;1179.