TY - JOUR
T1 - Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge
AU - Cormack, James
AU - Nath, Chinmoy
AU - Milward, David
AU - Raja, Kalpana
AU - Jonnalagadda, Siddhartha R.
N1 - Publisher Copyright:
© 2015 Elsevier Inc.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2015/12/1
Y1 - 2015/12/1
N2 - This paper describes the use of an agile text mining platform (Linguamatics' Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system.
AB - This paper describes the use of an agile text mining platform (Linguamatics' Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system.
KW - Clinical natural language processing
KW - Information extraction
KW - Text mining
UR - http://www.scopus.com/inward/record.url?scp=84937854995&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84937854995&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2015.06.030
DO - 10.1016/j.jbi.2015.06.030
M3 - Article
C2 - 26209007
AN - SCOPUS:84937854995
VL - 58
SP - S120-S127
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
SN - 1532-0464
ER -