Profile of a dictionary compiled from scanning over one million words of surgical pathology narrative text

R. L. Wong, J. D. Reno, T. C. Hain, R. C. Platt, P. S. Gaynon, D. M. Joseph

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

An anatomic pathology natural language dictionary (LEXICON) has evolved over a nine-year period, a result of scanning over one million words of narrative text from tissue examination request forms and surgical pathology reports. The text is parsed into individual words which are looked up in LEXICON and flagged by action codes which determine usage in constructing a KWIC index file and an on-line database retrievable by keywords. The LEXICON now resides on an IBM 370/168 system and has survived several transfers between computer systems. An update program is used after each batch of narrative text is scanned to modify LEXICON. LEXICON now contains 24,228 medical and nonmedical terms, 24.8% are errors (misspellings), 45.9% are keywords retrievable on and off line, 52.2% of the words are cross-referenced to a supplementary word. A preliminary study shows that many of the “nonmedical” terms in LEXICON carry significant medical information, and that there is considerable overlap of medical words among LEXICON, SNOMED, and ICDA-8. Our LEXICON appears to be an intermediate step in the process of evolving an algorithm capable of “understanding” medical narrative text.

Original languageEnglish (US)
Pages (from-to)382-398
Number of pages17
JournalComputers and Biomedical Research
Volume13
Issue number4
DOIs
StatePublished - 1980

ASJC Scopus subject areas

  • Medicine (miscellaneous)

Fingerprint Dive into the research topics of 'Profile of a dictionary compiled from scanning over one million words of surgical pathology narrative text'. Together they form a unique fingerprint.

Cite this