Rich text formatted EHR narratives: A hidden and ignored trove

Zexian Zeng, Yuan Zhao, Mengxin Sun, Andy H. Vo, Justin B Starren, Yuan Luo*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This study presents an approach for mining structured information from clinical narratives in Electronic Health Records (EHRs) by using Rich Text Formatted (RTF) records. RTF is adopted by many medical information management systems. There is rich structural information in these files which can be extracted and interpreted, yet such information is largely ignored. We investigate multiple types of EHR narratives in the Enterprise Data Warehouse from a multisite large healthcare chain consisting of both, an academic medical center and community hospitals. We focus on the RTF constructs related to tables and sections that are not available in plain text EHR narratives. We show how to parse these RTF constructs, analyze their prevalence and characteristics in the context of multiple types of EHR narratives. Our case study demonstrates the additional utility of the features derived from RTF constructs over plain text oriented NLP.

Original languageEnglish (US)
Title of host publicationMEDINFO 2019
Subtitle of host publicationHealth and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics
EditorsBrigitte Seroussi, Lucila Ohno-Machado, Lucila Ohno-Machado, Brigitte Seroussi
PublisherIOS Press
Pages472-476
Number of pages5
ISBN (Electronic)9781643680026
DOIs
StatePublished - Aug 21 2019
Event17th World Congress on Medical and Health Informatics, MEDINFO 2019 - Lyon, France
Duration: Aug 25 2019Aug 30 2019

Publication series

NameStudies in Health Technology and Informatics
Volume264
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365

Conference

Conference17th World Congress on Medical and Health Informatics, MEDINFO 2019
CountryFrance
CityLyon
Period8/25/198/30/19

Fingerprint

Electronic Health Records
Health
Management Information Systems
Data warehouses
Community Hospital
Information management
Delivery of Health Care
Industry

Keywords

  • Electronic Health Records
  • Information Management
  • Natural Language Processing

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Cite this

Zeng, Z., Zhao, Y., Sun, M., Vo, A. H., Starren, J. B., & Luo, Y. (2019). Rich text formatted EHR narratives: A hidden and ignored trove. In B. Seroussi, L. Ohno-Machado, L. Ohno-Machado, & B. Seroussi (Eds.), MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics (pp. 472-476). (Studies in Health Technology and Informatics; Vol. 264). IOS Press. https://doi.org/10.3233/SHTI190266
Zeng, Zexian ; Zhao, Yuan ; Sun, Mengxin ; Vo, Andy H. ; Starren, Justin B ; Luo, Yuan. / Rich text formatted EHR narratives : A hidden and ignored trove. MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. editor / Brigitte Seroussi ; Lucila Ohno-Machado ; Lucila Ohno-Machado ; Brigitte Seroussi. IOS Press, 2019. pp. 472-476 (Studies in Health Technology and Informatics).
@inproceedings{72c5fb86723d452d8f3c4250603bd060,
title = "Rich text formatted EHR narratives: A hidden and ignored trove",
abstract = "This study presents an approach for mining structured information from clinical narratives in Electronic Health Records (EHRs) by using Rich Text Formatted (RTF) records. RTF is adopted by many medical information management systems. There is rich structural information in these files which can be extracted and interpreted, yet such information is largely ignored. We investigate multiple types of EHR narratives in the Enterprise Data Warehouse from a multisite large healthcare chain consisting of both, an academic medical center and community hospitals. We focus on the RTF constructs related to tables and sections that are not available in plain text EHR narratives. We show how to parse these RTF constructs, analyze their prevalence and characteristics in the context of multiple types of EHR narratives. Our case study demonstrates the additional utility of the features derived from RTF constructs over plain text oriented NLP.",
keywords = "Electronic Health Records, Information Management, Natural Language Processing",
author = "Zexian Zeng and Yuan Zhao and Mengxin Sun and Vo, {Andy H.} and Starren, {Justin B} and Yuan Luo",
year = "2019",
month = "8",
day = "21",
doi = "10.3233/SHTI190266",
language = "English (US)",
series = "Studies in Health Technology and Informatics",
publisher = "IOS Press",
pages = "472--476",
editor = "Brigitte Seroussi and Lucila Ohno-Machado and Lucila Ohno-Machado and Brigitte Seroussi",
booktitle = "MEDINFO 2019",

}

Zeng, Z, Zhao, Y, Sun, M, Vo, AH, Starren, JB & Luo, Y 2019, Rich text formatted EHR narratives: A hidden and ignored trove. in B Seroussi, L Ohno-Machado, L Ohno-Machado & B Seroussi (eds), MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. Studies in Health Technology and Informatics, vol. 264, IOS Press, pp. 472-476, 17th World Congress on Medical and Health Informatics, MEDINFO 2019, Lyon, France, 8/25/19. https://doi.org/10.3233/SHTI190266

Rich text formatted EHR narratives : A hidden and ignored trove. / Zeng, Zexian; Zhao, Yuan; Sun, Mengxin; Vo, Andy H.; Starren, Justin B; Luo, Yuan.

MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. ed. / Brigitte Seroussi; Lucila Ohno-Machado; Lucila Ohno-Machado; Brigitte Seroussi. IOS Press, 2019. p. 472-476 (Studies in Health Technology and Informatics; Vol. 264).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Rich text formatted EHR narratives

T2 - A hidden and ignored trove

AU - Zeng, Zexian

AU - Zhao, Yuan

AU - Sun, Mengxin

AU - Vo, Andy H.

AU - Starren, Justin B

AU - Luo, Yuan

PY - 2019/8/21

Y1 - 2019/8/21

N2 - This study presents an approach for mining structured information from clinical narratives in Electronic Health Records (EHRs) by using Rich Text Formatted (RTF) records. RTF is adopted by many medical information management systems. There is rich structural information in these files which can be extracted and interpreted, yet such information is largely ignored. We investigate multiple types of EHR narratives in the Enterprise Data Warehouse from a multisite large healthcare chain consisting of both, an academic medical center and community hospitals. We focus on the RTF constructs related to tables and sections that are not available in plain text EHR narratives. We show how to parse these RTF constructs, analyze their prevalence and characteristics in the context of multiple types of EHR narratives. Our case study demonstrates the additional utility of the features derived from RTF constructs over plain text oriented NLP.

AB - This study presents an approach for mining structured information from clinical narratives in Electronic Health Records (EHRs) by using Rich Text Formatted (RTF) records. RTF is adopted by many medical information management systems. There is rich structural information in these files which can be extracted and interpreted, yet such information is largely ignored. We investigate multiple types of EHR narratives in the Enterprise Data Warehouse from a multisite large healthcare chain consisting of both, an academic medical center and community hospitals. We focus on the RTF constructs related to tables and sections that are not available in plain text EHR narratives. We show how to parse these RTF constructs, analyze their prevalence and characteristics in the context of multiple types of EHR narratives. Our case study demonstrates the additional utility of the features derived from RTF constructs over plain text oriented NLP.

KW - Electronic Health Records

KW - Information Management

KW - Natural Language Processing

UR - http://www.scopus.com/inward/record.url?scp=85071513618&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071513618&partnerID=8YFLogxK

U2 - 10.3233/SHTI190266

DO - 10.3233/SHTI190266

M3 - Conference contribution

C2 - 31437968

AN - SCOPUS:85071513618

T3 - Studies in Health Technology and Informatics

SP - 472

EP - 476

BT - MEDINFO 2019

A2 - Seroussi, Brigitte

A2 - Ohno-Machado, Lucila

A2 - Ohno-Machado, Lucila

A2 - Seroussi, Brigitte

PB - IOS Press

ER -

Zeng Z, Zhao Y, Sun M, Vo AH, Starren JB, Luo Y. Rich text formatted EHR narratives: A hidden and ignored trove. In Seroussi B, Ohno-Machado L, Ohno-Machado L, Seroussi B, editors, MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. IOS Press. 2019. p. 472-476. (Studies in Health Technology and Informatics). https://doi.org/10.3233/SHTI190266