Bayesian Generative Methods for Extracting and Modeling Relations in EHR Narratives

  • Luo, Yuan (PD/PI)
  • Starren, Justin (Co-Investigator)

Project: Research project

Project Details


Bayesian Generative Methods for Extracting and Modeling Relations in EHR Narratives Medicine has evolved into an era where the entire hospital progressively adopts intense real-time monitoring for the patients and generates ICU like clinical data. This rapidly growing data makes ICU a snapshot for tomorrow’s standard of care that should benefit from computer-aided decision making. These data contain both numerical or coded information, and a majority of unstructured narrative text data such as physicians’ and nurses' notes, specialists' reports, and discharge summaries. Both types of data have been shown to be highly informative for tasks such as cohort selection, and work best in combination. However, to achieve this, specific bits of information must be extracted from the narrative reports and coded in some formal representation. These bits include medical concepts such as symptoms, diseases, medications procedures; characteristics such as certainty, severity, dose; assertions about these items, such as whether they pertain to the patient or a family member, etc.; relations among these mentions, including indications of what condition is treated by what action and its degree of success, the time sequence and duration of events, and interpretations of laboratory test results as relations among medical concepts such as cells and antigens (e.g., “[large atypical cells] express [CD30]”). Concepts and assertions can be regarded as simple relations, and our proposal focuses on modeling narrative relations as complementary information to numeric data for predicting patient outcomes. Most existing techniques for interpreting clinical narratives rely on either hand-crafted rule systems and large medical thesauri or are based on machine learning models that create classification or regression models from large annotated data sets. The former are difficult and laborious to generalize, whereas the latter require large volumes of human-labeled data and may result
Effective start/end date9/1/178/31/20


  • National Library of Medicine (1R21LM012618-02)


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.