Assessing the Burden of Diabetes By Type in Children, Adolescents and Young Adults (DiCAYA) – 2020 Component B (#901642)

Project: Research project

Project Details


Dr. Luo is a well-recognized expert on biomedical machine learning and Natural Language Processing. For the proposed project, Dr. Luo will be responsible for providing guidance on using Natural Language Processing to mine clinical notes Lurie Children’s Hospital. This project explores the incidence and prevalence of diabetes by type in young adults in Chicago. The proposed work will make a unique contribution to the Assessing the Burden of Diabetes By Type in Children, Adolescents and Young Adults (DiCAYA) initiative, in that we will add Natural Language Processing (NLP) and machine learning methods to the computable phenotyping in the project. Much of the computable phenotyping activity across DiCAYA involves queries of structured data (e.g., of PCORnet data marts), which we will also do, but by adding NLP and machine learning we will be able to add substantially to the diabetes surveillance methods and results in the project. Previous studies have shown that an epidemiologically significant proportion of diabetes cases (~10-20%) have been missed by using structured data alone. NLP and machine learning are powerful tools for extracting structured representations of the content of interest from notes, and for understanding associations within that content. However, these methods have not yet been fully used and applied in diabetes research and public health. Our application of these methods will help improve identification of prevalent and incident type 1 and type 2 diabetes cases and, by providing additional context for our project’s structured data results, will help inform our project and the DiCAYA initiative overall.
Effective start/end date9/30/219/29/23


  • Ann & Robert H. Lurie Children’s Hospital of Chicago (901642-NU // U18DP006694)
  • Centers for Disease Control and Prevention (901642-NU // U18DP006694)


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.