Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes

Jiancheng Ye, Liang Yao, Jiahong Shen, Rethavathi Janarthanam, Yuan Luo*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


Background: Diabetes mellitus is a prevalent metabolic disease characterized by chronic hyperglycemia. The avalanche of healthcare data is accelerating precision and personalized medicine. Artificial intelligence and algorithm-based approaches are becoming more and more vital to support clinical decision-making. These methods are able to augment health care providers by taking away some of their routine work and enabling them to focus on critical issues. However, few studies have used predictive modeling to uncover associations between comorbidities in ICU patients and diabetes. This study aimed to use Unified Medical Language System (UMLS) resources, involving machine learning and natural language processing (NLP) approaches to predict the risk of mortality. Methods: We conducted a secondary analysis of Medical Information Mart for Intensive Care III (MIMIC-III) data. Different machine learning modeling and NLP approaches were applied. Domain knowledge in health care is built on the dictionaries created by experts who defined the clinical terminologies such as medications or clinical symptoms. This knowledge is valuable to identify information from text notes that assert a certain disease. Knowledge-guided models can automatically extract knowledge from clinical notes or biomedical literature that contains conceptual entities and relationships among these various concepts. Mortality classification was based on the combination of knowledge-guided features and rules. UMLS entity embedding and convolutional neural network (CNN) with word embeddings were applied. Concept Unique Identifiers (CUIs) with entity embeddings were utilized to build clinical text representations. Results: The best configuration of the employed machine learning models yielded a competitive AUC of 0.97. Machine learning models along with NLP of clinical notes are promising to assist health care providers to predict the risk of mortality of critically ill patients. Conclusion: UMLS resources and clinical notes are powerful and important tools to predict mortality in diabetic patients in the critical care setting. The knowledge-guided CNN model is effective (AUC = 0.97) for learning hidden features.

Original languageEnglish (US)
Article number295
JournalBMC Medical Informatics and Decision Making
StatePublished - Dec 2020


  • Clinical notes
  • Deep learning
  • Diabetic disease
  • Entity embedding
  • ICU
  • Machine learning
  • Mortality
  • Natural language processing
  • Word embedding

ASJC Scopus subject areas

  • Health Policy
  • Health Informatics

Fingerprint Dive into the research topics of 'Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes'. Together they form a unique fingerprint.

Cite this