Developing a machine learning model to detect diagnostic uncertainty in clinical documentation

Trisha L. Marshall*, Lindsay C. Nickels, Patrick W. Brady, Ezra J. Edgerton, James J. Lee, Philip A. Hagedorn

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Background and Objective: Diagnostic uncertainty, when unrecognized or poorly communicated, can result in diagnostic error. However, diagnostic uncertainty is challenging to study due to a lack of validated identification methods. This study aims to identify distinct linguistic patterns associated with diagnostic uncertainty in clinical documentation. Design, Setting and Participants: This case–control study compares the clinical documentation of hospitalized children who received a novel uncertain diagnosis (UD) diagnosis label during their admission to a set of matched controls. Linguistic analyses identified potential linguistic indicators (i.e., words or phrases) of diagnostic uncertainty that were then manually reviewed by a linguist and clinical experts to identify those most relevant to diagnostic uncertainty. A natural language processing program categorized medical terminology into semantic types (i.e., sign or symptom), from which we identified a subset of these semantic types that both categorized reliably and were relevant to diagnostic uncertainty. Finally, a competitive machine learning modeling strategy utilizing the linguistic indicators and semantic types compared different predictive models for identifying diagnostic uncertainty. Results: Our cohort included 242 UD-labeled patients and 932 matched controls with a combination of 3070 clinical notes. The best-performing model was a random forest, utilizing a combination of linguistic indicators and semantic types, yielding a sensitivity of 89.4% and a positive predictive value of 96.7%. Conclusion: Expert labeling, natural language processing, and machine learning methods combined with human validation resulted in highly predictive models to detect diagnostic uncertainty in clinical documentation and represent a promising approach to detecting, studying, and ultimately mitigating diagnostic uncertainty in clinical practice.

Original languageEnglish (US)
Pages (from-to)405-412
Number of pages8
JournalJournal of Hospital Medicine
Volume18
Issue number5
DOIs
StatePublished - May 2023

Funding

Dr. Marshall received funding for this work from the American Board of Medical Specialties (ABMS) in conjunction with the Gordon and Betty Moore Foundation. Drs. Lee, Nickels, and Mr. Edgerton were supported by the Andrew W. Mellon Foundation, Public Knowledge program grants 1708‐04721 and 2005‐07903. The content is solely the responsibility of the authors and does not necessarily represent the official views of the ABMS, the Gordon and Betty Moore Foundation, or the Andrew W. Mellon Foundation.

ASJC Scopus subject areas

  • Internal Medicine
  • Leadership and Management
  • Fundamentals and skills
  • Health Policy
  • Care Planning
  • Assessment and Diagnosis

Fingerprint

Dive into the research topics of 'Developing a machine learning model to detect diagnostic uncertainty in clinical documentation'. Together they form a unique fingerprint.

Cite this