Informative missingness: What can we learn from patterns in missing laboratory data in the electronic health record?

The Consortium for the Clinical Characterization of COVID-19 by EHR (4CE)

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Background: In electronic health records, patterns of missing laboratory test results could capture patients’ course of disease as well as ​​reflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients. Methods: We collected and analyzed demographic, diagnosis, and laboratory data for 69,939 patients with positive COVID-19 PCR tests across three countries from 1 January 2020 through 30 September 2021. We analyzed missing laboratory measurements across sites, missingness stratification by demographic variables, temporal trends of missingness, correlations between labs based on missingness indicators over time, and clustering of groups of labs based on their missingness/ordering pattern. Results: With these analyses, we identified mapping issues faced in seven out of 15 sites. We also identified nuances in data collection and variable definition for the various sites. Temporal trend analyses may support the use of laboratory test result missingness patterns in identifying severe COVID-19 patients. Lastly, using missingness patterns, we determined relationships between various labs that reflect clinical behaviors. Conclusion: In this work, we use. computational approaches to relate missingness patterns to hospital treatment capacity and highlight the heterogeneity of looking at COVID-19 over time and at multiple sites, where there might be different phases, policies, etc. Changes in missingness could suggest a change in a patient's condition, and patterns of missingness among laboratory measurements could potentially identify clinical outcomes. This allows sites to consider missing data as informative to analyses and help researchers identify which sites are better poised to study particular questions.

Original languageEnglish (US)
Article number104306
JournalJournal of Biomedical Informatics
Volume139
DOIs
StatePublished - Mar 2023

Funding

MM is supported by National Center for Advancing Translational Sciences (NCATS) UL1 TR001857. DAH is supported by NCATS UL1TR002240. WY is supported by National Institutes of Health (NIH) T32HD040128. BJA is supported by National Heart, Lung, and Blood Institute (NHLBI) U24 HL148865. WGL is supported by National Library of Medicine (NLM) R00LM012926. YL is supported by R01LM013337. SNM is supported by NCATS 5UL1TR001857-05 and National Human Genome Research Institute (NHGRI) 5R01HG009174-04. GSO is supported by NIH grants U24CA210967 and P30ES017885. LPP is supported by NCATS CTSA Award UL1TR002366. SV is supported by NCATS UL1TR001857. GMW is supported by NCATS UL1TR002541, NCATS UL1TR000005, NLM R01LM013345, and NHGRI 3U01HG008685-05S2. ZX is supported by National Institute of Neurological Disorders and Stroke (NINDS) R01NS098023 and R01NS124882. QL is supported by National Institute of General Medical Sciences (NIGMS) R01GM124111 and NIH National Institute on Aging RF1AG063481. DLM is supported by NCATS UL1-TR001878. JHH is supported by NCATS UL1-TR001878.

Keywords

  • COVID-19
  • Electronic health records
  • Laboratory tests
  • Missing data
  • Multi-site health data

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Informative missingness: What can we learn from patterns in missing laboratory data in the electronic health record?'. Together they form a unique fingerprint.

Cite this