Privacy-Preserving Record Linkage to Identify Fragmented Electronic Medical Records in the All of Us Research Program

Abel N. Kho*, Jingzhi Yu, Molly Scannell Bryan, Charon Gladfelter, Howard S. Gordon, Shaun Grannis, Margaret Madden, Eneida Mendonca, Vesna Mitrovic, Raj Shah, Umberto Tachinardi, Bradley Taylor

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

As part of a national study in the United States to recruit one million Americans (All of Us Research Program) and their Electronic Health Record data, we set out to determine the degree to which care is fragmented across a sample of participating health provider organizations (HPOs). We distributed a previously validated Privacy-Preserving Record Linkage (PPRL) tool to participating sites to generate a unique set of keyed encrypted hashes for seven participating institutions across three States in the Upper Midwest of the U.S. An honest broker received the resulting encrypted hashes to identify patients with the same encrypted hashes shared across any combination of more than one institution as a proxy for patients receiving care across institutions. Out of 5,831,238 individuals, we identified 458,680 patients with data at more than one institution. Care fragmentation varied significantly by State and by Institution ranging from 6.1% up to 32.7%. Patients with fragmented care were more likely to be black (11.8% vs 10.8%), and slightly older (Median birth year 1968 vs 1969) compared with patients receiving care at only one participating institution. In contrast, patients who maintained an address in a warmer state (“snowbirds”) were the least likely to be black (7.5%) of all study groups. We identified conflicting or inconsistent demographic information in 49.1% of patients with care fragmentation compared with 5.6% of patients without care fragmentation. Privacy-preserving record linkage can be an effective means to identify populations with care fragmentation and poor data quality for focused clinical and data improvement efforts.

Original languageEnglish (US)
Title of host publicationMachine Learning and Knowledge Discovery in Databases - International Workshops of ECML PKDD 2019, Proceedings
EditorsPeggy Cellier, Kurt Driessens
PublisherSpringer
Pages79-87
Number of pages9
ISBN (Print)9783030438869
DOIs
StatePublished - 2020
Event19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019 - Wurzburg, Germany
Duration: Sep 16 2019Sep 20 2019

Publication series

NameCommunications in Computer and Information Science
Volume1168 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
Country/TerritoryGermany
CityWurzburg
Period9/16/199/20/19

Keywords

  • Ecology of care
  • Privacy preservation
  • Record linkage

ASJC Scopus subject areas

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Privacy-Preserving Record Linkage to Identify Fragmented Electronic Medical Records in the All of Us Research Program'. Together they form a unique fingerprint.

Cite this