Efficient genome-wide, privacy-preserving Similar Patient Query based on private edit distance

Xiao Wang, Yan Huang, Yongan Zhao, Haixu Tang, Xiao Feng Wang, Diyue Bu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

48 Scopus citations

Abstract

Edit distance has been proven to be an important and frequentlyused metric in many human genomic research, with Similar Patient Query (SPQ) being a particularly promising and attractive example. However, due to the widespread privacy concerns on revealing personal genomic data, the scope and scale of many novel use of genome edit distance are substantially limited. While the problem of private genomic edit distance has been studied by the research community for over a decade [5], the state-of-the-art solution [30] is far from even close to be applicable to real genome sequences. In this paper, we propose several private edit distance protocols that feature unprecedentedly high efficiency and precision. Our construction is a combination of a novel genomic edit distance approximation algorithm and new construction of private set difference size protocols. With the private edit distance based secure SPQ primitive, we propose GENSETS, a genome-wide, privacy-preserving similar patient query system. It is able to support searching large-scale, distributed genome databases across the nation. We have implemented a prototype of GENSETS. The experimental results show that, with 100 Mbps network connection, it would take GENSETS less than 200 minutes to search through 1 million breast cancer patients (distributed nation-wide in 250 hospitals, each having 4000 patients), based on edit distances between their genomes of lengths about 75 million nucleotides each.

Original languageEnglish (US)
Title of host publicationCCS 2015 - Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery
Pages492-503
Number of pages12
ISBN (Electronic)9781450338325
DOIs
Publication statusPublished - Oct 12 2015
Event22nd ACM SIGSAC Conference on Computer and Communications Security, CCS 2015 - Denver, United States
Duration: Oct 12 2015Oct 16 2015

Publication series

NameProceedings of the ACM Conference on Computer and Communications Security
Volume2015-October
ISSN (Print)1543-7221

Conference

Conference22nd ACM SIGSAC Conference on Computer and Communications Security, CCS 2015
CountryUnited States
CityDenver
Period10/12/1510/16/15

    Fingerprint

Keywords

  • Edit distance
  • Genomic computation
  • Secure computation

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Cite this

Wang, X., Huang, Y., Zhao, Y., Tang, H., Wang, X. F., & Bu, D. (2015). Efficient genome-wide, privacy-preserving Similar Patient Query based on private edit distance. In CCS 2015 - Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (pp. 492-503). (Proceedings of the ACM Conference on Computer and Communications Security; Vol. 2015-October). Association for Computing Machinery. https://doi.org/10.1145/2810103.2813725