Abstract
One of the most formidable challenges electronic health records (EHRs) pose for traditional analytics is the inability to map directly (or reliably) to medical concepts or phenotypes. Among other things, EHR-based phenotyping can help identify and target patients for interventions and improve real-time clinical decisions. Existing phenotyping approaches often require labor-intensive supervision from medical experts or do not focus on generating concise and diverse phenotypes. Sparsity in phenotypes is key to making them interpretable and useful to clinicians, while diversity allows clinicians to grasp the main features of a patient population quickly.In this paper, we introduce Granite, a diversified, sparse nonnegative tensor factorization method to derive phenotypes with limited human supervision. Compared to existing high-throughput phenotyping techniques, Granite yields phenotypes with much more distinct (non-overlapping) elements that can, as an artifact, capture rare phenotypes. Moreover, the resulting concise phenotypes retain predictive powers comparable to or surpassing existing dimensionality reduction techniques. We evaluate Granite by comparing its resulting phenotypes with those generated using state-of-the-art, high-throughput methods on simulated as well as real EHR data. Our algorithm offers a promising and novel data-driven solution to rapidly characterize, predict, and manage a wide range of diseases.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017 |
Editors | Mollie Cummins, Julio Facelli, Gerrit Meixner, Christophe Giraud-Carrier, Hiroshi Nakajima |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 214-223 |
Number of pages | 10 |
ISBN (Electronic) | 9781509048816 |
DOIs | |
State | Published - Sep 8 2017 |
Event | 5th IEEE International Conference on Healthcare Informatics, ICHI 2017 - Park City, United States Duration: Aug 23 2017 → Aug 26 2017 |
Publication series
Name | Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017 |
---|
Other
Other | 5th IEEE International Conference on Healthcare Informatics, ICHI 2017 |
---|---|
Country/Territory | United States |
City | Park City |
Period | 8/23/17 → 8/26/17 |
Funding
The authors would like to thank Suriya Gunasekar for her input on inducing sparsity. This work was supported by NSF grant 1418504.
Keywords
- Computational phenotyping
- Data mining
- Electronic health records
- Feature extraction
- Health information management
- Tensor factorization
ASJC Scopus subject areas
- Health Informatics