TY - JOUR
T1 - Improving the phenotype risk score as a scalable approach to identifying patients with Mendelian disease
AU - Bastarache, Lisa
AU - Hughey, Jacob J.
AU - Goldstein, Jeffrey A.
AU - Bastraache, Julie A.
AU - Das, Satya
AU - Zaki, Neil Charles
AU - Zeng, Chenjie
AU - Tang, Leigh Anne
AU - Roden, Dan M.
AU - Denny, Joshua C.
N1 - Funding Information:
This work was supported by grants R01-LM010685 from the National Library of Medicine (LB, JJH, LAT, JCD), P50-GM115305 from the National Institute for General Medical Sciences (DMR), T32 CA160056 Vanderbilt Training in the Molecular and Genetic Epidemiology of Cancer (CZ), and U01 grants supporting Vanderbilt's participation in the eMERGE (Electronic Medical Records and Genomics) network (HG004603, HG006378, and HG008672). BioVU received and continues to receive support through the National Center for Research Resources (UL1-RR024975), which is now the National Center for Advancing Translational Sciences (UL1-TR000445).
Funding Information:
This work was supported by grants R01-LM010685 from the National Library of Medicine (LB, JJH, LAT, JCD), P50-GM115305 from the National Institute for General Medical Sciences (DMR), T32 CA160056 Vanderbilt Training in the Molecular and Genetic Epidemiology of Cancer (CZ), and U01 grants supporting Vanderbilt’s participation in the eMERGE (Electronic Medical Records and Genomics) network (HG004603, HG006378, and HG008672). BioVU received and continues to receive support through the National Center for Research Resources (UL1-RR024975), which is now the National Center for Advancing Translational Sciences (UL1-TR000445).
Publisher Copyright:
© 2019 The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: [email protected].
PY - 2019/11/15
Y1 - 2019/11/15
N2 - Objective: The Phenotype Risk Score (PheRS) is a method to detect Mendelian disease patterns using phenotypes from the electronic health record (EHR). We compared the performance of different approaches mapping EHR phenotypes to Mendelian disease features. Materials and Methods: PheRS utilizes Mendelian diseases descriptions annotated with Human Phenotype Ontology (HPO) terms. In previous work, we presented a map linking phecodes (based on International Classification of Diseases [ICD]-Ninth Revision) to HPO terms. For this study, we integrated ICD-Tenth Revision codes and lab data. We also created a new map between HPO terms using customized groupings of ICD codes. We compared the performance with cases and controls for 16 Mendelian diseases using 2.5 million de-identified medical records. Results: PheRS effectively distinguished cases from controls for all 15 positive controls and all approaches tested (P < 4 × 1016). Adding lab data led to a statistically significant improvement for 4 of 14 diseases. The custom ICD groupings improved specificity, leading to an average 8% increase for precision at 100 (-2% to 22%). Eight of 10 adults with cystic fibrosis tested had PheRS in the 95th percentile prio to diagnosis. Discussion: Both phecodes and custom ICD groupings were able to detect differences between affected cases and controls at the population level. The ICD map showed better precision for the highest scoring individuals. Adding lab data improved performance at detecting population-level differences. Conclusions: PheRS is a scalable method to study Mendelian disease at the population level using electronic health record data and can potentially be used to find patients with undiagnosed Mendelian disease.
AB - Objective: The Phenotype Risk Score (PheRS) is a method to detect Mendelian disease patterns using phenotypes from the electronic health record (EHR). We compared the performance of different approaches mapping EHR phenotypes to Mendelian disease features. Materials and Methods: PheRS utilizes Mendelian diseases descriptions annotated with Human Phenotype Ontology (HPO) terms. In previous work, we presented a map linking phecodes (based on International Classification of Diseases [ICD]-Ninth Revision) to HPO terms. For this study, we integrated ICD-Tenth Revision codes and lab data. We also created a new map between HPO terms using customized groupings of ICD codes. We compared the performance with cases and controls for 16 Mendelian diseases using 2.5 million de-identified medical records. Results: PheRS effectively distinguished cases from controls for all 15 positive controls and all approaches tested (P < 4 × 1016). Adding lab data led to a statistically significant improvement for 4 of 14 diseases. The custom ICD groupings improved specificity, leading to an average 8% increase for precision at 100 (-2% to 22%). Eight of 10 adults with cystic fibrosis tested had PheRS in the 95th percentile prio to diagnosis. Discussion: Both phecodes and custom ICD groupings were able to detect differences between affected cases and controls at the population level. The ICD map showed better precision for the highest scoring individuals. Adding lab data improved performance at detecting population-level differences. Conclusions: PheRS is a scalable method to study Mendelian disease at the population level using electronic health record data and can potentially be used to find patients with undiagnosed Mendelian disease.
KW - Data mining
KW - Diagnosis
KW - Electronic health record
KW - Mendelian genetics
UR - http://www.scopus.com/inward/record.url?scp=85075095159&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075095159&partnerID=8YFLogxK
U2 - 10.1093/jamia/ocz179
DO - 10.1093/jamia/ocz179
M3 - Article
C2 - 31609419
AN - SCOPUS:85075095159
SN - 1067-5027
VL - 26
SP - 1437
EP - 1447
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 12
ER -