TY - JOUR
T1 - Expression quantitative trait loci information improves predictive modeling of disease relevance of non-coding genetic variation
AU - Croteau-Chonka, Damien C.
AU - Rogers, Angela J.
AU - Raj, Towfique
AU - McGeachie, Michael J.
AU - Qiu, Weiliang
AU - Ziniti, John P.
AU - Stubbs, Benjamin J.
AU - Liang, Liming
AU - Martinez, Fernando D.
AU - Strunk, Robert C.
AU - Lemanske, Robert F.
AU - Liu, Andrew H.
AU - Stranger, Barbara Elaine
AU - Carey, Vincent J.
AU - Raby, Benjamin A.
N1 - Publisher Copyright:
Copyright © 2015 Croteau-Chonka et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2015/10/16
Y1 - 2015/10/16
N2 - Disease-associated loci identified through genome-wide association studies (GWAS) frequently localize to non-coding sequence. We and others have demonstrated strong enrichment of such single nucleotide polymorphisms (SNPs) for expression quantitative trait loci (eQTLs), supporting an important role for regulatory genetic variation in complex disease pathogenesis. Herein we describe our initial efforts to develop a predictive model of disease-associated variants leveraging eQTL information. We first catalogued cis-acting eQTLs (SNPs within 100kb of target gene transcripts) by meta-analyzing four studies of three blood-derived tissues (n = 586). At a false discovery rate < 5%, we mapped eQTLs for 6,535 genes; these were enriched for disease-associated genes (P < 10-04), particularly those related to immune diseases and metabolic traits. Based on eQTL information and other variant annotations (distance from target gene transcript, minor allele frequency, and chromatin state), we created multivariate logistic regression models to predict SNP membership in reported GWAS. The complete model revealed independent contributions of specific annotations as strong predictors, including evidence for an eQTL (odds ratio (OR) = 1.2-2.0, P < 10-11) and the chromatin states of active promoters, different classes of strong or weak enhancers, or transcriptionally active regions (OR = 1.5-2.3, P < 10-11). This complete prediction model including eQTL association information ultimately allowed for better discrimination of SNPs with higher probabilities of GWAS membership (6.3-10.0%, compared to 3.5% for a random SNP) than the other two models excluding eQTL information. This eQTL-based prediction model of disease relevance can help systematically prioritize non-coding GWAS SNPs for further functional characterization.
AB - Disease-associated loci identified through genome-wide association studies (GWAS) frequently localize to non-coding sequence. We and others have demonstrated strong enrichment of such single nucleotide polymorphisms (SNPs) for expression quantitative trait loci (eQTLs), supporting an important role for regulatory genetic variation in complex disease pathogenesis. Herein we describe our initial efforts to develop a predictive model of disease-associated variants leveraging eQTL information. We first catalogued cis-acting eQTLs (SNPs within 100kb of target gene transcripts) by meta-analyzing four studies of three blood-derived tissues (n = 586). At a false discovery rate < 5%, we mapped eQTLs for 6,535 genes; these were enriched for disease-associated genes (P < 10-04), particularly those related to immune diseases and metabolic traits. Based on eQTL information and other variant annotations (distance from target gene transcript, minor allele frequency, and chromatin state), we created multivariate logistic regression models to predict SNP membership in reported GWAS. The complete model revealed independent contributions of specific annotations as strong predictors, including evidence for an eQTL (odds ratio (OR) = 1.2-2.0, P < 10-11) and the chromatin states of active promoters, different classes of strong or weak enhancers, or transcriptionally active regions (OR = 1.5-2.3, P < 10-11). This complete prediction model including eQTL association information ultimately allowed for better discrimination of SNPs with higher probabilities of GWAS membership (6.3-10.0%, compared to 3.5% for a random SNP) than the other two models excluding eQTL information. This eQTL-based prediction model of disease relevance can help systematically prioritize non-coding GWAS SNPs for further functional characterization.
UR - http://www.scopus.com/inward/record.url?scp=84948974979&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84948974979&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0140758
DO - 10.1371/journal.pone.0140758
M3 - Article
C2 - 26474488
AN - SCOPUS:84948974979
SN - 1932-6203
VL - 10
JO - PLoS One
JF - PLoS One
IS - 10
M1 - 140758
ER -