Eukaryotic gene expression requires coordination among hundreds of transcriptional regulators. To characterize a specific transcriptional regulator, identifying how it shares genomic-binding sites with other regulators can generate important insights into its action. As genomic data such as chromatin immunoprecipitation assays with sequencing (ChIP-Seq) are being continously generated from individual labs, there is a demand for timely integration and analysis of these new data. We have developed an R package, GPSmatch (Genomic-binding Profile Similarity match), for calculating the Jaccard index to compare the ChIP-Seq peaks from one experiment to other experiments stored in a user-supplied customizable database. GPSmatch also evaluates the statistical significance of the calculated Jaccard index using a nonparametric Monte Carlo procedure. We show that GPSmatch is suitable for identifying and ranking transcriptional regulators with shared genomic-binding profiles, which may unravel potential mechanistic actions of gene regulation.
ASJC Scopus subject areas
- Computational Mathematics
- Molecular Biology
- Statistics and Probability
- Computer Science Applications
- Computational Theory and Mathematics