Central to reconstruction of cis-regulatory networks is identification and classification of naturally occurring transcription factor-binding sites according to the genes that they control. We have examined salient characteristics of 9-mers that occur in various orders and combinations in the proximal promoters of human genes. In evaluations of a dataset derived with respect to experimentally defined transcription initiation sites, in some cases we observed a clear correspondence of highly ranked 9-mers with protein-binding sites in genomic DNA. Evaluations of the larger dataset, derived with respect to the 5′ end of human ESTs, revealed that a subset of the highly ranked 9-mers corresponded to sites for several known transcription factor families (including CREB, ETS, EGR-1, SP1, KLF, MAZ, HIF-1, and STATs) that play important roles in the regulation of vertebrate genes. We identified several highly ranked CpG-containing 9-mers, defining sites for interactions with the CREB and ETS families of proteins, and identified potential target genes for these proteins. The results of the studies imply that the CpG-containing transcription factor-binding sites regulate the expression of genes with important roles in pathways leading to cell-type-specific gene expression and pathways controlled by the complex networks of signaling systems.
- Codes in human DNA
- Gene regulation
- Human genome
- Sequence context of human genomic DNA
- Transcription factor binding sites
ASJC Scopus subject areas