Despite our growing knowledge that many mammalian genes generate multiple transcript variants that may encode functionally distinct protein isoforms, the transcriptomes of various tissues and their developmental stages are poorly defined. Identifying the transcriptome and its regulation in a cell/tissue is the key to deciphering the cell/tissue-specific functions of a gene. We built a genome-wide inventory of noncoding and protein-coding transcripts (transcriptomes), their promoters (promoteromes) and histone modification states (epigenomes) for developing, and adult cerebella using integrative massive-parallel sequencing and bioinformatics approach. The data consists of 61,525 (12,796 novel) distinct mRNAs transcribed by 29,589 (4792 novel) promoters corresponding to 15,669 protein-coding and 7624 noncoding genes. Importantly, our results show that the transcript variants from a gene are predominantly generated using alternative transcriptional rather than splicing mechanisms, highlighting alternative promoters and transcriptional terminations as major sources of transcriptome diversity. Moreover, H3K4me3, and not H3K27me3, defined the use of alternative promoters, and we identified a combinatorial role of H3K4me3 and H3K27me3 in regulating the expression of transcripts, including transcript variants of a gene during development. We observed a strong bias of both H3K4me3 and H3K27me3 for CpG-rich promoters and an exponential relationship between their enrichment and corresponding transcript expression. Furthermore, the majority of genes associated with neurological diseases expressed multiple transcripts through alternative promoters, and we demonstrated aberrant use of alternative promoters in medulloblastoma, cancer arising in the cerebellum. The transcriptomes of developing and adult cerebella presented in this study emphasize the importance of analyzing gene regulation and function at the isoform level.
ASJC Scopus subject areas