DESCRIPTION (provided by applicant): The nucleosome is the fundamental packing unit of chromatin formed by association of DNA and histone octamer in eukaryotic chromosomes. The detailed locations of nucleosomes along genomic DNA are critical for proper gene regulation - and therefore for the health and development of humans and all other eukaryotes - yet, at present, we lack any methods for the "in-silico" prediction of nucleosome positioning sequence elements in genomes. The long-term goal of this project is to develop capability to predict nucleosome-forming propensity given DNA sequences and correlate the genome-wide distribution of nucleosome-forming sequences with chromosome function. Towards this goal, we propose to achieve the following five aims progressively in the subsequent years. (1) Experimentally collect 1000 yeast nucleosome core DNA sequences, and 1000 yeast di-nucleosome DNA sequences (two neighboring nucleosomes linked by linker DNA). These yeast nucleosome DNA sequences will form a new training set as well as a validating set for nucleosome positioning prediction across the yeast genome. (2) Develop and refine statistical methods to align nucleosome DNA sequences. A novel statistical model for nucleosorne DNA sequence alignment was proposed in the preliminary study. We will refine this model by introducing a weighting scheme based on the importance of different periodic di-nucleotide signals from free energy consideration. (3) Use the alignment in (2) to train a model to predict nucleosome positioning. An inhomogeneous Markov chain model and a "mixture train" model are under consideration to model the sequential dependent structure of nucleotides. (4) Generalize the model by incorporating knowledge in spacing properties of neighboring nucleosomes. The di-nucleosome sequences generated from (1) will provide distributional evidence for between-nucleosome space and dl1 greatly facilitate prediction of genome-wide nucleosome positioning. (5) Automate the developed tools and algorithms.
|Effective start/end date||4/30/10 → 3/31/11|
- National Institute of General Medical Sciences (3R01GM075313-05S1)