Improved alignment of nucleosome DNA sequences using a mixture model.

Ji Ping Z Wang*, Jonathan Widom

*Corresponding author for this work

Research output: Contribution to journalArticle

26 Scopus citations

Abstract

DNA sequences that are present in nucleosomes have a preferential approximately 10 bp periodicity of certain dinucleotide signals, but the overall sequence similarity of the nucleosomal DNA is weak, and traditional multiple sequence alignment tools fail to yield meaningful alignments. We develop a mixture model that characterizes the known dinucleotide periodicity probabilistically to improve the alignment of nucleosomal DNAs. We assume that a periodic dinucleotide signal of any type emits according to a probability distribution around a series of 'hot spots' that are equally spaced along nucleosomal DNA with 10 bp period, but with a 1 bp phase shift across the middle of the nucleosome. We model the three statistically most significant dinucleotide signals, AA/TT, GC and TA, simultaneously, while allowing phase shifts between the signals. The alignment is obtained by maximizing the likelihood of both Watson and Crick strands simultaneously. The resulting alignment of 177 chicken nucleosomal DNA sequences revealed that all 10 distinct dinucleotides are periodic, however, with only two distinct phases and varying intensity. By Fourier analysis, we show that our new alignment has enhanced periodicity and sequence identity compared with center alignment. The significance of the nucleosomal DNA sequence alignment is evaluated by comparing it with that obtained using the same model on non-nucleosomal sequences.

Original languageEnglish (US)
Pages (from-to)6743-6755
Number of pages13
JournalNucleic acids research
Volume33
Issue number21
DOIs
StatePublished - 2005

    Fingerprint

ASJC Scopus subject areas

  • Genetics

Cite this