Efficient algorithms for model-based motif discovery from multiple sequences

Bin Fu*, Ming-Yang Kao, Lusheng Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We study a natural probabilistic model for motif discovery that has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet ∑. A motif G=g 1 g 2...g m is a string of m characters. Each background sequence is implanted a randomly generated approximate copy of G. For a randomly generated approximate copy b 1 b 2...b m of G, every character is randomly generated such that the probability for b i ≠g i is at most α. In this paper, we give the first analytical proof that multiple background sequences do help for finding subtle and faint motifs.

Original languageEnglish (US)
Title of host publicationTheory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings
Pages234-245
Number of pages12
DOIs
StatePublished - Dec 1 2008
Event5th International Conference on Theory and Applications of Models of Computation, TAMC 2008 - Xian, China
Duration: Apr 25 2008Apr 29 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4978 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Conference on Theory and Applications of Models of Computation, TAMC 2008
CountryChina
CityXian
Period4/25/084/29/08

Fingerprint

Motif Discovery
Efficient Algorithms
Model-based
Probabilistic Model
Strings
Character
Background
Statistical Models

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Fu, B., Kao, M-Y., & Wang, L. (2008). Efficient algorithms for model-based motif discovery from multiple sequences. In Theory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings (pp. 234-245). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4978 LNCS). https://doi.org/10.1007/978-3-540-79228-4-21
Fu, Bin ; Kao, Ming-Yang ; Wang, Lusheng. / Efficient algorithms for model-based motif discovery from multiple sequences. Theory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings. 2008. pp. 234-245 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{491cc1648f354fa79c3d55a923cd5b87,
title = "Efficient algorithms for model-based motif discovery from multiple sequences",
abstract = "We study a natural probabilistic model for motif discovery that has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet ∑. A motif G=g 1 g 2...g m is a string of m characters. Each background sequence is implanted a randomly generated approximate copy of G. For a randomly generated approximate copy b 1 b 2...b m of G, every character is randomly generated such that the probability for b i ≠g i is at most α. In this paper, we give the first analytical proof that multiple background sequences do help for finding subtle and faint motifs.",
author = "Bin Fu and Ming-Yang Kao and Lusheng Wang",
year = "2008",
month = "12",
day = "1",
doi = "10.1007/978-3-540-79228-4-21",
language = "English (US)",
isbn = "3540792279",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "234--245",
booktitle = "Theory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings",

}

Fu, B, Kao, M-Y & Wang, L 2008, Efficient algorithms for model-based motif discovery from multiple sequences. in Theory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4978 LNCS, pp. 234-245, 5th International Conference on Theory and Applications of Models of Computation, TAMC 2008, Xian, China, 4/25/08. https://doi.org/10.1007/978-3-540-79228-4-21

Efficient algorithms for model-based motif discovery from multiple sequences. / Fu, Bin; Kao, Ming-Yang; Wang, Lusheng.

Theory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings. 2008. p. 234-245 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4978 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Efficient algorithms for model-based motif discovery from multiple sequences

AU - Fu, Bin

AU - Kao, Ming-Yang

AU - Wang, Lusheng

PY - 2008/12/1

Y1 - 2008/12/1

N2 - We study a natural probabilistic model for motif discovery that has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet ∑. A motif G=g 1 g 2...g m is a string of m characters. Each background sequence is implanted a randomly generated approximate copy of G. For a randomly generated approximate copy b 1 b 2...b m of G, every character is randomly generated such that the probability for b i ≠g i is at most α. In this paper, we give the first analytical proof that multiple background sequences do help for finding subtle and faint motifs.

AB - We study a natural probabilistic model for motif discovery that has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet ∑. A motif G=g 1 g 2...g m is a string of m characters. Each background sequence is implanted a randomly generated approximate copy of G. For a randomly generated approximate copy b 1 b 2...b m of G, every character is randomly generated such that the probability for b i ≠g i is at most α. In this paper, we give the first analytical proof that multiple background sequences do help for finding subtle and faint motifs.

UR - http://www.scopus.com/inward/record.url?scp=70349303350&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349303350&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-79228-4-21

DO - 10.1007/978-3-540-79228-4-21

M3 - Conference contribution

AN - SCOPUS:70349303350

SN - 3540792279

SN - 9783540792277

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 234

EP - 245

BT - Theory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings

ER -

Fu B, Kao M-Y, Wang L. Efficient algorithms for model-based motif discovery from multiple sequences. In Theory and Applications of Models of Computation - 5th International Conference, TAMC 2008, Proceedings. 2008. p. 234-245. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-540-79228-4-21