Linear-time algorithms for computing maximum-density sequence segments with bioinformatics applications

Michael H. Goldwasser*, Ming-Yang Kao, Hsueh I. Lu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

36 Scopus citations

Abstract

We study an abstract optimization problem arising from biomolecular sequence analysis. For a sequence A of pairs (ai,wi) for i=1,...,n and wi>0, a segment A(i,j) is a consecutive subsequence of A starting with index i and ending with index j. The width of A(i,j) is w(i,j)=∑i≤k≤jwk, and the density is (∑i≤k≤jak)/w(i,j). The maximum-density segment problem takes A and two values L and U as input and asks for a segment of A with the largest possible density among those of width at least L and at most U. When U is unbounded, we provide a relatively simple, O(n)-time algorithm, improving upon the O(nlogL)-time algorithm by Lin, Jiang and Chao. We then extend this result, providing an O(n)-time algorithm for the case when both L and U are specified.

Original languageEnglish (US)
Pages (from-to)128-144
Number of pages17
JournalJournal of Computer and System Sciences
Volume70
Issue number2
DOIs
StatePublished - Mar 2005

Funding

A preliminary version of these results appeared under the title, “Fast Algorithms for Finding Maximum-Density Segments of a Sequence with Applications to Bioinformatics,” in: R. Guigó, D. Gusfield (Eds.), Proceedings of the Second Workshop on Algorithms in Bioinformatics (WABI), Lecture Notes in Computer Science, vol. 2452, Springer, Berlin, 2002, pp. 157–171. E-mail addresses: [email protected], (M.H. Goldwasser), [email protected] (M.-Y. Kao), [email protected] (H.-I. Lu) URLs: http://euler.slu.edu/∼goldwasser, http://www.cs.northwestern.edu/∼kao, http://www.iis.sinica.edu.tw/∼hil 1Supported in part by NSF Grant EIA-0112934. 2Supported in part by NSC Grant NSC-90-2218-E-001-005.

Keywords

  • Bioinformatics
  • Density
  • Sequences

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science
  • Computer Networks and Communications
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Linear-time algorithms for computing maximum-density sequence segments with bioinformatics applications'. Together they form a unique fingerprint.

Cite this