TY - JOUR

T1 - Linear-time algorithms for computing maximum-density sequence segments with bioinformatics applications

AU - Goldwasser, Michael H.

AU - Kao, Ming-Yang

AU - Lu, Hsueh I.

N1 - Funding Information:
A preliminary version of these results appeared under the title, “Fast Algorithms for Finding Maximum-Density Segments of a Sequence with Applications to Bioinformatics,” in: R. Guigó, D. Gusfield (Eds.), Proceedings of the Second Workshop on Algorithms in Bioinformatics (WABI), Lecture Notes in Computer Science, vol. 2452, Springer, Berlin, 2002, pp. 157–171. E-mail addresses: goldwamh@slu.edu, (M.H. Goldwasser), kao@cs.northwestern.edu (M.-Y. Kao), hil@iis.sinica.edu.tw (H.-I. Lu) URLs: http://euler.slu.edu/∼goldwasser, http://www.cs.northwestern.edu/∼kao, http://www.iis.sinica.edu.tw/∼hil 1Supported in part by NSF Grant EIA-0112934. 2Supported in part by NSC Grant NSC-90-2218-E-001-005.

PY - 2005/3

Y1 - 2005/3

N2 - We study an abstract optimization problem arising from biomolecular sequence analysis. For a sequence A of pairs (ai,wi) for i=1,...,n and wi>0, a segment A(i,j) is a consecutive subsequence of A starting with index i and ending with index j. The width of A(i,j) is w(i,j)=∑i≤k≤jwk, and the density is (∑i≤k≤jak)/w(i,j). The maximum-density segment problem takes A and two values L and U as input and asks for a segment of A with the largest possible density among those of width at least L and at most U. When U is unbounded, we provide a relatively simple, O(n)-time algorithm, improving upon the O(nlogL)-time algorithm by Lin, Jiang and Chao. We then extend this result, providing an O(n)-time algorithm for the case when both L and U are specified.

AB - We study an abstract optimization problem arising from biomolecular sequence analysis. For a sequence A of pairs (ai,wi) for i=1,...,n and wi>0, a segment A(i,j) is a consecutive subsequence of A starting with index i and ending with index j. The width of A(i,j) is w(i,j)=∑i≤k≤jwk, and the density is (∑i≤k≤jak)/w(i,j). The maximum-density segment problem takes A and two values L and U as input and asks for a segment of A with the largest possible density among those of width at least L and at most U. When U is unbounded, we provide a relatively simple, O(n)-time algorithm, improving upon the O(nlogL)-time algorithm by Lin, Jiang and Chao. We then extend this result, providing an O(n)-time algorithm for the case when both L and U are specified.

KW - Bioinformatics

KW - Density

KW - Sequences

UR - http://www.scopus.com/inward/record.url?scp=15744368022&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=15744368022&partnerID=8YFLogxK

U2 - 10.1016/j.jcss.2004.08.001

DO - 10.1016/j.jcss.2004.08.001

M3 - Article

AN - SCOPUS:15744368022

SN - 0022-0000

VL - 70

SP - 128

EP - 144

JO - Journal of Computer and System Sciences

JF - Journal of Computer and System Sciences

IS - 2

ER -