TY - JOUR
T1 - Linear-time algorithms for computing maximum-density sequence segments with bioinformatics applications
AU - Goldwasser, Michael H.
AU - Kao, Ming-Yang
AU - Lu, Hsueh I.
N1 - Funding Information:
A preliminary version of these results appeared under the title, “Fast Algorithms for Finding Maximum-Density Segments of a Sequence with Applications to Bioinformatics,” in: R. Guigó, D. Gusfield (Eds.), Proceedings of the Second Workshop on Algorithms in Bioinformatics (WABI), Lecture Notes in Computer Science, vol. 2452, Springer, Berlin, 2002, pp. 157–171. E-mail addresses: goldwamh@slu.edu, (M.H. Goldwasser), kao@cs.northwestern.edu (M.-Y. Kao), hil@iis.sinica.edu.tw (H.-I. Lu) URLs: http://euler.slu.edu/∼goldwasser, http://www.cs.northwestern.edu/∼kao, http://www.iis.sinica.edu.tw/∼hil 1Supported in part by NSF Grant EIA-0112934. 2Supported in part by NSC Grant NSC-90-2218-E-001-005.
PY - 2005/3
Y1 - 2005/3
N2 - We study an abstract optimization problem arising from biomolecular sequence analysis. For a sequence A of pairs (ai,wi) for i=1,...,n and wi>0, a segment A(i,j) is a consecutive subsequence of A starting with index i and ending with index j. The width of A(i,j) is w(i,j)=∑i≤k≤jwk, and the density is (∑i≤k≤jak)/w(i,j). The maximum-density segment problem takes A and two values L and U as input and asks for a segment of A with the largest possible density among those of width at least L and at most U. When U is unbounded, we provide a relatively simple, O(n)-time algorithm, improving upon the O(nlogL)-time algorithm by Lin, Jiang and Chao. We then extend this result, providing an O(n)-time algorithm for the case when both L and U are specified.
AB - We study an abstract optimization problem arising from biomolecular sequence analysis. For a sequence A of pairs (ai,wi) for i=1,...,n and wi>0, a segment A(i,j) is a consecutive subsequence of A starting with index i and ending with index j. The width of A(i,j) is w(i,j)=∑i≤k≤jwk, and the density is (∑i≤k≤jak)/w(i,j). The maximum-density segment problem takes A and two values L and U as input and asks for a segment of A with the largest possible density among those of width at least L and at most U. When U is unbounded, we provide a relatively simple, O(n)-time algorithm, improving upon the O(nlogL)-time algorithm by Lin, Jiang and Chao. We then extend this result, providing an O(n)-time algorithm for the case when both L and U are specified.
KW - Bioinformatics
KW - Density
KW - Sequences
UR - http://www.scopus.com/inward/record.url?scp=15744368022&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=15744368022&partnerID=8YFLogxK
U2 - 10.1016/j.jcss.2004.08.001
DO - 10.1016/j.jcss.2004.08.001
M3 - Article
AN - SCOPUS:15744368022
SN - 0022-0000
VL - 70
SP - 128
EP - 144
JO - Journal of Computer and System Sciences
JF - Journal of Computer and System Sciences
IS - 2
ER -