TY - GEN
T1 - A new heuristic for multiple sequence alignment
AU - Agrawal, Ankit
AU - Khaitan, Siddhartha Kumar
PY - 2008
Y1 - 2008
N2 - Multiple sequence alignment is an important task in bioinformatics which forms the basis of many other tasks like protein structure prediction, protein function prediction and phylogenetic analysis. An optimal multiple sequence alignment for a given set of sequences is difficult to determine by standard dynamic programming algorithms because of impractically high computational complexity. Therefore, heuristics are commonly used to reduce the alignment construction time, even though the resulting alignment may not be optimal. ClustalW is one of the most popular progressive multiple sequence alignment algorithm where the pairwise sequence alignments are done according to a static guide-tree based on the sequences alone. For a multiple sequence alignment of N sequences, each of length O(n), the ClustalW algorithm takes O(N 2n2) time. In this paper, a modification to the algorithm is proposed which reduces the time complexity of the algorithm to O(N log 2 Nn2). The proposed heuristic also makes the alignment process more dynamic as the order of sequences added to the multiple sequence alignment also depends on the already computed multiple sequence alignments.
AB - Multiple sequence alignment is an important task in bioinformatics which forms the basis of many other tasks like protein structure prediction, protein function prediction and phylogenetic analysis. An optimal multiple sequence alignment for a given set of sequences is difficult to determine by standard dynamic programming algorithms because of impractically high computational complexity. Therefore, heuristics are commonly used to reduce the alignment construction time, even though the resulting alignment may not be optimal. ClustalW is one of the most popular progressive multiple sequence alignment algorithm where the pairwise sequence alignments are done according to a static guide-tree based on the sequences alone. For a multiple sequence alignment of N sequences, each of length O(n), the ClustalW algorithm takes O(N 2n2) time. In this paper, a modification to the algorithm is proposed which reduces the time complexity of the algorithm to O(N log 2 Nn2). The proposed heuristic also makes the alignment process more dynamic as the order of sequences added to the multiple sequence alignment also depends on the already computed multiple sequence alignments.
UR - http://www.scopus.com/inward/record.url?scp=51349128709&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51349128709&partnerID=8YFLogxK
U2 - 10.1109/EIT.2008.4554299
DO - 10.1109/EIT.2008.4554299
M3 - Conference contribution
AN - SCOPUS:51349128709
SN - 9781424420308
T3 - 2008 IEEE International Conference on Electro/Information Technology, IEEE EIT 2008 Conference
SP - 215
EP - 217
BT - 2008 IEEE International Conference on Electro/Information Technology, IEEE EIT 2008 Conference
T2 - 2008 IEEE International Conference on Electro/Information Technology, IEEE EIT 2008 Conference
Y2 - 18 May 2008 through 20 May 2008
ER -