TY - GEN
T1 - Pairwise DNA alignment with sequence specific transition-transversion ratio using multiple parameter sets
AU - Agrawal, Ankit
AU - Huang, Xiaoqiu
PY - 2008/12/1
Y1 - 2008/12/1
N2 - Pairwise DNA and protein sequence alignment is an underlying task in bioinformatics which forms the basis of many other bioinformatics applications. Protein sequence alignment is in general given more importance than DNA sequence alignment, and protein sequence alignment methods can usually be used with little modification for DNA sequences as well. However, alignment methods specific to DNA sequence alignment using sequence specific information are highly desirable. Most existing DNA alignment programs routinely use the common match/mismatch scoring scheme. Recently, an iterative alignment scheme using sequence-specific transition-transversion ratio was shown to be better than using a simple match/mismatch scoring scheme. In this paper, we present a modification to the iterative approach by incorporating in it the use of multiple parameter sets. Preliminary experiments indicate that using multiple parameter sets gives significantly better performance than using a single parameter set, and than using a simple match/mismatch scoring scheme. Sequence specific scoring matrices have been shown to be highly successful for protein alignment over the last decade, and the current work should be a significant step in the direction of using sequence specific substitution matrices for DNA sequences.
AB - Pairwise DNA and protein sequence alignment is an underlying task in bioinformatics which forms the basis of many other bioinformatics applications. Protein sequence alignment is in general given more importance than DNA sequence alignment, and protein sequence alignment methods can usually be used with little modification for DNA sequences as well. However, alignment methods specific to DNA sequence alignment using sequence specific information are highly desirable. Most existing DNA alignment programs routinely use the common match/mismatch scoring scheme. Recently, an iterative alignment scheme using sequence-specific transition-transversion ratio was shown to be better than using a simple match/mismatch scoring scheme. In this paper, we present a modification to the iterative approach by incorporating in it the use of multiple parameter sets. Preliminary experiments indicate that using multiple parameter sets gives significantly better performance than using a single parameter set, and than using a simple match/mismatch scoring scheme. Sequence specific scoring matrices have been shown to be highly successful for protein alignment over the last decade, and the current work should be a significant step in the direction of using sequence specific substitution matrices for DNA sequences.
UR - http://www.scopus.com/inward/record.url?scp=62449228013&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=62449228013&partnerID=8YFLogxK
U2 - 10.1109/ICIT.2008.62
DO - 10.1109/ICIT.2008.62
M3 - Conference contribution
AN - SCOPUS:62449228013
SN - 9780769535135
T3 - Proceedings - 11th International Conference on Information Technology, ICIT 2008
SP - 89
EP - 93
BT - Proceedings - 11th International Conference on Information Technology, ICIT 2008
T2 - 11th International Conference on Information Technology, ICIT 2008
Y2 - 17 December 2008 through 20 December 2008
ER -