TY - GEN
T1 - Conservative, non-conservative and average pairwise statistical significance of local sequence alignment
AU - Agrawal, Ankit
AU - Huang, Xiaoqiu
PY - 2008/12/1
Y1 - 2008/12/1
N2 - Estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, it was shown that pairwise statistical significance does better in practice than database statistical significance in terms of retrieval accuracy of homologs. In this paper, we introduce the concept of conservative, non-conservative, and average pairwise statistical significance which can be easily derived from original pairwise statistical significance estimates and use more information specific to the sequence pair under consideration using multiple shuffle spaces. Experimental results for homology detection reveal that the proposed measures give at least comparable or significantly better retrieval accuracy than original pairwise statistical significance and database statistical significance reported by BLAST, PSI-BLAST, and SSEARCH. The use of the proposed measures is further shown to be extremely useful when using sequence-specific substitution matrices.
AB - Estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, it was shown that pairwise statistical significance does better in practice than database statistical significance in terms of retrieval accuracy of homologs. In this paper, we introduce the concept of conservative, non-conservative, and average pairwise statistical significance which can be easily derived from original pairwise statistical significance estimates and use more information specific to the sequence pair under consideration using multiple shuffle spaces. Experimental results for homology detection reveal that the proposed measures give at least comparable or significantly better retrieval accuracy than original pairwise statistical significance and database statistical significance reported by BLAST, PSI-BLAST, and SSEARCH. The use of the proposed measures is further shown to be extremely useful when using sequence-specific substitution matrices.
UR - http://www.scopus.com/inward/record.url?scp=58049169952&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=58049169952&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2008.19
DO - 10.1109/BIBM.2008.19
M3 - Conference contribution
AN - SCOPUS:58049169952
SN - 9780769534527
T3 - Proceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
SP - 433
EP - 436
BT - Proceedings - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
T2 - 2008 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008
Y2 - 3 November 2008 through 5 November 2008
ER -