## Abstract

Homology detection is a fundamental step in sequence analysis. In the recent years, pairwise statistical significance has emerged as a promising alternative to database statistical significance for homology detection. Although more accurate, currently it is much time consuming because it involves generating tens of hundreds of alignment scores to construct the empirical score distribution. This paper presents a parallel algorithm for pairwise statistical significance estimation, called MPIPairwiseStatSig, implemented in C using MPI library. We further apply the parallelization technique to estimate non-conservative pairwise statistical significance using standard, sequence-specific, and position-specific substitution matrices, which has earlier demonstrated superior sequence comparison accuracy than original pairwise statistical significance. Distributing the most compute-intensive portions of the pairwise statistical significance estimation procedure across multiple processors has been shown to result in near-linear speed-ups for the application. The MPIPairwiseStatSig program for pairwise statistical significance estimation is available for free academic use at.

Original language | English (US) |
---|---|

Pages (from-to) | 2269-2279 |

Number of pages | 11 |

Journal | Concurrency Computation Practice and Experience |

Volume | 23 |

Issue number | 17 |

DOIs | |

State | Published - Dec 10 2011 |

## Keywords

- homologs
- MPI
- non-conservative pairwise statistical significance
- pairwise statistical significance
- parallel computing
- position-specific substitution matrix
- sequence alignment
- sequence-specific substitution matrix

## ASJC Scopus subject areas

- Software
- Theoretical Computer Science
- Computer Science Applications
- Computer Networks and Communications
- Computational Theory and Mathematics