TY - JOUR
T1 - Par-PSSE
T2 - Software for Pairwise statistical significance estimation in parallel for local sequence alignment
AU - Zhang, Yuhong
AU - Patwary, Md Mostofa Ali
AU - Misra, Sanchit
AU - Agrawal, Ankit
AU - Liao, Wei keng
AU - Qin, Zhiguang
AU - Choudhary, Alok
PY - 2012/3
Y1 - 2012/3
N2 - Pairwise statistical significance (PSS) has been recognized as a very useful method for homology detection. It can help in estimating whether the output of sequence alignment is evolutionarily link or just arisen by accident. However, pairwise statistical significance estimation (PSSE) poses a big challenge in terms of performance and scalability since it is both computationally intensive and data intensive to construct the empirical score distribution during the estimation. This paper presents a software library for estimating pairwise statistical significance in parallel, named Par-PSSE, implemented in C++ using OpenMP, MPI paradigms and their hybrids. Further, we apply the parallelization technique to estimate non-conservative PSS using standard, sequence-specific, and position-specific substitution matrices. These extensions have been found superior compared to the standard pairwise statistical significance in term of retrieval accuracy. Through distributing the compute-intensive kernels of the pairwise statistical significance estimation across multiple computational units, we achieve a speedup of up to 621.73× over the corresponding sequential implementation when using1024 cores.
AB - Pairwise statistical significance (PSS) has been recognized as a very useful method for homology detection. It can help in estimating whether the output of sequence alignment is evolutionarily link or just arisen by accident. However, pairwise statistical significance estimation (PSSE) poses a big challenge in terms of performance and scalability since it is both computationally intensive and data intensive to construct the empirical score distribution during the estimation. This paper presents a software library for estimating pairwise statistical significance in parallel, named Par-PSSE, implemented in C++ using OpenMP, MPI paradigms and their hybrids. Further, we apply the parallelization technique to estimate non-conservative PSS using standard, sequence-specific, and position-specific substitution matrices. These extensions have been found superior compared to the standard pairwise statistical significance in term of retrieval accuracy. Through distributing the compute-intensive kernels of the pairwise statistical significance estimation across multiple computational units, we achieve a speedup of up to 621.73× over the corresponding sequential implementation when using1024 cores.
KW - Hybrid paradigm
KW - MPI
KW - Multi-core
KW - OpenMP
KW - Pairwise statistical significance
UR - http://www.scopus.com/inward/record.url?scp=84859312008&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84859312008&partnerID=8YFLogxK
U2 - 10.4156/jdcta.vol6.issue5.24
DO - 10.4156/jdcta.vol6.issue5.24
M3 - Article
AN - SCOPUS:84859312008
SN - 1975-9339
VL - 6
SP - 200
EP - 208
JO - International Journal of Digital Content Technology and its Applications
JF - International Journal of Digital Content Technology and its Applications
IS - 5
ER -