TY - GEN
T1 - Lessons learned from exploring the backtracking paradigm on the GPU
AU - Jenkins, John
AU - Arkatkar, Isha
AU - Owens, John D.
AU - Choudhary, Alok
AU - Samatova, Nagiza F.
PY - 2011
Y1 - 2011
N2 - We explore the backtracking paradigm with properties seen as sub-optimal for GPU architectures, using as a case study the maximal clique enumeration problem, and find that the presence of these properties limit GPU performance to approximately 1.4-2.25 times a single CPU core. The GPU performance "lessons" we find critical to providing this performance include a coarse-and-fine-grain parallelization of the search space, a low-overhead load-balanced distribution of work, global memory latency hiding through coalescence, saturation, and shared memory utilization, and the use of GPU output buffering as a solution to irregular workloads and a large solution domain. We also find a strong reliance on an efficient global problem structure representation that bounds any efficiencies gained from these lessons, and discuss the meanings of these results to backtracking problems in general.
AB - We explore the backtracking paradigm with properties seen as sub-optimal for GPU architectures, using as a case study the maximal clique enumeration problem, and find that the presence of these properties limit GPU performance to approximately 1.4-2.25 times a single CPU core. The GPU performance "lessons" we find critical to providing this performance include a coarse-and-fine-grain parallelization of the search space, a low-overhead load-balanced distribution of work, global memory latency hiding through coalescence, saturation, and shared memory utilization, and the use of GPU output buffering as a solution to irregular workloads and a large solution domain. We also find a strong reliance on an efficient global problem structure representation that bounds any efficiencies gained from these lessons, and discuss the meanings of these results to backtracking problems in general.
UR - http://www.scopus.com/inward/record.url?scp=80052336166&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80052336166&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-23397-5_42
DO - 10.1007/978-3-642-23397-5_42
M3 - Conference contribution
AN - SCOPUS:80052336166
SN - 9783642233968
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 425
EP - 437
BT - Euro-Par 2011 Parallel Processing - 17th International Conference, Proceedings
T2 - 17th International Conference on Parallel Processing, Euro-Par 2011
Y2 - 29 August 2011 through 2 September 2011
ER -