Hough Transform is one of the most common methods for detecting shapes (lines, circles, etc.) in binary or gray-level images. In this paper we present several techniques for implementing hough transform computations on a shared-memory multiprocessor and present their performance. Implementation results are obtained using fine-grain and coarse-grain parallelism; uniform, static, parameter, and dynamic partitioning schemes; uniform and nonuniform images; several image sizes; and several multiprocessor sizes. A simple analysis of all the implementations is also presented. The results show that static and dynamic partitioning schemes perform comparably in most cases. Coarse-grain parallelism performs better than fine-grain parallelism in general. In fact, for very fine-grain computations, multiprocessors perform worse than a single processor implementation. There exists a granule size for which best performance is achieved. Finer or coarser granule sizes compared to this granule size result in worse performance. It is observed that for nonuniform images uniform partitioning does not perform well, whereas static and dynamic partitioning strategies perform well and comparably in most cases. Finally, the results also show that speedups are very sensitive to locking granularities for fine-grain parallelism.
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence