Researchers from the Department of Electrical and Computer Engineering at Rice University recently published a paper on their work looking at ”Workload Analysis and Efficient OpenCL-based Implementation of SIFT Algorithm on a Smartphone”. Their results show close to a 1.7X speed up using an ARM+GPU compared to ARM only, while noting that CPU to GPU memory bandwidth is a performance bottleneck. Overall using the CPU+GPU seemed to reduce power consumption by 41%.
The Scale-Invariant Feature Transform (SIFT) algorithm is a common approach for to detecting and extracting features from images. Recent developments in mobile processors have enabled heterogeneous computing on the CPU and GPU which appears from the results they show to overcome some of the limits of using this high compute approach on mobile and tablet platforms.
Extract from the Paper
CONCLUSION This paper presents workload analysis and an efficient implementation of the SIFT algorithm on a mobile processor. We discuss efficient algorithm mapping to the GPU architecture based on profiling results and optimization techniques, with emphasis on optimized Gaussian pyramid generation. Experiments show that we can achieve a 1.69X speedup for keypoint detection compared to an optimized C++ reference design. The frame rates for keypoint detection and descriptor generation are improved to 8.5 FPS and 19 FPS, respectively. Meanwhile, we reduce the energy consumption by 41%. However, we notice that the limited number of compute units and the low memory bandwidth between the CPU and GPU are still bottlenecks for a high speed mobile computer vision application. ACKNOWLEDGMENTS This work was supported in part by Samsung, Qualcomm, Renesas Mobile, Texas Instruments, and by the US National Science Foundation under grants CNS- 1265332, ECCS-1232274, and EECS-0925942.
To read the paper in full Click Here