ECE 562 Computer Architecture and Design

birdsowlSoftware and s/w Development

Dec 2, 2013 (3 years and 10 months ago)

88 views

ECE 562 Computer Architecture and Design


Project: Improving Feature Extraction Using
SIFT on GPU


Rodrigo Savage,
Wo
-
Tak

Wu

Overview

Application:


Object tracking in real time

Challenges:


Static Scene


Moving objects


Occluding


Collision


Disappearing


Rotation


Scaling

Divide and Conquer:


Feature Extraction and Tracking

Focus on:


Feature Extraction, used SIFT


Improve an existing implementation with GPU

Scale Invariant Feature Transform (SIFT)

Input:
image

Output:
keypoints

GPU Implementation


Selected the GPU implementation by
Sinha

et al. at UNC
at Chapel Hill


Open
-
source
SiftGPU

available (latest V4.00, Sept. 2012)


SIFT well suited to be implemented on GPU


Tens of thousands of threads handle subsets of data
without communication with each other

Attempts to Speed Up


Tackled the 2 most time consuming processing steps


Blurring images with Gaussian low
-
pass filter


Changed pixel data access pattern


Used different schemes of data partitioning


Keypoint

descriptor (128
-
element vector) calculations


Optimize code in the kernel


Used usual optimization techniques


Changed GPU memory usage


Threads management


Experimented with kernel parameters


Maximized usage of available threads

Result
: Reduced descriptor compute time from 73 to



22
ms

(70%)

Conclusion


Existing implementation is already pretty good


Hard to take full advantage of the architecture. Need to
have good understanding of


Memory architecture


Thread usage


CUDA C/C++ compiler (
nvcc
) optimizes code in different
ways. Need to experiment to gain performance


Hard to debug code running on GPU


Visual Profiler can provide valuable insights on code
behaviors


Backup Slides

References


SiftGPU

available at
http://cs.unc.edu/~ccwu/siftgpu/


D. G. Lowe, “Distinctive image features from scale
-
invariant
keypoints
,”
International Journal of Computer Vision, November 2004.


Sudipta

N.
Sinha

et al., “GPU
-
based Video Feature Tracking And Matching,”
Technical Report TR 06
-
012, Department of Computer Science, UNC Chapel
Hill, May 2006.


NVIDIA GeForce GT 640M LE


CUDA Cores: 384


Total available graphics memory: 4095 MB

Test image with
keypoints

Algorithm

Algorithm

Algorithm