# SE263 Video Analytics

Λογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 4 χρόνια και 6 μήνες)

161 εμφανίσεις

SE263 Video Analytics

Course Project Initial Report

Presented by M.
Aravind

Krishnan, SERC,
IISc

X. Mei and H. Ling, ICCV’09

AIM
of the course project is to implement and if possible, improve the work done by
X
ue

Mei and
Haibin

Ling in visual tracking, as explained in their paper
Robust Visual
Tracking using
l
1

minimization
.

By ‘
improve
’ it is meant to ‘
accelerate
’ the speed of execution using special
processing hardware called
Graphics Processing Units
.

1.
I will begin by explaining the work done in the paper, and the various mathematical
tools used in achieving the desired results.

1. Bayesian state inference framework, used to predict the affine state of the object.
(Called the particle filter)

2. Sparse representation of the Tracking target.

3. Non
-
negativity constraints

4.
l
1

minimization

5. Template update

2.
This will be followed by a brief overview of Graphics processing Units, and how they
can be used for general purpose computation.

3.
Finally the parts of the algorithm most suited for being executed in a GPU is
proposed.

OVERVIEW

Templates

Sample/collection of possible views of the object, whose linear
combination can be used to represent the tracked object in the
frame.

Two types of templates are considered in this paper, Target
templates and Trivial templates.

Target templates to deal with various lighting conditions, poses, etc.

Trivial templates to deal with occlusions, noise,
bacckground

clutter,
etc.

Templates continued

Target templates are densely used to
represent, and hence are less in number.

T
rivial templates are sparsely used to
represent, and hence can be large in number.

State of object being tracked

X
t

=

2D deformation parameters

2D translation parameters

If

z
t

is

the

observed

distribution

of

the

state

of

the

object

at

time

t,

then

the

predicted

distribution

of

the

object

x
t

is

given

by

the

recursive

computation

"filtering"

refers

to

determining

the

distribution

of

a

latent

variable

at

a

specific

time,

given

all

observations

up

to

that

time
;

particle

filters

are

so

named

because

they

allow

for

approximate

"filtering"

using

a

set

of

"particles"

(differently
-
weighted

samples

of

the

distribution
)
.

-
Wikipedia

l
1

minimization

Non negativity

Optimization

Convex Optimization

Interior point method

The

method

uses

the

preconditioned

conjugate

(PCG
)

algorithm

to

compute

the

search

direction

and

the

run

time

is

determined

by

the

product

of

the

total

number

of

PCG

steps

required

over

all

iterations

and

the

cost

of

a

PCG

step
.

This

process

can

be

accelerated

by

GPUs
.

Algorithm for template update

Review of Algorithm

Frame 1

1.
Manually detect object to be tracked

2.
Initialize Target Templates with random variations of
object

Generate a set of
N

states around current state
X
t
,
with each of
the 6 affine parameters being modeled as an independent
gaussian

variable.

Calculate p(
X
t
|
Z
1:t

) by determining the Bayesian weights of
the importance
w
i

= p(
z
t
|x
t
), in turn determined from the
errors/residuals in projecting the tracked object onto each of
the solutions of
3.

Represent each of the N generated states as a sparse linear
combination of target and trivial templates by solving the
l
1

minimization problem
min||
B
c
-
y
||
2
2
+
λ
||c||
1

Update templates if the highest similarity of the templates with
newly tracked object is less than a threshold. Do by replacing
lowest similarity template with the newly tracked object.

1

2

3

4

5

Working of a GPU

Consists of a lot ALUs.

Banks of ALUs with shared memory are called
cores.

An average CPU consists of
upto

4 SIMD units.

A GPU consists of 32
-
128 SIMD units

A tesla C1060 unit available in SERC will be
used to try and speed up the optimization
process, and hence the whole algorithm.

The functionality of
GPUs

Data
Parallelism

GPUs are extremely good at executing the same
instruction across bulky data.

Eg
. Vector addition, Matrix Vector Multiplication,
BLAS routines, etc.

The major bottle
-
neck of this algorithm is the
convex optimization performed using Interior
point method. It involves some matrix vector
operations over the same matrix and around
N

different vectors. This can be readily and trivially
parallelized, and great speedup can be achieved if
done carefully.

Architecture of GPU

Dividing the minimization algorithm amongst the cores
of the GPU, and figuring out optimal grid configuration.

Optimizing to perform the whole task with minimal
data transfer from CPU to GPU and performing the
algorithm in real time using just one kernel invocation,
for a long video.

Achieve a frame rate > 30 fps on Tesla C1060.

Achieve frame rate of 18 fps or more using ATI mobility
Radeon HD 5650 graphics processor with 1Gb internal
memory available in my laptop. (requires transcription
to
OpenCL
. Under constraints of time)

Thank you