# Descriptor Training with

AI and Robotics

Nov 17, 2013 (4 years and 5 months ago)

90 views

Deformation Invariant
Descriptor Training with
Simultaneous Recurrent
Network and Extended Kalman
Filter

Paul Kim

12/04/2007

Why Need Deformation Invariant Descriptor?

People are more concerned with security.

Perfect face images are not always available.

Face recognition out of deformed images can
be very difficult.

By using Deformation Invariant Descriptor
we can build a face recognition system that
is robust to deformation.

This Deformation Invariant Descriptor is
further trained with Simultaneous Recurrent
Network/Extended Kalman Filter to have cure
deformation and restore the original image.

What Do We Need in Deformation Invariant?

Geodesic distance

Level curve

Sampling from each level curve

Geodesic
-
Intensity Histogram (GIH)

The similarity between two GIH

What it means Deformation Invariant?

One
-
to
-
one, continuous mapping.

Intensity values are deformation invariant.

(their positions may change)

Figure 1. A Deformed Image Example

Our Framework for Deformation Invariance

A deformation invariant framework

Embed images as surfaces in 3D

Geodesic distance is made deformation
invariant by adjusting an embedding parameter

Build deformation invariant descriptors using
geodesic distances

Deformation Invariance is …

An intensity image is treated as a surface embedded in 3D
space, with the third coordinate proportional to the
intensity values with an aspect weight α and the first two
coordinates proportional to x
-
y with weight 1 − α. As α
increases, the geodesic distance on the embedded surface
becomes less sensitive to image deformations. In the limit
when α → 1, the geodesic distance is exactly deformation
invariant. Based on this idea, the method use geodesic
sampling to get sample points on the embedded surface,
then build the geodesic
-
intensity histogram (GIH) as a local
descriptor. GIH captures the spatial distribution of
intensities on the embedded manifold. With α = 1, it is
exactly invariant to deformation.

Deformation Invariance for one dimension

Consider the
images as 1D
surfaces
embedded in a
2D space, where
intensity is
scaled by α, the
aspect weight
and x is scaled
by 1
-
α.

2.

Geodesic Distance is …

α
I

(1
-
α
)
x

p

q

g
(
p,q
)

Length of the shortest path along surface

From:
Haibin Ling

Deformation Invariant Image Matching

Figure 3. A Geodesic Distance Example

Geodesic Distance and
α

I
1

I
2

Geodesic distance becomes

deformation invariant

for
α

close to 1

embed

embed

Figure 4. A Matching Between two points with Geodesic Distancexample

Image Embedding & Curve Lengths

]
1
,
0
[
:
)
,
(
2

R
y
x
I
dt
I
y
x
l
t
t
t

2
2
2
2
2
2
)
1
(
)
1
(

))
(
'
),
(
'
),
(
'
(
)
(
t
z
t
y
t
x
t

Depends
only
on intensity
I

Deformation Invariant

I
z
y
y
x
x
I

'
,
)
1
(
'
,
)
1
(
'
)
,
(
dt
I
t

2
1

Image

I

Embedded Surface

Curve on

Length of

Take limit

Deformation Invariant Sampling

Geodesic
Level Curves

Geodesic
Sampling

Geodesic Sampling

1.
Fast marching: get
geodesic level
curves with
sampling gap
Δ

2.
Sampling along level
curves with
Δ

p

Figure 5. Geodesic Level Curves and Sampling

GIH (Geodesic Intensity Histogram)

Given an interest point
p
, together with a sample point, set
P
p

obtained
via geodesic sampling, the GIH
H
p

at
p
is a normalized two dimensional
histogram obtained through the following steps:

Divide the 2D intensity
-
geodesic distance space into
K

×

M

bins where
K

is # of intensity intervals and
M
, #
of geodesic distance intervals.

Insert all points in
P
p

into
H
p
: so that
H
p
(
k
,
m
) = #{
q

H
p

: (
I
(
q
),
g
(
q
)
B
(
k
,
m
) } where
I
(
q
) is the intensity
at
q
,
g
(
q
), the geodesic distance at q (from p) and
B
(
k
,
m
), the bin corresponding to the
k
th intensity
interval and
m
th geodesic interval.

Normalize each column of
H
p

(representing the same
geodesic distance). Then normalize the whole
H
p
.

The Similarity Measure Between Two GIHs

Given two geodesic
-
intensity histogram
H
p
,
H
q
, the
similarity between them is measured using the
distance:

K
k
M
m
q
p
q
p
m
k
H
m
k
H
m
k
H
m
k
H
q
p
X
1
1
2
2
)
,
(
)
,
(
)]
,
(
)
,
(
[
2
1
)
,
(
Real Example

p

q

Figure 6. Two Point
-

Comparison Using Geodesic Level Curves and Sampling

Review: Deformation Invariant Framework

Image Embedding ( close to 1
)

Deformation Invariant Sampling

Geodesic Sampling

Build Deformation Invariant Descriptors

(GIH)

)
,
(
)
,
(

I
y
x
I

Temporary Result (1)

Figure 7. GIH calculation between the interest point and correlation point

Temporary Result (2)

max

0.1804

min

0.092384

<0.1

4

3%

>=0.1;<0.11

13

11%

>=0.11;<0.12

14

12%

>=0.12;<0.13

11

9%

>=0.13;<0.14

6

5%

>=0.14;<0.15

12

10%

>=0.15;<0.16

11

9%

>=0.16;<0.17

19

16%

>=0.17;<0.18

30

25%

>=0.18;<0.19

1

1%

total

121

100%

Simultaneous Recurrent Network (SRN)

SRN is an Artificial Neural Network that has

-

input x,

-

non
-
linear feedforward function ƒ(.) (e.g., MLP),

-

output
z
,

-

feedback, that copies the outputs to inputs without time delay.

The output of previous iteration is fedback to the network along

with the external inputs to compute the output of next iteration.

What is attractive in SRN?

SRN is a great function approximator or function mapper.

It works for a more complicated problem such as the maze

problem with biological aspects of neural network (recurrency).

Input:
x

feedback

Feedforward Network

f(W,x,z)

Output:
z

z

Figure 8. The Basic Topology of SRN

Training of SRN

The training all involve some calculation of the
derivatives of error w.r.t. the weights.

The weights are adapted according to a simple
formula

where LR=Learning Rate.

Figure 9. Types of SRN Training

j
i
j
i
j
i
W
Error
LR
oldW
newW
,
,
,
*

SNR Training Cntd

Pros

Cons

Trunc
-
ation

Simplest & least Expensive

Do not represent the total
impact of the weight
changes.

BPTT

Less expensive

Require the storage of many
intermediate results

SP

No need for intermediate
storage

More complex

EC

Less iteration (approximate
derivatives in BPTT)

No guarantee of yielding
exact results in equilibrium.

FP

Dynamic BPTT

More expensive

SNR Training with Kalman Filter

Kalman filters estimate the hidden state of a
system based on observable measurements.

The estimation is done iteratively, with the state
estimate improved with each new measurement.

In the case of SRN, the set of weights becomes
the state vector, and the measurement outputs
become the measurement vector.

In EKF, the state and observation models need
not be linear functions but may instead be
(differentiable) functions.

Figure 10. The Kalman Filter

Simple Trial w/t SRN/EKF: Maze Navigation

?

G

State: locations of
the goal and obstacles
with respect to the
agent

Actions: L,R,U,D

Strategic utility J is
the length of path
between any point and
the goal.

Find the shortest
path.

Figure 11. A typical maze navigation problem

Training SRN/EKF w/t parts of image & GIH

Fig. 12. Original Lena Face Image

(a) Eye Part (b) Deformed Eye (c) Binary Contoured Eye

(d) Nose part (e) Deformed Nose (f) Binary Contoured Nose

(g) Mouth part (h) Deformed Mouth (i) Binary Contoured Mouth

Result of SNR/EKF on the Deformed Eye Part

Figure 13. (25x25 out of 256x256) of eye Image to be deformed and trained

Discussion and Future Direction

GIH shows robustness to deformation

SRN/EKF takes the interest point area and its GIH and the
target point area and its GIH as if they were a training maze
and a testing maze.

With 10 iteration of SRN for 50 epochs for EKF, the weights
seemed to be adjusted with the target value.

With a 63 x 63 image, the system goes out of memory.

Maximum image size was 25 x 25 image and it took about
10 minutes to calculate 50 epochs of EKF.

The result shows restoring efforts from smear deformation.

The output function was re
-
defined for our purpose from
four directions (L(

),R(

)
,U(

)
,D(

)) to eight directions
including the corner directions (NW, NE,SW,SE)
.