Deformation Invariant
Descriptor Training with
Simultaneous Recurrent
Network and Extended Kalman
Filter
Paul Kim
12/04/2007
Why Need Deformation Invariant Descriptor?
•
People are more concerned with security.
•
Perfect face images are not always available.
•
Face recognition out of deformed images can
be very difficult.
•
By using Deformation Invariant Descriptor
we can build a face recognition system that
is robust to deformation.
•
This Deformation Invariant Descriptor is
further trained with Simultaneous Recurrent
Network/Extended Kalman Filter to have cure
deformation and restore the original image.
What Do We Need in Deformation Invariant?
•
Geodesic distance
•
Level curve
•
Sampling from each level curve
•
Geodesic

Intensity Histogram (GIH)
•
The similarity between two GIH
What it means Deformation Invariant?
•
One

to

one, continuous mapping.
•
Intensity values are deformation invariant.
–
(their positions may change)
Figure 1. A Deformed Image Example
Our Framework for Deformation Invariance
•
A deformation invariant framework
–
Embed images as surfaces in 3D
–
Geodesic distance is made deformation
invariant by adjusting an embedding parameter
–
Build deformation invariant descriptors using
geodesic distances
Deformation Invariance is …
•
An intensity image is treated as a surface embedded in 3D
space, with the third coordinate proportional to the
intensity values with an aspect weight α and the first two
coordinates proportional to x

y with weight 1 − α. As α
increases, the geodesic distance on the embedded surface
becomes less sensitive to image deformations. In the limit
when α → 1, the geodesic distance is exactly deformation
invariant. Based on this idea, the method use geodesic
sampling to get sample points on the embedded surface,
then build the geodesic

intensity histogram (GIH) as a local
descriptor. GIH captures the spatial distribution of
intensities on the embedded manifold. With α = 1, it is
exactly invariant to deformation.
Deformation Invariance for one dimension
Consider the
images as 1D
surfaces
embedded in a
2D space, where
intensity is
scaled by α, the
aspect weight
and x is scaled
by 1

α.
2.
Geodesic Distance is …
α
I
(1

α
)
x
p
q
g
(
p,q
)
•
Length of the shortest path along surface
From:
Haibin Ling
Deformation Invariant Image Matching
Figure 3. A Geodesic Distance Example
Geodesic Distance and
α
I
1
I
2
Geodesic distance becomes
deformation invariant
for
α
close to 1
embed
embed
Figure 4. A Matching Between two points with Geodesic Distancexample
Image Embedding & Curve Lengths
]
1
,
0
[
:
)
,
(
2
R
y
x
I
dt
I
y
x
l
t
t
t
2
2
2
2
2
2
)
1
(
)
1
(
))
(
'
),
(
'
),
(
'
(
)
(
t
z
t
y
t
x
t
Depends
only
on intensity
I
Deformation Invariant
I
z
y
y
x
x
I
'
,
)
1
(
'
,
)
1
(
'
)
,
(
dt
I
t
2
1
Image
I
Embedded Surface
Curve on
Length of
Take limit
Deformation Invariant Sampling
Geodesic
Level Curves
Geodesic
Sampling
Geodesic Sampling
1.
Fast marching: get
geodesic level
curves with
sampling gap
Δ
2.
Sampling along level
curves with
Δ
p
Figure 5. Geodesic Level Curves and Sampling
GIH (Geodesic Intensity Histogram)
Given an interest point
p
, together with a sample point, set
P
p
obtained
via geodesic sampling, the GIH
H
p
at
p
is a normalized two dimensional
histogram obtained through the following steps:
Divide the 2D intensity

geodesic distance space into
K
×
M
bins where
K
is # of intensity intervals and
M
, #
of geodesic distance intervals.
Insert all points in
P
p
into
H
p
: so that
H
p
(
k
,
m
) = #{
q
H
p
: (
I
(
q
),
g
(
q
)
B
(
k
,
m
) } where
I
(
q
) is the intensity
at
q
,
g
(
q
), the geodesic distance at q (from p) and
B
(
k
,
m
), the bin corresponding to the
k
th intensity
interval and
m
th geodesic interval.
Normalize each column of
H
p
(representing the same
geodesic distance). Then normalize the whole
H
p
.
The Similarity Measure Between Two GIHs
Given two geodesic

intensity histogram
H
p
,
H
q
, the
similarity between them is measured using the
distance:
K
k
M
m
q
p
q
p
m
k
H
m
k
H
m
k
H
m
k
H
q
p
X
1
1
2
2
)
,
(
)
,
(
)]
,
(
)
,
(
[
2
1
)
,
(
Real Example
p
q
Figure 6. Two Point

Comparison Using Geodesic Level Curves and Sampling
Review: Deformation Invariant Framework
Image Embedding ( close to 1
)
Deformation Invariant Sampling
Geodesic Sampling
Build Deformation Invariant Descriptors
(GIH)
)
,
(
)
,
(
I
y
x
I
Temporary Result (1)
Figure 7. GIH calculation between the interest point and correlation point
Temporary Result (2)
max
0.1804
min
0.092384
<0.1
4
3%
>=0.1;<0.11
13
11%
>=0.11;<0.12
14
12%
>=0.12;<0.13
11
9%
>=0.13;<0.14
6
5%
>=0.14;<0.15
12
10%
>=0.15;<0.16
11
9%
>=0.16;<0.17
19
16%
>=0.17;<0.18
30
25%
>=0.18;<0.19
1
1%
total
121
100%
Simultaneous Recurrent Network (SRN)
•
SRN is an Artificial Neural Network that has

input x,

non

linear feedforward function ƒ(.) (e.g., MLP),

output
z
,

feedback, that copies the outputs to inputs without time delay.
•
The output of previous iteration is fedback to the network along
with the external inputs to compute the output of next iteration.
•
What is attractive in SRN?
–
SRN is a great function approximator or function mapper.
–
It works for a more complicated problem such as the maze
problem with biological aspects of neural network (recurrency).
Input:
x
feedback
Feedforward Network
f(W,x,z)
Output:
z
z
Figure 8. The Basic Topology of SRN
Training of SRN
•
The training all involve some calculation of the
derivatives of error w.r.t. the weights.
•
The weights are adapted according to a simple
formula
where LR=Learning Rate.
Figure 9. Types of SRN Training
j
i
j
i
j
i
W
Error
LR
oldW
newW
,
,
,
*
SNR Training Cntd
Pros
Cons
Trunc

ation
Simplest & least Expensive
Do not represent the total
impact of the weight
changes.
BPTT
Less expensive
Require the storage of many
intermediate results
SP
No need for intermediate
storage
More complex
EC
Less iteration (approximate
derivatives in BPTT)
No guarantee of yielding
exact results in equilibrium.
FP
Dynamic BPTT
More expensive
SNR Training with Kalman Filter
•
Kalman filters estimate the hidden state of a
system based on observable measurements.
•
The estimation is done iteratively, with the state
estimate improved with each new measurement.
•
In the case of SRN, the set of weights becomes
the state vector, and the measurement outputs
become the measurement vector.
•
In EKF, the state and observation models need
not be linear functions but may instead be
(differentiable) functions.
Figure 10. The Kalman Filter
Simple Trial w/t SRN/EKF: Maze Navigation
?
G
•
State: locations of
the goal and obstacles
with respect to the
agent
•
Actions: L,R,U,D
•
Strategic utility J is
the length of path
between any point and
the goal.
•
Find the shortest
path.
Figure 11. A typical maze navigation problem
Training SRN/EKF w/t parts of image & GIH
Fig. 12. Original Lena Face Image
(a) Eye Part (b) Deformed Eye (c) Binary Contoured Eye
(d) Nose part (e) Deformed Nose (f) Binary Contoured Nose
(g) Mouth part (h) Deformed Mouth (i) Binary Contoured Mouth
Result of SNR/EKF on the Deformed Eye Part
Figure 13. (25x25 out of 256x256) of eye Image to be deformed and trained
Discussion and Future Direction
•
GIH shows robustness to deformation
•
SRN/EKF takes the interest point area and its GIH and the
target point area and its GIH as if they were a training maze
and a testing maze.
•
With 10 iteration of SRN for 50 epochs for EKF, the weights
seemed to be adjusted with the target value.
•
With a 63 x 63 image, the system goes out of memory.
•
Maximum image size was 25 x 25 image and it took about
10 minutes to calculate 50 epochs of EKF.
•
The result shows restoring efforts from smear deformation.
•
The output function was re

defined for our purpose from
four directions (L(
←
),R(
→
)
,U(
↑
)
,D(
↓
)) to eight directions
including the corner directions (NW, NE,SW,SE)
.
Comments 0
Log in to post a comment