Modeling and Recognition

brasscoffeeΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 4 χρόνια και 1 μήνα)

74 εμφανίσεις

1

A Multimodal Approach for Face

Modeling and Recognition

指導老師
:
萬書言

老師

學生
:
何炳杰

2

Outline


Abstract


Introduction


3
-
D Face Recognition Based On Ridge Images And
Iterative Closest Points


2
-
D Face Recognition Based On Attributed Graphs


Fusing The Information From 2
-
D And 3
-
D


Experiments And Results


3

Abstract 1/3


In this paper, we present a fully automated multimodal
(3
-
D and 2
-
D) face recognition system.


For
the 3
-
D modality
, we model the facial image as a
3
-
D binary ridge image

that contains the ridge lines
on the face.


We use the principal
curvature max

to extract the
locations of the ridge lines around the important facial
regions on the range image (i.e.,
the eyes, the nose,
and the mouth
.)

4

Abstract 2/3


For
the 2
-
D modality
, we model the face by an
attributed relational graph (ARG)
.




Each node of the graph corresponds to a facial
feature point. At each facial feature point, a set
of attributes is extracted by applying Gabor
wavelets to the 2
-
D image and assigned to the
node of the graph.

5

Abstract 3/3


Finally, we fuse the matching results of the
3
-
D and the 2
-
D modalities at the score
level to improve the overall performance of
the system.


6


Introduction 1/5


In this paper, we present a multimodal face
recognition system that fuses results from both 3
-
D
and 2
-
D face recognition.



The 2
-
D and the 3
-
D modeling data in our system is
independent of each other, this system can be
employed in different scenarios of face recognition,
such as 2
-
D or 3
-
D face recognition individually, or
multimodal face recognition.


7

Introduction 2/5

Fig. 1 illustrates a general block diagram of our system.

3
-
D binary ridge image

ARG

8

Introduction 3/5


For the 3
-
D modality:


(i) we use the principal curvature to extract the locations of the
ridge lines around the important facial regions in the range image
(i.e. the eyes, nose, and mouth).


(ii) we represent the face image as a
3
-
D binary ridge image

that
contains the ridge lines on the face.


(iii) In the matching phase, instead of using the entire surface of the
face, we only match the ridge lines.


(By (iii)
This reduces the computations during the matching
process.

)

max
k
9

Introduction 4/5


For
2
-
D modality
, we build an
attributed relational graph

using nodes at certain labeled facial points.


In order to automatically extract the locations of facial points, we
use an improved version of active shape model (ASM) .


At each node of the graph, we compute the response of 40 Gabor
filters in eight orientations and five wavelengths.


The similarity between the ARG models is employed for 2
-
D
face recognition.

10

Introduction 5/5


The similarity between the ARG models is employed for 2
-
D face
recognition.



In summary, the main contributions of this paper are:


presenting a fully automated algorithm for 3
-
D face recognition based on the
ridge lines of the face;


developing a fully automated algorithm for 2
-
D face recognition based on
attributed relational graph models.


presenting and comparing two methods for the fusion of the 2
-
D and 3
-
D face
recognition based on the Dempster


Shafer (DS) theory of evidence and the
weighted sum of scores technique;


evaluating the performance of the system using the FRGC2.0 database.

11

3
-
D Face Recognition Based On Ridge
Images And Iterative Closest Points 1/3

A. Ridge Images(
山脊影像
)


Our goal is to extract and use the points lying on ridge
lines as the feature points on the surface.


For facial range images, these are points on the lines
around the eyes, the nose, and the mouth.


In the literature [13], ridges are defined as the points at
which the principal curvature of the surface attains a local
positive maximum.


Intuitively, valleys are the points that illustrate the drainage
patterns and are referred to as ridges when looked at from
the opposite side.



12


. 2
顯示了一個例子,一山脊圖像得到了

Kmax
閾值。這是一張三維二值影像
顯示臉部表面上山脊線的位置。



13

3
-
D Face Recognition Based On Ridge
Images And Iterative Closest Points 2/3

B. Ridge Image Matching


In this work, we use a fast ICP variant [33].


The difference in the ICP that we used in this paper and the ICP in
[33] is
in the phase of feature point selection.


We do not rely on random sampling of the points and we use all of
the feature points in the 3
-
D ridge image during the matching
process.


Although random sampling of the points speeds up the
matching process, it has a major effect on the accuracy of the
final results.

作者的觀點

14

3
-
D Face Recognition Based On Ridge
Images And Iterative Closest Points 3/3


Before matching the ridge images, we initially align the ridge
images using three extracted facial landmarks (i.e., the two inner
corners of the eyes and the tip of the nose).




We use a fully automated technique to extract these facial
landmarks, based on Gaussian curvature.


15

As shown in Fig. 3



Fig. 3, the surface that either has a peak or a pit shape

has a positive Gaussian curvature value.

16

As shown in Fig. 4

Fig. 4 shows a sample range image with the three extracted facial landmarks

眼窩

鼻尖
/


17

2
-
D Face Recognition Based On Attributed
Graphs 1/14


Elastic

bunch graph matching (EBGM) represented a facial image by
a labeled graph called bunch graph.



Where edges are labeled with distance information and nodes are
labeled with wavelet responses bundled in jets.



In addition, bunch graphs are treated as combinatorial entities in
which, for each fiducial point, a set of jets from different sample faces
is combined, thus creating a highly adaptable model.


18

2
-
D Face Recognition Based On Attributed
Graphs 2/14


In mathematics, a geometric graph is a graph in which the vertices
or edges are associated with geometric objects or configurations .



A triangulation is a technique for building a geometric graph.



Delaunay triangulation, a graph defined from a set of points in
the plane by connecting two points with an edge whenever a
circle exists containing only those two points.


19

Delaunay triangulation

20

2
-
D Face Recognition Based On Attributed
Graphs 3/14


In this paper, the goal is to model 2
-
D facial images by
attributed relational graphs.

21

2
-
D Face Recognition Based On Attributed
Graphs 4/14

A. Building the Attributed Graph


An ARG [26] consists of a set of nodes, edges, and mutual
relations

between them.


Let us denote the ARG by
,



where is the set of N nodes of the graph

and is the set of M edges.


The nodes

of the graph represent the extracted facial
features.


R is

a set of mutual relations between the three edges of
each triangle

in the Delaunay triangulation.











(,,)
g V R


1 2
{,,...,}
M
e e e


1 2
{,,...,}
N
V v v v

22

2
-
D Face Recognition Based On Attributed
Graphs 5/14



Mathematically, we write

,


where is the set of triangles

in Delaunay
triangulation
.



Recall that a Delaunay triangulation for a
set of points satisfies the condition that
no
point

in is inside the circumcircle of any
triangle in .

{ |,,}
ijk i j k t
R r e e e D
 
t
D
( )
t
D P
( )
t
D P
P
P
23

2
-
D Face Recognition Based On Attributed
Graphs 6/14


Where specifies the orientation of the wavelet, is the
wavelength of the sine wave, is the radius of the Gaussian, is
the phase of the sine wave, and γ specifies the aspect ratio of the
Gaussian.





The kernels of the Gabor filters are selected at eight orientations
(i.e., ) and five wavelengths
(i.e., )





{0,/8,2/8,3/8,4/8,5/8,6/8,7/8}
       

{1,2,2,2 2,4}


24

2
-
D Face Recognition Based On Attributed
Graphs 7/14


Specifically, referring to Fig. 5, the mutual relations used in this
work are defined to be :





25

2
-
D Face Recognition Based On Attributed
Graphs 8/14


B. Facial Feature Extraction


In this paper, we transform the color image into HSV
space and assume that the three channels, (i.e., hue,
saturation, and value) are statistically independent
and the normalized first derivative for each channel
along a profile line satisfies a multivariate Gaussian
distribution.


26

2
-
D Face Recognition Based On Attributed
Graphs 9/14


The best match for a probe sample in HSV color space to a
reference model is found by minimizing the distance :






: is the sample profile.



and : are the mean and the covariance of the profile line of
the component of the Gaussian model, respectively.



: is the weighting factor for the component of the model with
the constraint that the

i
g
gi
1
i


th
i
i
w
1
h s v
w w w
  
27

2
-
D Face Recognition Based On Attributed
Graphs 10/14

C. Feature Selection


The number of feature points affects the
performance of the graph representation for face
recognition.


In this work, we initially extracted 75 feature
points and we then used a standard template to
add more features at certain positions on the face,
such as the cheek and the points on the ridge of
the nose.


28

2
-
D Face Recognition Based On Attributed
Graphs 11/14


By using the standard template (Fig. 6), the total number of
the feature point candidates represented by the nodes of the
ARG model was increased to 111 points.

29

2
-
D Face Recognition Based On Attributed
Graphs 12/14


Fig. 7 shows a sample face in the gallery along with the
candidate points for building the ARG model.

30

2
-
D Face Recognition Based On Attributed
Graphs 13/14

D. Recognition


Assume that the ARG models of two faces and are given. The
dissimilarity between these two ARGs is defined by





and are functions that measure the differences
between the nodes of the graph and the mutual relations of the
corresponding triangles from the Delaunay triangulation,
respectively.



The and are weighting factors.




v
w
r
w
(1 (.))
v
S

(.)
r
D
31

2
-
D Face Recognition Based On Attributed
Graphs 14/14


The similarity measure is defined as







: is the magnitude of the set of 40 complex
coefficients of the Gabor filter response, obtained
at the node of the graph.



th
j
j
a
(.)
v
S
32

Fusing The Information From 2
-
D And 3
-
D
1/4


The Tanh
-
estimators score normalization is efficient and
robust and is defined as






and : are the scores before normalization and after
normalization.


The and are the mean and standard deviation
estimates, respectively.

j
s
n
j
s
GH

GH

33

Fusing The Information From 2
-
D And 3
-
D
2/4


Hampel estimators are based on the following influence
function:






where sign( ) = +1 if >=0 ; otherwise,sign( ) =
-
1 .


The Hampel influence function reduces the influence of
the scores at the tails of the distribution (identified by a, b,
and c ).





34

Fusing The Information From 2
-
D And 3
-
D
3/4

B. Fusion Techniques


The weighted sum score fusion technique is defined as

:





:
is the weight of the modality with the condition


and is the normalized score of the modality.


j
w
th
j
1
1
R
j
j
w



n
j
s
th
j
35

Fusing The Information From 2
-
D And 3
-
D
4/4


In our case, the values of the weights and for the 3
-
D
and 2
-
D modalities, respectively.


Another fusion algorithm that we applied to combine the
results of the 2
-
D and 3
-
D face recognition is the DS
theory.


Based on the Dempster rule of combination, the match
scores obtained from two different techniques (i.e., two
modalities in our work) can be fused by


1
w
2
w
36

Experiments And Results 1/5


Fig. 8 shows the results of the
verification experiment for
neutral versus neutral facial
images
.



As the ROC curve shows (also
the second row of Table II),
the 3
-
D modality has better
performance than the 2
-
D
modality (88.5% versus
79.80% verification at 0.1%
FAR) and the best verification
rate of multimodal (3
-
D + 2
-
D)
fusion belongs to the DS
combination rule (94.49% at
0.1% FAR).


37

Table II

38

Experiments And Results 2/5


Fig. 9 shows the verification
rate of the multimodal (3
-
D +
2
-
D) fusion, at 0.1% FAR,
with respect to different
weights for each modality.


Since there are only two
modalities, then
and the x axis of Fig. 9 is .

1
w
1 2
1
w w
 
39

Experiments And Results 3/5


As the figure shows, the optimum weights that produce the
maximum fusion performance are 0.7 and 0.3, respectively,
for and .

1
w
2
w
40

Experiments And Results 4/5


Fig. 10 shows for various numbers of subjects enrolled in
the database the average rank
-
one identification rate.

41

Experiments And Results 5/5


Fig. 11 shows the cumulative match characteristic (CMC)
curve for the recognition, based on ridge images, of faces
with expressions using the FRGC v2.0 database.

42

Thank you !