Automated Face Tracking and

brasscoffeeAI and Robotics

Nov 17, 2013 (3 years and 6 months ago)

71 views

Automated Face Tracking and
Recognition

Curt Hesher

Anuj Srivastava

Gordon Erlebacher


Overview


Review of Past Research in Face Tracking and
Recognition


Data Acquisition and Representation


Face Tracking Using Images Generated from
Geometry


Face Recognition Using Range Images


Conclusions and future work.

A Review of Face Tracking and
Recognition


Survey papers


Past research


Commercial implementations


Persistent challenges

Survey Papers


Nonconnectionist (Samal and Iyengar)


Approaches dealing with the relative position of
feature points (distance between eyes, corners of
the mouth, etc.) derived from certain pixel values


Connectionist (Valentin et al.)


Approaches that
derive characteristics from the whole face image
(i.e., PCA)


General (Chellappa et al., Barrett, Zhao et al.)


Approaches categorized as neural, statistical, and
feature based

Past Research


Start with 2D images


LDA, KDA, PCA, SVM, EBGM


Neural, statistical, feature analysis

Commercial Implementations


Numerous implementations


Statistical, neural, and feature based


Government sponsored tests (FRVT 2000
and 2002) show accuracy between 20% and
90% depending on the environment


Robust face recognition is still unsolved

Persistent Challenges


Variation from pose


Variation from lighting


Occlusions


Poor image quality


Techniques beginning with 2D data have
been heavily researched. A new imaging
modality should be researched: 3D Imaging

A Novel Approach


Start with 3D data


Use the additional information present in 3D data
for tracking and recognition

Data Acquisition and Representation


Minolta Vivid 700 3D scanner



Meshes captured using 3D
camera



½ second capture time



Subject motion avoided



Light independent data
capture of geometry

Data Acquisition and Representation


Sample points on the surface of an object and
connect them via lines to form a mesh



200x200 geometry res.


400x400 texture res.


About 10K points
sampled from a face


About 40K pixels
sampled from a face

Tracking


Algorithm


Experiment


Conclusions

Algorithm

Algorithm


Segmentation and
recognition are not
addressed


Mesh is manually chosen


Video is manually chosen
(subject is face forward in
the first frame and at a
reasonable distance from
the camera)

Algorithm


Tracking through synthesis


Cost function (C) indicates likeness of estimate (E)
to target (T)



Follow the gradient of the cost function to achieve
alignment


1
(,) || (,) ||
C
 
x
θ T E x θ
1
( )
( ) ( 2) ( ),0
i i
C
q q
 
   
 
E Q
Q T E Q
Experiment


Synthetic and real target video


Synthetic target initially used to
avoid nuisance variables (i.e.,
lighting, noise, etc.)


Parameters for tracking are
chosen manually and refined by
observation


(add video tracking example)


Successfully tracks around 20
to 50 frames before failing

Experiment


Successfully tracks
around 20 to 50
frames before failing

Conclusions


Does not handle background clutter


Does not handle lighting variations


Computationally expensive


Principle Component Analysis of
Range Images for Face Recognition



Facial Identification


Many current modalities of investigation
(intra
-
feature distance, geometrical
parameterization, reflectance)


Outstanding issues in previous modalities
(reflectance, orientation)


New modality, Range Imaging.

What are Range Images


Range Images are generated from a mesh


Meshes captured using Minolta Vivid 700 3D
camera

Data Collected


115 persons


6 facial expressions per person


690 3D facial images


Subset of 37 persons under 6 expressions
used in current experiment


Some manual correction to data (hole
patching)

Range Image Generation


Traverse each triangle in the mesh


Orthographically project depth values onto
the range image plane

Range Image Registration


Orientation


rotation
in the image plane


Translation


translation in the
image plane


Depth


translation
perpendicular to the
image plane


Automatic Preprocessing

Recognition using Range Images


Training data


a subset of the experimental
data set is used to learn the variability in
facial range images


Testing data


remaining faces used in
attempted recognition


Dimension reduction


Principle
Component Analysis (PCA) used to reduce
facial range images to 10 dimensional
vectors



Twenty largest Eigen values
(above)



Three Eigen vectors from three
largest Eigen values (right)

Dimension Reduction

Testing: Nearest Neighbor
Algorithm


Use the Euclidian distance between
coefficients (projection of the image in
dominant subspace


first ten Eigen vectors)


Nearest neighbor (image from training set
with most similar projection) chosen as
match

Identification Results


Correct identification

Identification Results


Incorrect identification

Identification Results


Incorrect identification

1.109634e+02

1.295154e+02

1.366959e+02

1.147805e+02

1.636073e+02

1.383662e+02

Identification Results

0
10
20
30
40
50
60
70
80
90
100
185
148
111
74
37
% Correct
Training Faces

Future Research


Other projection techniques (Fisher
Discrimination Method)


Joint recognition using range and texture
images