Tim Miller CSCI 5521 - Pattern Recognition Determining Facial Attractiveness Using Eigenfaces Final Report Dr. Paul Schrater December 19, 2003

crumcasteΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

86 εμφανίσεις




















Tim Miller

CSCI 5521
-

Pattern Recognition

Determining Facial Attractiveness Using Eigenfaces

Final Report

Dr. Paul Schrater

December 19, 2003



















Abstract


The technique of transforming images of huma
n faces into reduced dimensional spaces
using principal components analysis was first described by Kirby and Sirovich (6). This
technique is used to produce what are called “eigenfaces” and has been used relatively
successfully for automated facial recogn
ition (9, 12, 14). For this project I extended the use of
the eigenfaces technique to predict attractiveness of female faces. Using attractiveness ratings
assigned by human subjects, novel images are rated for attractiveness on a scale of 1 to 10. In
ad
dition, new faces in any attractiveness classes can be generated using the ratings obtained from
subjects.

Introduction

Literature Review


The literature for facial recognition is quite extensive, with most pattern recognition
techniques having been tried
at some point by some research group. Similarly, the research in
the area of facial attractiveness is voluminous. There is very little overlap in the two areas,
although there is some amount of research that uses data transformation techniques in order t
o
manipulate the attractiveness of digitized facial images.


For the problem of facial recognition, both principle components analysis (PCA) (1, 3, 6,
9, 14) and Fisher’s linear discriminant (1) have been used to reduce data dimensionality. These
two meth
ods are also known as “Eigenfaces” and “Fisherfaces.” The most widely used method
is PCA, though this is usually done on images with uniform lighting and facial expression.
Alternately, some claim that using “Fisherfaces” as opposed to “Eigenfaces” allow
s for more
variance in lighting conditions and facial expressions (1). One of the most interesting new
techniques uses PCA on 3D face coordinates obtained from laser scans of a persons entire head
(3). By scanning the face with different expressions, it
is possible to obtain a difference vector
that represents certain facial expressions. This can then be added or subtracted to the mean to
manipulate the facial expression of any face. This enables the data set to be much larger than the
number of scans r
ecorded, because a certain facial expression (e.g. smile, frown) can be added to
or subtracted from any face in the database to create a new face.

For actual classification a variety of methods have been used, including support vector
machines (5), templat
e matching (4), feature matching (4), Bayesian decision theory (9), Fisher’s
linear discriminant (1), and Euclidean distance (14). For each paper, the method used has better
results than any previously used method, and thus each paper claims that the meth
od it uses is the
best suited for facial recognition. This is partly due to the fact that newer research tends to use
more advanced classification schemes, leading to better results. However, it is probably also
partly due to different data sets, sample
sizes, and image qualities.

Among researchers who study facial attractiveness, there seems to be an agreement that
average faces are attractive (7, 8, 10, 11). There is disagreement, however, about whether
averageness is a cause of attractiveness, and a
lso whether average faces are the most attractive.
Some propose that evolutionary forces would cause faces near to the population mean to be
desirable (7, 8). Others, however, say that while average faces may be attractive, they are not the
most attracti
ve (10, 11). In these papers, subjects rated a group of actual face images. Then, an
average face was created from the face images, along with an average of the most attractive face
images (as determined by the subjects). If averageness makes the most a
ttractive faces, one
would expect the average face and the average of the attractive faces to have approximately
equal attractiveness ratings. However, this was not the case, as the average of the attractive faces
was rated significantly higher than the m
ean of all faces. This is compelling evidence that while
average faces are attractive, certain non
-
average characteristics can result in a more attractive
face.

In much of the research done to investigate facial attractiveness, digital generation and
mani
pulation of facial images is used to probe for indicators of attractiveness (7, 8, 10, 11).
“Averaged” faces are literally just images where each pixel value is the arithmetic mean of the
corresponding pixel values of some group of images (7, 8). In othe
r experiments, facial images
were “feminized” or “masculinized” by adding some linear combination of face bases that had
earlier been determined to be either feminine or masculine by subjects (10, 11). It was found
that feminized faces tended to rate high
er than masculine faces, giving another indicator of
attractiveness besides averageness.

Other manipulations were done without respect to attractiveness (2). In this experiment,
facial images were connected by a line with the mean face, and moved along
the line. As an
image moves away from the mean, it is called a caricature, as the non
-
average features are
accentuated. Conversely, as the image is moved towards the mean, it is called an anti
-
caricature,
because non
-
average features diminish in magnitud
e. As a face goes past the mean, it is called
an anti
-
face, and begins to take on some of the opposite features of the original face. This
experiment showed that a “perceptional discontinuity” occurs at the mean. In other words, faces
along the same lin
e on the same side of the mean were deemed to be similar to a certain degree
by subjects, while faces on the same line, but on different sides of the mean, were not deemed to
be similar by subjects.


The algorithm to transform images into eigenfaces is des
cribed well in Turk and Pentland
(14), but I will describe it briefly here. Each image could be represented as a matrix of M


N
pixels, with each element an 8
-
bit grayscale value. If all images are represented as such, they
can each be turned into a vec
tor of length MN. Next, the average face can be found, and each
image can then be represented as a difference from the average. The difference vectors can then
all be used as columns in a matrix A. The eigenfaces are then the eigenvectors of the covaria
nce
matrix C = A*A
T
. However, since images are very high dimensional, the covariance matrix is
likely quite large, and a straightforward eigenvector computation is prohibitively expensive.


This problem can be solved by noticing that in the equation

AA
T
A
v
i

= uA
v
i
,



the eigenvectors of C are A
v
i
, where
v
i

are the eigenvectors of A
T
A, which is a square matrix
with dimension equal to the number of images in the data set. Since the number of images in the
data set is usually much less than the size of an ima
ge, this computation is likely to be tractable.


Once the eigenfaces have been determined, each image in the training set can be
projected onto this “face
-
space.” In this way, each image is represented as a linear combination
of the eigenfaces. For class
ification, new images are also projected onto the face
-
space. The
weights for the new image are compared to those of the test data, and based on the distance
metric in use, the new image is decided to belong to the class that it is closest to.


For this

project, the dataset used had already been converted into eigenfaces. Many of
the images for which the principal components coefficients had been given were not part of the
set of images on which PCA was done. As a result, these faces did not reconstruc
t perfectly.
While they may have been suitable for machine classification, most were not suitable to be rated
by humans for attractiveness. Thus, the training set for this project had to be reduced to 100
faces, many of which still did not reconstruct as

well as one would like.



Methodology


The process of automatically estimating attractiveness starts with training a classifier
using a training set of facial images. A total of 7 male students (3 undergraduates, 4 graduate)
rated each facial image in
the data set for facial attractiveness on a scale from one to ten. The
attractiveness rating of each image is the average across all subjects for that image. Subjects
were also given the option of not rating a given image if it was deemed too difficult t
o assign a
rating because of a poor reconstruction of the image. This abstention was counted as a non
-
rating, so that it did not affect the average in either direction.


For classification, the vector of face attractiveness ratings was divided into six cl
asses.
The number six was determined by trial and error. Ten classes may be the most obvious
division, and was the first attempt. However, the range of ratings was 2.4


6.7, so dividing into
ten classes would have been splitting hairs. Six was deemed t
he best choice, as the histogram the
data produced with six classes best represented the normal distribution the ratings should most
likely model.


A multitude of classification schemes were used to classify the facial images. These
include Fisher’s linea
r discriminant, perceptrons, linear and non
-
linear support vector machines,
Euclidean distance, and uniform and normal random classifiers. In the first three cases, special
care was needed since there was greater than two classes. For all three methods,
the one
-
against
-
all approach was used. Then, for each face, it was assigned to the class for which it had the
furthest distance from the discriminating hyperplane.

For the perceptron algorithm, it was discovered through inspection that the classifying
v
ector did not converge on the training set, so a variation of the pocket algorithm (15) was used.
Specifically, whenever a weight vector classified the training data with fewer mistakes than the
previous best weight vector, the new best vector was saved.

This prevented the occurrence of a
bad weight vector being used simply because the last update step was an over correction.


Since there were only 100 data samples, the size of
the data set was not sufficient to
break it up into a training set and a test set. In order to get results that would still be an indicator
of generalization capabilities, the leave
-
one
-
out method was used. Basically, each algorithm was
run 100 times, wi
th a different face being withheld from the training set each time. This face
was then tested on the discriminant functions computed for it.


Attractive face generation is based on the results provided by subjects. In each class, the
mean and standard de
viation for each coefficient is computed. Then, using a gaussian pseudo
-
random number generator initialized with the computed


and

2
, a new random face in that
class is generated.

Table 1.


Results


A summary of the results can be seen in table 1. No
ne of the methods used was
particularly adept at guessing the exact right class. However, all methods, with the exception of
support vector machines with a quartic kernel, did much better than the random classifier. This
may not seem very impressive, but

given that the number of classes is so small, a random
classifier actually is likely to get a large percentage of its guesses within a few points of the

Euclidean

Perceptron
Pocket

Linear
SVM

Quadratic
SVM

Cubic
SVM

Quartic
SVM

Uniform
Random

Gaussian
Random

Percent Correct

18.00
%

26.00%

20.00%

20.00%

26.00%

10.00%

13.00%

3.00%

Percent within one

51.00%

61.00%

53.00%

52.00%

46.00%

24.00%

44.00%

18.00%

Percent within two

71.00%

80.00%

77.00%

78.00%

74.00%

48.00%

66.00%

33.00%

actual class. The perceptron algorithm worked the best for this data set, but the difference in
accur
acy between the different techniques is minimal. One would expect that with a larger and
better data set, the classification rates would be much higher.


It should also be mentioned that for all the non
-
linear SVM classifications, a subset of the
data con
taining only the first 50 images was used. For some reason, the algorithm would not
converge with a data size any larger than 50. This could be part of the reason why non
-
linear
support vector machines actually did slightly worse than the linear version.

With a data set cut in
half, the performance probably should be a little worse.


The results of facial generation are shown in figure 1. These images are a result of
averaging all the images in each of the six classes, where class one is rated the least

attractive,
and class six is rated the most attractive. This averaging was done because the technique of
using a random number generator to generate a random face is not consistent enough to always
give clear face images.

The subjects did not rate the
generated images, but there are some subjective differences
that can be noticed. For one, there does not always seem to be much difference in appearance
between classes. However, there is a noticeable difference in the forehead size, with more
attractive

classes having bigger foreheads than less attractive classes. In addition, there seems to
be a slight softening of features in the progression from less attractive to more attractive, as the
face is remarkably more round in class four than in class one.

Classes five and six did not
reconstruct as well as the earlier classes, probably because of smaller class sizes. However, they
still appear at least as attractive as the earlier classes despite the blurriness.

Future Work


To continue or improve this re
search, the first task would be to get better data. In the
data set given, many of the faces were badly distorted from the reconstruction process. While
these images may be okay for a classifying machine once it starts, the images shown to the
subjects f
or rating should be original images. Also, some faces occurred in the data more than
once with slightly varying poses. This is probably not inherently harmful, and it may even
provide some insight into which poses or lighting conditions are most attrac
tive. However, by
duplicating some of the images in an already small dataset, the amount of meaningful variation
becomes even smaller.


Given better data, it may be possible to map out an attractiveness space for female faces.
Average faces are attract
ive, but not optimally attractive (11). It would be interesting to explore
what sort of deviations from average increase or decrease attractiveness, and try to fit some sort
of shape to attractiveness. With a large enough initial data set, one would expe
ct many attractive
faces that may be attractive for different reasons. These different attractive faces could be used
to generate new images. After subjects rate these synthesized images, certain features could be
picked out that are more or less conduci
ve to attractiveness.


Another improvement to this system would be increased automation. One of the nice
things about the data set used is that the images were all aligned and scaled properly already. If
there was a system that could take in any image, f
ind the face, scale the image, and align the
eyes, then any image could be used as input, not just those taken by one research group.




Figure 1.
Class1

20
40
60
80
100
120
20
40
60
80
100
120

Class 3

20
40
60
80
100
120
20
40
60
80
100
120

Class 5

20
40
60
80
100
120
20
40
60
80
100
120

Class 2

20
40
60
80
100
120
20
40
60
80
100
120

Class 4

20
40
60
80
100
120
20
40
60
80
100
120

Class 6

20
40
60
80
100
120
20
40
60
80
100
120



References

[1] P. Belheumer, J.
Hespanha, and D. Kriegma, “Eigenfaces vs. Fisherfaces: Recognition Using
Class Specific Linear Projection,”
IEEE Transactions on Pattern Analysis and Machine
Intelligence,

vol. 19, no. 7, pp. 711
-
720, 1997.


[2] V. Blanz, A. J. O’Toole, T. Vetter, and H
. Wild, “On the other side of the mean: The
perception of dissimilarity in human faces,”
Perception,

vol. 29, pp. 885
-
891, 2000.


[3] V. Blanz and T. Vetter, “A Morphable Model for the Synthesis of 3D Faces,”
Computer
Graphics Proceedings SIGGRAPH,

pp. 187
-
194, 1999.


[4] R. Brunelli and T. Poggio, “Face Recognition: Features versus Templates,”
IEEE
Transactions on Pattern Analysis and Machine Intelligence,

vol. 15, no. 10, pp.1042
-
1052,
October 1993.


[5] J. Huang, V. Blanz, and B. Heisele, “Face Recogniti
on Using Component
-
Based SVM
Classification and Morphable Models,”
Proceedings of the First International Workshop on
Pattern Recognition Using Support Vector Machines,
pp. 135
-
143, August 10, 2002.


[6] M. Kirby, and L. Sirovich, “Application of the Karh
unen
-
Loeve procedure for the
characterization of human faces”,
IEEE Transactions on Pattern Analysis and Machine
Intelligence,

vol. 12, no. 1, pp. 103
-
108, 1990.


[7] J. H. Langlois and L. A. Roggman, “Attractive Faces are Only Average,”
Psychological
Sci
ence,

vol. 1, pp. 115
-
121, 1990.


[8] J. H. Langlois, L. A. Roggman, and L. Musselman, “What is Average and What is Not
Average About Attractive Faces,”
Psychological Science,

vol. 5, 214
-
220, 1994.


[9] B. Moghaddam, W. Wahid, and A. Pentland, “Beyond Ei
genfaces: Probabilistic Matching

for Face Recognition,”
The 3
rd

International Conference for Automatic Face and Gesture

Recognition,
April 1998.


[10] D. I. Perrett, K. J. Lee, I. Penton
-
Voak, D. Rowland, S. Yoshikawa, D. M. Burt, S. P. Henzi,

D. L. Cast
les, S. Akamatsu, “Effects of Sexual Dimorphism on Facial Attractiveness,”
Nature,


vol. 394, no. 6696, pp. 884
-
887, 1998.


[11] D. I. Perrett, K. A. May, S. Yoshikawa, “Facial Shape and Judgements of Female

Attractiveness,”
Nature,

vol. 368, no. 6468, pp
. 239
-
242, 1994.


[12] P.J. Phillips, H. Moon, P. Rauss, and S. Rizvi, “The FERET September 1996 Database and

Evaluation Procedure,”
International Conference on Audio and Video
-
based Biometric Person

Authentication
, March 12
-
14, 1997.



[13] H .Rowley,

S. Baluja, and T. Kanade, ”Neural Network
-
Based Face Detection,”
IEEE

Transactions on Pattern Analysis and Machine Intelligence,
vol. 20, no. 1, January 1998.


[14] M. Turk, and A. Pentland, “Eigenfaces for Recognition,”
Journal of Cognitive

Neuroscience,

vol. 3, no. 1, pp. 71
-
86, 1991.


[15] D. Stork, R. Duda, P. Hart,
Pattern Classification, 2
nd

Ed.,

Wiley, 2002.