MIT/LL Person Identification System

parathyroidsanchovyΤεχνίτη Νοημοσύνη και Ρομποτική

17 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

80 εμφανίσεις

MIT Lincoln Laboratory

1

MIT/LL Person Identification System

Kevin Brady

MIT Lincoln Laboratory

kbrady@LL.MIT.EDU


CLEAR Evaluation

Baltimore, Maryland

8
-
9 May 2007

MIT Lincoln Laboratory

2

CLEAR Evaluation

Person Identification Task

System Challenges



Uncontrolled Environment



Limited Train / Test data



Visual (Face Images)


Registration


Resolution


Pose (non
-
frontal faces)


Scoring


Mismatch in number of train / test examplars



Audio


Microphone array channel handling

Objective



Multimodal Person Identification



Improve overall performance



Address channel
-
specific shortfalls

MIT Lincoln Laboratory

3

Audio Identification Architecture

Beamformer

Gaussian Mixture Model

Support Vector Machine

SAD

SAD: Speech Activity Detection

UBM: Universal Background Model

2048 Mixtures with a UBM

MIT Lincoln Laboratory

4

Audio Identification Results

Train / Test Condition (seconds)

15s/1s

15s/5s

15s/10s

15s/20s

30s/1s

30s/5s

30s/10s

30s/20s

0

10

20

30

40

50

60

70

80

90

100

Correct ID (%)

GMM

SVM

GMM
-
SVM

MIT Lincoln Laboratory

5

Audio GMM Identification Results

30s/5s
30s/10s
30s/20s
50
55
60
65
70
75
80
85
90
95
100
Correct ID (%)
30s/5s
30s/10s
30s/20s
50
55
60
65
70
75
80
85
90
95
100
Correct ID (%)
Train / Test Condition (seconds)

Train / Test Condition (seconds)

Submitted GMM System

2048 Mixture Models

Universal Background Model

Better suited for open set verification

32 Mixture Models

No Universal Background Model

Better suited for closed set ID

MIT Lincoln Laboratory

6

Visual Identification Architecture

Distance from Face Space

Improved Face Registration

LDA + Kernel Eigenfaces

Face

Registration

Scale, Location

Scoring

Train Examplars


Only utilized manually registered data where both
eyes were visible



Feature spaces evaluated


Eigenfaces


Fisherfaces


Kernel Eigenfaces


Gaussian Kernel


LDA + Kernel Eigenfaces



Scoring


Normalized Cross Correlation


Maximum score used to assign identity



MIT Lincoln Laboratory

7

Visual Identification Results

Train / Test Condition (seconds)

15s/1s

15s/5s

15s/10s

15s/20s

30s/1s

30s/5s

30s/10s

30s/20s

0

10

20

30

40

50

60

70

80

90

100

Correct ID (%)

Eigenfaces

Fisherfaces

Kernel Eigenfaces

LDA + Kernel Eigenfaces

MIT Lincoln Laboratory

8

CLEAR Evaluation

Mismatch in Train / Test Face Examplars

Face Test Images

Face Train Images

500

1000

1500

2000

2500

500

1000

1500

2000

2500

3000

3500

4000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Train / Test Scores

for same Target

Few train or test images

Good

Match

Poor

Match

Face Recognition Scores*

* Normalized Cross Correlation Scores using LDA + Kernel Eigenvectors on the 30 s train / 20 s test development set


MIT Lincoln Laboratory

9

Primary Identification Results

15s/1s

15s/5s

15s/10s

15s/20s

30s/1s

30s/5s

30s/10s

30s/20s

0

10

20

30

40

50

60

70

80

90

100

Correct ID (%)

Visual

Audio

Audio
-
Visual

Train / Test Condition (seconds)

MIT Lincoln Laboratory

10

Concluding Remarks


Encouraged by effectiveness of audio
-
visual modalities in
recent CLEAR evaluation


Audio: System design provided suboptimal performance


Visual: Kernel methods provide significant gains over linear
methods


Multimodal:


Provided performance improvement over either modality


Addressed audio
-
channel performance shortfalls




Addressing System Challenges


Visual


Addressing non
-
frontal poses at low resolution


How to combine scores over various train / test examplars




Audio
-
Visual


How to combine information


Features


Scores