Secure contracts signed by mobile Phone
IST
-
2002
-
506883
Jacques Koreman, NTNU
Andrew Morris, Spinvox
International Workshop on
Verbal and Nonverbal Communiation Behaviours
Vietri sul Mare, 29
-
31 March 2007
Multi
-
modal Biometric Verification
for Small and Very Small Devices
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
2
Overview
•
Background and application: SecurePhone
•
Multimodal biometric recognition
–
face, voice, signature: natural
•
For small devices: PDA
–
Good performance, short verification time
–
Security problem
•
For very small devices: SIM card
–
Global features to run on slow CPU
–
Short verification time, acceptable performance
•
Conclusion
–
Further improvements by glottal feature fusion?
–
Relevance for COST2102
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
3
Background: SecurePhone project
•
Duration: 01.01.2004
–
30.11.2006
•
Aim:
“
a mobile phone with biometric authentication
and e
-
signature support for dealing secure transactions
on the fly
”
•
SecurePhone consortium:
–
Management
–
Research
–
Implementation
–
Exploitation
Financing:
EU 6th framework IST
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
4
SecurePhone
GPRS/UMTS
e
-
signature
manager
SIM
card
PIN
number
video camera
touch
screen
microphone
data
capture
data
capture
data
capture
biometric
preprocessor
biometric
recogniser
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
5
Multimodal biometric recogniser
Haar LL4 wavelets
GMM
GMM
GMM
geometric features
MFCCs
Face
Voice
Signature
reject user
accept user
release private key
“biometric
recogniser”
user profile
world model
Later:
HL4
LL
HL
LH
HH
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
6
PDA: fusion results for PDAtabase
DET curves/result table for 5
-
digit (left), 10
-
digit (middle) and phrase prompts (right)
Modality
5
-
digit
10
-
digit
Phrase
Voice
7.21
3.24
5.54
Face
28.40
27.55
28.33
Signature
8.01
Fusion
(mean)
2.39
1.54
2.30
Fusion
(sd)
0.96
0.83
1.85
Marcos Faundez
-
Zanuy:
Face recognition:
an unsolved problem
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
7
From
small
to
very small
devices: problem
•
Biometric data cannot be stored or processed on the
PDA, because impostors could steal biometric data.
•
Therefore storage and processing must be on SIMcard,
which self
-
destroys when tampered with physically.
•
Instead of a few seconds on the PDA, verification on
the SIMcard takes one hour!
•
Bottleneck: large number of comparisons in voice and
signature verification (for client model and UBM)
–
for large number of frames per prompt
–
for large number of Gaussian mixtures in GMM
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
8
•
Reducing the frame rate or the number of GMM
mixtures cannot reduce the processing time in a
sufficient order of magnitude
•
Drastic solution: globalised features (idea taken from
static signature representations)
–
Means (cf. Long
-
Term Average Spectrum for voice)
and standard deviations per vector parameter across
all frames; also greatly reduced number of Gaussians
required for modelling the vectors
–
To counteract the effect of averaging, compute
globalised features for subparts of the signal
From
small
to
very small
devices: solution
Marcos Faundez
-
Zanuy:
Open your mind: sometimes a
simple solution can give a
good result“ (and sometimes
you cannot get around it)
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
9
PDA results
Global feat.
Means
only
Means
only
Means
only
Means
only
Means
+ sd
Means
+ sd
Means
+ sd
Means
+ sd
#Gauss.
1
2
4
8
1
2
4
8
Voice
28.20
30.08
30.36
32.08
22.78
22.55
24.41
25.71
Face
32.26
31.78
29.06
29.19
32.26
31.78
29.06
29.19
Signature
37.26
29.28
27.15
26.25
28.34
26.60
21.27
19.21
fused
17.95
17.16
14.83
15.01
13.68
12.35
10.05
10.31
EER (percent) for globalised means (columns 2
-
5) and means plus standard
deviations (columns 6
-
9)
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
10
SIM card results
EER (percent) for globalised means (columns 2
-
5) and means plus standard
deviations (columns 6
-
9
) for voice and signature divided into two equal subparts
Global feat.
Means
only
Means
only
Means
only
Means
only
Means
+ sd
Means
+ sd
Means
+ sd
Means
+ sd
#Gauss.
1
2
4
8
1
2
4
8
Voice
22.13
21.09
20.87
21.86
20.88
19.72
17.68
18.49
Face
32.26
31.78
29.06
29.19
32.26
31.78
29.06
29.19
Signature
38.29
27.58
22.58
17.86
28.14
22.16
17.59
16.45
Fused
12.89
12.48
10.49
9.32
12.56
10.48
8.28
9.15
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
11
Improvement needed
•
Performance drop:
–
PDA EER 2.39% (meanwhile improved to 0.9%)
–
SIM
EER 10.05% (8.28 for two equal subparts)
•
Performance can be improved if we do not restrain the
GMM models to be the same across all modalities
•
Otherwise: Use of complementary features
within
a
modality
–
Face: simple face geometric variables
–
Voice
: parameter values of LF model fitted to glottal flow
derivative, obtained from inverse filtering of mic signal
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
12
Interest to this COST action
•
Interest in glottal flow derivative for speaker recognition
stems from
–
expected complementarity to MFCC representation of
spectrum
–
applicability in applications which use very little training
data (as in SecurePhone, for user
-
friendliness)
•
But can also be useful for other classification problems,
like “the recognition of emotional states,
gesture, speech
and facial expressions,
in anticipation of the
implementation of useful application such as intelligent
avatars and interactive dialog systems”
(quote from aims website of this workshop)
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
13
Last night’s addendum:
speech & gestures
•
Source signal parameters can also be used together with
other spectral parameters as well as F0, duration,
loudness measures to signal prominence.
•
In speech, these signals can be used differently across
languages (syllable
-
timed vs. stress
-
timed) and speakers
(German Research Council “rhythm project” led by
Bill Barry, Saarland University, to which NTNU contributes
with Norwegian database recordings and analyses).
•
Prominence also
signalled
by extent/size as well as
acceleration of gestures.
•
In how far do gestures and speech signal parameters
correlate? When are they used as complementary/
alternative strategies for signalling prominence?
Int’l Workshop on Verbal and Nonverbal Communication Behaviours,
Vietri sul Mare, 29
-
31 March 2007, slide
14
Thank you for your attention.
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Comments 0
Log in to post a comment