IT and SLT Characterizations of Secured Biometric Authentication Systems

licoricebedsΑσφάλεια

22 Φεβ 2014 (πριν από 3 χρόνια και 3 μήνες)

65 εμφανίσεις

IT and SLT Characterizations of Secured Biometric
Authentication Systems
Natalia A.Schmid
a
and Harry Wechsler
b
a
West Virginia University,Morgantown,WV 26508;
b
George Mason University,Fairfax,VA 22030
ABSTRACT
This paper provides an information theoretical description of biometric systems at the systemlevel.A number of
basic models to characterize performance of biometric systems are presented.All models compare performance of
an automatic biometric recognition system against performance of an ideal biometric system that knows correct
decisions.The correct decision can be visualized as an input to a new decision system,and the decision by an
automatic recognition system is the output of this decision system.The problem of performance evaluation for
a biometric recognition system is formulated as (1) the problem of ¯nding the maximum information that the
output of the systemhas about the input,and (2) the problemof ¯nding the maximumdistortion that the output
can experience with respect to the input of the system to guarantee a bounded average probability of recognition
error.The ¯rst formulation brings us to evaluation of capacity of a binary asymmetric and M-ary channels.The
second formulation falls under the scope of rate-distortion theory.We further describe the problem of physical
signature authentication used to authenticate a biometric acquisition device and state the problem of secured
biometric authentication as the problem of joint biometric and physical signature authentication.One novelty
of this work is in restating the problem of secured biometric authentication as the problem of ¯nding capacity
and rate-distortion curve for a secured biometric authentication system.Another novelty is in application of
transductive methods fromstatistical learning theory to estimate the conditional error probabilities of the system.
This set of parameters is used to optimize the system performance.
Keywords:system capacity,biometric systems,physical signatures,detection,binary channel,M-ary channel,
decision error,rate-distortion,transduction
1.INTRODUCTION
In recent years biometrics has drawn the attention of research groups ranging from computer vision to physics
and statistics.Many new biometric modalities and algorithms have been developed and their number continues
to grow.Most important among the challenges to be met are systemreliability together with robustness to image
variability and adversarial learning.The bounds on what can be achieved in practice,however,are not known.
This situation is quite di®erent from information and communication theory,where Claude Shannon predicted
the bounds of transmission for a signal over a channel with Gaussian noise more than 60 years ago.The codes to
achieve the limits were designed 50 years after the theoretical results were stated.Many biometric systems have
claimed top performance but what can be actually achieved in practice is not known.A full-°edged information
theoretical aspect for biometric recognition systems has yet to be developed.The developments available are not
many and include achievable rates
1
and capacity of biometric recognition systems.
2{4
This paper proposes novel means to evaluate the limits of biometric authentication systems.Towards that end
we suggest two basic frameworks that are drawn from Information Theory (IT).The ¯rst framework evaluates
the capacity (achievable rate) of a biometric authentication system at a system level.Here system links the
decision of an ideal biometric authentication system (input to the system) with the decision by an automatic
recognition device (output of the system).Under the ¯rst framework,the problem of biometric authentication
can be restated as the problem of ¯nding capacity of a binary asymmetric channel (BAC) or an M-ary channel,
Further author information:(Send correspondence to Natalia A.Schmid or to Harry Wechsler.)
Natalia A.Schmid:E-mail:Natalia.Schmid@mail.wvu.edu,Telephone:1 (304) 293-9136
Harry Wechsler:E-mail:wechsler@gmu.edu,Telephone:1 (703) 993-1710
depending on the type of the system (one-to-one match or many-to-one match).If in addition to biometric
authentication we will involve a means to authenticate the biometric device used to acquire biometric data,we
will add security component to the biometric authentication system.It is shown here that the secured biometric
authentication system can be recalculated into a BAC,and its performance can be numerically evaluated.
The second framework is a rate-distortion framework,where the decisions made by a binary or a M-ary
biometric authentication system are distorted versions of the decisions made by an ideal authentication system.
The distortion function is a zero/one loss function that assigns no cost,that is a zero-cost,to a correct decision,
and a cost of one to each incorrect decision.The problem of ¯nding the maximum deviation of the decision
made by an automatic biometric authentication system from the decision by the ideal biometric authentication
system under the constraint of the bounded average probability of authentication error is stated as the problem
of ¯nding a rate-distortion function.Again,involving physical signature authentication component leads to a
secured biometric authentication system.It is shown here that for the same value of the average recognition
error,the secured biometric system can sustain a larger loss of information at the system level compared to
unsecured biometric system.
In theory,channel optimization is performed with respect to the probability of the input to a channel and
solving a rate-distortion problem involves an optimization with respect to conditional error probabilities of the
system relating inputs and outputs.In practice,conditional error probabilities and prior probabilities of the
input are not known and have to be evaluated using a training data set.For example,these probabilities can be
computed using methods of Classical Detection and Estimation Theory or by using modern methods of Statistical
Learning Theory (SLT) by involving transductive approaches.In this paper we suggest a transductive method
to estimate conditional error probabilities.
The rest of the paper is organized as follows.Sec.2 presents known information theoretical models that
allow to treat biometric authentication system at the system level as a binary asymmetric and M-ary channels.
Sec.3 presents results and developments related to rate-distortion theory.Sec.4 suggests to use transductive
methods from SLT to estimate unknown parameters of the proposed models.A short summary of models and
developments is provided in Sec.5.
2.SYSTEM CAPACITY
2.1 Biometric authentication system
The relationship between capacity and probability of error of a threshold based detector device was previously
analyzed by Amblard et al.
5
Their paper describes the problem of designing noise enhanced detectors from two
di®erent perspectives:communications theory (viewing the system as a communication channel) and detection
theory (designing an optimal binary receiver).The experimental results indicate that joint optimality (to ensure
reliable communication and optimal detector design) is not possible for a threshold based device.Placed in
the information theory framework a detection system can be depicted as a binary communication channel with
asymmetric cross error probabilities (we will use biometrics terminology) False Accept Rate (FAR) and False
Reject Rate (FRR).Assume that the values of the probability of correct biometric authentication,1¡FRR = p;
and the probability of false accept,FAR = 1¡q;are preset.Then the capacity of the binary asymmetric channel
is given by:
C = H
b

¤
(1 ¡FRR) +(1 ¡¼
¤
)FAR) ¡¼
¤
H
b
(1 ¡FRR) ¡(1 ¡¼
¤
)H
b
(FAR);(1)
where
¼
¤
=
1 ¡¯(1 ¡FRR;FAR)FAR
¯(1 ¡FRR;FAR) (1 ¡FRR¡FAR)
;(2)
¯(1 ¡FRR;FAR) = 2
H
b
(1¡FRR)¡H
b
(FAR)
1¡FRR¡FAR
+1;(3)
and H
b
(P) is the binary entropy given by H
b
(P) = ¡P log(P) ¡(1 ¡P) log(1 ¡P):
The total probability of authentication error for this channel is
P
¤
(error) = (1 ¡¼
¤
)FAR+¼
¤
FRR:(4)
0
0.5
1
0
0.5
1
0
0.5
1
p
1-q
CAPACITY
Figure 1.The left panel shows a diagram of a biometric veri¯cation channel.The right panel displays the capacity of a
BAC as a function of the probability of detection and the probability of false alarm.
However,it can be easily shown (see Ref.5 for detailed illustration) that P
¤
(error) is always larger than the
minimum probability of error that can be achieved by the decision making system.
To summarize,from a system perspective (at the decision level) a biometric authentication system can be
viewed as a decision device as well as a binary asymmetric channel with two error probabilities FAR and FRR.
The input to the BAC is a random variable (denote it by X).From system level perspective,the random
variable X is the ideal knowledge (ideal decision) available to a system designer when biometric data are labeled.
The labeled set is typically used to built an adaptive biometric system with a number of unknown parameters
estimated from the labeled (also called training) data.The input X takes value 1 when two templates presented
for matching belong to the same class.It takes value 0 when the two templates are from two di®erent classes.
The output of the binary channel is also a binary random variable (denote it by Y ),which present the decision
made by an automatic biometric veri¯cation system.
Channels are traditionally characterized by a single number measure called capacity.For a communication
channel,capacity is related to the maximum number of classes that the receiver on the further end of a com-
munication channel can recognize with the probability of recognition error approaching zero as the length of
transmitted messages increases.The meaning of capacity is slightly di®erent for authentication (biometric au-
thentication in particular) systems.The capacity of a biometric authentication system can be interpreted (in
the particular setting summarized above) as the maximum amount of information that the input and output can
have in common.That is,the capacity measures how well the automatic system mimics the ability to recognize
by the ideal system.Fig.1 displays a block-diagram of a BAC (left panel) and the plot of the capacity of the
BAC as a function of the probability of correct decision,p;and FRR,denoted here as 1¡q:Note that the points
where the capacity is equal to one,are the points where the automatic and the ideal systems are in perfect
agreement.
We now restate the problem of biometric authentication to add a security component,which can be naturally
involved in the process of biometric authentication.The new design will include an additional decision making
device.This device authenticates the nature of biometric data based on a physical signature of the sensor used
to acquire biometric data.
2.2 Secured Biometric Authentication
Secure biometrics is one of the top priority topics in the ¯eld of biometric-based authentication.Security of
biometric signatures or system is often provided by performing encryption,watermarking,encoding,or through
involving cancelable biometric signatures.
6
Most of these methods require the application of some type of lossy
transformation to biometric data,after which the original signals cannot be entirely recovered.
In recent years,digital forensics grew into a separate research ¯eld.Digital forensics cover a large number of
topics uni¯ed by a single theme:establishing authenticity of data.This task is often restated as the problem of
establishing the authenticity of the device used to acquire data.Each electronic,mechanical,magnetic device,
each substance and material are characterized by a unique physical signature such as graininess and structure of
wood or other surfaces,particles of paint,concentration of particles in a chemical composition,magnetic noise
on the magnetic stripe of credit cards and other physical signatures.For CCD and CMOS electronic cameras
the physical signatures (also known as camera ¯ngerprint) is due to imperfections in the production of the optics
of a particular camera and thus it is a noise.This noise is known as photo-response nonuniformity (PRNU).
PRNU can be extracted using relatively uniform and not so bright portions of a provided image (see
7,8
).
Camera physical signatures are traditionally modeled as discrete space random processes.Thus physical signa-
tures are treated as realizations of a discrete random process.In general,the PRNU process cannot be described
by a simple statistical model.However,the correlation-based test statistic designed for device authentication
is relatively well modeled as a Gaussian random variable.This simpli¯es performance analysis of the device
signature authentication signi¯cantly.
Involving PRNUor another physical signature of an acquisition sensor is a natural solution to improve security
of a biometric authentication system.It can be done by concatenating the biometric signature (could be raw
data or extracted informative and descriptive representation) with a physical signature of the acquisition device.
For example,PRNU can be easily extracted from submitted images of a biometrics.
Here we will demonstrate the change in the performance (the mutual information between input and output
of an authentication system at the system level) of a biometric authentication system due to the involvement of
a physical signature of an optical device.Details will be developed for the case where biometric and physical
signatures of the acquisition device are independent or weakly dependent.
2.2.1 Physical signature authentication
In the problem of authenticating a physical signature,two di®erent realizations of sensor noise are compared.
The problem of deciding if signatures belong to the same sensor or to two di®erent sensors is traditionally stated
as a binary hypothesis testing problem.If we assume that signatures of di®erent sensors are realizations of
independent and identically distributed randomprocesses (signatures of a ¯nite length are realizations of random
vectors),then the authentication procedure can be described by the following two hypotheses.We introduce H
1
and H
0
;which indicate that two physical signatures,one extracted from the claimed image and the other one
extracted froma query image,have a signal in common,or do not have a common signal,respectively.Thus,under
the hypothesis H
1
;two signatures are realizations of the same stochastic process and thus have a joint distribution.
Under the hypothesis H
0
;two signatures are independent and identically distributed.The probability distribution
of the two signatures under H
0
has a product of marginals form.Thus,the joint probability distribution is
tested against the product of marginals.Associated with this systems are two conditional error probabilities:
the probability of false alarm FAR = 1 ¡s and the probability of missed detection FRR = 1 ¡r:Given r and
s;the probability of error and the capacity of the system can be evaluated by analogy with the approach of the
previous section.The asymptotic analysis of optimally designed physical signature authentication systems can
be found in earlier publications by one of the authors.
9,10
2.2.2 Joint biometric and device signature authentication
Assume that physical signature of a device and biometric signature used for identi¯cation are independent.Due
to complex processing,enhancement and a distinct nature of physical and biometric signatures this assumption is
often valid in practice.Joint design depends on statistical models of physical and biometric signatures.When two
signatures are independent,the correct decision about genuine identity requires that both biometric signatures
and physical signatures of the device be authenticated correctly.Denote by 1 ¡ FRR
1
= p the probability
of correct decision for the biometric authentication system and by 1 ¡ FRR
2
= r the probability of correct
decision for the device signature authentication system.The conditional probability that biometric signature is
authenticated correctly and physical signature is authenticated correctly,provided that both events are true is
(1¡FRR
1
)(1¡FRR
2
) = pr:Let X
1
be the input randomvariable to the biometric authentication system.This is
the true (ideal) state of the system.Let X
2
be the input to the physical signature authentication system.Let the
Figure 2.A diagram of the joint biometric-physical signature authentication channel.
outputs of the two systems be denoted by Y
1
and Y
2
;respectively.The outputs represent the states of practical
automatic systems designed to perform biometric authentication and device physical signature authentication.
The channel with joint states is shown in Fig.2.
If the capacity of each individual channel is known (for example,C
1
and C
2
) and the channels are independent,
then the capacity of the joint channel is the sumof the two capacities.Let C be the capacity of the joint channel,
then
C = C
1
+C
2
:
This is a classical result that can be derived by following few guidelines from Ref.11.
2.3 BAC Perspective
It is interesting to note that from secure biometric authentication point of view the events (X
1
;X
2
) = (1;0);
(X
1
;X
2
) = (0;1) and (X
1
;X
2
) = (0;0) constitute an error in secured authentication.Since a biometric authen-
tication system does not di®erentiate among the three errors,the joint channel in Fig.2 can be reduced to a
BAC with the two joint states (1;1) and (1;0) [ (0;1) [ (0;0) replaced by 1 and 0 states.Assuming the inde-
pendence of the biometric and physical signature authentication systems and keeping the notation introduced
in the previous sections,the probability of correct recognition is the conditional probability pr:The probability
that the authentication system decides in favor of 0;given that the true state is 1 is the conditional probability
P[(Y
1
;Y
2
) = f(1;0) [ (0;1);[(0;0)gj(X
1
;X
2
) = (1;1)] = 1 ¡ pr:Let the prior probability for the input of the
biometric authentication system to be in state 1 be ¼
1
and the prior probability for the input to the physical sig-
nature authentication system to take the value 1 be w
1
:Then the probability of false accept and the probability
of correct reject can also be evaluated:
P[(Y
1
;Y
2
) = (1;1)j(X
1
;X
2
) = f(1;0) [(0;1) [(0;0)g]
=
·
1 ¡
P[(Y
1
;Y
2
) = (1;1)j(X
1
;X
2
) = (1;1)]P([(X
1
;X
2
) = (1;1)]
P[(Y
1
;Y
2
) = (1;1)]
¸
P[(Y
1
;Y
2
) = (1;1)]
1 ¡P[(X
1
;X
2
) = (1;1)]
(5)
=
1
1 ¡¼
1
w
1
(p(1 ¡s)¼
1
(1 ¡w
1
) +(1 ¡q)r(1 ¡¼
1
)w
1
+(1 ¡q)(1 ¡s)(1 ¡¼
1
)(1 ¡w
1
)):
The block-diagram of the binary channel is shown in Fig.1 with the transition probabilities replaced by pr and
the expression (5).
0
0.5
1
0
0.5
1
0
0.2
0.4
0.6
0.8
PRIOR, BAC2
PRIOR, BAC1
MUTUAL INFORMATION
Figure 3.The plot of the capacity of a secured biometric system as a function of ¼
1
and w
1
;the prior probabilities for
biometric veri¯cation system and physical signature authentication system.
The mutual information between the input and output of this binary channel I(X;Y ) = H(Y ) ¡H(Y jX)
can be easily evaluated and the optimization problem is stated as follows:
max
¼
1
;w
1
fH
b
(¯) ¡¼
1
w
1
H
b
(pr) ¡(1 ¡¼
1
w
1
)H
b
(®)g;(6)
where
® = P[Y = (1;1)jX = f(1;0) [ (0;1) [(0;0)g] (7)
and
¯ = P[Y = (1;1)] = pr(¼
1
w
1
) +p(1 ¡s)¼
1
(1 ¡w
1
) +(1 ¡q)r(1 ¡¼
1
)w
1
+(1 ¡q)(1 ¡s)(1 ¡¼
1
)(1 ¡w
1
):(8)
This optimization problem can be readily solved numerically.
2.3.1 Illustration
Assume that the transition probabilities characterizing a biometric authentication systemand a physical signature
authentication system are set to be 1 ¡ FRR
1
= p = 0:97;FAR
1
= 1 ¡ q = 0:01;1 ¡ FRR
2
= r = 0:9;and
FAR
2
= 1¡s = 0:05:Thus the accuracy of the physical signature authentication systemis lower compared to the
biometric system.Using (1),the capacities of the two systems are C
1
= 0:8624 and C
2
= 0:6209:We found the
optimal prior probabilities for each of the systems and substituted them in (6) without performing optimization.
The mutual information between the input and the output of the secured biometric authentication system with
prede¯ned prior probabilities was found to be 0:5293:Note that in this case the automatic system loses its
ability to recognize biometric samples compared to the case of unsecured biometric authentication system.The
capacity of the secured biometric system was obtained by numerically optimizing (6) with respect to ¼
1
and w
1
:
It amounts to 0:6820:The plot of the capacity as a function of the two prior probabilities for the same set of the
transition probabilities is displayed in Fig.3.
2.4 Identi¯cation Channel
By analogy with the binary authentication system above,an authentication system with Mclasses can be viewed
as a M-ary channel with the conditional probability p of a correct decision and with the conditional probability
(1 ¡p)=(M¡1) of an error.All conditional error probabilities are assumed to be equal.With this notation (see
Fig.4),the capacity of this channel is
C = plog M ¡H
b
(p) ¡(1 ¡p) log
M ¡1
M
;(9)
0
0.5
1
0
20
40
60
-1
0
1
2
3
4
5
6
p, CORRECT DECISION
M, NUMBER OF CLASSES
CAPACITY OF M-ARY SYSTEM
Figure 4.The left panel presents a block-diagram of an identi¯cation channel at a system level.The right panel shows
the plot of the M-ary capacity as a function of the number of classes and the probability of correct authentication.
which is achieved by selecting the uniform distribution on the output.
The capacity of a M-ary biometric authentication system has an interpretation similar to the capacity of a
M-ary communication channel.This is the relationship between the capacity,maximum number of classes to
recognize with vanishing probability of error and the length of codewords that are used to encode the label of a
class.The equation (9) can be used to predict the value of the capacity for a given p and M:
3.RATE-DISTORTION FRAMEWORK
In some cases,the optimization of the amount of information between input and output of an authentication
channel has to be performed considering limited resources.For example,we may be interested in ¯nding the
\worst case average decision"that can be made by an automatic authentication system at the system level (due
to distortions in data or due to imperfect design of the system) under the condition that the average probability
of authentication error attained by the system is below a provided value.
3.1 Binary Problem
Trading o® the information between the input and output random variables and the distortions due to query
image being distinct compared to enrolled images is another approach to characterize the limits of biometric
systems.Given an upper bound,D;on the average probability of error that an authentication systemcan sustain,
we would like to ¯nd the maximumof dissimilarity between the decisions of the automatic and ideal systems under
this constraint.Here we use mutual information between the decisions made by the two systems as a measure
of dissimilarity.Thus,seeking the maximum average dissimilarity is reduced to seeking the minimum mutual
information.Constrained optimization of the information between binary input and output of an authentication
system under the constraint of a bounded average probability of authentication error is a classical rate-distortion
problem.We will brie°y state the problem below and discuss its solution.
Let X be a decision made by an ideal authentication system.Let
^
X be a decision made by an automatic
(nonideal) authentication system.At the systemlevel,X and
^
X are the input and output of a BAC,as described
in earlier sections.The notation
^
X is introduced to indicate that
^
X is a distorted version of X (in information
theory this notation is used to indicate a lossy compressed representation of X).The question is how much
can X be distorted such that the average probability of authentication error is smaller than D?Denoting the
mutual information between the input and output of the system as I(X;
^
X);the problem of\trading maximum
input-output dissimilarity (rate in IT terms) and the average probability of error of the system (distortion in IT
terms)"becomes:
min
p(^xjx):E[d(X;
^
X)]·D
I(X;
^
X);(10)
where
d(x;^x) =
·
0 1
1 0
¸
;and E[d(X;
^
X)] =
X
x;^x=0;1
d(x;^x)p(x;^x)
with p(x;^x) being the joint distribution of X and
^
X:
Assuming that the conditional probabilities of error are not equal,the constrained optimization (10) can be
approached by involving the method of Lagrange multipliers:
J(p(^xjx);¹) =
X
x;^x=0;1
log
p(x;^x)
p(x)p(^x)
p(x;^x) +¹(E[d(X;
^
X)] ¡D);(11)
where J is the new function (Lagrangian) to optimize and ¹ is the Lagrange multiplier parameter.
Using the notation similar to the notation used in the earlier sections for the conditional error probabilities,
that is,setting p(
^
X = 1jX = 1) to p and p(
^
X = 0jX = 0) to q and optimizing with respect to p and q results in
1
2
log
p(1 ¡p +q)
(1 ¡p)(1 +p ¡q)
¡
¹
2
= 0;
1
2
log
q(1 +p ¡q)
(1 ¡q)(1 ¡p +q)
¡
¹
2
= 0;
where X is assumed to be uniformly distributed and
1 ¡p
2
+
1 ¡q
2
= D:
There is no closed form solution developed for this problem.The problem can be solved numerically.
3.2 Secured Biometric Authentication
Consider again a joint channel composed of two independent BAC channels.Assume that the total probability
of error of the joint system is bounded by a value D:Since a secured biometric system has to make only a
binary decision,the joint channel can be mapped into a BAC channel.The use of the mapping leads to the
block-diagram in Fig.1 with the conditional error probabilities replaced by pr and by the expression (5) and
with the output of the channel Y replaced by an approximation to the input of the channel,
^
X:The total average
probability of authentication error (the distortion) is evaluated by assuming that the penalty is zero for making
a correct decision and that the penalty is one for making a wrong decision.Then the average probability of error
(distortion in IT terms) is given by
E[d(X;
^
X)] = (1 ¡pr)¼
1
w
1
+p(1 ¡s)¼
1
(1 ¡w
1
) +(1 ¡q)r(1 ¡¼
1
)w
1
+(1 ¡q)(1 ¡s)(1 ¡¼
1
)(1 ¡w
1
):(12)
The dissimilarity between the decisions by the ideal and by an automatic systems is measured in terms of the
\rate:"
I(X;
^
X) = H(
^
X) ¡H(
^
XjX);
where
H(
^
X) = H
b
(¯) and H(
^
XjX) = ¼
1
w
1
H
b
(pr) +(1 ¡¼
1
w
1
)H
b
(®):
Finding the maximumof dissimilarity between the decisions of the ideal and an automatic authentication systems
under the condition that the total average probability of error is below D is a constrained optimization problem:
min
(p;q;s;r):(1¡pr)¼
1
w
1
+(1¡¼
1
w
1
)®·D
fH
b
(¯) ¡¼
1
w
1
H
b
(pr) ¡(1 ¡¼
1
w
1
)H
b
(®)g:(13)
If the unsecured biometric authentication system as well as the physical authentication system are both symmet-
ric,p = q and r = s;then the relationship between the average dissimilarity in decisions of ideal and automatic
secured authentication systems and the average probability of error (rate-distortion function in IT terms) for
0
0.1
0.2
0.3
0.4
0.5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
DISTORTION, D
RATE


RATE-DISTORTION CURVE
BIOMETRIC OPERATING POINT
DEVICE FINGERPRINTING POINT
SECURED BIOMETRIC POINT
0
0.1
0.2
0.3
0.4
0.5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
RATE
DISTORTION, D
Figure 5.Left panel shows the rate-distortion curve for a binary veri¯cation system.The points marked in star and
circle are the operating points of the biometric veri¯cation system and of the physical signature authentication system,
respectively.The point indicated as a box is the operating point of the secured biometric system that assumes the same
transition probabilities as the original unsecured biometrics veri¯cation system and as the original physical signature
authentication system.Right panel shows the rate-distortion function for the secured veri¯cation system.
this case is straightforward to derive.For the symmetric systems,¯ = 1=4 and ® = (1 ¡pr)=3:The expression
to be optimized is given by
min
p;r:
(1¡pr)
2
·D
½
H
b
µ
1
4

¡
1
4
H
b
(pr) ¡
3
4
H
b
µ
1 ¡pr
3
¶¾
:(14)
This is a function of the product pr only.Therefore,pr = 1 ¡2D and the minimal mutual information between
the input and the output of the secured biometric authentication system is
I(X;
^
X) = H
b
µ
1
4

¡
1
4
H
b
(1 ¡2D) ¡
3
4
H
b
µ
2D
3

:
The rate-distortion function corresponding to this case is shown on the right panel in Fig.5.The left panel in
Fig.5 shows the rate-distortion function for unsecured biometric authentication system.
3.3 Example
To illustrate the di®erence between the rate-distortion curves for the case of unsecured and secured biometric
authentication systems,we assume that all prior probabilities (for unsecured biometric authentication system
and for physical signature authentication system are equal,that is,¼
1
= w
1
= 1=2:The upper bound on the
average probability of error D can be varied between 0 and 1:We set it to be D
1
= 0:01 for the unsecured
biometric authentication system and D
2
= 0:05 for the physical signature authentication system.The optimal
values of the probabilities p = q and r = s for this case are 1 ¡ D
i
;i = 1;2:The optimal rate for the two
cases are 0:9192 and 0:7136;respectively.Using these probability values,p = 0:99 and r = 0:95;for the secured
biometric authentication system,¯ = 0:25 and ® = 0:0198:The average probability of error (distortion in
IT terms) for this case is 0:0297;and the mutual information between the decisions by the ideal and automatic
biometric authentication systems (rate in IT terms) is 0:6247:Three operating points (0:01;0:9192) for unsecured
biometric authentication system,(0:05;0:7136) for physical signature authentication system and (0:0297;0:6247)
for secured biometric authentication system are shown on the left panel in Fig.5.Note that for the same
value of the average error probability,D;the secured biometric system can sustain larger average dissimilarity
between the decisions by the ideal and an automatic biometric authentication systems compared to the unsecured
biometric authentication system.Note that a smaller value of the mutual information between the decision of
the ideal and automatic authentication systems correspond to a larger (in value) average dissimilarity between
the decisions by the ideal and automatic systems.
3.4 Identi¯cation System
Assume that X is the decision of an ideal biometric authentication system and
^
X is the distorted version of X;
the decision of an automatic biometric authentication system.For a M-ary authentication system,X and
^
X
take values f1;2;:::;Mg:They can be interpreted as labels assigned to each class by an ideal or by an automatic
(nonideal) biometric authentication system.
The distortion d(¢;¢) in this case is a measure of pairwise dissimilarity between the decision made by an auto-
matic system and the decision made by the ideal biometric authentication system.The measure of dissimilarity
is zero,if the automatic and the ideal decisions agree.Otherwise,the measure of dissimilarity takes value one.
All pairwise dissimilarities can be placed in an M £M matrix ¯lled with all ones except the diagonal entries,
which are ¯lled with zeros.Then the average dissimilarity (distortion in IT terms) is
E[d(X;
^
X)] =
M
X
x;^x=1
d(x;^x)p(^xjx)p(x);(15)
where we use p(x) to denote the prior probability on the state X = x of the ideal biometric authentication
system and use p(^xjx) to denote the conditional probability of the automatic system to be in state
^
X = ^x given
that the true state (the state of the ideal system) is X = x:The expression (15) is the expression for the average
probability of error.
Finding the minimal average amount of information that has to be retained in the decision
^
X about the
decision X to ensure that the total average probability of error is bounded by D is stated as a constrained
optimization problem:
min
p(^xjx):E[d(X;
^
X)]·D
I(X;
^
X):(16)
The solution to this problem can be easily found by involving Fano's inequality (see Cover and Thomas for
detail
11
).The transition probability that solves these equations under the condition of the uniforminput is given
by
p(^xjx) =
½
1 ¡D;when ^x = x
D
(M¡1)
;when ^x 6= x;
(17)
which produces the following rate-distortion function:
R(D) =
½
log M ¡H
b
(D) ¡Dlog(M ¡1);when 0 · D · 1 ¡
1
M
0;otherwise;
(18)
where H
b
(¢) is the binary entropy.
4.A SLT FRAMEWORK TO ESTIMATE FAR AND FRR
When performing IT analysis,we assume that the conditional error probabilities,FAR and FRR,or the proba-
bility of the occurrence of zeros and ones in the ideal system are known.In practice these parameters are not
known and have to be estimated by involving observed labeled data.Furthermore,when only a small amount of
labeled observed data available,estimating parameters such as FAR and FRR and then substituting estimates in
the expression for the capacity or for the rate-distortion function may not be desirable.A function or expression
with estimated parameters in it becomes a plug-in estimate.These estimates are suboptimal.To e®ectively use
a small amount of data,a plug-in estimate has to be replaced by a transductive estimate (or any other type of
local estimates).Here we provide a brief overview of a transductive approach to estimation of parameters or
functions and illustrate its principle of operation by estimating FAR and FRR.
We will ¯rst consider a number of performance and discrepancy measures that are further used to introduce
transductive approach for parameter estimation.These measures include strangeness and p-value.
4.1 Strangeness and p-values
Suppose that a small amount of labeled data is available.Assume that the data are collected from a number of
classes in a set Y:If decisions are about binary authentication,then the labels take only two values,1 or 0:The
strangeness measures the lack of typicality in a data sample with respect to its true or putative (assumed) label
and the labels for all the other data samples.Formally,the strangeness measure ¸
i
is the (likelihood) ratio of
the sum of the k nearest neighbor (k-nn) distances d from the same class y divided by the sum of the k nearest
neighbor (k-nn) distances from all the other classes:
¸
i
=
P
k
j=1
d
y
ij
P
k
j=1
d
Yny
ij
;(19)
where a notation from the set theory Y ny is used to indicate that all other classes are involved in the evaluation
of a distance except the class y:
The smaller the strangeness,the larger its typicality and the more probable its (putative) label y is.The
strangeness facilitates both feature selection (similar to Markov blankets) and variable selection (dimensionality
reduction).One ¯nds empirically that the strangeness,classi¯cation margin,sample and hypothesis margin,pos-
teriors,and odds are all related via a monotonically non-decreasing function with a small strangeness amounting
to a large margin.
The likelihood-like de¯nitions for strangeness are intimately related to discriminative methods.The p-values
suggested next compare (rank) the strangeness values to determine the credibility and con¯dence in the putative
classi¯cations (labeling) made.The p-values bear resemblance to their counterparts from statistics but are
not the same.
12
P-values are determined according to the relative rankings of putative authentications against
each one of the classes known to the library data using the strangeness.The standard p-value construction
shown below,where l is the cardinality of the training set T;constitutes a valid randomness (de¯ciency) test
approximation
12
for some putative label y hypothesis
p
y
(e) =
#fi:¸
i
¸ ¸
y
new
g
l +1
:(20)
P-values are used to assess the extent to which the biometric data supports or discredits the null hypothesis
H
0
(for some speci¯c authentication).When the null hypothesis is rejected for each identity class known,one
declares that the test image lacks mates in the gallery and therefore the identity query is answered with\none
of the above."This corresponds to forensic exclusion with rejection characteristic of open set recognition with
authentication implemented using Open Set Transduction Con¯dence Machine (TCM) - k-nearest neighbor (k-
nn).
13
TCM facilitates outlier detection,in general,and imposters detection,in particular.
4.2 Open Set Transduction Con¯dence Machine (TCM)
The strangeness is computed for each validation biometric sample under all its putative class labels a;a 2 1;:::;A:
Assuming N validation biometric samples from each class,one derives N positive strangeness values for each
class a;and N(A¡1) negative strangeness values.The positive and negative strangeness values correspond to
the case when the putative label of the validation and training samples are the same or not,respectively.Similar
labels,if recognized as such,correspond to Hits,and di®erent labels,if mistaken as similar,correspond to False
Positives.The strangeness values are ranked for all the NA cases and p-values are derived accordingly.
4.3 Imposter (Intrusion or Outliers) Detection
Similar to semi-supervised learning,changing the class assignments (characteristic of impostor behavior) provides
the bias needed to determine the rejection threshold required to make an authentication inference or to decline
making one.Towards that end using Open Set TCM one re-labels the training exemplars,one at a time,
with all the (impostor) putative labels except the one originally assigned to it.The peak-to-side ratio (PSR),
PSR = (p
max
¡p
min
)=p
stdev
;describes the characteristics of the resulting p-value distribution and determines,
using cross validation,the [a priori] threshold used to identify (infer) impostors.The PSR values found for
impostors are low because impostors do not mate and their relative strangeness is high (and p-value low).
Impostors are deemed as outliers and are thus rejected.
4.3.1 Implications for biometric veri¯cation systems
In practical biometric systems the conditional error probabilities such as FAR and FRR are unknown and have
to be estimated using observed biometric data.The values of estimated probabilities depend on the system
design (encoder and matcher) and on the amount of data available.The estimates can be further plugged in
the expression for the capacity to estimate the amount of information that the designed biometric veri¯cation
system and the ideal system have in common.Thus,capacity is a measure of\goodness"of a designed system.
4.4 Estimation of FAR and FRR
FAR and FRR are estimated using the typicality of biometric samples and the rankings for each of their putative
N assignments.The active learning solution proposed here is similar to that used for choosing the best examples
for biometric training (learning).The solution is driven by Open Set Transductive Con¯dence Machines (TCM)
using strangeness and p-values.The p-values provide a measure of diversity and disagreement in opinion regarding
the putative label of a biometric sample when it is assigned all the labels available.Let p
i
be the p-values obtained
for a particular example x
n+1
using all possible labels i = 1;:::;N:Sort the sequence of p-values in descending
order so that the ¯rst two p-values,say,p
j
and p
k
are the two highest p-values with labels j and k;respectively.
The label assigned to the unknown example is j with a p-value of p
j
:This value de¯nes the credibility of the
classi¯cation.If p
j
(credibility) is not high enough,the prediction is rejected under the open set recognition
scenario.The di®erence between the two p-values can be used as a con¯dence value on the prediction,if one is
contemplated.Note that,the smaller the con¯dence,the larger the ambiguity regarding the proposed label and
the more likely the false accepts and false rejects are.We consider three possible cases of p-values,p
j
and p
k
;
assuming p
j
> p
k
:
1.
p
j
is high and p
k
is low.Prediction j has high credibility and high-con¯dence value;
2.
p
j
is high and p
k
is high.Prediction j has high credibility but low-con¯dence value;
3.
p
j
is low and p
k
is low.Prediction j has low credibility and low-con¯dence value.
High uncertainty in prediction occurs for both Case 2 and Case 3 and leads to misclassi¯cation errors with
those corresponding to Case 2 harder to avoid (because of their assumed high credibility).For both cases
2 and 3 uncertainty of prediction occurs when p
j
¼ p
k
:The con¯dence I(x
n
+ 1) = p
j
¡ p
k
indicates the
quality of authentication information possessed by the biometric samples.As I(x
n
+ 1) approaches 0;the
more uncertain we are about classifying the example,and the larger the likelihood of occurring errors.One
tabulates for all biometric samples their possible errors weighted according to both their credibility and contextual
con¯dence.The larger the credibility the larger the weight;the smaller the con¯dence the larger the weight too.
Thresholds similar to those derived for Open Set TCM are set,confusion matrices accrue\errors"over NA
putative authentications,and FAR and FRR are estimated accordingly.Further extensions can incorporate
some error analysis according to the diversity of the biometric population encountered and the characteristic of
pattern speci¯c error inhomogeneities (PSEI).
The scheme proposed above,similar to Query by Transduction,
12
has solid theoretical underpinnings with
p-values mapped to posterior probabilities.This is based on the fact that (1) the Kullback-Leibler (KL) diver-
gence can be interpreted as the expected discrimination information between the null and alternative statistical
hypotheses;and (2) connections between KL divergence and Shannon information.
14
The scheme proposed above
for FAR and FRR estimation can be expanded to both labeled and unlabeled biometric samples if label propa-
gation using spectral clustering precedes the computation of con¯dence values and the (weighted) tabulation of
confusion matrices.
5.SUMMARY
Two IT frameworks to analyze performance of biometric authentication systems at the system level were intro-
duced.A joint secured biometric channel was designed by combining biometric authentication systemand device
physical signature authentication system into a single BAC channel.Its capacity and rate-distortion function
were evaluated.Since prior probabilities and conditional error probabilities characterizing biometric and physical
signature authentication systems are not available in practice,a transductive method was suggested to estimate
these parameters by using a small amount of training data.These parameters can then be used to estimate
capacity and rate-distortion function of the real biometric authentication systems.
REFERENCES
[1]
Westover,M.B.and O'Sullivan,J.A.,\Achievable rates for pattern recognition,"IEEE Transactions on
Information Theory 54(1),299{320 (2008).
[2]
Willems,F.,Kalker,T.,Goseling,J.,and Linnartz,J.-P.,\On the capacity of a biometrical identi¯cation
system,"Proc.of IEEE Int.Symp.on Information Theory,82 (2003).
[3]
Schmid,N.A.and O'Sullivan,J.A.,\Performance prediction methodology for biometric system using large
deviations approach,"IEEE Trans.on Signal Processing:Supplement on Secure Media 52(10),3036{3045
(2004).
[4]
Schmid,N.A.and Nicolo,F.,\Recognition capacity of biometric systems under global pca- and ica-based
encoding,"IEEE Trans.on Information Forensics and Security 3(3),512{528 (2008).
[5]
Amblard,P.O.,Michel,O.J.J.,and Morfu,S.,\Revisiting the asymmetric binary channel:joint noise-
enhanced detection and information transmission through threshold devices,"in [Noise in Complex Systtems
and Stochastic Dynamics III],Proc.SPIE 5845,50{60 (2005).
[6]
Wechsler,H.,[Reliable Face Recognition Methods System Design,Implementation and Evaluation],Springer
US,New York (2007,Ch.14).
[7]
Fridrich,J.,\Digital image forensics,"IEEE Signal Processing Magazine 26(2),26{37 (2009).
[8]
Filler,T.,Fridrich,J.,and Goljan,M.,\Using sensor pattern noise for camera model identi¯cation,"in
[Proc.IEEE ICIP],1296{1299 (2008).
[9]
O'Sullivan,J.A.and Schmid,N.A.,\Large deviations for performance analysis of signature authentication,"
in [IEEE Int.Symp.on Information Theory],176 (1997).
[10]
O'Sullivan,J.A.and Schmid,N.A.,\Performance analysis of physical signature authentication,"IEEE
Trans.on Inform.Theory 47(7),3034{3039 (2001).
[11]
Cover,T.M.and Thomas,J.A.,[Elements of Information Theory],Wiley-Interscience,Hoboke,New Jersey
(2006 (second edition)).
[12]
Ho,S.S.and Wechsler,H.,\Query by transduction,"IEEE Trans.on Pattern Analysis and Machine
Intelligence 30(9),1557{1571 (2008).
[13]
Li,F.and Wechsler,H.,\Open set face recognition using transduction,"IEEE Trans.on Pattern Analysis
and Machine Intelligence 27(11),1686{1697 (2005).
[14]
Ho,S.S.and Wechsler,H.,\A martingale framework for detecting changes in the data generating model
in data streams,"IEEE Trans.on Pattern Analysis and Machine Intelligence (2010,(to appear)).