IT and SLT Characterizations of Secured Biometric

Authentication Systems

Natalia A.Schmid

a

and Harry Wechsler

b

a

West Virginia University,Morgantown,WV 26508;

b

George Mason University,Fairfax,VA 22030

ABSTRACT

This paper provides an information theoretical description of biometric systems at the systemlevel.A number of

basic models to characterize performance of biometric systems are presented.All models compare performance of

an automatic biometric recognition system against performance of an ideal biometric system that knows correct

decisions.The correct decision can be visualized as an input to a new decision system,and the decision by an

automatic recognition system is the output of this decision system.The problem of performance evaluation for

a biometric recognition system is formulated as (1) the problem of ¯nding the maximum information that the

output of the systemhas about the input,and (2) the problemof ¯nding the maximumdistortion that the output

can experience with respect to the input of the system to guarantee a bounded average probability of recognition

error.The ¯rst formulation brings us to evaluation of capacity of a binary asymmetric and M-ary channels.The

second formulation falls under the scope of rate-distortion theory.We further describe the problem of physical

signature authentication used to authenticate a biometric acquisition device and state the problem of secured

biometric authentication as the problem of joint biometric and physical signature authentication.One novelty

of this work is in restating the problem of secured biometric authentication as the problem of ¯nding capacity

and rate-distortion curve for a secured biometric authentication system.Another novelty is in application of

transductive methods fromstatistical learning theory to estimate the conditional error probabilities of the system.

This set of parameters is used to optimize the system performance.

Keywords:system capacity,biometric systems,physical signatures,detection,binary channel,M-ary channel,

decision error,rate-distortion,transduction

1.INTRODUCTION

In recent years biometrics has drawn the attention of research groups ranging from computer vision to physics

and statistics.Many new biometric modalities and algorithms have been developed and their number continues

to grow.Most important among the challenges to be met are systemreliability together with robustness to image

variability and adversarial learning.The bounds on what can be achieved in practice,however,are not known.

This situation is quite di®erent from information and communication theory,where Claude Shannon predicted

the bounds of transmission for a signal over a channel with Gaussian noise more than 60 years ago.The codes to

achieve the limits were designed 50 years after the theoretical results were stated.Many biometric systems have

claimed top performance but what can be actually achieved in practice is not known.A full-°edged information

theoretical aspect for biometric recognition systems has yet to be developed.The developments available are not

many and include achievable rates

1

and capacity of biometric recognition systems.

2{4

This paper proposes novel means to evaluate the limits of biometric authentication systems.Towards that end

we suggest two basic frameworks that are drawn from Information Theory (IT).The ¯rst framework evaluates

the capacity (achievable rate) of a biometric authentication system at a system level.Here system links the

decision of an ideal biometric authentication system (input to the system) with the decision by an automatic

recognition device (output of the system).Under the ¯rst framework,the problem of biometric authentication

can be restated as the problem of ¯nding capacity of a binary asymmetric channel (BAC) or an M-ary channel,

Further author information:(Send correspondence to Natalia A.Schmid or to Harry Wechsler.)

Natalia A.Schmid:E-mail:Natalia.Schmid@mail.wvu.edu,Telephone:1 (304) 293-9136

Harry Wechsler:E-mail:wechsler@gmu.edu,Telephone:1 (703) 993-1710

depending on the type of the system (one-to-one match or many-to-one match).If in addition to biometric

authentication we will involve a means to authenticate the biometric device used to acquire biometric data,we

will add security component to the biometric authentication system.It is shown here that the secured biometric

authentication system can be recalculated into a BAC,and its performance can be numerically evaluated.

The second framework is a rate-distortion framework,where the decisions made by a binary or a M-ary

biometric authentication system are distorted versions of the decisions made by an ideal authentication system.

The distortion function is a zero/one loss function that assigns no cost,that is a zero-cost,to a correct decision,

and a cost of one to each incorrect decision.The problem of ¯nding the maximum deviation of the decision

made by an automatic biometric authentication system from the decision by the ideal biometric authentication

system under the constraint of the bounded average probability of authentication error is stated as the problem

of ¯nding a rate-distortion function.Again,involving physical signature authentication component leads to a

secured biometric authentication system.It is shown here that for the same value of the average recognition

error,the secured biometric system can sustain a larger loss of information at the system level compared to

unsecured biometric system.

In theory,channel optimization is performed with respect to the probability of the input to a channel and

solving a rate-distortion problem involves an optimization with respect to conditional error probabilities of the

system relating inputs and outputs.In practice,conditional error probabilities and prior probabilities of the

input are not known and have to be evaluated using a training data set.For example,these probabilities can be

computed using methods of Classical Detection and Estimation Theory or by using modern methods of Statistical

Learning Theory (SLT) by involving transductive approaches.In this paper we suggest a transductive method

to estimate conditional error probabilities.

The rest of the paper is organized as follows.Sec.2 presents known information theoretical models that

allow to treat biometric authentication system at the system level as a binary asymmetric and M-ary channels.

Sec.3 presents results and developments related to rate-distortion theory.Sec.4 suggests to use transductive

methods from SLT to estimate unknown parameters of the proposed models.A short summary of models and

developments is provided in Sec.5.

2.SYSTEM CAPACITY

2.1 Biometric authentication system

The relationship between capacity and probability of error of a threshold based detector device was previously

analyzed by Amblard et al.

5

Their paper describes the problem of designing noise enhanced detectors from two

di®erent perspectives:communications theory (viewing the system as a communication channel) and detection

theory (designing an optimal binary receiver).The experimental results indicate that joint optimality (to ensure

reliable communication and optimal detector design) is not possible for a threshold based device.Placed in

the information theory framework a detection system can be depicted as a binary communication channel with

asymmetric cross error probabilities (we will use biometrics terminology) False Accept Rate (FAR) and False

Reject Rate (FRR).Assume that the values of the probability of correct biometric authentication,1¡FRR = p;

and the probability of false accept,FAR = 1¡q;are preset.Then the capacity of the binary asymmetric channel

is given by:

C = H

b

(¼

¤

(1 ¡FRR) +(1 ¡¼

¤

)FAR) ¡¼

¤

H

b

(1 ¡FRR) ¡(1 ¡¼

¤

)H

b

(FAR);(1)

where

¼

¤

=

1 ¡¯(1 ¡FRR;FAR)FAR

¯(1 ¡FRR;FAR) (1 ¡FRR¡FAR)

;(2)

¯(1 ¡FRR;FAR) = 2

H

b

(1¡FRR)¡H

b

(FAR)

1¡FRR¡FAR

+1;(3)

and H

b

(P) is the binary entropy given by H

b

(P) = ¡P log(P) ¡(1 ¡P) log(1 ¡P):

The total probability of authentication error for this channel is

P

¤

(error) = (1 ¡¼

¤

)FAR+¼

¤

FRR:(4)

0

0.5

1

0

0.5

1

0

0.5

1

p

1-q

CAPACITY

Figure 1.The left panel shows a diagram of a biometric veri¯cation channel.The right panel displays the capacity of a

BAC as a function of the probability of detection and the probability of false alarm.

However,it can be easily shown (see Ref.5 for detailed illustration) that P

¤

(error) is always larger than the

minimum probability of error that can be achieved by the decision making system.

To summarize,from a system perspective (at the decision level) a biometric authentication system can be

viewed as a decision device as well as a binary asymmetric channel with two error probabilities FAR and FRR.

The input to the BAC is a random variable (denote it by X).From system level perspective,the random

variable X is the ideal knowledge (ideal decision) available to a system designer when biometric data are labeled.

The labeled set is typically used to built an adaptive biometric system with a number of unknown parameters

estimated from the labeled (also called training) data.The input X takes value 1 when two templates presented

for matching belong to the same class.It takes value 0 when the two templates are from two di®erent classes.

The output of the binary channel is also a binary random variable (denote it by Y ),which present the decision

made by an automatic biometric veri¯cation system.

Channels are traditionally characterized by a single number measure called capacity.For a communication

channel,capacity is related to the maximum number of classes that the receiver on the further end of a com-

munication channel can recognize with the probability of recognition error approaching zero as the length of

transmitted messages increases.The meaning of capacity is slightly di®erent for authentication (biometric au-

thentication in particular) systems.The capacity of a biometric authentication system can be interpreted (in

the particular setting summarized above) as the maximum amount of information that the input and output can

have in common.That is,the capacity measures how well the automatic system mimics the ability to recognize

by the ideal system.Fig.1 displays a block-diagram of a BAC (left panel) and the plot of the capacity of the

BAC as a function of the probability of correct decision,p;and FRR,denoted here as 1¡q:Note that the points

where the capacity is equal to one,are the points where the automatic and the ideal systems are in perfect

agreement.

We now restate the problem of biometric authentication to add a security component,which can be naturally

involved in the process of biometric authentication.The new design will include an additional decision making

device.This device authenticates the nature of biometric data based on a physical signature of the sensor used

to acquire biometric data.

2.2 Secured Biometric Authentication

Secure biometrics is one of the top priority topics in the ¯eld of biometric-based authentication.Security of

biometric signatures or system is often provided by performing encryption,watermarking,encoding,or through

involving cancelable biometric signatures.

6

Most of these methods require the application of some type of lossy

transformation to biometric data,after which the original signals cannot be entirely recovered.

In recent years,digital forensics grew into a separate research ¯eld.Digital forensics cover a large number of

topics uni¯ed by a single theme:establishing authenticity of data.This task is often restated as the problem of

establishing the authenticity of the device used to acquire data.Each electronic,mechanical,magnetic device,

each substance and material are characterized by a unique physical signature such as graininess and structure of

wood or other surfaces,particles of paint,concentration of particles in a chemical composition,magnetic noise

on the magnetic stripe of credit cards and other physical signatures.For CCD and CMOS electronic cameras

the physical signatures (also known as camera ¯ngerprint) is due to imperfections in the production of the optics

of a particular camera and thus it is a noise.This noise is known as photo-response nonuniformity (PRNU).

PRNU can be extracted using relatively uniform and not so bright portions of a provided image (see

7,8

).

Camera physical signatures are traditionally modeled as discrete space random processes.Thus physical signa-

tures are treated as realizations of a discrete random process.In general,the PRNU process cannot be described

by a simple statistical model.However,the correlation-based test statistic designed for device authentication

is relatively well modeled as a Gaussian random variable.This simpli¯es performance analysis of the device

signature authentication signi¯cantly.

Involving PRNUor another physical signature of an acquisition sensor is a natural solution to improve security

of a biometric authentication system.It can be done by concatenating the biometric signature (could be raw

data or extracted informative and descriptive representation) with a physical signature of the acquisition device.

For example,PRNU can be easily extracted from submitted images of a biometrics.

Here we will demonstrate the change in the performance (the mutual information between input and output

of an authentication system at the system level) of a biometric authentication system due to the involvement of

a physical signature of an optical device.Details will be developed for the case where biometric and physical

signatures of the acquisition device are independent or weakly dependent.

2.2.1 Physical signature authentication

In the problem of authenticating a physical signature,two di®erent realizations of sensor noise are compared.

The problem of deciding if signatures belong to the same sensor or to two di®erent sensors is traditionally stated

as a binary hypothesis testing problem.If we assume that signatures of di®erent sensors are realizations of

independent and identically distributed randomprocesses (signatures of a ¯nite length are realizations of random

vectors),then the authentication procedure can be described by the following two hypotheses.We introduce H

1

and H

0

;which indicate that two physical signatures,one extracted from the claimed image and the other one

extracted froma query image,have a signal in common,or do not have a common signal,respectively.Thus,under

the hypothesis H

1

;two signatures are realizations of the same stochastic process and thus have a joint distribution.

Under the hypothesis H

0

;two signatures are independent and identically distributed.The probability distribution

of the two signatures under H

0

has a product of marginals form.Thus,the joint probability distribution is

tested against the product of marginals.Associated with this systems are two conditional error probabilities:

the probability of false alarm FAR = 1 ¡s and the probability of missed detection FRR = 1 ¡r:Given r and

s;the probability of error and the capacity of the system can be evaluated by analogy with the approach of the

previous section.The asymptotic analysis of optimally designed physical signature authentication systems can

be found in earlier publications by one of the authors.

9,10

2.2.2 Joint biometric and device signature authentication

Assume that physical signature of a device and biometric signature used for identi¯cation are independent.Due

to complex processing,enhancement and a distinct nature of physical and biometric signatures this assumption is

often valid in practice.Joint design depends on statistical models of physical and biometric signatures.When two

signatures are independent,the correct decision about genuine identity requires that both biometric signatures

and physical signatures of the device be authenticated correctly.Denote by 1 ¡ FRR

1

= p the probability

of correct decision for the biometric authentication system and by 1 ¡ FRR

2

= r the probability of correct

decision for the device signature authentication system.The conditional probability that biometric signature is

authenticated correctly and physical signature is authenticated correctly,provided that both events are true is

(1¡FRR

1

)(1¡FRR

2

) = pr:Let X

1

be the input randomvariable to the biometric authentication system.This is

the true (ideal) state of the system.Let X

2

be the input to the physical signature authentication system.Let the

Figure 2.A diagram of the joint biometric-physical signature authentication channel.

outputs of the two systems be denoted by Y

1

and Y

2

;respectively.The outputs represent the states of practical

automatic systems designed to perform biometric authentication and device physical signature authentication.

The channel with joint states is shown in Fig.2.

If the capacity of each individual channel is known (for example,C

1

and C

2

) and the channels are independent,

then the capacity of the joint channel is the sumof the two capacities.Let C be the capacity of the joint channel,

then

C = C

1

+C

2

:

This is a classical result that can be derived by following few guidelines from Ref.11.

2.3 BAC Perspective

It is interesting to note that from secure biometric authentication point of view the events (X

1

;X

2

) = (1;0);

(X

1

;X

2

) = (0;1) and (X

1

;X

2

) = (0;0) constitute an error in secured authentication.Since a biometric authen-

tication system does not di®erentiate among the three errors,the joint channel in Fig.2 can be reduced to a

BAC with the two joint states (1;1) and (1;0) [ (0;1) [ (0;0) replaced by 1 and 0 states.Assuming the inde-

pendence of the biometric and physical signature authentication systems and keeping the notation introduced

in the previous sections,the probability of correct recognition is the conditional probability pr:The probability

that the authentication system decides in favor of 0;given that the true state is 1 is the conditional probability

P[(Y

1

;Y

2

) = f(1;0) [ (0;1);[(0;0)gj(X

1

;X

2

) = (1;1)] = 1 ¡ pr:Let the prior probability for the input of the

biometric authentication system to be in state 1 be ¼

1

and the prior probability for the input to the physical sig-

nature authentication system to take the value 1 be w

1

:Then the probability of false accept and the probability

of correct reject can also be evaluated:

P[(Y

1

;Y

2

) = (1;1)j(X

1

;X

2

) = f(1;0) [(0;1) [(0;0)g]

=

·

1 ¡

P[(Y

1

;Y

2

) = (1;1)j(X

1

;X

2

) = (1;1)]P([(X

1

;X

2

) = (1;1)]

P[(Y

1

;Y

2

) = (1;1)]

¸

P[(Y

1

;Y

2

) = (1;1)]

1 ¡P[(X

1

;X

2

) = (1;1)]

(5)

=

1

1 ¡¼

1

w

1

(p(1 ¡s)¼

1

(1 ¡w

1

) +(1 ¡q)r(1 ¡¼

1

)w

1

+(1 ¡q)(1 ¡s)(1 ¡¼

1

)(1 ¡w

1

)):

The block-diagram of the binary channel is shown in Fig.1 with the transition probabilities replaced by pr and

the expression (5).

0

0.5

1

0

0.5

1

0

0.2

0.4

0.6

0.8

PRIOR, BAC2

PRIOR, BAC1

MUTUAL INFORMATION

Figure 3.The plot of the capacity of a secured biometric system as a function of ¼

1

and w

1

;the prior probabilities for

biometric veri¯cation system and physical signature authentication system.

The mutual information between the input and output of this binary channel I(X;Y ) = H(Y ) ¡H(Y jX)

can be easily evaluated and the optimization problem is stated as follows:

max

¼

1

;w

1

fH

b

(¯) ¡¼

1

w

1

H

b

(pr) ¡(1 ¡¼

1

w

1

)H

b

(®)g;(6)

where

® = P[Y = (1;1)jX = f(1;0) [ (0;1) [(0;0)g] (7)

and

¯ = P[Y = (1;1)] = pr(¼

1

w

1

) +p(1 ¡s)¼

1

(1 ¡w

1

) +(1 ¡q)r(1 ¡¼

1

)w

1

+(1 ¡q)(1 ¡s)(1 ¡¼

1

)(1 ¡w

1

):(8)

This optimization problem can be readily solved numerically.

2.3.1 Illustration

Assume that the transition probabilities characterizing a biometric authentication systemand a physical signature

authentication system are set to be 1 ¡ FRR

1

= p = 0:97;FAR

1

= 1 ¡ q = 0:01;1 ¡ FRR

2

= r = 0:9;and

FAR

2

= 1¡s = 0:05:Thus the accuracy of the physical signature authentication systemis lower compared to the

biometric system.Using (1),the capacities of the two systems are C

1

= 0:8624 and C

2

= 0:6209:We found the

optimal prior probabilities for each of the systems and substituted them in (6) without performing optimization.

The mutual information between the input and the output of the secured biometric authentication system with

prede¯ned prior probabilities was found to be 0:5293:Note that in this case the automatic system loses its

ability to recognize biometric samples compared to the case of unsecured biometric authentication system.The

capacity of the secured biometric system was obtained by numerically optimizing (6) with respect to ¼

1

and w

1

:

It amounts to 0:6820:The plot of the capacity as a function of the two prior probabilities for the same set of the

transition probabilities is displayed in Fig.3.

2.4 Identi¯cation Channel

By analogy with the binary authentication system above,an authentication system with Mclasses can be viewed

as a M-ary channel with the conditional probability p of a correct decision and with the conditional probability

(1 ¡p)=(M¡1) of an error.All conditional error probabilities are assumed to be equal.With this notation (see

Fig.4),the capacity of this channel is

C = plog M ¡H

b

(p) ¡(1 ¡p) log

M ¡1

M

;(9)

0

0.5

1

0

20

40

60

-1

0

1

2

3

4

5

6

p, CORRECT DECISION

M, NUMBER OF CLASSES

CAPACITY OF M-ARY SYSTEM

Figure 4.The left panel presents a block-diagram of an identi¯cation channel at a system level.The right panel shows

the plot of the M-ary capacity as a function of the number of classes and the probability of correct authentication.

which is achieved by selecting the uniform distribution on the output.

The capacity of a M-ary biometric authentication system has an interpretation similar to the capacity of a

M-ary communication channel.This is the relationship between the capacity,maximum number of classes to

recognize with vanishing probability of error and the length of codewords that are used to encode the label of a

class.The equation (9) can be used to predict the value of the capacity for a given p and M:

3.RATE-DISTORTION FRAMEWORK

In some cases,the optimization of the amount of information between input and output of an authentication

channel has to be performed considering limited resources.For example,we may be interested in ¯nding the

\worst case average decision"that can be made by an automatic authentication system at the system level (due

to distortions in data or due to imperfect design of the system) under the condition that the average probability

of authentication error attained by the system is below a provided value.

3.1 Binary Problem

Trading o® the information between the input and output random variables and the distortions due to query

image being distinct compared to enrolled images is another approach to characterize the limits of biometric

systems.Given an upper bound,D;on the average probability of error that an authentication systemcan sustain,

we would like to ¯nd the maximumof dissimilarity between the decisions of the automatic and ideal systems under

this constraint.Here we use mutual information between the decisions made by the two systems as a measure

of dissimilarity.Thus,seeking the maximum average dissimilarity is reduced to seeking the minimum mutual

information.Constrained optimization of the information between binary input and output of an authentication

system under the constraint of a bounded average probability of authentication error is a classical rate-distortion

problem.We will brie°y state the problem below and discuss its solution.

Let X be a decision made by an ideal authentication system.Let

^

X be a decision made by an automatic

(nonideal) authentication system.At the systemlevel,X and

^

X are the input and output of a BAC,as described

in earlier sections.The notation

^

X is introduced to indicate that

^

X is a distorted version of X (in information

theory this notation is used to indicate a lossy compressed representation of X).The question is how much

can X be distorted such that the average probability of authentication error is smaller than D?Denoting the

mutual information between the input and output of the system as I(X;

^

X);the problem of\trading maximum

input-output dissimilarity (rate in IT terms) and the average probability of error of the system (distortion in IT

terms)"becomes:

min

p(^xjx):E[d(X;

^

X)]·D

I(X;

^

X);(10)

where

d(x;^x) =

·

0 1

1 0

¸

;and E[d(X;

^

X)] =

X

x;^x=0;1

d(x;^x)p(x;^x)

with p(x;^x) being the joint distribution of X and

^

X:

Assuming that the conditional probabilities of error are not equal,the constrained optimization (10) can be

approached by involving the method of Lagrange multipliers:

J(p(^xjx);¹) =

X

x;^x=0;1

log

p(x;^x)

p(x)p(^x)

p(x;^x) +¹(E[d(X;

^

X)] ¡D);(11)

where J is the new function (Lagrangian) to optimize and ¹ is the Lagrange multiplier parameter.

Using the notation similar to the notation used in the earlier sections for the conditional error probabilities,

that is,setting p(

^

X = 1jX = 1) to p and p(

^

X = 0jX = 0) to q and optimizing with respect to p and q results in

1

2

log

p(1 ¡p +q)

(1 ¡p)(1 +p ¡q)

¡

¹

2

= 0;

1

2

log

q(1 +p ¡q)

(1 ¡q)(1 ¡p +q)

¡

¹

2

= 0;

where X is assumed to be uniformly distributed and

1 ¡p

2

+

1 ¡q

2

= D:

There is no closed form solution developed for this problem.The problem can be solved numerically.

3.2 Secured Biometric Authentication

Consider again a joint channel composed of two independent BAC channels.Assume that the total probability

of error of the joint system is bounded by a value D:Since a secured biometric system has to make only a

binary decision,the joint channel can be mapped into a BAC channel.The use of the mapping leads to the

block-diagram in Fig.1 with the conditional error probabilities replaced by pr and by the expression (5) and

with the output of the channel Y replaced by an approximation to the input of the channel,

^

X:The total average

probability of authentication error (the distortion) is evaluated by assuming that the penalty is zero for making

a correct decision and that the penalty is one for making a wrong decision.Then the average probability of error

(distortion in IT terms) is given by

E[d(X;

^

X)] = (1 ¡pr)¼

1

w

1

+p(1 ¡s)¼

1

(1 ¡w

1

) +(1 ¡q)r(1 ¡¼

1

)w

1

+(1 ¡q)(1 ¡s)(1 ¡¼

1

)(1 ¡w

1

):(12)

The dissimilarity between the decisions by the ideal and by an automatic systems is measured in terms of the

\rate:"

I(X;

^

X) = H(

^

X) ¡H(

^

XjX);

where

H(

^

X) = H

b

(¯) and H(

^

XjX) = ¼

1

w

1

H

b

(pr) +(1 ¡¼

1

w

1

)H

b

(®):

Finding the maximumof dissimilarity between the decisions of the ideal and an automatic authentication systems

under the condition that the total average probability of error is below D is a constrained optimization problem:

min

(p;q;s;r):(1¡pr)¼

1

w

1

+(1¡¼

1

w

1

)®·D

fH

b

(¯) ¡¼

1

w

1

H

b

(pr) ¡(1 ¡¼

1

w

1

)H

b

(®)g:(13)

If the unsecured biometric authentication system as well as the physical authentication system are both symmet-

ric,p = q and r = s;then the relationship between the average dissimilarity in decisions of ideal and automatic

secured authentication systems and the average probability of error (rate-distortion function in IT terms) for

0

0.1

0.2

0.3

0.4

0.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

DISTORTION, D

RATE

RATE-DISTORTION CURVE

BIOMETRIC OPERATING POINT

DEVICE FINGERPRINTING POINT

SECURED BIOMETRIC POINT

0

0.1

0.2

0.3

0.4

0.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

RATE

DISTORTION, D

Figure 5.Left panel shows the rate-distortion curve for a binary veri¯cation system.The points marked in star and

circle are the operating points of the biometric veri¯cation system and of the physical signature authentication system,

respectively.The point indicated as a box is the operating point of the secured biometric system that assumes the same

transition probabilities as the original unsecured biometrics veri¯cation system and as the original physical signature

authentication system.Right panel shows the rate-distortion function for the secured veri¯cation system.

this case is straightforward to derive.For the symmetric systems,¯ = 1=4 and ® = (1 ¡pr)=3:The expression

to be optimized is given by

min

p;r:

(1¡pr)

2

·D

½

H

b

µ

1

4

¶

¡

1

4

H

b

(pr) ¡

3

4

H

b

µ

1 ¡pr

3

¶¾

:(14)

This is a function of the product pr only.Therefore,pr = 1 ¡2D and the minimal mutual information between

the input and the output of the secured biometric authentication system is

I(X;

^

X) = H

b

µ

1

4

¶

¡

1

4

H

b

(1 ¡2D) ¡

3

4

H

b

µ

2D

3

¶

:

The rate-distortion function corresponding to this case is shown on the right panel in Fig.5.The left panel in

Fig.5 shows the rate-distortion function for unsecured biometric authentication system.

3.3 Example

To illustrate the di®erence between the rate-distortion curves for the case of unsecured and secured biometric

authentication systems,we assume that all prior probabilities (for unsecured biometric authentication system

and for physical signature authentication system are equal,that is,¼

1

= w

1

= 1=2:The upper bound on the

average probability of error D can be varied between 0 and 1:We set it to be D

1

= 0:01 for the unsecured

biometric authentication system and D

2

= 0:05 for the physical signature authentication system.The optimal

values of the probabilities p = q and r = s for this case are 1 ¡ D

i

;i = 1;2:The optimal rate for the two

cases are 0:9192 and 0:7136;respectively.Using these probability values,p = 0:99 and r = 0:95;for the secured

biometric authentication system,¯ = 0:25 and ® = 0:0198:The average probability of error (distortion in

IT terms) for this case is 0:0297;and the mutual information between the decisions by the ideal and automatic

biometric authentication systems (rate in IT terms) is 0:6247:Three operating points (0:01;0:9192) for unsecured

biometric authentication system,(0:05;0:7136) for physical signature authentication system and (0:0297;0:6247)

for secured biometric authentication system are shown on the left panel in Fig.5.Note that for the same

value of the average error probability,D;the secured biometric system can sustain larger average dissimilarity

between the decisions by the ideal and an automatic biometric authentication systems compared to the unsecured

biometric authentication system.Note that a smaller value of the mutual information between the decision of

the ideal and automatic authentication systems correspond to a larger (in value) average dissimilarity between

the decisions by the ideal and automatic systems.

3.4 Identi¯cation System

Assume that X is the decision of an ideal biometric authentication system and

^

X is the distorted version of X;

the decision of an automatic biometric authentication system.For a M-ary authentication system,X and

^

X

take values f1;2;:::;Mg:They can be interpreted as labels assigned to each class by an ideal or by an automatic

(nonideal) biometric authentication system.

The distortion d(¢;¢) in this case is a measure of pairwise dissimilarity between the decision made by an auto-

matic system and the decision made by the ideal biometric authentication system.The measure of dissimilarity

is zero,if the automatic and the ideal decisions agree.Otherwise,the measure of dissimilarity takes value one.

All pairwise dissimilarities can be placed in an M £M matrix ¯lled with all ones except the diagonal entries,

which are ¯lled with zeros.Then the average dissimilarity (distortion in IT terms) is

E[d(X;

^

X)] =

M

X

x;^x=1

d(x;^x)p(^xjx)p(x);(15)

where we use p(x) to denote the prior probability on the state X = x of the ideal biometric authentication

system and use p(^xjx) to denote the conditional probability of the automatic system to be in state

^

X = ^x given

that the true state (the state of the ideal system) is X = x:The expression (15) is the expression for the average

probability of error.

Finding the minimal average amount of information that has to be retained in the decision

^

X about the

decision X to ensure that the total average probability of error is bounded by D is stated as a constrained

optimization problem:

min

p(^xjx):E[d(X;

^

X)]·D

I(X;

^

X):(16)

The solution to this problem can be easily found by involving Fano's inequality (see Cover and Thomas for

detail

11

).The transition probability that solves these equations under the condition of the uniforminput is given

by

p(^xjx) =

½

1 ¡D;when ^x = x

D

(M¡1)

;when ^x 6= x;

(17)

which produces the following rate-distortion function:

R(D) =

½

log M ¡H

b

(D) ¡Dlog(M ¡1);when 0 · D · 1 ¡

1

M

0;otherwise;

(18)

where H

b

(¢) is the binary entropy.

4.A SLT FRAMEWORK TO ESTIMATE FAR AND FRR

When performing IT analysis,we assume that the conditional error probabilities,FAR and FRR,or the proba-

bility of the occurrence of zeros and ones in the ideal system are known.In practice these parameters are not

known and have to be estimated by involving observed labeled data.Furthermore,when only a small amount of

labeled observed data available,estimating parameters such as FAR and FRR and then substituting estimates in

the expression for the capacity or for the rate-distortion function may not be desirable.A function or expression

with estimated parameters in it becomes a plug-in estimate.These estimates are suboptimal.To e®ectively use

a small amount of data,a plug-in estimate has to be replaced by a transductive estimate (or any other type of

local estimates).Here we provide a brief overview of a transductive approach to estimation of parameters or

functions and illustrate its principle of operation by estimating FAR and FRR.

We will ¯rst consider a number of performance and discrepancy measures that are further used to introduce

transductive approach for parameter estimation.These measures include strangeness and p-value.

4.1 Strangeness and p-values

Suppose that a small amount of labeled data is available.Assume that the data are collected from a number of

classes in a set Y:If decisions are about binary authentication,then the labels take only two values,1 or 0:The

strangeness measures the lack of typicality in a data sample with respect to its true or putative (assumed) label

and the labels for all the other data samples.Formally,the strangeness measure ¸

i

is the (likelihood) ratio of

the sum of the k nearest neighbor (k-nn) distances d from the same class y divided by the sum of the k nearest

neighbor (k-nn) distances from all the other classes:

¸

i

=

P

k

j=1

d

y

ij

P

k

j=1

d

Yny

ij

;(19)

where a notation from the set theory Y ny is used to indicate that all other classes are involved in the evaluation

of a distance except the class y:

The smaller the strangeness,the larger its typicality and the more probable its (putative) label y is.The

strangeness facilitates both feature selection (similar to Markov blankets) and variable selection (dimensionality

reduction).One ¯nds empirically that the strangeness,classi¯cation margin,sample and hypothesis margin,pos-

teriors,and odds are all related via a monotonically non-decreasing function with a small strangeness amounting

to a large margin.

The likelihood-like de¯nitions for strangeness are intimately related to discriminative methods.The p-values

suggested next compare (rank) the strangeness values to determine the credibility and con¯dence in the putative

classi¯cations (labeling) made.The p-values bear resemblance to their counterparts from statistics but are

not the same.

12

P-values are determined according to the relative rankings of putative authentications against

each one of the classes known to the library data using the strangeness.The standard p-value construction

shown below,where l is the cardinality of the training set T;constitutes a valid randomness (de¯ciency) test

approximation

12

for some putative label y hypothesis

p

y

(e) =

#fi:¸

i

¸ ¸

y

new

g

l +1

:(20)

P-values are used to assess the extent to which the biometric data supports or discredits the null hypothesis

H

0

(for some speci¯c authentication).When the null hypothesis is rejected for each identity class known,one

declares that the test image lacks mates in the gallery and therefore the identity query is answered with\none

of the above."This corresponds to forensic exclusion with rejection characteristic of open set recognition with

authentication implemented using Open Set Transduction Con¯dence Machine (TCM) - k-nearest neighbor (k-

nn).

13

TCM facilitates outlier detection,in general,and imposters detection,in particular.

4.2 Open Set Transduction Con¯dence Machine (TCM)

The strangeness is computed for each validation biometric sample under all its putative class labels a;a 2 1;:::;A:

Assuming N validation biometric samples from each class,one derives N positive strangeness values for each

class a;and N(A¡1) negative strangeness values.The positive and negative strangeness values correspond to

the case when the putative label of the validation and training samples are the same or not,respectively.Similar

labels,if recognized as such,correspond to Hits,and di®erent labels,if mistaken as similar,correspond to False

Positives.The strangeness values are ranked for all the NA cases and p-values are derived accordingly.

4.3 Imposter (Intrusion or Outliers) Detection

Similar to semi-supervised learning,changing the class assignments (characteristic of impostor behavior) provides

the bias needed to determine the rejection threshold required to make an authentication inference or to decline

making one.Towards that end using Open Set TCM one re-labels the training exemplars,one at a time,

with all the (impostor) putative labels except the one originally assigned to it.The peak-to-side ratio (PSR),

PSR = (p

max

¡p

min

)=p

stdev

;describes the characteristics of the resulting p-value distribution and determines,

using cross validation,the [a priori] threshold used to identify (infer) impostors.The PSR values found for

impostors are low because impostors do not mate and their relative strangeness is high (and p-value low).

Impostors are deemed as outliers and are thus rejected.

4.3.1 Implications for biometric veri¯cation systems

In practical biometric systems the conditional error probabilities such as FAR and FRR are unknown and have

to be estimated using observed biometric data.The values of estimated probabilities depend on the system

design (encoder and matcher) and on the amount of data available.The estimates can be further plugged in

the expression for the capacity to estimate the amount of information that the designed biometric veri¯cation

system and the ideal system have in common.Thus,capacity is a measure of\goodness"of a designed system.

4.4 Estimation of FAR and FRR

FAR and FRR are estimated using the typicality of biometric samples and the rankings for each of their putative

N assignments.The active learning solution proposed here is similar to that used for choosing the best examples

for biometric training (learning).The solution is driven by Open Set Transductive Con¯dence Machines (TCM)

using strangeness and p-values.The p-values provide a measure of diversity and disagreement in opinion regarding

the putative label of a biometric sample when it is assigned all the labels available.Let p

i

be the p-values obtained

for a particular example x

n+1

using all possible labels i = 1;:::;N:Sort the sequence of p-values in descending

order so that the ¯rst two p-values,say,p

j

and p

k

are the two highest p-values with labels j and k;respectively.

The label assigned to the unknown example is j with a p-value of p

j

:This value de¯nes the credibility of the

classi¯cation.If p

j

(credibility) is not high enough,the prediction is rejected under the open set recognition

scenario.The di®erence between the two p-values can be used as a con¯dence value on the prediction,if one is

contemplated.Note that,the smaller the con¯dence,the larger the ambiguity regarding the proposed label and

the more likely the false accepts and false rejects are.We consider three possible cases of p-values,p

j

and p

k

;

assuming p

j

> p

k

:

1.

p

j

is high and p

k

is low.Prediction j has high credibility and high-con¯dence value;

2.

p

j

is high and p

k

is high.Prediction j has high credibility but low-con¯dence value;

3.

p

j

is low and p

k

is low.Prediction j has low credibility and low-con¯dence value.

High uncertainty in prediction occurs for both Case 2 and Case 3 and leads to misclassi¯cation errors with

those corresponding to Case 2 harder to avoid (because of their assumed high credibility).For both cases

2 and 3 uncertainty of prediction occurs when p

j

¼ p

k

:The con¯dence I(x

n

+ 1) = p

j

¡ p

k

indicates the

quality of authentication information possessed by the biometric samples.As I(x

n

+ 1) approaches 0;the

more uncertain we are about classifying the example,and the larger the likelihood of occurring errors.One

tabulates for all biometric samples their possible errors weighted according to both their credibility and contextual

con¯dence.The larger the credibility the larger the weight;the smaller the con¯dence the larger the weight too.

Thresholds similar to those derived for Open Set TCM are set,confusion matrices accrue\errors"over NA

putative authentications,and FAR and FRR are estimated accordingly.Further extensions can incorporate

some error analysis according to the diversity of the biometric population encountered and the characteristic of

pattern speci¯c error inhomogeneities (PSEI).

The scheme proposed above,similar to Query by Transduction,

12

has solid theoretical underpinnings with

p-values mapped to posterior probabilities.This is based on the fact that (1) the Kullback-Leibler (KL) diver-

gence can be interpreted as the expected discrimination information between the null and alternative statistical

hypotheses;and (2) connections between KL divergence and Shannon information.

14

The scheme proposed above

for FAR and FRR estimation can be expanded to both labeled and unlabeled biometric samples if label propa-

gation using spectral clustering precedes the computation of con¯dence values and the (weighted) tabulation of

confusion matrices.

5.SUMMARY

Two IT frameworks to analyze performance of biometric authentication systems at the system level were intro-

duced.A joint secured biometric channel was designed by combining biometric authentication systemand device

physical signature authentication system into a single BAC channel.Its capacity and rate-distortion function

were evaluated.Since prior probabilities and conditional error probabilities characterizing biometric and physical

signature authentication systems are not available in practice,a transductive method was suggested to estimate

these parameters by using a small amount of training data.These parameters can then be used to estimate

capacity and rate-distortion function of the real biometric authentication systems.

REFERENCES

[1]

Westover,M.B.and O'Sullivan,J.A.,\Achievable rates for pattern recognition,"IEEE Transactions on

Information Theory 54(1),299{320 (2008).

[2]

Willems,F.,Kalker,T.,Goseling,J.,and Linnartz,J.-P.,\On the capacity of a biometrical identi¯cation

system,"Proc.of IEEE Int.Symp.on Information Theory,82 (2003).

[3]

Schmid,N.A.and O'Sullivan,J.A.,\Performance prediction methodology for biometric system using large

deviations approach,"IEEE Trans.on Signal Processing:Supplement on Secure Media 52(10),3036{3045

(2004).

[4]

Schmid,N.A.and Nicolo,F.,\Recognition capacity of biometric systems under global pca- and ica-based

encoding,"IEEE Trans.on Information Forensics and Security 3(3),512{528 (2008).

[5]

Amblard,P.O.,Michel,O.J.J.,and Morfu,S.,\Revisiting the asymmetric binary channel:joint noise-

enhanced detection and information transmission through threshold devices,"in [Noise in Complex Systtems

and Stochastic Dynamics III],Proc.SPIE 5845,50{60 (2005).

[6]

Wechsler,H.,[Reliable Face Recognition Methods System Design,Implementation and Evaluation],Springer

US,New York (2007,Ch.14).

[7]

Fridrich,J.,\Digital image forensics,"IEEE Signal Processing Magazine 26(2),26{37 (2009).

[8]

Filler,T.,Fridrich,J.,and Goljan,M.,\Using sensor pattern noise for camera model identi¯cation,"in

[Proc.IEEE ICIP],1296{1299 (2008).

[9]

O'Sullivan,J.A.and Schmid,N.A.,\Large deviations for performance analysis of signature authentication,"

in [IEEE Int.Symp.on Information Theory],176 (1997).

[10]

O'Sullivan,J.A.and Schmid,N.A.,\Performance analysis of physical signature authentication,"IEEE

Trans.on Inform.Theory 47(7),3034{3039 (2001).

[11]

Cover,T.M.and Thomas,J.A.,[Elements of Information Theory],Wiley-Interscience,Hoboke,New Jersey

(2006 (second edition)).

[12]

Ho,S.S.and Wechsler,H.,\Query by transduction,"IEEE Trans.on Pattern Analysis and Machine

Intelligence 30(9),1557{1571 (2008).

[13]

Li,F.and Wechsler,H.,\Open set face recognition using transduction,"IEEE Trans.on Pattern Analysis

and Machine Intelligence 27(11),1686{1697 (2005).

[14]

Ho,S.S.and Wechsler,H.,\A martingale framework for detecting changes in the data generating model

in data streams,"IEEE Trans.on Pattern Analysis and Machine Intelligence (2010,(to appear)).

## Comments 0

Log in to post a comment