# Linear discriminant analysis (LDA)

L
i
near discriminant analysis

(LDA)

Katarina Berta

Katarina Berta

bk113255m@student.etf.rs

Introduction

Fisher
’s

Linear

Discriminant

Analysis

Paper from 1936. (
)

Statistical technique for classification

LDA = two classes

MDA = multiple classes

Used in statistics, pattern recognition,
machine learning

2
Purpose

Discriminant

Analysis classifies objects

in two or more groups

according to
linear combination
of features

Feature selection

Which

set of features

can best determine
group membership of the object?

dimension reduction

Classification

What is the classification

rule

or

model

to best
separate those groups?

3
Method (1)

Passed
Not passed
Good separation

4
Method (2)

Maximize the between
-
class scatter

D
ifference of mean values (m1
-
m2)

Minimize the within
-
class scatter

Covariance

M
in

M
in

M
ax

5
Formula

Σ
y

= 0

= Σ
y

= 1

= Σ

equal covarinaces

Bayes
' theorem

Idea:

x

object

i
, j

classes, groups

Derivation:

probability density functions

-
normaly distributet
-

QDA
-

discriminant

analysis

M
ean value

Covarinace

FLD

6
Example

Curvature

Diameter

Quality Control Result

2,95

6,63

Passed

2,53

7,79

Passed

3,57

5,65

Passed

3,16

5,47

Passed

2,58

4,46

Not Passed

2,16

6,22

Not Passed

3,27

3,52

Not Passed

Factory for high quality chip rings

Training set

7
Normalization of data

X1

X2

2,888

5,676

X1

X2

class

2,95

6,63

1

2,53

7,79

1

3,57

5,65

1

3,16

5,47

1

2,58

4,46

0

2,16

6,22

0

3,27

3,52

0

X1o

X2o

class

0,060

0,951

1

-
0,357

2,109

1

0,679

-
0,025

1

0,269

-
0,209

1

-
0,305

-
1,218

0

-
0,732

0,547

0

0,386

-
2,155

0

T
raining
data

Mean corrected data

Avrage

8
Covarinace

0,166

-
0,192

-
0,192

1,349

0,259

-
0,286

-
0,286

2,142

C
ovarinace for class i

Covarinace class 1

C
1

Covarinace class 2

C
2

O
ne

entry of covarinace

matrix
-

C

0,206

-
0,233

-
0,233

1,689

covarinace matrix

-

C

0,259

-
0,286

-
0,286

2,142

Inverse covarinace matrix C
-

S

9
Mean values

N

P(i)

m
(X1)

m
(X2)

Class 1

4

0,571

3,05

6,38

m1

Class 2

3

0,429

2,67

4,73

m2

Sum

7

5,72

11,12

m1
+m2

0,38

1,65

m1
-
m2

3,487916

1,456612

W
= S*(m1
-
m2)

W
0
= ln
[
P
(
1
)
\
P(2)]
-
1
\
2
*(m1
+
m2)

=
-
17,7856

N

number of objects

P(
i
)

prior probability

m1

mean value matrix of class 1 (m(x1), m(x2))

m2

mean value matrix of class 2 (m(x1), m(x2))

0,259

-
0,286

-
0,286

2,142

S
-

inverse covariance

*

=

10
Resault

X1

X2

score

class

2,95

6,63

2,149

1

2,53

7,79

2,380

1

3,57

5,65

2,887

1

3,16

5,47

1,189

1

2,58

4,46

-
2,285

0

2,16

6,22

-
1,203

0

3,27

3,52

-
1,240

0

score= X*W + W
0

X1

X2

2,95

6,63

2,53

7,79

3,57

5,65

3,16

5,47

2,58

4,46

2,16

6,22

3,27

3,52

3,487916

1,456612

*

=

W
0

+

score

2,149

2,380

2,887

1,189

-
2,285

-
1,203

-
1,240

Not Passed
Passed
11
Prediction

N
ew chip:

curvature = 2.81, diameter = 5.46

Predicition: will not pass

Prediction correct!

score= X*W + W
0

W
= S*(m1
-
m2)

score=
-
0,036

I
f (score>0) then class1 else class2

score=
-
0,036

=>
class2

Not Passed
Passed
12
Pros & Cons

Cons

Old algorithm

algorithm
s
-

much better predicition

Pros

Simple

Fast and portable

Still beats some algorithms (logistic regression)
when its assumptions are met

Good to use when begining a project

13
Conclusion

FisherFace one of the best algorithms

for face
recognition

Often used for dimension reduction

Good for beginig of data mining projects

Thoug old, still worth trying

14
