Clustering Algorithms

Johannes Blomer

WS 2012/13

Introduction

Clustering techniques for data management and analysis that

classify/group given set of objects into

categories/subgroups or clusters

Clusters homogeneous subgroups of objects such that

similarity b/w objects in one subgroup is larger than

similarity b/w objects from dierent subgroups

Goals

1.nd structures in large set of objects/data

2.simplify large data sets

Example

Example

How do we measure similarity/dissimilarity of objects?

How do we measure quality of clustering?

Application areas

1.information retrieval

2.data mining

3.computer graphics

4.data compression

5.bioinformatics

6.machine learning

7.statistics

8.pattern recognition.

Goals of this course

I

dierent models for clustering

I

many important clustering heuristics,including agglomerative

clustering,Lloyd's algorithm,and the EM algorithms

I

the limitations of these heuristics

I

improvements to these heuristics

I

various theoretical results about clustering,including

NP-hardness results and approximation algorithms

I

general techniques to improve the eciency of heuristics and

approximation algorithms,i.e.dimension reduction techniques.

Organization

Information about this course

http://www.cs.uni-paderborn.de/fachgebiete/ag-

bloemer/lehre/2012/ws/clusteringalgorithms.html

Here you nd

I

announcements

I

handouts

I

slides

I

literature

I

lecture notes (will be written and appear as course progresses)

Prerequisites

I

design and analysis of algorithm

I

basic complexity theory

I

probability theory and stochastic

I

some linear algebra

Tutorials

There are TWO Tutorials:

I

Thursday,1 -2 p.m.,room F2.211 new

I

Friday,1-2 p.m.,room,F1.110

Objects

I

objects described by d dierent features

I

features continuous or binary

I

objects described as elements in R

d

or f0;1g

d

I

objects from M R

d

or M f0;1g

d

Distance functions

Denition 1.1

D:MM!R is called a distance function,if for all x;y;z 2 M

I

D(x;y) = D(y;x) (symmetry)

I

D(x;y) 0 (positivity),

D is called a metric,if in addition,

I

D(x;y) = 0,x = y (re exivity)

I

D(x;z) D(x;y) +D(y;z) (triangle inequality)

Examples

Example 1.2 (euclidean distance)

M = R

d

;

D

l

2

(x;y) = kx yk

2

=

d

X

i =1

jx

i

y

i

j

2

1

2

;

where x = (x

1

;:::;x

d

) and y = (y

1

;:::;y

d

):

Examples

Example 1.3 (Minkowski distances,l

p

-norms)

M = R

d

;p 1;

D

l

p

(x;y) = kx yk

p

=

d

X

i =1

jx

i

y

i

j

p

1

p

:

Example 1.4 (maximum distance)

M = R

d

;

D

l

1

(x;y) = kx yk

1

= max

1i d

jx

i

y

i

j:

Examples

Example 1.5 (Pearson correlation)

M = R

d

;

D

Pearson

(x;y) =

1

2

0

@

1

P

d

i =1

(x

i

x)(y

i

y)

q

P

d

i =1

(x

i

x)

2

P

d

i =1

(y

i

y)

2

1

A

;

where x =

1

d

P

x

i

and y =

1

d

P

y

i

:

Examples

Example 1.6 (Mahalanobis divergence)

A 2 R

dd

positive denite,i.e.x

T

Ax > 0 for x 6= 0;M = R

d

;

D

A

(x;y) = (x y)

T

A(x y)

Example 1.7 (Itakura-Saito divergence)

M = R

d

0

;

D

IS

(x;y) =

X

x

i

y

i

ln(

x

i

y

i

) 1;

where by denition 0 ln(0) = 0.

Examples

Example 1.8 (Kullback-Leibler divergence)

M = S

d

:= fx 2 R

d

:8i:x

i

0;

P

x

i

= 1g;

D

KLD

(x;y) =

X

x

i

ln(x

i

=y

i

);

where by denition 0 ln(0) = 0.

Example 1.9 (generalized KLD)

M = R

d

0

;

D

KLD

(x;y) =

X

x

i

ln(x

i

=y

i

) (x

i

y

i

);

Similarity functions

Denition 1.10

S:MM!R is called a similarity function,if for all x;y;z 2 M

I

S(x;y) = S(y;x) (symmetry)

I

0 S(x;y) 1 (positivity),

S is called a metric,if in addition,

I

S(x;y) = 1,x = y (re exivity)

I

S(x;y)S(y;z)

S(x;y) +S(y;z)

S(x;z) (triangle

inequality)

Examples

Example 1.11 (Cosine similarity)

M = R

d

;

S

CS

(x;y) =

x

T

y

kxkkyk

:

Similarity for binary features

Let x;y 2 f0;1g

d

,then

n

b

b

(x;y):=

f1 i d:x

i

= b;y

i

=

bg

and for w 2 R

0

S

w

(x;y):=

n

00

(x;y) +n

11

(x;y)

n

00

(x;y) +n

11

(x;y) +w

n

01

(x;y) +n

10

(x;y)

:

Popular:w = 1;2;

1

2

.

Example 1.12 (matching coecient)

w = 1;S

mc

(x;y) =

n

00

(x;y) +n

11

(x;y)

d

:

Similarity for binary features

S

w

(x;y):=

n

11

(x;y)

n

11

(x;y) +w

n

01

(x;y) +n

10

(x;y)

Popular:w = 1;2;

1

2

.

Example 1.13 (Jaccard coecient)

w = 1;S

mc

(x;y) =

n

11

(x;y)

n

11

(x;y) +n

01

(x;y) +n

10

(x;y)

:

## Comments 0

Log in to post a comment