# Clustering II - Webdocs Cs Ualberta

AI and Robotics

Nov 25, 2013 (4 years and 7 months ago)

127 views

Clustering II

CMPUT 466/551

Nilanjan Ray

Mean
-
shift Clustering

Will show slides from:
http://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/mean_shift/mean_shift.ppt

Spectral Clustering

Let’s visit a serious issue with
K
-
means

K
-
means tries to figure out
compact
,
hyper
-
ellipsoid

like structures

What if the clusters are not ellipsoid like
compact?
K
-
means fails.

What can we do?
Spectral clustering

can be a
remedy here.

-150
-100
-50
0
50
100
150
-150
-100
-50
0
50
100
150
Basic Spectral Clustering

Forms a similarity matrix
w
ij

for all pairs of observations
i
,
j
.

This is a dense graph with data points as the vertex set. Edge strength is
given by the
w
ij

, similarity between
i
th

and
j
th

observations.

Clustering can be conceived as a
partitioning of the graph

into connected
components, where within a component, the edge weights are large,
whereas, across the components they are low.

Basic Spectral Clustering…

Form the
Laplacian

of this graph: where
G

is a diagonal matrix
with entries,

L

is positive semi
-
definite and has a constant eigenvector (all 1’s) with zero
eigenvalue
.

Find m smallest eigenvectors
Z
=[
z
1

z
2

z
m
] of
L
, ignoring the constant
eigenvector.

Cluster (say by
K
-
means)
N

observations with features as rows of matrix
Z
.

,
W
G
L

N
j
ij
i
W
g
1
Why Spectral Clustering Works

The graph cut cost for a label vector
f
:

N
i
N
j
j
i
ij
N
i
N
j
ij
j
i
N
i
i
i
T
f
f
w
w
f
f
f
g
L
1
1
2
1
1
1
)
(
2
1
f
f
So, a small value of will be obtained if pairs of points with large adjacencies

same labels.

In reality, we only have weak and strong edges. So look for small
eigenvalues
.

f
f
L
T
The constant eigenvector corresponding to 0
eigenvalue

is actually a trivial

solution that suggests to put all
N

observations into a single cluster.

If a graph has
K

connected components, the nodes of the graph can be reordered

so that
L

will be block diagonal with
K

diagonal blocks and
L

will have zero

eigenvalue

with multiplicity
K
, one for each connected component. Corresponding

eigenvectors will have indicator variables indentifying these connected components.

Choose eigenvectors corresponding to small
eigenvalues

and cluster them into
K

classes.

Insight 1:

Insight 2:

Combining Insight 1 and 2:

A Tiny Example: A Perfect World

W =
[
1.0000 0.5000 0 0

0.5000 1.0000 0 0

0 0 1.0000 0.8000

0 0 0.8000 1.0000
];

We observe two classes each with 2

observations here. W is a perfect block

diagonal matrix here.

L = 0.5000
-
0.5000 0 0

-
0.5000 0.5000 0 0

0 0 0.8000
-
0.8000

0 0
-
0.8000 0.8000];

Laplacian

L

Eigenvalues

of L: 0, 0, 1, 1.6

Eigenvectors corresponding to two 0
eigenvalues
: [
-
0.7071
-
0.7071 0 0]

and [ 0 0
-
0.7071
-
0.7071]

The Real World Tiny Example

W =
[

1.0000 0.5000 0.0500 0.1000

0.5000 1.0000 0.0800 0.0400

0.0500 0.0800 1.0000 0.8000

0.1000 0.0400 0.8000 1.0000
]

L =[ 0.6500
-
0.5000
-
0.0500
-
0.1000

-
0.5000 0.6200
-
0.0800
-
0.0400

-
0.0500
-
0.0800 0.9300
-
0.8000

-
0.1000
-
0.0400
-
0.8000 0.9400]

[V,D]=
eig
(L)

V = 0.5000 0.4827
-
0.7169 0.0557

0.5000 0.5170 0.6930
-
0.0498

0.5000
-
0.5027 0.0648 0.7022

0.5000
-
0.4970
-
0.0409
-
0.7081

D = 0.0000 0.2695 1.1321 1.7384

Eigenvectors:

Eigenvalues
:

Notice that
eigenvalue

0 has a constant eigenvector.

The next
eigenvalue

0.26095 has an eigenvector that clearly indicates the class memberships.

Normalized Graph Cut for Image
Segmentation

.
otherwise
,
0
,
25
|
)
(
)
(
|
for
),
2
|
)
(
)
(
|
exp(
)
2
|
)
(
)
(
|
exp(
)
,
(
2
2
2
2
j
X
i
X
σ
j
X
i
X
σ
j
I
i
I
j
i
W
X
I
A cell image

10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
Similarity:

Pixel locations

NGC Example

(a) A blood cell image. (b) Eigenvector corresponding to second smallest
eigenvalue. (c) Binary labeling via Otsu’s method. (d) Eigenvector corresponding
to third smallest eigenvalue. (e) Ternary labeling via
k
-
means clustering.

10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
Demo:
NCUT.m

(a)

(b)

(c)

(d)

(e)