Clustering II - Webdocs Cs Ualberta

plantationscarfAI and Robotics

Nov 25, 2013 (3 years and 8 months ago)

114 views

Clustering II

CMPUT 466/551

Nilanjan Ray

Mean
-
shift Clustering


Will show slides from:
http://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/mean_shift/mean_shift.ppt

Spectral Clustering


Let’s visit a serious issue with
K
-
means


K
-
means tries to figure out
compact
,
hyper
-
ellipsoid

like structures


What if the clusters are not ellipsoid like
compact?
K
-
means fails.




What can we do?
Spectral clustering

can be a
remedy here.


-150
-100
-50
0
50
100
150
-150
-100
-50
0
50
100
150
Basic Spectral Clustering


Forms a similarity matrix
w
ij


for all pairs of observations
i
,
j
.



This is a dense graph with data points as the vertex set. Edge strength is
given by the
w
ij

, similarity between
i
th

and
j
th

observations.



Clustering can be conceived as a
partitioning of the graph

into connected
components, where within a component, the edge weights are large,
whereas, across the components they are low.



Basic Spectral Clustering…


Form the
Laplacian

of this graph: where
G

is a diagonal matrix
with entries,



L

is positive semi
-
definite and has a constant eigenvector (all 1’s) with zero
eigenvalue
.



Find m smallest eigenvectors
Z
=[
z
1

z
2

z
m
] of
L
, ignoring the constant
eigenvector.



Cluster (say by
K
-
means)
N

observations with features as rows of matrix
Z
.

,
W
G
L





N
j
ij
i
W
g
1
Why Spectral Clustering Works

The graph cut cost for a label vector
f
:















N
i
N
j
j
i
ij
N
i
N
j
ij
j
i
N
i
i
i
T
f
f
w
w
f
f
f
g
L
1
1
2
1
1
1
)
(
2
1
f
f
So, a small value of will be obtained if pairs of points with large adjacencies

same labels.

In reality, we only have weak and strong edges. So look for small
eigenvalues
.

f
f
L
T
The constant eigenvector corresponding to 0
eigenvalue

is actually a trivial

solution that suggests to put all
N

observations into a single cluster.

If a graph has
K

connected components, the nodes of the graph can be reordered

so that
L

will be block diagonal with
K

diagonal blocks and
L

will have zero

eigenvalue

with multiplicity
K
, one for each connected component. Corresponding

eigenvectors will have indicator variables indentifying these connected components.

Choose eigenvectors corresponding to small
eigenvalues

and cluster them into
K

classes.

Insight 1:

Insight 2:

Combining Insight 1 and 2:

A Tiny Example: A Perfect World

W =
[
1.0000 0.5000 0 0



0.5000 1.0000 0 0


0 0 1.0000 0.8000


0 0 0.8000 1.0000
];

We observe two classes each with 2

observations here. W is a perfect block

diagonal matrix here.

L = 0.5000
-
0.5000 0 0


-
0.5000 0.5000 0 0


0 0 0.8000
-
0.8000


0 0
-
0.8000 0.8000];

Laplacian

L

Eigenvalues

of L: 0, 0, 1, 1.6


Eigenvectors corresponding to two 0
eigenvalues
: [
-
0.7071
-
0.7071 0 0]






and [ 0 0
-
0.7071
-
0.7071]

The Real World Tiny Example

W =
[

1.0000 0.5000 0.0500 0.1000



0.5000 1.0000 0.0800 0.0400



0.0500 0.0800 1.0000 0.8000



0.1000 0.0400 0.8000 1.0000
]

L =[ 0.6500
-
0.5000
-
0.0500
-
0.1000


-
0.5000 0.6200
-
0.0800
-
0.0400


-
0.0500
-
0.0800 0.9300
-
0.8000


-
0.1000
-
0.0400
-
0.8000 0.9400]

[V,D]=
eig
(L)

V = 0.5000 0.4827
-
0.7169 0.0557


0.5000 0.5170 0.6930
-
0.0498


0.5000
-
0.5027 0.0648 0.7022


0.5000
-
0.4970
-
0.0409
-
0.7081


D = 0.0000 0.2695 1.1321 1.7384

Eigenvectors:

Eigenvalues
:

Notice that
eigenvalue

0 has a constant eigenvector.


The next
eigenvalue

0.26095 has an eigenvector that clearly indicates the class memberships.


Normalized Graph Cut for Image
Segmentation













.
otherwise
,
0
,
25
|
)
(
)
(
|
for
),
2
|
)
(
)
(
|
exp(
)
2
|
)
(
)
(
|
exp(
)
,
(
2
2
2
2
j
X
i
X
σ
j
X
i
X
σ
j
I
i
I
j
i
W
X
I
A cell image

10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
Similarity:

Pixel locations

NGC Example

(a) A blood cell image. (b) Eigenvector corresponding to second smallest
eigenvalue. (c) Binary labeling via Otsu’s method. (d) Eigenvector corresponding
to third smallest eigenvalue. (e) Ternary labeling via
k
-
means clustering.

10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
10
20
30
40
50
60
5
10
15
20
25
30
35
40
45
50
Demo:
NCUT.m

(a)



(b)



(c)


(d)



(e)