Clustering Algorithms

coachkentuckyAI and Robotics

Nov 25, 2013 (3 years and 11 months ago)

64 views

CS
-
175: Clustering:
1


Clustering Algorithms

Padhraic Smyth

Department of Computer Science


CS 175, Fall 2007

CS
-
175: Clustering:
2

Timeline



Today


Discussion of project presentations and final report


Overview of clustering algorithms and how they can be used with image data




Tuesday December 4
th


No lecture (out of town)




Thursday Dec 6
th
: Student Presentations:


About 4 minutes per student, + questions




Wednesday Dec 12
th
: Final project reports due 12 noon to EEE


Instructions on format provided on the class Web site


CS
-
175: Clustering:
3

Project Presentations


Thursday next week


Each student will make a 4 minute presentation + 1 minute for
questions from the professor and/or students


13 students * 5 minutes = 65 minutes + setup time



IMPORTANT:


We will go in a fixed order (alphabetical by last name


next slide)


Be here on time!


Your slides must be uploaded BEFORE 9am THURSDAY (day of
presentations)


Powerpoint or PDF is acceptable



CS
-
175: Clustering:
4

Order of Student Presentations

1.
Austgen

2.
Duran

3.
Hall

4.
Hooper

5.
Kong

6.
Lipeles

7.
Newton

8.
Nguyen (Nam)

9.
Nguyen (Son)

10.
Nilsen

11.
Salanga

12.
Schmitt

13.
Sheldon

14.
Rodriguez

CS
-
175: Clustering:
5

Guidelines for Presentations


Your slides should at least contain the following elements:


clear statement of what task/problem you are addressing


Outline of the technical approach you are taking


You will not have time to go into details


Provide a high
-
level description of your methods


e.g., a figure or flow chart


Show an example (e.g., of template matching, edge map, etc)


Describe your results so far


Visual examples, tables of accuracy numbers, etc


Its ok if your project is not yet finished: describe what you have



General tips


Speak clearly and loudly


face the audience


Practice beforehand


know what is in your slides


Be creative


use figures rather than text where possible



CS
-
175: Clustering:
6

Presentation Grading


5% of your grade



You will get at least 2.5% for just showing up




Remainder of the grade will depend on


How much work/effort did you put into your slides?


How clear are your slides and your presentation?


Creativity, e.g., a clever way to illustrate visually how your feature
-
extractor/detector/classifier is working



Questions?

CS
-
175: Clustering:
7

Final Project Reports


Due noon Wednesday December 12
th

(to EEE)



See class Web page for detailed instructions


You will submit a 5 to 10 page report and your code



Worth 35% of your total grade


Make sure to spend time on the report


Much of your grade will depend on how well and how clearly your report is
written


Lower grades will go to


Poorly written reports and/or poorly executed project


High grade will need


Well
-
written report AND well
-
executed project



If your system is not performing accurately, don’t panic! Carefully describe
what you did, and try to identify why your system is not performing
accurately (look at errors, etc). If you write a good report and document what
you did, you can still get a high grade.


CS
-
175: Clustering:
8

Normalized Template Matching

CS
-
175: Clustering:
9

Putting 2 Vectors on the same Scale


Two vectors x and y



mx = mean(x), sx = standard_deviation(x)



Let x’ = (x


mx)/sx


x’ values now have mean 0 and standard deviation 1


Why?


Mean(x’) = mean( (x
-
mx)/sx ) = (mean(x)


mx)/sx = 0


Same type of argument for standard deviation



Can apply the same normalization to y to get y’


y ‘ = (y


my)/sy






CS
-
175: Clustering:
10

Applying this Idea to Template Matching


x = template
-

> x’ = normalized template



y = patch of image being matched to template


y’ = normalized patch



Normalized template matching


Replace template x by normalized template x’


Has mean pixel intensity 0 and standard deviation 1


Only needs to be done once (at start of function)



Replace each image patch y by normalized image patch y’


Has mean pixel intensity 0 and standard deviation 1


Likely to lead to better matching


However: the patch normalization has to be done at every patch
in the image (so will slow down the template
-
matching code)

CS
-
175: Clustering:
11

Normalized Template Matching

CS
-
175: Clustering:
12

Normalized Template Matching

CS
-
175: Clustering:
13

Normalized Template Matching

CS
-
175: Clustering:
14

Normalized Template Matching

CS
-
175: Clustering:
15

Modified Template
-
Matching Code

% Reshape template

reshtemp = reshape(template,1,tmrows*tmcols);



% Remove mean of template and divide by standard deviation:

reshtemp = (reshtemp
-

mean(reshtemp))./std(reshtemp);

% Template now has mean 0 and standard deviation 1;


…….

for x=1:xspan


for y=1:yspan


% Take a piece of the image where the template is


bite = image(y:y+tmrows
-
1,x:x+tmcols
-
1);


% Reshape


reshbite = reshape(bite,1,tmrows*tmcols);




% Remove mean of “bite” and divide by standard deviation:


sbite = std(reshbite);



if sbite>0



reshbite = (reshbite
-

mean(reshbite))./(std(reshbite));



end


% “patch” now has mean 0 and standard deviation 1;




CS
-
175: Clustering:
16


Clustering Algorithms

CS
-
175: Clustering:
17

Unsupervised Learning or Clustering


In “supervised learning” each data point had a class label



in many problems there are no class labels


this is “unsupervised learning”



human learning: how do we form categories of objects?


Humans are good at creating groups/categories/clusters from
data



in image analysis finding groups in data is very useful


e.g., can find pixels with similar intensities


> automatically finds regions in images


e.g., can find images that are similar


-
> can automatically find classes/clusters of images


CS
-
175: Clustering:
18

Example: Data in 2 Clusters

Feature 1

Feature 2

CS
-
175: Clustering:
19

The Clustering Problem


Let
x

= (x
1
, x
2
,…, x
d
,) be a d
-
dimensional feature vector



Let D be a set of
x

vectors,


D = {
x
(1),
x
(2), …..
x
(N) }



Given data D, group the N vectors into K groups such that the
grouping is “optimal”



One definition of “optimal”:


Let mean_k be the mean (centroid) of the Kth group


Let d_i be the distance from vector
x
(i) to the closest mean


so each data point
x
(i) is assigned to one of the K means


CS
-
175: Clustering:
20

Optimal Clustering


Let mean_k be the mean (centroid) of the kth cluster


mean_k is the average vector of all vectors
x

“assigned to” cluster k


mean_k = (1/n)
S

x
(i),


where the sum is over
x
(i) assigned to cluster k



One definition of “optimal”:


Let d_i be the distance from vector
x
(i) to the closest mean


so each data point
x
(i) is assigned to one of the K means



Q_k = quality of cluster k =
S

d_i ,


where the sum is over
x
(i) assigned to cluster k


the Q_k’s measure how “compact” each cluster is



We want to minimize the total sum of the Q_k’s



CS
-
175: Clustering:
21

The Total Squared Error Objective function


Let d_i = distance from feature vector
x
(i) to the closest mean


= squared Euclidean distance between
x
(i) and mean_k



Now Q_k = sum of squared distances for points in cluster k



Total Squared Error (TSE)



TSE = Total Squared_Error =
S

Q_k


where sum is over all K clusters (and each Q_k is itself a sum)



TSE measures how “compact” a clustering is




CS
-
175: Clustering:
22

Example: Data in 2 Clusters

Feature 1

Feature 2

CS
-
175: Clustering:
23

“Compact” Clustering: Low TSE

Feature 1

Feature 2

Cluster Center 1

Cluster Center 2

CS
-
175: Clustering:
24

“Compact” Clustering: Low TSE

Feature 1

Feature 2

Cluster Center 1

Cluster Center 2

Here we have 2 clusters, and TSE = Q
1

+ Q
2

CS
-
175: Clustering:
25

“Non
-
Compact” Clustering: High TSE

Feature 1

Feature 2

Cluster Center 1

Cluster Center 2

TSE = Q
1

+ Q
2
would be much higher now: so we want to


find the cluster centers that minimize TSE


CS
-
175: Clustering:
26

The Clustering Problem


Let D be a set of
x

vectors,


D = {
x
(1),
x
(2), …..
x
(N) }



Fix a value for K, e.g., K = 2



Find the locations of the K means that minimize the TSE


no direct solution


Exhaustive search: how many possible clusterings of N objects
into K subsets?


O(K
N
)
-
> way too many to search directly



can use an iterative greedy search algorithm to minimize TSE

CS
-
175: Clustering:
27

The K
-
means Algorithm for Clustering

Inputs: data D, with N feature vectors


K = number of clusters

Outputs: K mean vectors (centers of K clusters)


memberships for each of the N feature vectors







CS
-
175: Clustering:
28

The K
-
means Algorithm for Clustering


kmeans(D, k)


choose K initial means randomly (e.g., pick K points randomly from D)



while means_are_changing



% assign each point to a cluster



for i = 1: N




membership[
x
(i)] = cluster with mean closest to
x
(i)



end




% update the means



for k = 1:K




mean_k = average of vectors
x
(i) assigned to cluster k



end






% check for convergence




if (new means are the same as old means) then halt




else means_are_changing = 1


end




CS
-
175: Clustering:
29

Original Data (2 dimensions)

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
CS
-
175: Clustering:
30

Initial Cluster Centers for K
-
means (K=2)

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Initial Cluster Centers at Iteration 1
CS
-
175: Clustering:
31

Update Memberships (Iteration 1)

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 1
X Variable
Y Variable
CS
-
175: Clustering:
32

Update Cluster Centers at Iteration 2

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Cluster Centers at Iteration 2
CS
-
175: Clustering:
33

Update Memberships (Iteration 2)

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 2
X Variable
Y Variable
CS
-
175: Clustering:
34

Update Cluster Centers at Iteration 3

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Cluster Centers at Iteration 3
CS
-
175: Clustering:
35

Update Memberships (Iteration 3)

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 3
X Variable
Y Variable
CS
-
175: Clustering:
36

Update Cluster Centers at Iteration 4

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Cluster Centers at Iteration 4
CS
-
175: Clustering:
37

Updated Memberships (Iteration 4)

8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 4
X Variable
Y Variable
CS
-
175: Clustering:
38

Comments on the K
-
means algorithm


Time Complexity


per iteration = O(KNd)



Can prove that TSE decreases (or converges) at each iteration



Does it find the global minimum of TSE?


No, not necessarily


in a sense it is doing “steepest descent” from a random initial
starting point


thus, results will be sensitive to the starting point


in practice, we can run it from multiple starting points and pick
the solution with the lowest TSE (the most “compact” solution)


CS
-
175: Clustering:
39

Clustering Pixels in an Image


We can use K
-
means to cluster pixel intensities in an image into K
clusters


this provides a simple way to “segment” an image into K regions of
similar “compact” image intensities


more automated than manual thresholding of an image



How to do this?


Size(image pixel matrix) = m x n


convert to a vector with (m x n) rows and 1 column


this is a 1
-
dimensional feature vector of pixel intensities


run the k
-
means algorithm with input = vector of intensities


assign each pixel the “grayscale” of the cluster it is assigned to



Note: with color images we can use a 3
-
dimensional feature vector
per pixel, i.e, R, G, B values at each pixel



CS
-
175: Clustering:
40

Clustering in RGB (color) space

K
-
means clustering of RGB (3 value) pixel

color intensities, K = 11 segments

(courtesy of David Forsyth, UC Berkeley)

Image

Clusters on color

CS
-
175: Clustering:
41

Example: Original Image

20
40
60
80
100
120
20
40
60
80
100
120
CS
-
175: Clustering:
42

Segmentation with K
-
means: K = 2

CS
-
175: Clustering:
43

Segmentation with K=3

CS
-
175: Clustering:
44

Segmentation with K=5

Note: what K
-
means is doing in effect is finding 4 threshold intensities (based on the data)

and assigning each intensity to 1 of 5 “bins” (clusters) based on these thresholds

CS
-
175: Clustering:
45

Another Image Example

20
40
60
80
100
120
20
40
60
80
100
120
CS
-
175: Clustering:
46

Segmentation with K=2

CS
-
175: Clustering:
47

Segmentation with K=3

CS
-
175: Clustering:
48

Segmentation with K=8

(with pseudocolor display)

colormap(‘hsv’)

CS
-
175: Clustering:
49

Using pixel clustering for region finding


How could you use K
-
means in your project?


K
-
means puts pixels into K groups based on intensity similarities


The result is a set of regions in an image, where each region is
relatively homogeneous in terms of pixel intensity


=> K
-
means can be used as a simple technique for region
-
finding



Note that K
-
means clustering knows nothing about the spatial
aspects of the image


Other region
-
finding algorithms can operate spatially

(more on this in a later lecture)



CS
-
175: Clustering:
50

Using pixel clustering for region finding


How could you use K
-
means in your project?


K
-
means puts pixels into K groups based on intensity similarities


The result is a set of regions in an image, where each region is
relatively homogeneous in terms of pixel intensity


=> K
-
means can be used as a simple technique for region
-
finding



Note that K
-
means clustering knows nothing about the spatial
aspects of the image


Other region
-
finding algorithms can operate spatially





Note that regions and edges are “duals”


edges


業i来g


So one could find regions give edges


Or one could find edges given regions (e.g., boundaries between
clusters of pixels produced by K
-
means)

CS
-
175: Clustering:
51

Clustering Images


We can also cluster sets of images into groups


now each vector = a full image (dimensions 1 x (mxn))


N images of size m x n


convert to a matrix with N rows and (m x n) columns


just use image_to_matrix.m


call kmeans with D = this matrix


kmeans is now clustering in an (m x n) dimensional space


kmeans will group the images into K groups

CS
-
175: Clustering:
52

Example: First 5 Individuals, K = 2

Cluster 1

Cluster 2

CS
-
175: Clustering:
53

Example: 2nd 5 individuals, K = 2

Cluster 1

Cluster 2

CS
-
175: Clustering:
54

All Individuals, Happy Faces, K=5

CS
-
175: Clustering:
55

Matlab Code


Code for k
-
means on Web page: kmeans_clustering.zip


Kmeans.m


Does the basic clustering



Segmentimage.m


Uses k
-
means to cluster pixel intensities



Clusterimage.m


Uses k
-
means to cluster images



CS
-
175: Clustering:
56

Summary



Clustering


automated methods to assign feature vectors to K clusters


K
-
means algorithm


With images, can use K
-
means to


Cluster pixels into groups of pixels


Cluster images into groups of images





CS
-
175: Clustering:
57

Timeline



Tuesday December 4
th


No lecture (out of town)




Thursday Dec 6
th
: Student Presentations:


About 3
-
5 slides, 4 minutes per student + questions


IMPORTANT: upload your slides to EEE before 9am Thursday!




Wednesday Dec 12
th
: Final project reports due 12 noon to EEE


Instructions on format provided on the class Web site