CS

175: Clustering:
1
Clustering Algorithms
Padhraic Smyth
Department of Computer Science
CS 175, Fall 2007
CS

175: Clustering:
2
Timeline
•
Today
–
Discussion of project presentations and final report
–
Overview of clustering algorithms and how they can be used with image data
•
Tuesday December 4
th
–
No lecture (out of town)
•
Thursday Dec 6
th
: Student Presentations:
–
About 4 minutes per student, + questions
•
Wednesday Dec 12
th
: Final project reports due 12 noon to EEE
–
Instructions on format provided on the class Web site
CS

175: Clustering:
3
Project Presentations
•
Thursday next week
–
Each student will make a 4 minute presentation + 1 minute for
questions from the professor and/or students
–
13 students * 5 minutes = 65 minutes + setup time
•
IMPORTANT:
–
We will go in a fixed order (alphabetical by last name
–
next slide)
–
Be here on time!
–
Your slides must be uploaded BEFORE 9am THURSDAY (day of
presentations)
–
Powerpoint or PDF is acceptable
CS

175: Clustering:
4
Order of Student Presentations
1.
Austgen
2.
Duran
3.
Hall
4.
Hooper
5.
Kong
6.
Lipeles
7.
Newton
8.
Nguyen (Nam)
9.
Nguyen (Son)
10.
Nilsen
11.
Salanga
12.
Schmitt
13.
Sheldon
14.
Rodriguez
CS

175: Clustering:
5
Guidelines for Presentations
•
Your slides should at least contain the following elements:
–
clear statement of what task/problem you are addressing
–
Outline of the technical approach you are taking
•
You will not have time to go into details
•
Provide a high

level description of your methods
–
e.g., a figure or flow chart
•
Show an example (e.g., of template matching, edge map, etc)
–
Describe your results so far
•
Visual examples, tables of accuracy numbers, etc
•
Its ok if your project is not yet finished: describe what you have
•
General tips
–
Speak clearly and loudly
–
face the audience
–
Practice beforehand
–
know what is in your slides
–
Be creative
–
use figures rather than text where possible
CS

175: Clustering:
6
Presentation Grading
•
5% of your grade
•
You will get at least 2.5% for just showing up
•
Remainder of the grade will depend on
–
How much work/effort did you put into your slides?
–
How clear are your slides and your presentation?
–
Creativity, e.g., a clever way to illustrate visually how your feature

extractor/detector/classifier is working
•
Questions?
CS

175: Clustering:
7
Final Project Reports
•
Due noon Wednesday December 12
th
(to EEE)
•
See class Web page for detailed instructions
–
You will submit a 5 to 10 page report and your code
•
Worth 35% of your total grade
–
Make sure to spend time on the report
–
Much of your grade will depend on how well and how clearly your report is
written
–
Lower grades will go to
•
Poorly written reports and/or poorly executed project
–
High grade will need
•
Well

written report AND well

executed project
–
If your system is not performing accurately, don’t panic! Carefully describe
what you did, and try to identify why your system is not performing
accurately (look at errors, etc). If you write a good report and document what
you did, you can still get a high grade.
CS

175: Clustering:
8
Normalized Template Matching
CS

175: Clustering:
9
Putting 2 Vectors on the same Scale
•
Two vectors x and y
•
mx = mean(x), sx = standard_deviation(x)
•
Let x’ = (x
–
mx)/sx
–
x’ values now have mean 0 and standard deviation 1
–
Why?
•
Mean(x’) = mean( (x

mx)/sx ) = (mean(x)
–
mx)/sx = 0
•
Same type of argument for standard deviation
•
Can apply the same normalization to y to get y’
–
y ‘ = (y
–
my)/sy
CS

175: Clustering:
10
Applying this Idea to Template Matching
•
x = template

> x’ = normalized template
•
y = patch of image being matched to template
y’ = normalized patch
•
Normalized template matching
–
Replace template x by normalized template x’
•
Has mean pixel intensity 0 and standard deviation 1
•
Only needs to be done once (at start of function)
–
Replace each image patch y by normalized image patch y’
•
Has mean pixel intensity 0 and standard deviation 1
•
Likely to lead to better matching
•
However: the patch normalization has to be done at every patch
in the image (so will slow down the template

matching code)
CS

175: Clustering:
11
Normalized Template Matching
CS

175: Clustering:
12
Normalized Template Matching
CS

175: Clustering:
13
Normalized Template Matching
CS

175: Clustering:
14
Normalized Template Matching
CS

175: Clustering:
15
Modified Template

Matching Code
% Reshape template
reshtemp = reshape(template,1,tmrows*tmcols);
% Remove mean of template and divide by standard deviation:
reshtemp = (reshtemp

mean(reshtemp))./std(reshtemp);
% Template now has mean 0 and standard deviation 1;
…….
for x=1:xspan
for y=1:yspan
% Take a piece of the image where the template is
bite = image(y:y+tmrows

1,x:x+tmcols

1);
% Reshape
reshbite = reshape(bite,1,tmrows*tmcols);
% Remove mean of “bite” and divide by standard deviation:
sbite = std(reshbite);
if sbite>0
reshbite = (reshbite

mean(reshbite))./(std(reshbite));
end
% “patch” now has mean 0 and standard deviation 1;
CS

175: Clustering:
16
Clustering Algorithms
CS

175: Clustering:
17
Unsupervised Learning or Clustering
•
In “supervised learning” each data point had a class label
•
in many problems there are no class labels
–
this is “unsupervised learning”
–
human learning: how do we form categories of objects?
•
Humans are good at creating groups/categories/clusters from
data
–
in image analysis finding groups in data is very useful
•
e.g., can find pixels with similar intensities
–
> automatically finds regions in images
•
e.g., can find images that are similar

> can automatically find classes/clusters of images
CS

175: Clustering:
18
Example: Data in 2 Clusters
Feature 1
Feature 2
CS

175: Clustering:
19
The Clustering Problem
•
Let
x
= (x
1
, x
2
,…, x
d
,) be a d

dimensional feature vector
•
Let D be a set of
x
vectors,
–
D = {
x
(1),
x
(2), …..
x
(N) }
•
Given data D, group the N vectors into K groups such that the
grouping is “optimal”
•
One definition of “optimal”:
–
Let mean_k be the mean (centroid) of the Kth group
–
Let d_i be the distance from vector
x
(i) to the closest mean
•
so each data point
x
(i) is assigned to one of the K means
CS

175: Clustering:
20
Optimal Clustering
•
Let mean_k be the mean (centroid) of the kth cluster
–
mean_k is the average vector of all vectors
x
“assigned to” cluster k
–
mean_k = (1/n)
S
x
(i),
–
where the sum is over
x
(i) assigned to cluster k
•
One definition of “optimal”:
–
Let d_i be the distance from vector
x
(i) to the closest mean
•
so each data point
x
(i) is assigned to one of the K means
•
Q_k = quality of cluster k =
S
d_i ,
–
where the sum is over
x
(i) assigned to cluster k
–
the Q_k’s measure how “compact” each cluster is
•
We want to minimize the total sum of the Q_k’s
CS

175: Clustering:
21
The Total Squared Error Objective function
•
Let d_i = distance from feature vector
x
(i) to the closest mean
= squared Euclidean distance between
x
(i) and mean_k
•
Now Q_k = sum of squared distances for points in cluster k
•
Total Squared Error (TSE)
–
TSE = Total Squared_Error =
S
Q_k
•
where sum is over all K clusters (and each Q_k is itself a sum)
–
TSE measures how “compact” a clustering is
CS

175: Clustering:
22
Example: Data in 2 Clusters
Feature 1
Feature 2
CS

175: Clustering:
23
“Compact” Clustering: Low TSE
Feature 1
Feature 2
Cluster Center 1
Cluster Center 2
CS

175: Clustering:
24
“Compact” Clustering: Low TSE
Feature 1
Feature 2
Cluster Center 1
Cluster Center 2
Here we have 2 clusters, and TSE = Q
1
+ Q
2
CS

175: Clustering:
25
“Non

Compact” Clustering: High TSE
Feature 1
Feature 2
Cluster Center 1
Cluster Center 2
TSE = Q
1
+ Q
2
would be much higher now: so we want to
find the cluster centers that minimize TSE
CS

175: Clustering:
26
The Clustering Problem
•
Let D be a set of
x
vectors,
–
D = {
x
(1),
x
(2), …..
x
(N) }
•
Fix a value for K, e.g., K = 2
•
Find the locations of the K means that minimize the TSE
–
no direct solution
•
Exhaustive search: how many possible clusterings of N objects
into K subsets?
–
O(K
N
)

> way too many to search directly
–
can use an iterative greedy search algorithm to minimize TSE
CS

175: Clustering:
27
The K

means Algorithm for Clustering
Inputs: data D, with N feature vectors
K = number of clusters
Outputs: K mean vectors (centers of K clusters)
memberships for each of the N feature vectors
CS

175: Clustering:
28
The K

means Algorithm for Clustering
kmeans(D, k)
choose K initial means randomly (e.g., pick K points randomly from D)
while means_are_changing
% assign each point to a cluster
for i = 1: N
membership[
x
(i)] = cluster with mean closest to
x
(i)
end
% update the means
for k = 1:K
mean_k = average of vectors
x
(i) assigned to cluster k
end
% check for convergence
if (new means are the same as old means) then halt
else means_are_changing = 1
end
CS

175: Clustering:
29
Original Data (2 dimensions)
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
CS

175: Clustering:
30
Initial Cluster Centers for K

means (K=2)
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Initial Cluster Centers at Iteration 1
CS

175: Clustering:
31
Update Memberships (Iteration 1)
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 1
X Variable
Y Variable
CS

175: Clustering:
32
Update Cluster Centers at Iteration 2
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Cluster Centers at Iteration 2
CS

175: Clustering:
33
Update Memberships (Iteration 2)
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 2
X Variable
Y Variable
CS

175: Clustering:
34
Update Cluster Centers at Iteration 3
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Cluster Centers at Iteration 3
CS

175: Clustering:
35
Update Memberships (Iteration 3)
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 3
X Variable
Y Variable
CS

175: Clustering:
36
Update Cluster Centers at Iteration 4
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Cluster Centers at Iteration 4
CS

175: Clustering:
37
Updated Memberships (Iteration 4)
8
10
12
14
16
18
20
7
8
9
10
11
12
13
14
Updated Memberships and Boundary at Iteration 4
X Variable
Y Variable
CS

175: Clustering:
38
Comments on the K

means algorithm
•
Time Complexity
–
per iteration = O(KNd)
•
Can prove that TSE decreases (or converges) at each iteration
•
Does it find the global minimum of TSE?
–
No, not necessarily
–
in a sense it is doing “steepest descent” from a random initial
starting point
–
thus, results will be sensitive to the starting point
•
in practice, we can run it from multiple starting points and pick
the solution with the lowest TSE (the most “compact” solution)
CS

175: Clustering:
39
Clustering Pixels in an Image
•
We can use K

means to cluster pixel intensities in an image into K
clusters
–
this provides a simple way to “segment” an image into K regions of
similar “compact” image intensities
–
more automated than manual thresholding of an image
•
How to do this?
–
Size(image pixel matrix) = m x n
–
convert to a vector with (m x n) rows and 1 column
•
this is a 1

dimensional feature vector of pixel intensities
–
run the k

means algorithm with input = vector of intensities
–
assign each pixel the “grayscale” of the cluster it is assigned to
–
Note: with color images we can use a 3

dimensional feature vector
per pixel, i.e, R, G, B values at each pixel
CS

175: Clustering:
40
Clustering in RGB (color) space
K

means clustering of RGB (3 value) pixel
color intensities, K = 11 segments
(courtesy of David Forsyth, UC Berkeley)
Image
Clusters on color
CS

175: Clustering:
41
Example: Original Image
20
40
60
80
100
120
20
40
60
80
100
120
CS

175: Clustering:
42
Segmentation with K

means: K = 2
CS

175: Clustering:
43
Segmentation with K=3
CS

175: Clustering:
44
Segmentation with K=5
Note: what K

means is doing in effect is finding 4 threshold intensities (based on the data)
and assigning each intensity to 1 of 5 “bins” (clusters) based on these thresholds
CS

175: Clustering:
45
Another Image Example
20
40
60
80
100
120
20
40
60
80
100
120
CS

175: Clustering:
46
Segmentation with K=2
CS

175: Clustering:
47
Segmentation with K=3
CS

175: Clustering:
48
Segmentation with K=8
(with pseudocolor display)
colormap(‘hsv’)
CS

175: Clustering:
49
Using pixel clustering for region finding
•
How could you use K

means in your project?
–
K

means puts pixels into K groups based on intensity similarities
–
The result is a set of regions in an image, where each region is
relatively homogeneous in terms of pixel intensity
–
=> K

means can be used as a simple technique for region

finding
–
Note that K

means clustering knows nothing about the spatial
aspects of the image
•
Other region

finding algorithms can operate spatially
(more on this in a later lecture)
CS

175: Clustering:
50
Using pixel clustering for region finding
•
How could you use K

means in your project?
–
K

means puts pixels into K groups based on intensity similarities
–
The result is a set of regions in an image, where each region is
relatively homogeneous in terms of pixel intensity
–
=> K

means can be used as a simple technique for region

finding
–
Note that K

means clustering knows nothing about the spatial
aspects of the image
•
Other region

finding algorithms can operate spatially
•
Note that regions and edges are “duals”
edges
業i来g
–
So one could find regions give edges
–
Or one could find edges given regions (e.g., boundaries between
clusters of pixels produced by K

means)
CS

175: Clustering:
51
Clustering Images
•
We can also cluster sets of images into groups
–
now each vector = a full image (dimensions 1 x (mxn))
–
N images of size m x n
•
convert to a matrix with N rows and (m x n) columns
–
just use image_to_matrix.m
–
call kmeans with D = this matrix
•
kmeans is now clustering in an (m x n) dimensional space
–
kmeans will group the images into K groups
CS

175: Clustering:
52
Example: First 5 Individuals, K = 2
Cluster 1
Cluster 2
CS

175: Clustering:
53
Example: 2nd 5 individuals, K = 2
Cluster 1
Cluster 2
CS

175: Clustering:
54
All Individuals, Happy Faces, K=5
CS

175: Clustering:
55
Matlab Code
•
Code for k

means on Web page: kmeans_clustering.zip
–
Kmeans.m
•
Does the basic clustering
–
Segmentimage.m
•
Uses k

means to cluster pixel intensities
–
Clusterimage.m
•
Uses k

means to cluster images
CS

175: Clustering:
56
Summary
•
Clustering
–
automated methods to assign feature vectors to K clusters
–
K

means algorithm
–
With images, can use K

means to
•
Cluster pixels into groups of pixels
•
Cluster images into groups of images
CS

175: Clustering:
57
Timeline
•
Tuesday December 4
th
–
No lecture (out of town)
•
Thursday Dec 6
th
: Student Presentations:
–
About 3

5 slides, 4 minutes per student + questions
–
IMPORTANT: upload your slides to EEE before 9am Thursday!
•
Wednesday Dec 12
th
: Final project reports due 12 noon to EEE
–
Instructions on format provided on the class Web site
Comments 0
Log in to post a comment