HELSINKI UNIVERSITY OF TECHNOLOGY 1.8.2002

Laboratory of Computer and Information Science

T-61.195 Special Assignment 1

Clustering Algorithms: Basics and

Visualization

Jukka Kainulainen

47942F

1

Clustering Algorithms: Basics and Visualization

Jan Jukka Kainulainen

HUT

jkainula@cc.hut.fi

Abstract

This paper discusses the issue of clustering algorithms. Clustering

algorithms are important in many fields of science. Paper provides the basic

concepts and presents an implementati on of a few popular algorithms. The

problem of visualization of the result is also discussed and a simple solution

in a dimension limited case is provided.

1 INTRODUCTION

Clustering is one solution to the case of unsupervised learning, where class labeling

information of the data is not available. Cl ustering is a method where data is divided

into groups (clusters) which seem to make se nse. Clustering algorithms are usually fast

and quite simple. They need no beforehand knowledge of the used data and form a

solution by comparing the given samples to each other and to the clustering criterion.

The simplicity of the algorithms is also a drawback: The results may vary greatly when

using a different kind of clustering criter ia and thus unfortunately also nonsense

solutions are possible. Also with some algorithms the order in which the original

samples are introduced can make a great difference to the result.

Despite the drawbacks clustering is used in many fields of science including machine

vision, life and medical sciences and informa tion science. One reason for this is the fact

that intelligent beings, humans included, are known to use the idea of clustering in many

brain functions.

This paper introduces the basics of clustering and the concepts needed to understand and

implement the algorithms. A couple of algorithms are implemented and compared. Also

the problem of visualization of the result is discussed and one solution is provided in the

popular OpenGL framework.

The second chapter introduces the basic concepts the reader should be aware of to be

able to read this paper efficiently. The third chapter introduces the most popular

algorithms and discusses their characteristics. The fourth chapter complements the

survey to the clustering algorithms by introducing some special algorithms. Chapter five

then provides an implementation of four al gorithms discussed in chapter three and

chapter six provides a way to visualize the result. Finally, chapter seven runs some tests

on the implemented algorithms.

2

2 BASIC CONCEPTS

When classifying different kind of samples a way to represent the sample in a

mathematical way is needed. From now on we assume that the features are represented

in a feature vector. A feature vector is a vector including different features for the

sample. That is, with l features x

i

the feature vector is of the form

x = [x

1

, x

2

, , x

l

]

T

where T denotes transposition and x

i

are typically real numbers. The selection of these

features is often very hard due to the fact there usually are a lot of features from where

the most representative ones should be selected. This is because the computational

complexity of the classification (clustering) algorithm grows with every feature selected.

Feature selection and the reduction of dimensionality of the data are beyond this

document. For additional information about these tasks a good place to start from is

Thedoridis (1999).

2.1 Linear Independence, Vector Space

A set of n vectors is said to be linearly independent if equation

k

1

x

1

+ k

2

x

2

++ k

n

x

n

= 0 (2.1)

implies k

i

= 0 for all i. If a nonzero solution can be found then the vectors are said to be

linearly dependent. If the vectors are linearly dependent then at least one of them can be

expressed as a linear combination of the others.

Now, given m vectors x

n

with l components (as before) we can form a set V of all the

linear combination of these vectors. V is called the span of these vectors. The maximum

number of linearly independent vectors in V is called the dimension of V. Clearly, if the

given m vectors were linearly independent then the dimension is m and x

n

is called the

basis for V.

An n-dimensional vector space R

n

is the space of all vectors with n (real) numbers as

components. The symbol R itself refers to a single dimension of real numbers. Thus for

n = 3 we get R

3

for which the basis is, for example, vectors

x

1

= [1, 0, 0]; x

2

= [0, 1, 0]; x

3

= [0, 0, 1]

so every vector in R

3

can be expressed with these three basis vectors. A more

comprehensive examination of vector spaces can be found from almost any engineering

mathematics book, for example, Kreyszig (1993).

2.2 Data Normalization

The used data is often scaled to be within a certain range. Neural networks, for example,

often require this. A classical way to normalize the N available data vectors is with the

mean value

3

∑

=

=

N

i

ikk

x

N

x

1

1

(2.2)

where k denotes the feature and variance

2

1

2

)(

1

1

kik

N

i

k

xx

N

−

−

=

∑

=

σ

(2.3)

now, to gain zero mean and unit variance

k

kik

ik

xx

x

σ

−

=

(2.4)

2.3 Definition of a Cluster

Now, let us define some basic concepts of clusters in a mathematical way. Let X be a set

of data, that is

X = {x

1

, x

2

,, x

n

}.

A so called m-clustering of X is its partition into m parts (clusters) C

1

,, C

m

, so that

1. None of the clusters is empty; C

i

≠ Ø

2. Every sample belongs to a cluster

3. Every sample belongs to a single cluster (crisp clustering); C

i

∩ C

j

= Ø, i≠j

Naturally, it is assumed that vectors in cluster C

i

are in some way more similar to each

other than to the vectors in other clusters. Figure 1 illustrates a couple of different kind

of clusters; compact, linear and circular.

Figure 1: Couple of different kind of clusters

2.4 Number of Possible Clusterings

Naturally, the best way to apply clustering would be to identify all possible clusterings

and select the best suitable one. Unfortunately, due to limited time and large amount of

4

feature vectors this isnt us ually possible. If we let S(N, m) denote the number of

possible clusterings of N vectors into m groups it is true that

1. S(N, 1) = S(N, N) = 1

2. S(N, m) = 0, if m > N

where the second statement comes from the definition that no cluster may be empty.

Actually, it can be shown that the solution for S is the Stirling numbers of the second

kind:

∑

=

−

−=

m

i

Nim

i

i

m

m

mNS

0

)1(

!

1

),(

(2.5)

It is quite clear that the solution of equation 2.5 grows rapidly with N and if the number

of desired clusters m is not initially available many different values must be tested and a

raw power solution becomes impossible. A more efficient solution must be found.

2.5 Proximity Measure

When clustering is applied a way to measure the similarities and dissimilarities between

the samples is needed. A formal way to define the function of dissimilarity measure

(DM) d (informally, distance) on X is the following:

d : X × X → R

exists d

0

in R : -∞ < d

0

≤ d(x, y) < +∞, for all x, y in X (2.6)

d(x, x) = d

0

, for all x in X (2.7)

d(x, y) = d(y, x), for all x, y in X. (2.8)

If in addition the following is valid:

d(x, y) = d

0

, if and only if x = y (2.9)

d(x, z) ≤ d(x, y) + d(y, z), for all x, y, z in X (triangular inequality) (2.10)

then d is called a metric. A similarity measure (SM) is defined correspondingly, see

Theodoridis (1999) for details. For example the Euclidian distance d

2

is a metric

dissimilarity measure of d

0

= 0; the minimum possible distance between any two vectors

is 0 and is equal to 0 only when the two vectors are the same. Also the triangular

inequality is known to be true with Euclidian distance. Another well known metric DM

is the Manhattan norm.

The inner product x

T

y between two vectors on the other hand, is a similarity measure.

Especially, if the length of the vectors x, y is one then the lower and upper limits for the

inner product are -1 and +1.

5

When we extend the above formulas (2.6 2.10) to hold for subsets (U) of X we get the

proximity measure ς on U as a function

ς : U × U → R.

A typical case where proximity between subset s is needed is when a single vector x is

measured against a cluster C. Typically a distance to y, the representative of C, is

chosen. The representative can be chosen so that the value is, for example, maximized

or minimized. If a single vector representative is chosen among C the used method is

called global clustering criteria and if all the vectors in C have an effect on the

representative a local clustering criteria is being used.

The representative of C can also be curve or a hyperplane in the dimension of x. For

example in figure 2 different kind of repres entatives for the clusters of figure 1 are

chosen. The first cluster is represented by a point, the second by a line and the third by a

two hyperspheres (the inner and the outer).

Figure 2: Different representatives for different clusters

3 POPULAR CLUSTERING ALGORITHMS

As stated in chapter 2.4, calculating all possible combinations of the feature vectors is

not generally possible. Clustering algorithms provide means to make a sensible

division into small clusters by using only a fr action of the work needed to calculate all

possible combinations. These algorithms usuall y fall into one of the categories of the

below subchapters, Theodoridis (1999).

3.1 Sequential Algorithms

Sequential algorithms are straightforward and fast methods to produce a single

clustering. Usually the feature vectors are presented to the algorithm once or a few

times. Final result is typically dependent on the order of presentation and the result is

often compact and hyperellipsoidally shaped.

3.1.1 Basic Sequential Algorithmic Scheme

A very basic clustering algorithm that is easy to understand is basic sequential

algorithmic scheme (BSAS). In the basic form vectors are presented only once and the

number of clusters is not known a priori. What is needed is the dissimilarity measure

d(x, C), threshold of dissimilarity Θ and the number of maximum clusters allowed q.

6

The idea is to assign every newly presented vector to an existing cluster or create a new

cluster for this sample, depending on the distance to the already defined clusters. As

pseudo the algorithm works like the following:

1. m = 1; C

m

= {x1}; // Init first cluster = first sample

2. for every sample x from 2 to N

a. find cluster C

k

such that min d(x, C

k

)

b. if d(x, C

k

) > Θ AND (m < q)

i. m = m + 1; C

m

= {x} // Create a new cluster

c. else

i. C

k

= C

k

+ {x} // Add sample to the nearest cluster

ii. Update representative if needed

3. end algorithm

As can be seen the algorithm is simple but st ill quite efficient. Different choices for the

distance function lead to different results and unfortunately the order in which the

samples are presented can also have a great effect to the final result. Whats also very

important is a correct value for Θ. This value has a direct effect on the number of

formed clusters. If Θ is too small unnecessary clusters are created and if too large a

value is chosen less than required number of clusters are formed.

One detail is that if q is not defined the algorithm decides the number of clusters on its

own. This might be wanted under some circumstances but when dealing with limited

resources a limited q is usually chosen. Also, it should be noted that BSAS can be used

with a similarity function simply by replacing the min function with max.

There exists a modification to BSAS called modified BSAS (MBSAS), which runs

twice through the samples. It overcomes the drawback that a final cluster for a single

sample is decided before all the clusters have been created. The first phase of the

algorithm creates the clusters (just like 2b in BSAS) and assigns only a single sample to

each cluster. Then the second phase runs through the remaining samples and classifies

them to the created clusters (step 2c in BSAS).

3.1.2 A Two-Threshold Sequential Scheme

The major drawback of BSAS and MBSAS is the dependence on the order of the

samples as well as on the correct value of Θ. These drawbacks can be diminished by

using two threshold values Θ

1

and Θ

2

. Distances less than the first value Θ

1

denote that

these two samples most likely belong together and distances greater than Θ

2

denote that

these samples do not belong to the same cluster. Values between these two are in a so

called gray area and they are to be reevaluated at a later stage of the algorithm.

7

Letting clas(x) be a boolean stating whether a sample has been classified or not and

assuming no bounds to the number of clusters, the two-threshold sequential scheme

(TTSAS) can be described in pseudo:

1. m = 0

2. for all x clas(x) = false

3. prev_change = 0; cur_change = 0; exists_change = 0;

4. while exists some sample not classified

a. for every x from 1 to N

i. if clas(x) = false AND is first in this while loop AND

exists_change = 0

1. m = m + 1 // Create a new cluster

2. C

m

= {x}; clas(x) = true;

3. cur_change = cur_change + 1

ii. else if clas(x) = false

1. find min d(x, C

k

)

2. if d(x, C

k

) < Θ

1

a. C

k

= C

k

+ {x}; clas(x) = true; // Add to a cluster

b. cur_change = cur_change + 1

3. else if d(x, C

k

) > Θ

2

a. m = m + 1 // Create a new cluster

b. C

m

= {x}; clas(x) = true;

c. cur_change = cur_change + 1

iii. else // if clas(x) = true

1. cur_change = cur_change + 1

b. exists_change = |cur_change prev_change|

c. prev_change = cur_change; cur_change = 0;

8

The variable exists_change variable checks if at least one vector has been classified at

the current pass of while loop. If no sample has been classified, the first unclassified

sample is used to form a new cluster (4.a.i) and this guarantees that at most N passes of

the while loop is performed. In practice the number of passes should naturally be much

less than N but in theory this is an O(N

2

) algorithm, see Weiss (1997) on additional

information about performance classifications.

3.2 Hierarchical Algorithms

Hierarchical algorithms are further divided into agglomerative and divisive schemes.

These algorithms rely on ideas of matrix and graph theory to produce either decreasing

or increasing number of clusters each ti me step thus producing a hierarchy of

clusterings.

The problem of its own is the choice of selecting a proper clustering from this hierarchy.

One solution is to track the lifetime of all cl usters and search for clusters that have a

large lifetime. This involves subjectivity and might not be suitable in many cases.

Another approach is to measure the self-similarity of clusters by calculating distances

between vectors in the same cluster and comparing them to some threshold value. As

can be deduced the problem overall is difficult and no general correct solution exists.

3.2.1 Agglomerative Algorithms 1: Matrix Theory

Agglomerative algorithms start from an initial clustering where every sample vector has

its own cluster; that is, initially there exists N clusters with a single element in every one

of them. Every step of the algorithm two of these clusters are joined together, thus

resulting in one less clusters. This is continued until only a single cluster exists.

One simple agglomerative algorithm from whic h many other algorithms are derived is

the general agglomerative scheme (GAS) defined as follows:

1. t = 0; C

i

= x

i

, i = 1N // Initial clustering

2. while more than one cluster left

a. t = t + 1

b. among all the clusters find the pair min d(C

i

, C

j

) (or max d(C

i

, C

j

) if d

denotes similarity)

c. Create C

q

= C

i

+ C

j

, and replace clusters C

i

and C

j

with it.

It should be clear that this creates a hierarchy of N clusterings. The disadvantage here is

that once two vectors are assigned to a single cluster there is no way for them to get

separated at a later stage. Further, it is quite easy to see that this is an O(N

3

) algorithm

not suitable for a large N.

9

There are algorithms such as matrix updating algorithmic scheme (MUAS) and single

and complete link variations that can all be seen as a special case of GAS. These

algorithms are based on matrix theory and the general idea is to use the pattern matrix

D(X) and the dissimilarity matrix P(X) to hold the information needed in a GAS like

updating scheme. The pattern matr ix is a N × m matrix, whose ith row is the transposed

ith vector of X. The dissimilarity matrix is just an N × N matrix, whose element ( j, k)

equals the dissimilarity d(x

j

, x

k

).

3.2.2 Agglomerative Algorithms 2: Graph Theory

The other form of agglomerative algorithms is those based on graph theory. A graph G

is an ordered pair G = (V, E), where V = {v

i

, i = 1,, N} is a set of points (nodes) and E

is a set of edges denoted e

ij

or (v

i

, v

j

) connecting some of these points. If the order of

points (v

i

, v

j

) is not meaningful the graph is undir ected; otherwise it is directed. If no

cost is associated with the edges the graph is said to be unweighted and if anyone of the

edges has a cost then the graph is weighted. A more throughout introduction to graphs

can be found, for example, from Ilkka (1997).

Small graphs can be illustrated easily by drawing them like in the figure 3. The first

graph with 5 nodes is complete (all points are connected to each other) and unweighted.

The second graph is a subgraph of the first one and the third is a path 1,,5 with

weights assigned to each edge.

Figure 3: Different kinds of graphs.

In clustering algorithms we consider the graph nodes as sample vectors from X. Clusters

are formed by connecting these nodes together with edges. The basic algorithm in this

case is know as the graph theory-based algorithmic scheme (GTAS). It is, again, very

similar to GAS. The difference is in the step 2b, which now becomes

min g

h(k)

(C

i

, C

j

) (or max g

h(k)

(C

i

, C

j

) if g denotes similarity)

where g

h(k)

(C

i

, C

j

) is defined in terms of proximity and the property h(k) of the

subgraphs. This property can differ depending on the desired result. In other words

clusters (subgraphs) are formed based on the distance and the fact that the resulting

subgraph has the property h(k) or is complete.

10

3.2.3 Divisive algorithms

Divisive algorithms are the opposite of the agglomerative ones: they start with a single

cluster containing the entire set X and start to divide it in stages. The final clustering

contains N clusters, one sample in each one. The idea is to find the division that

maximizes the dissimilarity. As an example the generalized divisive scheme (GDS) may

be defined

1. t = 0; C

0

= {X}

2. while each vector not in single distinct cluster

a. t = t + 1

b. for i = 1 to t

i. amongst all possible pairs of clusters find max g(C

i

, C

j

)

c. form a new clustering by separating the pair (C

i

, C

j

)

It is easy to see that this is computationally very demanding and in practice

simplifications are required. One way to speed up the process goes as follows: Let C

i

be

a cluster to be split into C

1

and C

2

. Initially set C

1

= 0 and C

2

= C

i

. Now, find the vector

from C

2

whose average dissimilarity with other vectors is maximal and move it to C

1

.

Next, for each remaining vector x (in C

2

) compute its average dissimilarity with C

1

and

C

2

. If for every x dissimilarity with C

2

is smaller than with C

1

then stop (we found the

division). Else we move the x maximizing the similarity with C

1

and minimizing the

similarity with C

2

to C

1

. This process in continued always until a division has been

found and it can be viewed as the step 2.b.i of GDS.

3.3 Cost Function Optimization Algorithms

A third genre of clustering algorithms is those based on a cost function. Cost function

algorithms use a cost function J to define the sensibility of their solution. Usually the

number of desired clusters is known beforehand and differential calculus is used to

optimize the cost function iteratively. Bayesian, Thedoridis (1999), philosophy is also

often applied. This category includes also fuzzy clustering algorithms where a vector

belongs to a cluster up to a certain degree. As a glance to the cost function optimization

algorithms some of the theory of fuzzy clustering is discussed in the below subchapter.

3.3.1 Fuzzy Clustering

Fuzzy schemes have been under a lot of interest and research during the recent years.

What is characteristic and unique to fuzzy schemes is that a sample belongs

simultaneously to many categories. A fuzzy cl ustering is defined by a set of functions u

j

: X → A, j = 1,, m and A = [0, 1]. A hard clustering scheme can be defined by setting

A = {0, 1}.

11

Let Θ

j

be a parameterized representative of the cluster j, so that Θ = [Θ

1

T

,, Θ

m

T

]

T

and

let U be an N × m matrix with element (i, j) denoting u

j

(x

i

). Then we can define a cost

function of the form

∑∑

= =

=

N

i

m

j

ji

q

ijq

xduUJ

1 1

),(),( ΘΘ

(3.1)

which is to be minimized with respect to Θ and U. The parameter q is called the

fuzzyfier. The constraint is that one sample bel ongs to the clusters at the rate of 1

∑

=

=

m

j

ij

u

1

1 , i = 1,,N. (3.2)

Minimizing J with respect to U, see Thedoridis (1999) for details, leads to

∑

=

−

=

m

j

q

jr

sr

rs

d

d

u

1

1

1

,(

),(

1

Θx

Θx

, r = 1,...,N , s = 1,,m (3.3)

With respect to

Θ

we take the gradiant of (3.1) and obtain

0

),(

),(

1

=

∂

∂

=

∂

∂

∑

=

N

i

j

ji

q

ij

j

xd

u

U

Θ

Θ

Θ

ΘJ

, j = 1,,m. (3.4)

When combined these two do not give a general closed form solution. One solution is to

use, for example, an algorithm known as generalized fuzzy algorithmic scheme (GFAS)

to iteratively estimate U and Θ, see Theodoridis (1999).

Finally if, for example, for a point representative we use a common function d of form

)()(),(

ji

T

jiji

Ad ΘxΘxΘx −−=

(3.5)

substituting this to (3.4) yields to

0)(2

),(

1

=−=

∂

∂

∑

=

N

i

j

q

ij

j

Au

U

i

xΘ

Θ

ΘJ

which is to be used in GFAS to obt ain new representatives per time step.

4

OTHER ALGORITHMS

There are also other algorithms not belonging to the groups mentioned in chapter 3.

These include, for example, genetic algorithms, stochastic relaxation methods,

competitive learning algorithms and morphological transformation techniques. Also,

12

some graph theory algorithms are used, for example, algorithms based on the minimum

spanning tree (MST), see Ilkka (1997). The following subchapter gives a quick tour to

the ideas of competitive learning.

4.1

Competitive Learning

Competitive learning algorithms are a wide branch of algorithms used in many fields of

science. What these algorithms actually do is clustering. They typically use a set of

representatives w

j

(like Θ in the previous chapter) which are moved around in space R

n

to match (represent) regions that include relatively large amount of samples. Every time

a new sample is introduced the representatives compete with each other and the winner

is updated (moved). Other representatives can be updated at a slower rate, left alone or

be punished (moved away from the sample).

One of the most basic competitive algorithms is the generalized competitive learning

scheme (GCLS) defined

1.

t = 0; // Time = 0

2.

m = minit; // Number of clusters

3.

while NOT convergence AND (t < tmax)

a.

t = t + 1

b.

present a new sample x and calculate winner w

j

c.

if (x NOT similar with w

j

) AND (m < mmax)

i.

m = m + 1; // New cluster

ii.

w

m

= x

d.

else // Update parameters

i.

if winner w

j

(t) = w

j

(t-1) + nh(x, w

j

(t-1))

ii.

if not winner w

j

(t) = w

j

(t-1) + nh( x, w

j

(t-1))

4.

Clusters are ready and represented by w

j

s. Assign each sample to the cluster

whose representative is the closest.

Parameters n and n called learning rate parameters. They control the rate of change of

the winners and losers. The function h is some function usually dependent on distance.

Convergence can be considered, for example, by calculating the total change in the

vectors w

j

and comparing it to a selected threshold value.

13

4.1.1

The Self-Organizing Map

Generalizing the definition of a representative and defining a neighborhood of

representatives Q

j

for each w

j

we achieve a model called the (Kohonen) self-organizing

map (SOM). As time t increases the neighborhood shrinks and concentrates around the

representative. All representatives in the neighborhood of the winner are updated each

time step. Whats important is the fact that the neighborhood is independent on the

actual distances in space and is defined in terms of the indices j. That is, the geometrical

distance in space is not an issue or metric of the neighborhood.

The self-organizing map and its properties are formally defined in terms of a neural

network in Haykin (1999). The original article about SOM by Kohonen (1982) is the

most popular and general model in use today. The Kohonen model is illustrated in the

figure 4. The input layer is connected to a two-dimensional array of neurons from which

the winner is chosen. The weight of the winner (and its neighborhood) is then updated

each time step like in GCLS.

Figure 4: The Kohonen SOM model

4.2

Closing on Algorithm Presentation

This and the preceding chapter introduced the most general algorithms used in

clustering. Every algorithm presented is of basic nature and should not be very hard to

understand or implement. The presentation is not complete and many algorithms of

value are omitted due to limitations set for the length of this document. As a close-up,

figure 5 gives an overall sight to the famili es of clustering algorithms mentioned above.

The hierarchy is neither absolute nor complete and also some other kind of division

could be made. It is provided to clarify the different branches in the way the algorithms

operate.

14

Figure 5: The family of clustering algorithms

5

THE IMPLEMENTATION

This chapter presents an implementation of four of the above algorithms. The

implementation works in the Microsoft Windows environment and is coded in the C++

programming language using the Microsoft Visual Studio .NET compiler. C++ was

chosen for this task because it is an industr y standard, it produces efficient programs and

because of its power of expression. All the re levant code is provided in appendix and is

available, along with the application, from the author.

What is needed before the actual implementation is a way to store and handle the

vectors used as containers for the features. For this a class of vector container was

created. What is assumed from here on is the fact that there are no more that three

features to deal with. This limitation is because of the fact that only three dimensions

(3D) can be easily projected to a 2D display. The vector class ( Vector3) has these three

components for which all the normal vector arithmetic is implemented as class

operators. The next chapter provides an efficient way to display these vectors and their

clusterings with the OpenGL API (application program interface) often used in

professional 3D graphics.

A base class for all clustering algorithms was named CClustAlgorithm. It provides some

virtual functions all algorithms must implement. This way there is a unified base from

which all the algorithms must be derived. The most important function (Java has

methods, C++ has functions) to implement is

virtual ClList* Clusterize(const VList* vectors, CCluster* empty)

const;

It returns an std::list of clusters formed from given vectors. The empty cluster

parameter is so that different kind of subclasses can be used: Clusterize always creates

15

the same kind of objects as is given to it. That way one can, for example, use clusters

with different kind of representatives.

The CCluster class represents a single cluster. It holds a list of all the vectors belonging

to it and a representative (mean value). It can be used as a base class if other kinds of

representatives are needed. Vectors can be added and removed and distances to other

clusters and individual vectors can be calcul ated. Also, other clusters can be included

into a cluster and thus form a union. This functionality is enough for efficient

implementation of the algorithms in the below subchapters.

5.1

MBSAS

The implementation of the MBSAS algorithm is straightforward. Initialization includes

creating an initial cluster and adding the first element in the list to it. After that, the

create clusters pass is performed. One iterator goes through all the samples and

another iterator goes through the list of clus ters (for every sample). The cluster with

minimal distance is retrieved and the distance is compared to the given threshold value.

If the distance is greater than the threshold and we still may create a new cluster ( iq is

the maximum amount) we do so. As code the create clusters pass is

for (; iter != tmplist.end(); iter++) {

tmp = *iter;

float mindist = FLT_MAX;

// Find the minimum distance

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end();

iter2++) {

float dist = (*iter2)->Distance(tmp);

if (dist < mindist)

mindist = dist;

}

// Create a new cluster?

if ((mindist > fTheta) && ((int)ClusterList->size() < iq)) {

CCluster* newclust = empty->GetNewCluster();

newclust->AddVector(tmp);

ClusterList->push_back(newclust);

}

}

Now, all the samples already assigned to a cluster are removed from the list. After that a

second pass to classify all the rest is made. The sample is added to the cluster with

minimum distance just like above but new clusters are not created anymore.

The implementation works well and is the fastest (among with TTSAS) of the

algorithms provided here. The quality of the result is dependent on the correct value of

the threshold value as can be seen in chapter 7.

5.2

TTSAS

In the TTSAS algorithm the list of vectors is first transformed into a normal array of

vectors. This might not be necessary but now another array containing the information

about whether the sample is classified or not ( clas) is easy to make. The implementation

16

here does not limit the number of clusters like the above MBSAS. What is done is

exactly the same as in chapter 3.1.2. Part 4.a.i goes like

if (!clas[i] && existsChange == 0 && !gotOne) {

// Let's make sure the while ends at some point :)

CCluster* clust = empty->GetNewCluster();

clust->AddVector(tmplist[i]);

ClusterList->push_back(clust);

clas[i] = true;

curChange++; numDone++; gotOne = true;

}

Next, if sample is not classified (4.a.ii) we search the minimum distance cluster and

based on the two threshold values create a new cluster, add the sample to an existing

cluster or just leave it to a later pass:

if (mindist < fTheta1) { // found the same kind

minclust->AddVector(tmplist[i]);

clas[i] = true;

curChange++; numDone++;

}

else if (mindist > fTheta2) { // need to create a new one

CCluster* clust = empty->GetNewCluster();

clust->AddVector(tmplist[i]);

ClusterList->push_back(clust);

clas[i] = true;

curChange++; numDone++;

}

All this is done for the entire list, pass by pass, until every sample belongs to a cluster.

The first if guarantees that no more that N

2

is made. Naturally, in practice, the amount of

passes is less. The result is, again, dependent on the threshold values. The speed of this

implementation is on par with the MBSAS implementation or little better.

5.3

GAS

The GAS algorithm is also quite short and easy to implement. Notice that the entire

hierarchy of clusterings is not saved in this implementation. First, create an initial

clustering of N clusters with one sample in every cluster. Then, while there is more that

the suggested amount of clusters seek the two that are closest to each other and combine

them:

// Seek the two clusters that have min distance (slow)...

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end();

iter2++) {

iter2++; // Dummy inc

for (iter3 = iter2--; iter3 != ClusterList->end(); iter3++) {

float dist = (*iter2)->Distance(*iter3);

if (dist < mindist) {

mindist = dist; minclust1 = *iter2; minclust2 = *iter3;

}

}

}

// ...and combine them

if (minclust2 != NULL) {

minclust1->Include(*minclust2);

ClusterList->remove(minclust2);

17

delete minclust2;

}

The only problem here is actually the slowness of the algorithm. It takes a lot of time to

go through all the levels of clustering. This proves the theory that this algorithm is quite

slow when compared to the two above algorithms. On the other hand there are no

threshold values that would need to be carefully selected as can be seen in chapter 7 and

thus this algorithm is easier and safer to use. If the amount of clusters is relatively

large compared to the amount of samples this algorithm should be more competitive

since then less iteration steps are performed.

5.4

GDS

The general divisive scheme is even more demanding than GAS. That is why the

general version of the algorithm was not implemented but an optimized version is

presented (a bit like the version in 3.2.3). The optimization is based on the assumption

that an outlier sample (the one farthest away from the medium) is very likely to belong

to another cluster than where it currently is. Also as with GAS this algorithm is not

driven to the end, but it is interrupted as soon as the suggested amount of clusters is

found. This actually helps a lot if there is a relatively small amount of clusters.

Initially a single cluster including all samples is created. Then, while there exists less

than desired amount of clusters, a cluster with the farthest outlier is selected as the one

to be divided in two. The outlier is moved to a cluster of its own. Then the distance to

the new cluster is calculated for every other vector in the old cluster and measured

against the distance to the own representati ve. The one closest to the new cluster is

selected and if it is nearer to the new cluster than the old one it is moved to the new one.

This is continued until no vector is moved dur ing one pass (all vectors in the old cluster

are nearer to representative than the new one):

while (foundOne) {

foundOne = false; // Let's see if we find any

fVector3 vect(0, 0, 0);

maxdist = FLT_MAX;

// Go through all the samples in the old cluster

for (iter = maxclust->GetVectors().begin();

iter != maxclust->GetVectors().end(); iter++) {

// Dist. to the new clust.

maxdist2 = newclust->Distance(*iter);

if (maxclust->Distance(*iter) > maxdist2 && maxdist2 < maxdist) {

foundOne = true;

maxdist = maxdist2;

vect = *iter;

}

}

if (foundOne) { // We did find one sample?

newclust->AddVector(vect);

maxclust->RemoveVector(vect);

}

}

The improvements presented significantly diminish the time needed for a single pass:

Instead of always looking for all possible combinations we select a single vector and

18

attach similar vectors to it from the old cluster. Due to the optimizations, the runtime

for this algorithm is closer to the MBSAS and TTSAS than it is to GAS. The quality of

the result is generally close to the one of GAS.

6

THE VISUALIZATION ENGINE

The visualization of clusterings is problematic if the amount of features is large. What is

provided in this chapter is a way to display three element vectors and their clusterings

with the OpenGL 1.2 API. This engine can also be seen as a tutorial to the use of

OpenGL in Microsoft Windows environment to create 3D visualizations. It actually

consists of a single class CGLRenderer which acts as a wrapper between OpenGL and

the rest of the program. The user, or programmer, needs no knowledge of the OpenGL

while using other parts of the code.

What this class provides are the basic functions needed to initialize OpenGL, draw

elements to screen, move and rotate the camera and display text to the screen. The most

important functions of CGLRenderer are a function to draw a list of clusters

void DrawClusters(const ClList* list);

and the two functions that move and rotate the view point (camera) in the coordinate

system allowing the viewer to move freely in space

void MoveCamera(float advance, float sideways, float up = 0.f);

void RotateCamera(float xrot, float yrot);

If selected, the engine draws gray bounding spheres around the clusters so that it is

easier to see the limits of a single cluster (key V). Figure 6 shows a typical view of

five separate clusters in space. The gray spheres represent the borders of different

clusters, while the colorful dots inside the spheres are individual samples. The three

white lines represent the coordinate axis. User can change the viewpoint and view

direction with mouse and mouse buttons. The user may choose any viewpoint and any

view direction; this is not limited by the application.

The user can change the algorithm to show by pressing the number keys. If the selected

algorithm is disabled no clusters are shown (j ust the coordinate axis). The text display

on the bottom left corner of the screen shows the current algorithm, number of found

clusters and viewer position (the display is not visible in the resized images of this

document). If the user minimizes the application it goes to a so called idle mode where

it consumes much less processor time al lowing other applications to run more

efficiently.

19

Figure 6: Typical view of five clusters

6.1

Data Generation and Algorithm Initialization

Data can be generated with a simple generator class CDataGenerator. It has static

functions that can be called from anywhere to get data generated with a simple random

function. The data generation setup screen is illustrated in the figure 7. User inputs the

number of clusters and vectors and the data generator generates such a material. The

compactness is a value indicating how compact the clusters should be. The file section

is for future use when data could be read from a file generated, for example, with

Matlab. User should notice here that after this dialog also the algorithms are reinitialized

with given amount of clusters. If other parameters are wanted the algorithm setup dialog

in figure 8 must be used.

After either of these dialogs the algorithms are driven through with the data. If the

number of sample vectors is great the response to the user might be slow. This is due to

the fact that the application is single threaded, meaning that when some algorithm is

been run the display is not updated.

Algorithms and their parameters are initialized with another dialog illustrated in the

figure 8. From this dialog the GAS algorithm can also be disabled if a large amount of

data needs to be clusterized. When all the parameters have been filled the user presses

OK and the data is reclusterized using the new parameters for the algorithms.

20

Figure 7: The data generation setup.

Figure 8: The algorithm setup.

7

THE TESTS

This chapter presents some results of running the algorithms with different parameters

and different amounts of samples and clusters. No solid statistical analysis was made for

the results. A more throughout analysis is beyond the scope of this basic document and

the results are provided as a guideline for general interest on the behavior of the

algorithms.

First the speed of the algorithms was analyzed against the amount of samples. Table 1

shows the result (the amount of clusters was kept in 20). The tests were run on AMD

21

Athlon Classic 700MHz, 512kB of L2-cache and with 512MB of memory on Windows

XP. The GAS algorithm was not driven with the three largest data sets due to the fact

that it would have taken a lot of time and the behavior ( O(N

3

)) of the algorithm can

already be seen from the smaller sets. The performance of the other algorithms seems to

be proportional to N

2

, where one pass of the GDS is relatively long compared to

MBSAS or TTSAS. Notice that due to the way the runtime was measured times below

100ms cannot be considered accurate.

Table 1. Runtime for different amounts of data

Algorithm N=1000 N=2000 N=4000 N=8000 N=16000 N=32000

MBSAS 10ms 10ms 20ms 70ms 310ms 960ms

TTSAS 0ms 0ms 20ms 60ms 320ms 900ms

GDS 80ms 360ms 1.5s 7.1s 37.4s 231.6s

GAS 12.4 s 185.1s 0.4h

7.1

The Effect of Parameters

The effect of the parameters was also examined with randomly selected sets of data.

First figure 9 shows the result of incorrectly chosen parameters for TTSAS.

Figure 9: The meaning of correct parameters.

The figure on the left shows how TTSAS with incorrectly large theta values (1000,

2000) misclassifies a part of the blue cluster to the red cluster (cluster on the left). When

the theta values are lowered to a more reasonable stage (500, 1500) TTSAS creates

correct clusters in the picture on the ri ght. The value 500 corresponds roughly the

22

distance of 20 in space. This problem does not emerge with GAS or GDS since they

have no dissimilarity parameters.

Figure 10 illustrates the case where an incorrect amount of clusters was guessed and

given to the algorithms as parameter (four instead of five).

Figure 10: The meaning of correct value of clusters

The top leftmost image shows the behavior of MBSAS: the blue cluster (on the left) is

incorrectly large when compared to the top rightmost image of TTSAS. The result of

TTSAS is most likely the correct one since it does not use the number of clusters

parameter at all. The behavior of GAS, down on the left, is similar to MBSAS while

GDS creates a big red cluster (on the right) including the samples of the green cluster.

The fact that GDS differs from GAS is probabl y due to the fact that the red cluster of

GDS is more compact than the blue cluster of GAS thus making its outlier closer to

average value for the GDS algorithm.

8

CONCLUSION

What were discussed were the problem of clustering and the most popular algorithms of

that particular field of science. The basic concepts needed to understand the

functionality of the algorithms were discus sed in chapter two. Chapter three provided a

sight to the most popular algorithms and their behavior. Chapter four completed the tour

with a couple of special purpose algorithms. Chapter five included an implementation of

four of the algorithms discussed and chapter six presented a way to visualize the product

23

of the algorithms with OpenGL. Finally, in chapter 7 the algorithms were run on the

framework and some different kind of parameters and samples were considered. This

paper was meant as an introduction to the algorithms and also the purpose was to create

an efficient way to display the clusterings of individual data elements.

If something remains to be done a more throughout analysis of the behavior of the

algorithms should be made. Especially cases where the samples are badly balanced or in

some particular order should be generated and analyzed. The application could easily be

complemented with a screenshot feature to automatically generate printable figures of

the current display. Also, it would be easy to add a feature to read the sample data from

a file.

Finally, a complete set of the source code is available from the author by request. The

appendix provides all commented key elements of the code. The code is provided for

educational purposes. A complete listing would be unacceptably long to be included to

this document and would serve no purpose.

REFERENCES

Haykin, S. 1999. Neural Networks a Comprehensive Foundation. Second Edition.

Prentice Hall. 842 p.

Ilkka, S. 1997. Diskreettiä Matematiikkaa. Fifth Edition. Otatieto. 165 p.

Kohonen, T. 1982. Self-Organized Formation of Topologically Correct Feature Maps.

Biological Cybernetics. vol 43. pp. 59-69.

Kreyszig, E. 1993. Advaced Engineering Mathematics. Seventh Edition. Wiley. 1271 p.

Thedoridis, A; Koutroumbas, K. 1999. Patte rn Recognition. Academic Press. 625 p.

Weiss, M. 1997. Data Structures and Algori thm Analysis in C. Second Edition. Addison

Wesley. 511 p.

APPENDIX

/**

* Cluster.h

*

* Provides a basic cluster class including sample vectors.

*

* Commercial use without written permission from the author is forbidden.

* Other use is allowed provided that this original notice is included

* in any form of distribution.

* @author Jukka Kainulainen 2002 jkainula@cc.hut.fi

*/

/**

* The definition of a single cluster. A cluster has a list of vectors

* belonging to it. Vectors can be added or removed to/from a cluster.

* Other classes can be derived from this class. For example those with

* other kind of representatives. This class uses a medium representative.

*

*/

class CCluster

{

protected:

/** The vectors belonging to this cluster */

VList Vectors;

/** The medium representative */

fVector3 Medium;

/** The current outlier sample */

24

fVector3 Outlier;

/** The distance of the outlier from representative */

float fOutlierDist;

/** Is the outlier valid */

bool bOutlierValid;

/** Updates the representative value */

virtual void UpdateMedium();

/** Updates the outlier sample (the one farthest from the representative) */

virtual void UpdateOutlier();

public:

CCluster(void);

/** Is this cluster empty? */

bool IsEmpty() const;

/** Add a vector to this cluster */

virtual void AddVector(const fVector3& vec);

/** Remove a vector from this cluster */

virtual void RemoveVector(const fVector3& vec);

/** Includes all vectors from the given cluster to this one also. */

virtual void Include(CCluster& cluster);

/** Returns a refenrence to the list of vectors */

virtual const VList& GetVectors() const;

/** Returns the reprsentative vector */

virtual fVector3 GetRepresentative() const

{ return Medium; }

/** Returns the outlier vector */

virtual fVector3 GetOutlier();

/** Returns the outlier distance */

virtual float GetOutlierDist();

/** Returns a new cluster object. Override for different clusters. */

virtual CCluster* GetNewCluster()

{ return new CCluster(); }

/** Returns the squared distance to given vector. */

virtual float Distance(const fVector3& vec);

/** Returns the squared distance to another cluster */

virtual float Distance(const CCluster* clust);

/** Clears and deletes this cluster */

virtual ~CCluster(void);

};

/** A list of cluster pointers */

typedef list<CCluster*> ClList;

// From Cluster.cpp

void CCluster::Include(CCluster& cluster)

{

VList::iterator i;

for (i = cluster.Vectors.begin(); i != cluster.Vectors.end(); ++i) {

Vectors.push_back(*i);

}

UpdateMedium();

bOutlierValid = false;

// The caller should remove the elements from the other one...

}

const VList& CCluster::GetVectors() const

{

return Vectors;

}

float CCluster::Distance(const fVector3& vec)

{

return Medium.SquaredDistance(vec);

}

float CCluster::Distance(const CCluster* clust)

{

return Medium.SquaredDistance(clust->GetRepresentative());

}

25

/**

* ClustAlgorithm.h

*

* Commercial use without written permission from the author is forbidden.

* Other use is allowed provided that this original notice is included

* in any form of distribution.

* @author Jukka Kainulainen 2002 jkainula@cc.hut.fi

*/

#pragma once

#include "Cluster.h"

/**

* Provides a base class for all clustering algorithms.

*/

class CClustAlgorithm

{

public:

CClustAlgorithm() {};

virtual void SetParameters(float theta, int q = 0) = 0;

/**

* Creates a clustering from vectors using the given empty cluster class.

* You can give different kinds of cluster subclasses as parameter.

*

*/

virtual ClList* Clusterize(const VList* vectors, CCluster* empty) const = 0;

/** Returns the number of clusters searched for, 0 if not in use */

virtual int GetClusters()

{ return 0; }

/** Returns the theta parameter used, 0 if not in use */

virtual float GetTheta()

{ return 0.f; }

virtual float GetTheta2()

{ return 0.f; }

/** Returns the name of the algorithm */

virtual const char* GetName() const = 0;

virtual ~CClustAlgorithm(void) {};

};

// From MBSAS.cpp

ClList* CMBSAS::Clusterize(const VList* vectors, CCluster* empty) const

{

int numVectors = 0;

int clusters = 0;

fVector3 tmp;

if ((vectors == NULL) || (empty == NULL))

return NULL;

if ((int)vectors->size() < 1)

return NULL;

// This is not the optimum way....

VList tmplist = *vectors;

ClList* ClusterList = new ClList();

VList::iterator iter;

ClList::iterator iter2;

// Fill with the initial vector

iter = tmplist.begin();

empty->AddVector(*iter);

iter++;

ClusterList->push_back(empty);

// 'Create the clusters' pass

for (; iter != tmplist.end(); iter++) {

tmp = *iter;

float mindist = FLT_MAX;

// Find the minimum distance

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end(); iter2++) {

float dist = (*iter2)->Distance(tmp);

if (dist < mindist)

mindist = dist;

26

}

// Create a new cluster?

if ((mindist > fTheta) && ((int)ClusterList->size() < iq)) {

CCluster* newclust = empty->GetNewCluster();

newclust->AddVector(tmp);

ClusterList->push_back(newclust);

}

}

// Now we have to remove the already taken samples...

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end(); iter2++) {

tmp = (*iter2)->GetRepresentative(); // Representative is the only one

tmplist.remove(tmp);

}

// And then we classify the rest...

for (iter = tmplist.begin(); iter != tmplist.end(); iter++) {

tmp = *iter;

float mindist = FLT_MAX;

CCluster* minclust = NULL;

// Find the minimum distance cluster...

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end(); iter2++) {

float dist = (*iter2)->Distance(tmp);

if (dist < mindist) {

mindist = dist;

minclust = *iter2;

}

}

minclust->AddVector(tmp); // ...and add to it

}

return ClusterList;

}

// From TTSAS.cpp...

ClList* CTTSAS::Clusterize(const VList* vectors, CCluster* empty) const

{

if ((vectors == NULL) || (empty == NULL))

return NULL;

if ((int)vectors->size() < 1)

return NULL;

// We'll do this the old way...

bool* clas = new bool[vectors->size()];

fVector3* tmplist = new fVector3[vectors->size()];

VList::const_iterator iter;

int i = 0;

for (iter = vectors->begin(); iter != vectors->end(); iter++, i++) {

tmplist[i] = *iter;

clas[i] = false;

}

ClList* ClusterList = new ClList();

int numVectors = (int)vectors->size();

int numDone = 0; // Number of classified samples

int existsChange = 0; // Classified something new during last pass

int curChange = 0; // Current number of classified samples

int prevChange = 0; // Number of classified samples during last pass

float mindist = FLT_MAX;

CCluster* minclust = NULL;

ClList::iterator iter2;

while (numDone < numVectors) {

bool gotOne = false;

for (i = 0; i < numVectors; i++) {

if (!clas[i] && existsChange == 0 && !gotOne) {

// Let's make sure the while ends at some point :)

CCluster* clust = empty->GetNewCluster();

clust->AddVector(tmplist[i]);

ClusterList->push_back(clust);

clas[i] = true;

curChange++; numDone++; gotOne = true;

}

else if (clas[i] == 0) {

mindist = FLT_MAX;

minclust = NULL;

// Find the minimum distance cluster...

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end(); iter2++) {

27

float dist = (*iter2)->Distance(tmplist[i]);

if (dist < mindist) {

mindist = dist;

minclust = *iter2;

}

}

if (mindist < fTheta1) { // found the same kind

minclust->AddVector(tmplist[i]);

clas[i] = true;

curChange++; numDone++;

}

else if (mindist > fTheta2) { // need to create a new one

CCluster* clust = empty->GetNewCluster();

clust->AddVector(tmplist[i]);

ClusterList->push_back(clust);

clas[i] = true;

curChange++; numDone++;

}

}

else // clas == 1

curChange++;

}

existsChange = abs(curChange - prevChange);

prevChange = curChange; curChange = 0;

}

delete empty;

delete[] clas;

delete[] tmplist;

return ClusterList;

}

// From GDS.cpp...

ClList* CGDS::Clusterize(const VList* vectors, CCluster* empty) const

{

if ((vectors == NULL) || (empty == NULL))

return NULL;

if ((int)vectors->size() < 1)

return NULL;

// Create the initial clustering...

ClList* ClusterList = new ClList();

VList::const_iterator iter;

CCluster* tmp = empty->GetNewCluster();

for (iter = vectors->begin(); iter != vectors->end(); iter++)

tmp->AddVector(*iter);

ClusterList->push_back(tmp);

float maxdist, maxdist2;

CCluster* maxclust;

ClList::iterator iter2;

while ((int)ClusterList->size() < iq) {

maxdist = 0;

maxclust = NULL;

// Find the cluster that has maximal outlier element...

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end(); iter2++)

if ((*iter2)->GetOutlierDist() > maxdist) {

maxdist = (*iter2)->GetOutlierDist();

maxclust = *iter2;

}

// Move the outlier to a new cluster

CCluster* newclust = empty->GetNewCluster();

newclust->AddVector(maxclust->GetOutlier());

maxclust->RemoveVector(maxclust->GetOutlier());

ClusterList->push_back(newclust);

bool foundOne = true;

// While we found a vector more similar to the new cluster...

while (foundOne) {

foundOne = false; // Let's see if we find any

fVector3 vect(0, 0, 0);

maxdist = FLT_MAX;

// Go through all the samples in the old cluster

for (iter = maxclust->GetVectors().begin(); iter != maxclust-

>GetVectors().end(); iter++) {

maxdist2 = newclust->Distance(*iter); // Dist. to the new clust.

if (maxclust->Distance(*iter) > maxdist2 && maxdist2 < maxdist) {

foundOne = true;

maxdist = maxdist2; // The closest one to the new cluster

vect = *iter;

}

28

}

if (foundOne) { // We did find one sample?

newclust->AddVector(vect);

maxclust->RemoveVector(vect);

}

}

}

delete empty;

return ClusterList;

}

// From GAS.cpp...

ClList* CGAS::Clusterize(const VList* vectors, CCluster* empty) const

{

if ((vectors == NULL) || (empty == NULL))

return NULL;

if ((int)vectors->size() < 1)

return NULL;

// Create the initial clustering...

ClList* ClusterList = new ClList();

VList::const_iterator iter;

for (iter = vectors->begin(); iter != vectors->end(); iter++) {

CCluster* tmp = empty->GetNewCluster();

tmp->AddVector(*iter);

ClusterList->push_back(tmp);

}

ClList::iterator iter2;

ClList::iterator iter3;

float mindist;

CCluster* minclust1;

CCluster* minclust2;

while ((int)ClusterList->size() > iq) {

mindist = FLT_MAX;

minclust1 = NULL;

minclust2 = NULL;

// Seek the two clusters that have min distance (slow)...

for (iter2 = ClusterList->begin(); iter2 != ClusterList->end(); iter2++) {

iter2++; // Dummy inc

for (iter3 = iter2--; iter3 != ClusterList->end(); iter3++) {

float dist = (*iter2)->Distance(*iter3);

if (dist < mindist) {

mindist = dist; minclust1 = *iter2; minclust2 = *iter3;

}

}

}

// ...and combine them

if (minclust2 != NULL) {

minclust1->Include(*minclust2);

ClusterList->remove(minclust2);

delete minclust2;

}

}

delete empty;

return ClusterList;

}

// From GLRenderer.cpp

void CGLRenderer::DrawVolumes(const ClList* list)

{

int c = 0;

glColor4f(0.5f, 0.5f, 0.5f, 0.4f);

glEnable(GL_BLEND);

glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);

for (ClList::const_iterator iter = list->begin(); iter != list->end(); iter++, c+=3){

glPushMatrix();

float dist = (*iter)->GetOutlierDist();

fVector3 out = (*iter)->GetRepresentative();

glTranslatef(out.x, out.y, out.z);

glutWireSphere(sqrt(dist), 8, 8);

glutSolidSphere(sqrt(dist), 8, 8);

glPopMatrix();

}

glDisable(GL_BLEND);

}

29

void CGLRenderer::RotateCamera(float xrot, float yrot)

{

// Around the up vector...

vdir = vdir.Rotate(vup, 3.1415f * yrot / 180.f);

vleft = vleft.Rotate(vup, 3.1415f * yrot / 180.f);

vup = vdir.Cross(vleft); // Just to make sure we don't get messed up

// Around the "left" vector...

vup = vup.Rotate(vleft, 3.1415f * xrot / 180.f);

vdir = vdir.Rotate(vleft, 3.1415f * xrot / 180.f);

vleft = vup.Cross(vdir);

vdir.Normalize();

vleft.Normalize();

vup.Normalize();

}

## Comments 0

Log in to post a comment