Fuzzy Clusterin
g Techniques: Fuzzy C

Means and
Fuzzy Min

Max Clustering Neural Networks
Benjamin
James
Bush
SSIE 617
Term
Paper
,
Fall 2012
1
INTRODUCTION
Data clustering is a data processing strategy which aims to organize a collection of data points (hereby
simply called points) into groups.
Traditionally, the data set is partitioned so that each point belongs to
one and only one cluster.
However, unless t
he data is very highly clustered, it is often the case that
some points do not completely belong to any one cluster. With the
arrival
of Fuzzy
clustering, these
points could be assigned a set of membership degrees associated with each cluster instead of
ar
tificially pigeonholing it as belonging to only one.
The volume of literature available on fuzzy
clustering is immense; a general review of the literature is outside the scope of this term paper. This
paper discusses only 2 approaches to fuzzy clustering:
the ubiquitous Fuzzy C

Means Clustering
algorithm and the less well known but interesting Fuzzy Min

Max Clustering Neural Network. These
approaches are discussed in sections 2 and 3, respectively.
In
part 4 I will briefly discuss several
applications which
use the fuzzy clus
tering techniques covered here.
2
THE FUZZY C

MEANS
(FCM) C
LUSTERING ALGORITHM
Fuzzy C

Means, also known as Fuzzy K

Means and Fuzzy ISODATA,
is one of the oldest and most
ubiquitous fuzzy clustering al
gorithms. FCM is a generalization
of the K

Means clustering algorithm,
which is a simple and widely used method for finding crisp clusters.
Understanding FCM’s crisp
ancestor is instructive and is discussed below.
2.1
K

MEANS CLUSTERING
The “K” in K

Means refers to the fact that in K

M
eans clustering, th
e number of clusters is decided
before the process begins. The “Means” in K

Means refers to the fact that each cluster is characterized
by the mean of all the points that belong to the cluster. Thus, in K

Means clustering our goal is lit
erally
to find K means, thereby giving us the K clusters we seek.
In particular, the means we seek are those
which minimize the cost function depicted in the following figure:
Figure
1
: Cost fucntion minimized during K

Means clu
stering. Equation taken from
[
1
]
. Annotations by me.
The process is initiali
z
ed by picking K different “centroids” at random from the space in which the
points are embedded. From here, the
K

Means process can be divided into two phases:
Phase 1:
Form Clusters.
Each centroid is associated with a different cluster. To form these clusters,
each point in the data set is evaluated in turn. When evaluated, a point is assigned to the cluster
corresponding to the closest centroid.
Phase
2
:
Move Centroids.
Each of the centroid is now moved to the position obtained by taking the
mean of each of the points in the cluster associated with the centroid.
These two phases are repeated in turn until co
nvergence is reached (i.e. until the value of the cost
function stops decreasing significantly). It should be noted that there is no guarantee that the cost
function will be minimized. The outcome depends on initial conditions.
A flow chart is provided bel
ow
to aid the reader’s understanding of the
algorithm
.
Figure
2
: A flow chart summarizing the K

Means Clustering Algorithm
.
Understanding
of the K

means clustering algorithm can
be further enhanced by viewing a series of
animated GIF images produced by Andrey A. Shabalin, available at http://shabal.in/visuals.html
Key frames from the animation are provided below for the reader’s convenience. Visually inspecting
these key frames in
conjunction with the above flow chart can be
very instructive.
Figure
3
: Key frames from an animation on k

means clustering by
Andrey A. Shabalin
, PhD
2.
2

FUZZY C

MEANS CLUSTERING (FCM)
FCM is a generalization of K

Means. While K

Means assigns each point to one and only one cluster,
FCM allows clusters to be fuzzy sets, so that each point
belongs to all clusters to varying degrees, with
the following restriction: The sum of all membership
degrees for any given data point is equal to 1.
The cost function used in FCM (shown in figure 3) is very similar to the one used by K

Means, but there
are some key differences: The inner sum contains a term for each data point in the set. Each of these
terms is weighed by a membership degree raised to
the power of a fuzziness exponent.
Figure
4
: Cost function for FCM. Figure adapted from
[
1
]
. Annotations by me.
Compare with Figure 1.
Applying the method of Lagrange multip
liers to minimize the above cost functions yields the following
necessary (but not sufficient) constraints
[
1
]
:
Like K

means, FCM
is initialized by choosing a fixed number of centroids at random. Also like K

means,
FCM after initialization is divided into two phases:
Phase 1:
Form Clusters.
Each centroid is associated with a different
fuzzy
cluster. To form these
clusters, each poin
t in the data set is evaluated in turn. When evaluated, a point is assigned a
membership degree with respect to
each
cluster. The numerical value of these degrees is given by the
second of the above constraints.
Phase
2
:
Move Centroids.
Each of the centroi
d is now moved to the position obtained via the first of
the above constraints.
The reader should verify that the flow chart for FCM provided below
closely resembles the flow chart
for K

Means above. Note also the incorporation of the aforementioned const
raints.
Figure
5
:
A flow chart summarizing the FCM
Clustering Algorithm.
Compare to Figure 2
It is instructive to visualize
fuzzy clusters
visualized by FCM
.
For this purpose, it is convenient to use a
one dimensional data set, as is used in Figure
6
below.
Figure
6
:
Three
fuzzy clusters produced by FCM on a 1
dimensional
data set.
Figure taken from
[
2
]
.
MATLAB’
s
fcmdemo
command provides a great way to interact with FCM
using
2 dimensional data.
1
One can run FCM on several preloaded data sets or provide a custom data file. The number of clusters
can be varied, as can the fuzziness exponent and the
stopping criteria
. Once FCM has finished running,
one can directly view and manipulate each of the fu
zzy clusters. Screenshots follow on the next page.
1
MATLAB’s
fcmdemo depends
on the Fuzzy Logic Toolbox, which is available for purchase from MathWorks at the following
URL:
http://www.mathworks.com/products/fuzzy

logic/index.html
The Laptops i
n the Enginet classrooms
at Binghamton University
already have the Fuzzy Logic Toolbox installed.
To start the
demo, simply enter the command
fcmdemo
into the MATLAB command window.
Figure
7
: The main window of
fcmdemo
after running it on data set 2 with C = 3 and m = 2.
Figure
8
: Membership function plots after running
fcmdemo with fuzziness exponent m = 1.5
Figure
9
:
Membership function plots after running fcmdemo
with fuzziness exponent m = 4
. Compare with figure 8.

3

FUZZY MIN

MAX CLUSTERING NEURAL NETWORKS
(FMMCNN)
FCM
requires that the number of clusters be specified in advance. However, the number of clusters
that should be used is not always clear, as the figure below illustrates.
Figure
10
: A data set (top) can be clustered into 4 (bottom
left) or 2 (bottom right)
clusters.
There are many fuzzy clustering techniques which will automatically determine the number of clusters
that should be used. Among them is the Fuzzy Min

Max Clustering Neural Network (FMMCNN), which
we discuss in this sect
ion.

3
.
1

HYPERBOX FUZZY SETS
The fuzzy clusters used in a FMMCNN are called hyperbox fuzzy sets.
A hyperbox fuzzy set
has a
hyperbox core, so that every point that lies within the hyperbox is given a membership degree of 1.
The membership function
of the
hyperbox fuzzy set
then decays linearly as one moves further away
from the hyperbox
core. A system
wide parameter
controls the
rate
of this decay.
A hyperbox is completely defined by its min point and its max point. The min point is a vector who
se
components provide a series of lower bounds for each dimension which must be surpassed to remain
within the hyperbox. For example, suppose we have a 2 dimentional hyperbox with min point <5, 20>.
Then for a data point <x, y> to lie within the hyperbox,
it is necessary that x ≥ 5 and y ≥ 20.
Analogously, the max point provides a series of upper bounds for each
dimension
, which must be
respected
to remain within the hyperbox.
A formal definition of the membership function
associated
with a hyperbox fuzzy s
et is shown on the next page. To gain a more practical / intuitive understanding
of hyperbox fuzzy sets, contour plots can be generated and manipulated using a Mathematica
notebook created by me.
With this notebook, one can control the position of the min
and max points,
as well as the gamma membership decay parameter. The notebook can also plot one dimensional
hyperbox fuzzy sets, thereby revealing that hyberbox fuzzy sets can be thought of as generalized
symmetric trapezoidal fuzzy numbers.
The notebook i
s available from my website at the following URL:
http://www.benjaminjamesbush.com/fuzzyclustering
Screenshots are given on the following
page for the reader’s convenience.
Figure
11
: Membership function of a Hyperbox Fuzzy Set. Adapted from
[
3
]
2
Figure
12
: Manipulating hyperbox fuzzy sets in Mathematica
. One
dimensional (top) and two dimensional (bottom).
2
[
3
]
contains some
typographical errors. They have been corrected
in Figure 11.
3.1 FUZZY MIN MAX NEURAL NETWORKS
A major advantage of using hyperbox fuzzy set for fuzzy clustering is the fact that they can easily be
implemented as
2 layer artificial
neural networks.
The
following figu
re illustrates how this is done.
Figure
13
: A hyperbox fuzzy set implemented as a 2 layer artificial neural network.
The input layer contains one node per dimension of the space in which the data points are embedded.
Each input
node is connected to the output node via a pair of weighted links which are weighed by the
corresponding component value of the max point and min point, respectively. Implementing a
clustering system in this way allows for the development of massively para
llel systems that can quickly
calculate the membership values for incoming data.
3.1
EVOLVING FUZZY CLUSTERS
Another advantage of hyperbox fuzzy sets is their relative simplicity with which they can be expressed.
As previously mentioned, a hyperbox
fuzzy set can be completely represented by a min point and a
max point. This makes it very easy to design an evolutionary
algorithms
which can be used to evolve
sets of hyperbox fuzzy sets for use within fuzzy min

max clustering neural networks. One such
algorithm was published by Fogel and Simpson in
[
3
]
and is o
utlined in the flow chart on the next page.
For their fitness function, Fogel and Simpson use the minimum description length (MDL), which is i
n
some way an optimal compromise between fitting the data and using the smallest possible number of
clusters. For more information on the MDL, see
[
4
]
.
Figure
14
: Flow chart
summerizing the evolutionary algorithm used in
[
3
]

4

APPLICATIONS
Fuzzy clustering is becoming an important data processing technique in many scientific fields. While
the use of FCM is widespread, fuzzy min

max clustering neural networks are harder to come by.
Below
I list of a few i
nteres
ting applications which I encoun
tered in the literature.
GENETICS
Gasch and Eisen used FCM to find clusters of
yeast
gene
s
[
5
]
.
POLITICS
Teran
and
Meier d
esigned a fuzzy sys
tem that used FCM to simplify the complex
political landsca
pe
and recommend candidates to voters based on fuzzy data obtained from surveys
[
6
]
.
RADIOLOGY
John, Innocent and Barnes
used a fuzzy min

max clustering neural network to group x

ray images of
the tibia into clusters
[
7
]
.
INDUSTRIAL ENGINEERING
Dobado
et. al. used a fuzzy min

max clustering neural network to group parts into part families, an
important step in the formation of cells for cellular manufacturing
[
8
]
.
APPENDIX: MATHEMATICA CODE
The following Mathematica code can be used to create interactive plots of hyperbox fuzzy sets in one
and two dimensions. The code has been tested on Mathematica 8.
WORKS CITED
[1]
J. S. R. Jang, C

T Sun, and E Mizutani,
Neuro

fuzzy and
soft computing: a computational approach to
learning and machine intelligence
.: Prentice Hall, 1997.
[2]
Matteo Matteucci. (2012, May) A Tutorial on Clustering Algorithms: Fuzzy C

Means Clustering.
[Online].
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html
[3]
D B Fogel and P K Simpson, "Evolving Fuzzy Clusters," in
IEEE International Conference on
Neural
Networks
, 1993.
[4]
Peter Grünwald. (2008, August) Videolectures.net: MDL Tutorial. [Online].
http://videolectures.net/icml08_grunwald_mld/
[5]
A P Gasch and M B Eisen, "Exploring the c
onditional coregulation of yeast gene expression through
fuzzy k

means clustering,"
Genome Biology
, vol. 3, no. 11, October 2002.
[6]
L Teran and A Meier, "A Fuzzy Recommender System for eElections," in
Electronic Government and
the Information Systems Pe
rspective
., 2010, pp. 62

76.
[7]
R I John, P R Innocent, and M R Barnes, "Neuro

fuzzy clustering of radiographic tibia image data
using type 2 fuzzy sets,"
Information Sciences
, vol. 125, no. 1

4, pp. 65

82, June 2000.
[8]
D Dobado, S Lozano, J M Bueno,
and J Larrañeta, "Cell formation using a Fuzzy Min

Max neural
network,"
International Journal of Production Research
, vol. 40, no. 1, pp. 93

107, November 2010.
Comments 0
Log in to post a comment