A spectral image clustering algorithm based on ant colony optimization

quonochontaugskateAI and Robotics

Nov 24, 2013 (3 years and 24 days ago)

63 views

A spectral image clustering algorithm based on ant colony
optimization
Luca Ashok
a
and David W.Messinger
b
a
Department of Mathematics,University of Rochester,Rochester,NY
b
Center for Imaging Science,Rochester Institute of Technology,Rochester,NY
ABSTRACT
Ant Colony Optimization (ACO) is a computational method used for optimization problems.The ACOalgorithm
uses virtual ants to create candidate solutions that are represented by paths on a mathematical graph.We develop
an algorithm using ACO that takes a multispectral image as input and outputs a cluster map denoting a cluster
label for each pixel.The algorithm does this through identication of a series of one dimensional manifolds on
the spectral data cloud via the ACO approach,and then associates pixels to these paths based on their spectral
similarity to the paths.We apply the algorithm to multispectral imagery to divide the pixels into clusters based
on their representation by a low dimensional manifold estimated by the best t\ant path"through the data
cloud.We present results from application of the algorithm to a multispectral Worldview-2 image and show that
it produces useful cluster maps.
Keywords:spectral clustering,multispectral,hyperspectral,ant colony optimization
1.INTRODUCTION
Acommon task in the eld of remote sensing is producing an algorithmto cluster a multi or hyperspectral image.
1
A clustering algorithm uses either the spatial or spectral information of pixels to assign them to dierent groups
based on similarities between pixels.Ideally within each cluster,the spectral dierences between pixels will
be minimal.Since the solution space for potential clusterings is so large,clustering algorithms generally nd
locally optimal approximations by use of iterative processes.The algorithm presented in this paper uses a less
conventional type of algorithm called Ant Colony Optimization (ACO).
Manolakis,et al.(2003) describe the standard models used in developing quantitative algorithms for process-
ing spectral imagery.The methods essentially fall into three categories:statistical models,linear mixing models,
and linear subspace models.Each family of algorithms has its own assumptions about the data,yet they have
lead to advances in spectral image processing for algorithmic tasks such as target detection,anomaly detection,
and change detection.However,there are other types of data models available and recently graph based models
have been developed that leverage the spectral structures in the data with minimal assumptions.Bachmann,
et al.(2005) estimate the low dimensional manifold that describes the inherently nonlinear structures in hyper-
spectral imagery over near-shore water scenes.The coordinates of the pixels in the new spectral domain,relative
to the manifold,are then used in a spectral clustering algorithm that accurately estimates material features in
this challenging imaging environment.Basener,et al.(2007,2009) describe how graph based methods can be
used to develop anomaly detection algorithms for spectral imagery.The Topological Anomaly Detection (TAD)
algorithm has been shown to perform very well,even in dicult,cluttered scenes.Mercovich,et al.(2011a)
show how a graph can be used to describe a spectral image,and then manipulated through iterative edge cutting
to produce a cluster map on high resolution multispectral imagery over cluttered scenes.Here,the method is
termed Maximized Modularity and the graph is iteratively cut until each respective cluster is well represented
by a random cluster with the associated edge distribution.Albano,et al.(2012) also focus on spectral image
clustering,but through the use of\traditional"algorithms after application of the Commute Time Distance
Transformation to the spectral image.This data transformation preserves and enhances the structures in the
data based on both their spectral similarity and the connectivity of the graph.
Further author information:(Send correspondence to D.W.M.)
D.W.M.:E-mail:messinger@cis.rit.edu,Telephone:585-475-4538
Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XVIII,
edited by Sylvia S. Shen, Paul E. Lewis, Proc. of SPIE Vol. 8390, 83901P ∙ © 2012 SPIE
CCC code: 0277-786X/12/$18 ∙ doi: 10.1117/12.919082
Proc. of SPIE Vol. 8390 83901P-1
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
ACO comes from a family of Swarm Intelligence based algorithms that are modeled after the behavior
of animals which solve problems collectively as a group.In the ACO algorithm,
8
virtual ants optimize over
a mathematical graph in order to nd a\best path."Potential solutions to the optimization problem are
represented as paths on a graph.Ants are guided by the work of previous ants by use of pheromones.
ACO has been used to great success to solve other optimization problems such as the traveling salesman
problem,scheduling problems,and vehicle routing problems.For example,in the traveling salesman problem,
a salesman must visit n number of cities and must nd the shortest route that will take him to every city
exactly once.As ants construct candidate solutions,the best solutions are treated with extra pheromones on
the edges which comprise those paths.During the creation of solutions,ants are more likely to follow the edges
in which there are more pheromones.Additionally dierent ACO models use dierent forms of\pheromone
evaporation"in which pheromone levels are globally reduced every iteration in an attempt to put more focus on
recent solutions.In this way the ants communicate via pheromones to guide future ants towards more promising
areas of the solution space.
Although it may seem less obvious how to apply ACO to clustering,we considered the idea promising since
we were used to modeling spectral data as a mathematical graph.In mathematics,a graph is simply a collection
of nodes and edges and such a graph is required in every ACO algorithm.We nd that ACO can be used to
eectively divide an image into clusters.ACO allows us to create clusters by nding paths of closely located
points in the data cloud.We found ACO to cluster reasonably well on both multispectral and hyperspectral
images.Therefore we see the possibility that ACO can be a viable method in analyzing spectral data.
In Section 2 a formal explanation of the clustering algorithm is given including the construction of the graph
used in the algorithm.In Section 3 we present the results of experiments using the presented algorithm,and
nally in Section 4 we discuss our conclusions.
2.ALGORITHM DESCRIPTION
2.1 General Algorithm Outline
If you consider the pixels of an image in n dimensions,where n is the number of bands in the spectral image,
you can imagine a\data cloud."Our goal is to use ACO to nd paths along points in the cloud where the
data are dense.We later assign a neighborhood around each path to a cluster.The way we accomplish this is
through careful construction of a graph and particular ACO application to that graph.ACO is an optimization
algorithm,so we need to specify some metric over which to optimize.We start from the darkest point of the
cloud and let ants create paths in search of the one that travels the furthest.
Once we have suciently optimized in this respect,we change the variable that we optimize over.We note
the ending point of the furthest traveling path and instruct the ants to nd the shortest distance (Euclidean)
from our starting point to our ending point.Once a shortest path is found,we label all the points within a
neighborhood of the shortest path as one cluster.We next choose the darkest point that has not yet been labeled
and continue the process until there are no unlabeled points left in the image.We dene the darkest point to be
the point that when considered in n dimensions has the shortest Euclidean distance from the origin.
2.2 Graph Construction
In order to apply ACO to any optimization problem you must rst start with a graph.As pointed out by
Mercovich,et al.(2011b),there are many approaches that could be used.The graph must be constructed in
such a way that solutions to the optimization can be represented by paths on the graph.Not only this,but
the edges and sections of the paths must be meaningful such that when ants add pheromones to more optimal
solutions it leads future ants in the right direction.The most natural way to construct the graph is to create a
node for each pixel in the image.Nodes are then connected based on the similarities of their spectral information.
When building the graph,one of the things to avoid,at least in this application,is allowing for paths that
pass through the entire graph;i.e.,we desire to create an unconnected graph.This could potentially connect
two separate high density areas of data together.In order to remedy this,we only use a certain fraction of the
pixels in the image when creating the graph;we use the most\dense"points in the graph.We dene the density
Proc. of SPIE Vol. 8390 83901P-2
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
of pixel A as the number of pixels whose Euclidean distance is within r
1
of A.Both the fraction of points used
and the radius used to determine\density"are parameters in our algorithm.
Now that we have found a subset of pixels which have a higher density,we construct a new graph using only
these points.This new graph will be used solely for creating ant paths.We will then refer back to the old graph
(which contains every pixel) when assigning a neighborhood to each ant path for clustering purposes.
The next decision to make in creating the graph is choosing how to connect nodes.There are many viable
schemes here but we choose a simple scheme in which a radius r
2
is chosen and all pixels within r
2
Euclidean
distance between each other are connected.We choose to ignore spatial information in the construction of our
graph (and all other parts of the algorithm).
A nal step is needed when constructing the graph.In this nal step we make provisions in order to avoid
ants retracing their own steps (going in loops).Using the same notion of darker and lighter pixels as described
earlier,we make it so that in the ACO algorithm ants can only move from a darker pixel to a lighter pixel while
constructing a path on the graph.In other words we use a directed graph when constructing ant paths.
2.3 Basic Ant Colony Optimization Setup
For every edge on the graph,both a pheromone and heuristic level is stored in an array.The pheromone level
gives ants information based on how past ants have fared on this edge while the heuristic value is a predetermined
metric imposed for the specic problem which helps guide ants in the right direction.Ants start at some given
starting point and at each point during their tour construction they implement a decision rule on which edge to
choose.The decision rule is dependent on the pheromone and heuristic values for potential edges along with two
parameters  and ,and is written as
p
k
xy
=


xy


xy
P


xy


xy
(1)
where p
xy
is the probability of moving frompoint x to point y,
xy
is the pheromone value on the edge connecting
x and y and 
xy
is the heuristic value on the same edge.In this case,the heuristic value for each edge is simply
the length of the edge.The summation is done over all potential edges stemming fromx. and  are parameters
which apply various weights to the pheromone and heuristic information.The step chosen is based on a random
draw from the probability distribution for all potential edges given by eqn.1.
Eventually ants come to a point where an ending condition (dened by the user) is met and the ant path
ends.During every iteration of the algorithm,k number of ants begin and end paths along the graph according
to the decision rule.After the paths have been created,pheromones are updated accordingly.For every edge
crossed by an ant a pheromone update rule is applied based on the quality of that ant's solution among other
variables.First,though,a metric must be created to determine the quality of an ant's solution.The pheromone
update rule is as follows:

k
xy
=

Q=L
k
:if ant k uses curve xy in its tour
0:otherwise
(2)
where Q is a constant and L
k
relates to the quality of the solution involving that edge xy,i.e.,its length.
Additionally pheromones are evaporated at the end of every iteration j in order to put more emphasis on
recent paths,using the following evaporation rule

j
xy
= (1 )
(j1)
xy
(3)
where  is the evaporation rate and superscripts here denote iteration number.
Proc. of SPIE Vol. 8390 83901P-3
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
2.4 Ant Colony System
Ant Colony System (ACS) is a specic version of ACO which was implemented in our algorithm.There are a
few minor changes to the ant decision rule,pheromone update rule,and evaporation rule.For details on the
implementation of ACS the reader is encouraged to see Dorigo & Stutzle (2004).
An extra parameter q is given in the ACS model.With probability (1 q),the normal decision rule is used
(eqn.1).With probability q,the same decision rule is used,however the ant chooses the y such that p
xy
is
maximized,not the result of a random draw.This enforces that some of the time,the ant chooses the best next
step,which is not necessarily the best step to choose for the optimal path.
In ACS pheromones are only evaporated for the edges in which ants have crossed.This leads ants towards
the less explored areas of the graph.Additionally,the best so far path is stored and pheromones are added to
its edges every iteration.This helps ants nd a local optimum around the best so far area of the graph.The
new pheromone update rules can be summarized as follows:
 update for when an edge ij is crossed:

xy
((1 )
xy
+  
0
(4)
where  is an evaporation parameter and 
0
is the parameter determining starting pheromone values for
every edge.
 update for best so far path:

xy
((1 )
xy
+=C
bs
(5)
where  is another evaporation parameter and C
bs
is a measure given to the quality of the\best-so-far"
path found,i.e.,its length,or reciprocal thereof.
2.5 Implementation of ACS
In our algorithm,the general idea is to create a series of paths on a graph.For each path on the graph we will
have to use ACS twice:once to nd an ending point,and once to nd the shortest path between the starting and
ending points.We change the metric we are optimizing over and the heuristic values on the graph whenever we
switch between the two phases of ACS.As stated above,the rst step in the algorithm is to identify long paths
through the most dense portions of the data cloud in the spectral domain.Consequently,the local density for
each pixel is estimated and only those points of highest density are considered in the rst step of the algorithm.
Once the long paths are identied,then all the data points are considered to correctly classify as many pixels as
possible via spectral proximity to the identied paths.
The rst step is to nd the point in the graph that is not yet assigned to a cluster and whose corresponding
pixel is darkest in the image.In other words,its Euclidean distance from the origin is smallest.To nd the
ending point of the path under consideration,we set the heuristic value of each edge to the length of the edge.
We have all paths start at our chosen starting point,and we use a total path distance metric which is simply
the Euclidean distance between starting and ending points.The metric and heuristic are set such that the ants
\prefer"to take large steps to get as far as possible in the spectral space.
After running ACS for several iterations using the above heuristics and metric,we use the end node of the
longest path as our ending point.Now,given starting and ending points for the path,we then switch our
heuristics and metric to optimize for the shortest path between the two points.The metric is simply the length
the path would be if it were plotted in the spectral cloud and the heuristic is again the length of each edge.
However,now we are searching for the shortest possible path so the ants\prefer"taking short steps to get to
the ending point.
Again ACS is run for several iterations.Afterward,all of the points located on the best path are noted.Then
all points on the graph connected to points on the best path are assigned to one cluster (i.e.,the neighborhood
pixels of all pixels on a given path are collected and associated to one another as a cluster).Finally we nd the
next\darkest unassigned pixel"so we can start the same process again.The algorithm ends when every point
on the graph has been assigned to a cluster.
Proc. of SPIE Vol. 8390 83901P-4
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
.
.
I
II
I
I
.
I
.
.
I
..
.
I
.
.
.
I
I
I
I
I
I
.
.
.
.
...
.
.
.
.
.
.
.
S
SS
S
S
S
S
S
S
S
Proc. of SPIE Vol. 8390 83901P-5
The clustering algorithm is implemented as the following series of steps,also shown in Figure 5:
 Given a graph with nodes and edges,nd the\darkest"unassigned pixel in the set:
Figure 1.First step:identify starting point for rst path.
 We let ants construct paths in search of the path which goes farthest:
Figure 2.Second step:ants go as far as they can,nd end point for this path.
 We next optimize for the shortest path from start to end:
Figure 3.Third step:ants nd optimal path between current start and end points.
 Finally we assign all points in the shortest path and all neighbors to one cluster:
Figure 4.Fourth step:associate neighborhood pixels to the cluster.
The algorithm ow chart is shown in Figure 5.
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
Figure 5.Flowchart of the Ant Colony Optimization clustering algorithm.
Table 1.Free parameters in the ACS algorithm implemented here.
r
1
radius for estimating point density
r
2
radius for edge creation on graph
N
fraction of pixels used in path creation graph
k
number of ants used per iteration

weight factor on pheromones

weight factor on heuristic values
,
pheromone evaporation parameters

0
initial pheromone value
q
ACS probability parameter
Q
pheromone update rule scaling
We note here that one limitation of this approach is that there are many free parameters required to adequately
use ACO & ACS to identify paths in the spectral space.Table 1 shows the free parameters required by the Ant
Colony System as implemented here.Parameter values for the results shown below were derived empirically.
By construction,it is likely that the ACO clustering algorithm leaves some pixels of the image unassigned to
any cluster.This occurs because only the most\dense"points in the cloud are used when creating ant paths,
and sometimes the less dense points are so far from the ant paths that they end up unassigned.In the cluster
maps shown below these pixels are colored black.If having unassigned pixels is a problem,there are many ways
to assign them.One potential way would be to take all the pixels that are unassigned and treat them as if they
belonged to a new image.You could then run the ACO algorithm on this subset of the image and create clusters
for these pixels.Alternatively a fuzzy clustering scheme could be used to assign these pixels in a probabilistic
manner based on their proximity to clusters.
3.EXPERIMENTAL RESULTS
3.1 Application to a Multispectral Image
We applied the presented algorithm to a 100100 pixel area of interest of a DigitalGlobe Worldview-2 8 band
image.Figure 6 shows the true color representation of the full image along with the area of interest to which the
clustering algorithm was applied.The image is largely a forested region and the area of interest contains trees,
exposed soils,structures,and other man made features.
To better understand what the algorithm is doing,we show the actual ant paths identied for this image (as
projected into two spectral bands) using various amounts of the data in Figure 7.Figure 7(a) shows the data
Proc. of SPIE Vol. 8390 83901P-6
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
(a)
(b)
Figure 6.(a) Example Worldview-2 image shown in RGB.(b) Area of interest.
in the two bands,where color represents estimated density per pixel in the spectral domain (lighter blue implies
higher density).We note that the data are ltered by density in the early steps of the algorithm such that we
only identify paths through the most dense portions of the data cloud in the spectral domain.Figures 7(b) -
7(d) show the cluster paths determined when using 40%,50%,or 60% of the data,respectively,based on spectral
density.Here we see that as more data are considered in the path identication step,more paths are uniquely
found and\gaps"in the data are gradually lled.However,we also note that using only 40% of the data the
primary structures in the cloud are correctly identied by the ACS algorithm.Note that several paths,when
viewed in this projection,\overlap"and could be identied as a single cluster that is not well modeled by a one
dimensional manifold.An algorithm to cluster the paths together,or alternatively group the resulting cluster
maps,was not implemented at this time.
Figure 8 shows the cluster maps for three separate clusterings.The three cluster maps show the dierences
seen when using 40%,50% or 60% of the data cloud when generating the original cluster paths.These results
show that the algorithm is successful in producing a spatially interpretable cluster map using only a collection
of one dimensional manifolds and the spectral pixels associated to those paths through the spectral cloud.Even
using only 40% of the data to build the paths,the algorithm correctly separates out the vegetation from the
bare soil and several of the man-made structures.The addition of more data to the path creation step only
improves the cluster separability in the nal cluster map.This result indicates that relatively low dimensional
spectral imagery,in this case the 8 spectral bands of Worldview-2,can be well modeled through a collection of
one dimensional manifolds as derived directly from the data through an Ant Colony Optimization technique.
These one-dimensional manifolds can then be used to separate individual materials in the spectral domain to
produce cluster maps.
Proc. of SPIE Vol. 8390 83901P-7
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
(a)
(b)
(c)
(d)
Figure 7.(a) Two band projection of the WV2 data used.Resulting paths when using (b) 40%,(c) 50%,
and (d) 60% of the data in the path construction algorithm.
Proc. of SPIE Vol. 8390 83901P-8
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
(a)
(b)
(c)
(d)
Figure 8.(a) RGB of area of interest.(b) - (d) Cluster maps after running the ACO clustering algorithm.
Results shown for using (b) 40%,(c) 50%,and (d) 60% of the data when building the paths.
Proc. of SPIE Vol. 8390 83901P-9
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms
4.SUMMARY AND CONCLUSIONS
This paper has presented a novel approach to modeling multispectral imagery through the use of low dimensional
manifolds as derived from the data cloud via the Ant Colony Optimization (ACO) approach to identifying paths
on graphs.Here,we represent the multispectral data as a graph and implement a variant of ACO,the Ant
Colony System (ACS),to identify\best"paths through the data in the spectral domain.Here,\best"means
long paths through the densest portions of the data.Each path is considered as the best representation of a
unique material cluster in the data,and other pixels in the data cloud are associated to the cluster based on their
proximity,in the spectral domain,to the path.The algorithmhas been applied to a high resolution multispectral
image collected with the Worldview-2 platform.
The resulting cluster maps show that,perhaps surprisingly,the low dimensional model of the data is generally
good and related to material separability in the spectral domain.This is surprising as previous work
10
has
shown that while natural materials are generally well represented by low dimensional manifolds,they are not
one dimensional.Bachmann,et al.(2005) have used a locally linear manifold estimation to cluster hyperspectral
imagery in the nonlinear scenarios of near shore water remote sensing with success,but this approach used the
ISOMAP algorithm to identify the manifold coordinates for each pixel to produce the clustering.Here,we do
not identify the manifold itself,but only leverage the identication of a low dimensional (i.e.,one) path through
the data to which we can associate pixels assuming they are of similar material.
While the Ant Colony Optimization/Ant Colony System methodology presented here for clustering spectral
imagery may not be practical in application to large imagery (due to the computational complexity of the method
and the large number of free parameters) it is hoped that the notion of using low dimensional manifolds to
understand and model multispectral imagery will inspire new classes of algorithms for analysis of high resolution
remotely sensed spectral imagery.
ACKNOWLEDGMENTS
The authors wish to thank DigitalGlobe for the imagery used in this study.
REFERENCES
1.J.R.Schott,Remote Sensing:the Image Chain Approach,Oxford University Press,2 ed.,May 2007.
2.D.Manolakis,D.Marden,and G.Shaw,\Hyperspectral image processing for automatic target detection
applications,"Lincoln Laboratory Journal 14(1),pp.79 { 116,2003.
3.C.L.Bachmann,T.L.Ainsworth,and R.A.Fusina,\Exploiting manifold geometry in hyperspectral
imagery,"IEEE Transactions on Geoscience and Remote Sensing 43(3),2005.
4.B.Basener,E.Ientilucci,and D.Messinger,\Anomaly detection using topology,"in Algorithms and Tech-
nologies for Multispectral,Hyperspectral,and Ultraspectral Imagery XIII,S.Shen,ed.,6565,SPIE,2007.
5.B.Basener and D.Messinger,\Enhanced detection and visualization of anomalies in spectral imagery,"in
Algorithms and Technologies for Multispectral,Hyperspectral,and Ultraspectral Imagery XV,S.Shen,ed.,
7334,SPIE,2009.
6.R.Mercovich,A.Harkin,and D.Messinger,\Automatic clustering of multispectral imagery by maximization
of the graph modularity,"in Algorithms and Technologies for Multispectral,Hyperspectral,and Ultraspectral
Imagery XVII,S.Shen,ed.,8048,SPIE,2011.
7.J.Albano,D.Messinger,and S.Rotman,\Commute time distance transformation applied to spectral
imagery and its utilization in material clustering,"submitted to Optical Engineering,2012.
8.M.Dorigo and T.Stutzle,Ant Colony Optimization,MIT Press,1st ed.,2004.
9.R.Mercovich,J.Albano,and D.Messinger,\Techniques for the graph representation of spectral imagery,"
in WHISPERS 2011,IEEE GRSS,2011.
10.A.Schlamm,D.Messinger,and B.Basener,\Geometric estimation of the inherent dimensionality of single
and multi-material clusters in hyperspectral imagery,"Journal of Applied Remote Sensing 3,2009.
Proc. of SPIE Vol. 8390 83901P-10
Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms