A spectral image clustering algorithm based on ant colony

optimization

Luca Ashok

a

and David W.Messinger

b

a

Department of Mathematics,University of Rochester,Rochester,NY

b

Center for Imaging Science,Rochester Institute of Technology,Rochester,NY

ABSTRACT

Ant Colony Optimization (ACO) is a computational method used for optimization problems.The ACOalgorithm

uses virtual ants to create candidate solutions that are represented by paths on a mathematical graph.We develop

an algorithm using ACO that takes a multispectral image as input and outputs a cluster map denoting a cluster

label for each pixel.The algorithm does this through identication of a series of one dimensional manifolds on

the spectral data cloud via the ACO approach,and then associates pixels to these paths based on their spectral

similarity to the paths.We apply the algorithm to multispectral imagery to divide the pixels into clusters based

on their representation by a low dimensional manifold estimated by the best t\ant path"through the data

cloud.We present results from application of the algorithm to a multispectral Worldview-2 image and show that

it produces useful cluster maps.

Keywords:spectral clustering,multispectral,hyperspectral,ant colony optimization

1.INTRODUCTION

Acommon task in the eld of remote sensing is producing an algorithmto cluster a multi or hyperspectral image.

1

A clustering algorithm uses either the spatial or spectral information of pixels to assign them to dierent groups

based on similarities between pixels.Ideally within each cluster,the spectral dierences between pixels will

be minimal.Since the solution space for potential clusterings is so large,clustering algorithms generally nd

locally optimal approximations by use of iterative processes.The algorithm presented in this paper uses a less

conventional type of algorithm called Ant Colony Optimization (ACO).

Manolakis,et al.(2003) describe the standard models used in developing quantitative algorithms for process-

ing spectral imagery.The methods essentially fall into three categories:statistical models,linear mixing models,

and linear subspace models.Each family of algorithms has its own assumptions about the data,yet they have

lead to advances in spectral image processing for algorithmic tasks such as target detection,anomaly detection,

and change detection.However,there are other types of data models available and recently graph based models

have been developed that leverage the spectral structures in the data with minimal assumptions.Bachmann,

et al.(2005) estimate the low dimensional manifold that describes the inherently nonlinear structures in hyper-

spectral imagery over near-shore water scenes.The coordinates of the pixels in the new spectral domain,relative

to the manifold,are then used in a spectral clustering algorithm that accurately estimates material features in

this challenging imaging environment.Basener,et al.(2007,2009) describe how graph based methods can be

used to develop anomaly detection algorithms for spectral imagery.The Topological Anomaly Detection (TAD)

algorithm has been shown to perform very well,even in dicult,cluttered scenes.Mercovich,et al.(2011a)

show how a graph can be used to describe a spectral image,and then manipulated through iterative edge cutting

to produce a cluster map on high resolution multispectral imagery over cluttered scenes.Here,the method is

termed Maximized Modularity and the graph is iteratively cut until each respective cluster is well represented

by a random cluster with the associated edge distribution.Albano,et al.(2012) also focus on spectral image

clustering,but through the use of\traditional"algorithms after application of the Commute Time Distance

Transformation to the spectral image.This data transformation preserves and enhances the structures in the

data based on both their spectral similarity and the connectivity of the graph.

Further author information:(Send correspondence to D.W.M.)

D.W.M.:E-mail:messinger@cis.rit.edu,Telephone:585-475-4538

Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XVIII,

edited by Sylvia S. Shen, Paul E. Lewis, Proc. of SPIE Vol. 8390, 83901P ∙ © 2012 SPIE

CCC code: 0277-786X/12/$18 ∙ doi: 10.1117/12.919082

Proc. of SPIE Vol. 8390 83901P-1

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

ACO comes from a family of Swarm Intelligence based algorithms that are modeled after the behavior

of animals which solve problems collectively as a group.In the ACO algorithm,

8

virtual ants optimize over

a mathematical graph in order to nd a\best path."Potential solutions to the optimization problem are

represented as paths on a graph.Ants are guided by the work of previous ants by use of pheromones.

ACO has been used to great success to solve other optimization problems such as the traveling salesman

problem,scheduling problems,and vehicle routing problems.For example,in the traveling salesman problem,

a salesman must visit n number of cities and must nd the shortest route that will take him to every city

exactly once.As ants construct candidate solutions,the best solutions are treated with extra pheromones on

the edges which comprise those paths.During the creation of solutions,ants are more likely to follow the edges

in which there are more pheromones.Additionally dierent ACO models use dierent forms of\pheromone

evaporation"in which pheromone levels are globally reduced every iteration in an attempt to put more focus on

recent solutions.In this way the ants communicate via pheromones to guide future ants towards more promising

areas of the solution space.

Although it may seem less obvious how to apply ACO to clustering,we considered the idea promising since

we were used to modeling spectral data as a mathematical graph.In mathematics,a graph is simply a collection

of nodes and edges and such a graph is required in every ACO algorithm.We nd that ACO can be used to

eectively divide an image into clusters.ACO allows us to create clusters by nding paths of closely located

points in the data cloud.We found ACO to cluster reasonably well on both multispectral and hyperspectral

images.Therefore we see the possibility that ACO can be a viable method in analyzing spectral data.

In Section 2 a formal explanation of the clustering algorithm is given including the construction of the graph

used in the algorithm.In Section 3 we present the results of experiments using the presented algorithm,and

nally in Section 4 we discuss our conclusions.

2.ALGORITHM DESCRIPTION

2.1 General Algorithm Outline

If you consider the pixels of an image in n dimensions,where n is the number of bands in the spectral image,

you can imagine a\data cloud."Our goal is to use ACO to nd paths along points in the cloud where the

data are dense.We later assign a neighborhood around each path to a cluster.The way we accomplish this is

through careful construction of a graph and particular ACO application to that graph.ACO is an optimization

algorithm,so we need to specify some metric over which to optimize.We start from the darkest point of the

cloud and let ants create paths in search of the one that travels the furthest.

Once we have suciently optimized in this respect,we change the variable that we optimize over.We note

the ending point of the furthest traveling path and instruct the ants to nd the shortest distance (Euclidean)

from our starting point to our ending point.Once a shortest path is found,we label all the points within a

neighborhood of the shortest path as one cluster.We next choose the darkest point that has not yet been labeled

and continue the process until there are no unlabeled points left in the image.We dene the darkest point to be

the point that when considered in n dimensions has the shortest Euclidean distance from the origin.

2.2 Graph Construction

In order to apply ACO to any optimization problem you must rst start with a graph.As pointed out by

Mercovich,et al.(2011b),there are many approaches that could be used.The graph must be constructed in

such a way that solutions to the optimization can be represented by paths on the graph.Not only this,but

the edges and sections of the paths must be meaningful such that when ants add pheromones to more optimal

solutions it leads future ants in the right direction.The most natural way to construct the graph is to create a

node for each pixel in the image.Nodes are then connected based on the similarities of their spectral information.

When building the graph,one of the things to avoid,at least in this application,is allowing for paths that

pass through the entire graph;i.e.,we desire to create an unconnected graph.This could potentially connect

two separate high density areas of data together.In order to remedy this,we only use a certain fraction of the

pixels in the image when creating the graph;we use the most\dense"points in the graph.We dene the density

Proc. of SPIE Vol. 8390 83901P-2

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

of pixel A as the number of pixels whose Euclidean distance is within r

1

of A.Both the fraction of points used

and the radius used to determine\density"are parameters in our algorithm.

Now that we have found a subset of pixels which have a higher density,we construct a new graph using only

these points.This new graph will be used solely for creating ant paths.We will then refer back to the old graph

(which contains every pixel) when assigning a neighborhood to each ant path for clustering purposes.

The next decision to make in creating the graph is choosing how to connect nodes.There are many viable

schemes here but we choose a simple scheme in which a radius r

2

is chosen and all pixels within r

2

Euclidean

distance between each other are connected.We choose to ignore spatial information in the construction of our

graph (and all other parts of the algorithm).

A nal step is needed when constructing the graph.In this nal step we make provisions in order to avoid

ants retracing their own steps (going in loops).Using the same notion of darker and lighter pixels as described

earlier,we make it so that in the ACO algorithm ants can only move from a darker pixel to a lighter pixel while

constructing a path on the graph.In other words we use a directed graph when constructing ant paths.

2.3 Basic Ant Colony Optimization Setup

For every edge on the graph,both a pheromone and heuristic level is stored in an array.The pheromone level

gives ants information based on how past ants have fared on this edge while the heuristic value is a predetermined

metric imposed for the specic problem which helps guide ants in the right direction.Ants start at some given

starting point and at each point during their tour construction they implement a decision rule on which edge to

choose.The decision rule is dependent on the pheromone and heuristic values for potential edges along with two

parameters and ,and is written as

p

k

xy

=

xy

xy

P

xy

xy

(1)

where p

xy

is the probability of moving frompoint x to point y,

xy

is the pheromone value on the edge connecting

x and y and

xy

is the heuristic value on the same edge.In this case,the heuristic value for each edge is simply

the length of the edge.The summation is done over all potential edges stemming fromx. and are parameters

which apply various weights to the pheromone and heuristic information.The step chosen is based on a random

draw from the probability distribution for all potential edges given by eqn.1.

Eventually ants come to a point where an ending condition (dened by the user) is met and the ant path

ends.During every iteration of the algorithm,k number of ants begin and end paths along the graph according

to the decision rule.After the paths have been created,pheromones are updated accordingly.For every edge

crossed by an ant a pheromone update rule is applied based on the quality of that ant's solution among other

variables.First,though,a metric must be created to determine the quality of an ant's solution.The pheromone

update rule is as follows:

k

xy

=

Q=L

k

:if ant k uses curve xy in its tour

0:otherwise

(2)

where Q is a constant and L

k

relates to the quality of the solution involving that edge xy,i.e.,its length.

Additionally pheromones are evaporated at the end of every iteration j in order to put more emphasis on

recent paths,using the following evaporation rule

j

xy

= (1 )

(j1)

xy

(3)

where is the evaporation rate and superscripts here denote iteration number.

Proc. of SPIE Vol. 8390 83901P-3

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

2.4 Ant Colony System

Ant Colony System (ACS) is a specic version of ACO which was implemented in our algorithm.There are a

few minor changes to the ant decision rule,pheromone update rule,and evaporation rule.For details on the

implementation of ACS the reader is encouraged to see Dorigo & Stutzle (2004).

An extra parameter q is given in the ACS model.With probability (1 q),the normal decision rule is used

(eqn.1).With probability q,the same decision rule is used,however the ant chooses the y such that p

xy

is

maximized,not the result of a random draw.This enforces that some of the time,the ant chooses the best next

step,which is not necessarily the best step to choose for the optimal path.

In ACS pheromones are only evaporated for the edges in which ants have crossed.This leads ants towards

the less explored areas of the graph.Additionally,the best so far path is stored and pheromones are added to

its edges every iteration.This helps ants nd a local optimum around the best so far area of the graph.The

new pheromone update rules can be summarized as follows:

update for when an edge ij is crossed:

xy

((1 )

xy

+

0

(4)

where is an evaporation parameter and

0

is the parameter determining starting pheromone values for

every edge.

update for best so far path:

xy

((1 )

xy

+=C

bs

(5)

where is another evaporation parameter and C

bs

is a measure given to the quality of the\best-so-far"

path found,i.e.,its length,or reciprocal thereof.

2.5 Implementation of ACS

In our algorithm,the general idea is to create a series of paths on a graph.For each path on the graph we will

have to use ACS twice:once to nd an ending point,and once to nd the shortest path between the starting and

ending points.We change the metric we are optimizing over and the heuristic values on the graph whenever we

switch between the two phases of ACS.As stated above,the rst step in the algorithm is to identify long paths

through the most dense portions of the data cloud in the spectral domain.Consequently,the local density for

each pixel is estimated and only those points of highest density are considered in the rst step of the algorithm.

Once the long paths are identied,then all the data points are considered to correctly classify as many pixels as

possible via spectral proximity to the identied paths.

The rst step is to nd the point in the graph that is not yet assigned to a cluster and whose corresponding

pixel is darkest in the image.In other words,its Euclidean distance from the origin is smallest.To nd the

ending point of the path under consideration,we set the heuristic value of each edge to the length of the edge.

We have all paths start at our chosen starting point,and we use a total path distance metric which is simply

the Euclidean distance between starting and ending points.The metric and heuristic are set such that the ants

\prefer"to take large steps to get as far as possible in the spectral space.

After running ACS for several iterations using the above heuristics and metric,we use the end node of the

longest path as our ending point.Now,given starting and ending points for the path,we then switch our

heuristics and metric to optimize for the shortest path between the two points.The metric is simply the length

the path would be if it were plotted in the spectral cloud and the heuristic is again the length of each edge.

However,now we are searching for the shortest possible path so the ants\prefer"taking short steps to get to

the ending point.

Again ACS is run for several iterations.Afterward,all of the points located on the best path are noted.Then

all points on the graph connected to points on the best path are assigned to one cluster (i.e.,the neighborhood

pixels of all pixels on a given path are collected and associated to one another as a cluster).Finally we nd the

next\darkest unassigned pixel"so we can start the same process again.The algorithm ends when every point

on the graph has been assigned to a cluster.

Proc. of SPIE Vol. 8390 83901P-4

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

.

.

I

II

I

I

.

I

.

.

I

..

.

I

.

.

.

I

I

I

I

I

I

.

.

.

.

...

.

.

.

.

.

.

.

S

SS

S

S

S

S

S

S

S

Proc. of SPIE Vol. 8390 83901P-5

The clustering algorithm is implemented as the following series of steps,also shown in Figure 5:

Given a graph with nodes and edges,nd the\darkest"unassigned pixel in the set:

Figure 1.First step:identify starting point for rst path.

We let ants construct paths in search of the path which goes farthest:

Figure 2.Second step:ants go as far as they can,nd end point for this path.

We next optimize for the shortest path from start to end:

Figure 3.Third step:ants nd optimal path between current start and end points.

Finally we assign all points in the shortest path and all neighbors to one cluster:

Figure 4.Fourth step:associate neighborhood pixels to the cluster.

The algorithm ow chart is shown in Figure 5.

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

Figure 5.Flowchart of the Ant Colony Optimization clustering algorithm.

Table 1.Free parameters in the ACS algorithm implemented here.

r

1

radius for estimating point density

r

2

radius for edge creation on graph

N

fraction of pixels used in path creation graph

k

number of ants used per iteration

weight factor on pheromones

weight factor on heuristic values

,

pheromone evaporation parameters

0

initial pheromone value

q

ACS probability parameter

Q

pheromone update rule scaling

We note here that one limitation of this approach is that there are many free parameters required to adequately

use ACO & ACS to identify paths in the spectral space.Table 1 shows the free parameters required by the Ant

Colony System as implemented here.Parameter values for the results shown below were derived empirically.

By construction,it is likely that the ACO clustering algorithm leaves some pixels of the image unassigned to

any cluster.This occurs because only the most\dense"points in the cloud are used when creating ant paths,

and sometimes the less dense points are so far from the ant paths that they end up unassigned.In the cluster

maps shown below these pixels are colored black.If having unassigned pixels is a problem,there are many ways

to assign them.One potential way would be to take all the pixels that are unassigned and treat them as if they

belonged to a new image.You could then run the ACO algorithm on this subset of the image and create clusters

for these pixels.Alternatively a fuzzy clustering scheme could be used to assign these pixels in a probabilistic

manner based on their proximity to clusters.

3.EXPERIMENTAL RESULTS

3.1 Application to a Multispectral Image

We applied the presented algorithm to a 100100 pixel area of interest of a DigitalGlobe Worldview-2 8 band

image.Figure 6 shows the true color representation of the full image along with the area of interest to which the

clustering algorithm was applied.The image is largely a forested region and the area of interest contains trees,

exposed soils,structures,and other man made features.

To better understand what the algorithm is doing,we show the actual ant paths identied for this image (as

projected into two spectral bands) using various amounts of the data in Figure 7.Figure 7(a) shows the data

Proc. of SPIE Vol. 8390 83901P-6

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

(a)

(b)

Figure 6.(a) Example Worldview-2 image shown in RGB.(b) Area of interest.

in the two bands,where color represents estimated density per pixel in the spectral domain (lighter blue implies

higher density).We note that the data are ltered by density in the early steps of the algorithm such that we

only identify paths through the most dense portions of the data cloud in the spectral domain.Figures 7(b) -

7(d) show the cluster paths determined when using 40%,50%,or 60% of the data,respectively,based on spectral

density.Here we see that as more data are considered in the path identication step,more paths are uniquely

found and\gaps"in the data are gradually lled.However,we also note that using only 40% of the data the

primary structures in the cloud are correctly identied by the ACS algorithm.Note that several paths,when

viewed in this projection,\overlap"and could be identied as a single cluster that is not well modeled by a one

dimensional manifold.An algorithm to cluster the paths together,or alternatively group the resulting cluster

maps,was not implemented at this time.

Figure 8 shows the cluster maps for three separate clusterings.The three cluster maps show the dierences

seen when using 40%,50% or 60% of the data cloud when generating the original cluster paths.These results

show that the algorithm is successful in producing a spatially interpretable cluster map using only a collection

of one dimensional manifolds and the spectral pixels associated to those paths through the spectral cloud.Even

using only 40% of the data to build the paths,the algorithm correctly separates out the vegetation from the

bare soil and several of the man-made structures.The addition of more data to the path creation step only

improves the cluster separability in the nal cluster map.This result indicates that relatively low dimensional

spectral imagery,in this case the 8 spectral bands of Worldview-2,can be well modeled through a collection of

one dimensional manifolds as derived directly from the data through an Ant Colony Optimization technique.

These one-dimensional manifolds can then be used to separate individual materials in the spectral domain to

produce cluster maps.

Proc. of SPIE Vol. 8390 83901P-7

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

(a)

(b)

(c)

(d)

Figure 7.(a) Two band projection of the WV2 data used.Resulting paths when using (b) 40%,(c) 50%,

and (d) 60% of the data in the path construction algorithm.

Proc. of SPIE Vol. 8390 83901P-8

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

(a)

(b)

(c)

(d)

Figure 8.(a) RGB of area of interest.(b) - (d) Cluster maps after running the ACO clustering algorithm.

Results shown for using (b) 40%,(c) 50%,and (d) 60% of the data when building the paths.

Proc. of SPIE Vol. 8390 83901P-9

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

4.SUMMARY AND CONCLUSIONS

This paper has presented a novel approach to modeling multispectral imagery through the use of low dimensional

manifolds as derived from the data cloud via the Ant Colony Optimization (ACO) approach to identifying paths

on graphs.Here,we represent the multispectral data as a graph and implement a variant of ACO,the Ant

Colony System (ACS),to identify\best"paths through the data in the spectral domain.Here,\best"means

long paths through the densest portions of the data.Each path is considered as the best representation of a

unique material cluster in the data,and other pixels in the data cloud are associated to the cluster based on their

proximity,in the spectral domain,to the path.The algorithmhas been applied to a high resolution multispectral

image collected with the Worldview-2 platform.

The resulting cluster maps show that,perhaps surprisingly,the low dimensional model of the data is generally

good and related to material separability in the spectral domain.This is surprising as previous work

10

has

shown that while natural materials are generally well represented by low dimensional manifolds,they are not

one dimensional.Bachmann,et al.(2005) have used a locally linear manifold estimation to cluster hyperspectral

imagery in the nonlinear scenarios of near shore water remote sensing with success,but this approach used the

ISOMAP algorithm to identify the manifold coordinates for each pixel to produce the clustering.Here,we do

not identify the manifold itself,but only leverage the identication of a low dimensional (i.e.,one) path through

the data to which we can associate pixels assuming they are of similar material.

While the Ant Colony Optimization/Ant Colony System methodology presented here for clustering spectral

imagery may not be practical in application to large imagery (due to the computational complexity of the method

and the large number of free parameters) it is hoped that the notion of using low dimensional manifolds to

understand and model multispectral imagery will inspire new classes of algorithms for analysis of high resolution

remotely sensed spectral imagery.

ACKNOWLEDGMENTS

The authors wish to thank DigitalGlobe for the imagery used in this study.

REFERENCES

1.J.R.Schott,Remote Sensing:the Image Chain Approach,Oxford University Press,2 ed.,May 2007.

2.D.Manolakis,D.Marden,and G.Shaw,\Hyperspectral image processing for automatic target detection

applications,"Lincoln Laboratory Journal 14(1),pp.79 { 116,2003.

3.C.L.Bachmann,T.L.Ainsworth,and R.A.Fusina,\Exploiting manifold geometry in hyperspectral

imagery,"IEEE Transactions on Geoscience and Remote Sensing 43(3),2005.

4.B.Basener,E.Ientilucci,and D.Messinger,\Anomaly detection using topology,"in Algorithms and Tech-

nologies for Multispectral,Hyperspectral,and Ultraspectral Imagery XIII,S.Shen,ed.,6565,SPIE,2007.

5.B.Basener and D.Messinger,\Enhanced detection and visualization of anomalies in spectral imagery,"in

Algorithms and Technologies for Multispectral,Hyperspectral,and Ultraspectral Imagery XV,S.Shen,ed.,

7334,SPIE,2009.

6.R.Mercovich,A.Harkin,and D.Messinger,\Automatic clustering of multispectral imagery by maximization

of the graph modularity,"in Algorithms and Technologies for Multispectral,Hyperspectral,and Ultraspectral

Imagery XVII,S.Shen,ed.,8048,SPIE,2011.

7.J.Albano,D.Messinger,and S.Rotman,\Commute time distance transformation applied to spectral

imagery and its utilization in material clustering,"submitted to Optical Engineering,2012.

8.M.Dorigo and T.Stutzle,Ant Colony Optimization,MIT Press,1st ed.,2004.

9.R.Mercovich,J.Albano,and D.Messinger,\Techniques for the graph representation of spectral imagery,"

in WHISPERS 2011,IEEE GRSS,2011.

10.A.Schlamm,D.Messinger,and B.Basener,\Geometric estimation of the inherent dimensionality of single

and multi-material clusters in hyperspectral imagery,"Journal of Applied Remote Sensing 3,2009.

Proc. of SPIE Vol. 8390 83901P-10

Downloaded from SPIE Digital Library on 25 Jun 2012 to 129.21.58.44. Terms of Use: http://spiedl.org/terms

## Σχόλια 0

Συνδεθείτε για να κοινοποιήσετε σχόλιο