Kohonen.PPT - South Dakota School of Mines and Technology

naivenorthAI and Robotics

Nov 8, 2013 (3 years and 7 months ago)

54 views

The Kohonen Class of Neural
Networks

Karl Lalonde

Institute of Atmospheric Sciences

South Dakota School Of Mines and Technology


Pattern Recognition and Classification


Supervised


Show the computer something and tell it what it is.

Classification occurs partly before system training to select
training information. Additional classification and
verification occurs after system training by a human
problem domain expert.


Unsupervised


Show the computer something and let it sense what it is.

Classification occurs after system training by a problem
domain expert, usually by identifying data drawn to a
“cluster center.” Unsupervised methods are often equated
with data clustering.

Why cluster?

Because sometimes you just need
a way to group stuff.

When we use unsupervised learning

How do you
learn without
any language
skills?

When we use unsupervised learning

How do you cope
without linguistic
skills
or

memory?
How do you find
your way around a
new place?

When we use unsupervised learning

How do you crack a dead language?

Etruscan


Linear B (Greek)


Linear A (proto
-
Greek)


More Importantly….

How would
you

teach these guys to do the same things?

Introducing the Kohonen Self
Organizing Map (KSOM)

Kohonen Self Organizing Map

Developed by this guy
(Teuvo Kohonen) at U of
Helsinki in the early
1980s.

Based on work by this guy
(Christoph von der Malsburg) at
Ruhr
-
Universität Bochum in the
mid
-
1970s.

Biological Justification for the KSOM

The Kohonen and Malsburg models are based on
studies of learning in the V1, V2, V4, and MT areas of
the cerebral cortex. These are also called “Broadman
areas”, specifically areas 17 through 19.

Biological Justification for the KSOM

The structure consists of a “map” of structured cortical
“hypercolumns” containing pyramidal neural cells.

Biological Justification:
Vision and Learning

(at least as I understand it)



Input Stimulus

Goes to the rods and

cones of the eye

And gets converted

for processing

Vision and Learning

(at least as I understand it)

Input Layer

And then it hits the cortex.

Unsensitized pyramidal cells

Vision and Learning

(at least as I understand it)

Input Layer

A cell in cortical sheet is stimulated!

Vision and Learning

(at least as I understand it)

Input Layer

And responds accordingly.

!

Vision and Learning

(at least as I understand it)

Input Layer

As do others in the immediate area or “neighborhood”

Vision and Learning

(at least as I understand it)

Input Layer

Different inputs ….

Vision and Learning

(at least as I understand it)

Input Layer

….impact different areas of the cortex.

Vision and Learning

(at least as I understand it)


Resulting in a map in which
clusters

of neurons
which respond to the respective stimuli

Kohonens: The Basic Idea


Make a two dimensional array, or map, and
randomize it.


Present training data to the map and let the cells
on the map compete to win in some way.
Euclidean distance is usually used.


Stimulate the winner and some friends in the
“neighborhood”.


Do this a bunch of times.


The result is a 2 dimensional “weight” map.

Kohonen SOMs: Details

Make a two dimensional array, or map
, and randomize it.




Often a 20 by 20 rectangular or hexagonal map is used, with each cell
having the same structure as the presented feature vectors.

Kohonen SOMs: Details

Make a two dimensional array, or map, and
randomize it
.



Randomizing or initializing consists of setting every cell component to
0.500 +/
-

a small random perturbation or offset.

Kohonen SOMs: Details

Make a two dimensional array, or map, and
randomize it
.


for (i = 0; i < X; i++)


for (j = 0; j < Y; j++)


for (k = 0; k < NumFeatures; k++)


{


KSOLayer[ i ][ j ][ k ] = 0.5;



val1 = rand() % 100; // get 0 to 100


val1
-
= 50.0; // now get
-
50 to 50


val1 /= 500.0; // finally get
-
0.10 to 0.10



val2 = rand() % 100; // get 0 to 100


val2
-
= 50.0; // now get
-
50 to 50


val2 /= 500.0; // finally get
-
0.10 to 0.10




KSOLayer[ i ][ j ][ k ] += (val1 * val2); // Multiply for a small perturbation


}

Kohonen SOMs: Details


Present training data to the map and let the
cells on the map compete to win in some
way. Similarity measures such as Euclidean
distance are generally used.



Usually the cell with closest distance to the
presented training vector is called the
‘winner’.

Feature Vector Presentation


GetTrainVector(); // get the vector to present


Smallest = 1000; // default smallest value


SmallestPositionX = SmallestPositionY = 0;



for (x = 0; x < MapSizeX; x++) // For each node in the map


for (y = 0; y < MapSizeY; y++)


{


Distance = 0;



for (z = 0; z < NumberFeatures; z++)


{


D = TrainVector[ z ]
-

KSOLayer[ x ][ y ][ z ];


D *= D;


Distance += D;


}



Distance = sqrt( Distance );



if ( Distance <= Smallest) // If new position has smallest distance


{


Smallest = Distance; // Keep it and the new distance


SmallestPositionX = x;


SmallestPositionY = y;


}


}



UpdateWeights( SmallestPositionX, SmallestPositionY ); // Update Winning Node Wts.

Feature Vector Presentation

Kohonen SOMs: Update Details


Stimulate the winner and some friends in
the “neighborhood”.



The following weight update rule is used (or some variant of
it):



w
ijk

( t+1 ) = w
ijk
(t) +
[




ij
(t)
• [x
k
(t)


w
ijk
(t)]
]




where


w( t+1 ) are the new weights i,j are node coordinates



w(t) are the old weights k is the feature vector dimension



x(t) is the feature vector for the winning node





is a learning “constant”




(t) is a spatial (non
-
numeric)




neighborhood function

Update Strategy

Kohonen SOMs: Update Details

Stimulate the winner and some friends in the “neighborhood”.



(Set High and Low neighborhood variables based on proximity to map edge)



//


// Now update ... for each node in the 'hood (yo)


//


for (i = LowX; i <= HighX; i++)


for (j = LowY; j <= HighY; j++)


for (k = 0; k < NumFeatures; k++)


{


work = TrainVector[ k ]
-

KSOLayer[ i ][ j ][ k ];



KSOLayer[ i ][ j ][ k ] += Alpha * work;


}

In other words….

Find the closest node to the presented feature vector and
some in the neighborhood, and stimulate them by making
them a little more like the presented feature vector.


Kohonen SOMs: Details


Do this a bunch of times.



Number of training epochs can vary between 500 and
100,000, depending on the rate of map structural
convergence. A learning constant, or

, introduced to
inhibit the adaptation of the self organizing map. This
value usually ranges from 0.01 and 0.50.



The learning value is slowly decreased over the training
period to focus the training of the presented feature
vectors.
You also slowly collapse the size of the
neighborhood as another tool to focus the knowledge
adaptation.


Kohonen SOMs: Details


Alpha = 0.2;


AlphaIncr = Alpha / NumberTrainingEpochs;


TheHood = 10;


HoodDrop = 1000;



(Train loop starts here)



//


// Every epoch, update the gain constant using whatever strategy


//


if (Alpha >= 0.01)


Alpha
-
= AlphaIncr;



//


// Update the neighborhood


// Note: The (i + 1) accounts for epochs starting at 0


//


if ( (Epochs % HoodDrop == 0) && (TheHood != 1) )


TheHood
--
;



(Train loop ends here)

Kohonen SOMs: Results


The result is a two
-
dimensional map of
weights, hopefully with areas or individual
cells which attract objects presented to it
(not necessarily from the original training
set) with similar feature signatures.

A Trained KSOM

When Training is Finished, Classify


Take all of the test data from the original data set.


Find the closest distance for each, and record the
position of the map that was closest.


Create a cluster map, and analyze the groupings.


Assign classes to the groups.


Create a visually “classed” map.


Analyze the errors.



(In the future, we’ll look at this process using satellite imagery)



Kohonen
-
style network training is also
called “Competitive Learning”

This is due to the fact that during a single
training epoch, the cell with closest distance to
the presented training vector is can be
considered as the ‘winner’, and is thought to
have successfully competed for the honor,
hence the concept of “competitive learning”.

“Competitive Learning”

There are two types of competitive
learning:
hard and soft
.


Hard competitive learning is
essentially a “winner
-
take
-
all”
scenario. The winning neuron is the
only one which receives any
training response.

(Closer to most supervised learning models?)

“Competitive Learning”


Soft competitive learning is essentially a “share with your
neighbors” scenario.



This is actually closer to the real cortical sheet model.



This is obviously what the KSOM and other unsupervised
connectionist methods use.

Other Computer Science Issues

(Making the “Perfect Kohonen”)


Different neighborhood functions


Different learning rules


Gaussian


ViSOM (Yin)


Normalization


Vector Magnitude


Division by largest number


Anything that works for the problem


Similarity measures


Other properties and things to be aware of

The real neighborhood function

Which just happens to look like ….

The “Mexican Hat” Wavelet Function.

Neighborhood Update

Winning Node

Different Neighborhood Update Rules

Winning Node

Different Neighborhood Update Rules

Similarity measures


Issues with Euclidean distance



Need a fairly uniform feature space



Other simmilarity/dissimilarity metrics



The rest of the Minkowski metric family



Tanimoto Distance



Chebychev Distance



Fractal Distance metrics



Home
-
grown Distance metrics



Text
-
specific Simmilarity metrics

Properties


Topological Preservation


KSOMs are topology
-
representing networks (TRN)


This means that if training signals are close to each other in the
real world, they will be close to each other on a trained KSOM.


TRNs are also biologically justified



Retinotopical mapping between the retina and the cortex


Somatosensory mapping (the homunculus)

Input Space

KSOM

Properties


Topological Preservation


The Initial Neighborhood Size is Important!



Start large and collapse slowly at first, depending on
the number of training vectors being used and the
number of real world vectors your training set actually
represents. Neighborhood size is one of the primary
controls of the quality of the topological preservation in
the KSOM.

Applications: General


Oil and Gas exploration.


Satellite image analysis.


Data mining.


Document Organization


Stock price prediction (Zorin, 2003).


Technique analysis in football and other sports (Barlett,
2004).


Spatial reasoning in GIS applications


As a pre
-
clustering engine for other ANN architectures


Over 3000 journal
-
documented applications, last count.

Resource Acknowledgements


Cottrell, M., Hammer, B., Hasenfu
ß
, A.,


Villmann, T
., WSOM 2005 presentation


Bombarda, F., Neuronanatomy,

http://www.neuroanatomy.hpg.ig.com.br/
,


Shun
-
ichi Amari
,
Dynamics of Excitation and Self
-
Organization

i
n Neural Fields



Next time….

Satellite Image Analysis

(Questions?)