Neural Edelmanism and Neural Darwinism

bigskymanAI and Robotics

Oct 24, 2013 (3 years and 7 months ago)

81 views

The Evolutionary Review



Neural Edelmanism and Neural Darwinism



Chrisantha Fernando



For many years
Gerald Edelman
’s theory of Neural Darwinism

confused me

[1]
.
After winning a
Nobel Prize for discovering the
structure
of antibodies,
few
people were in a better position to appreciate that the adaptive immune system
works by a kind of natural selection called somatic selection.
In somatic
selection,
B
-
cells

that produce antibodies that bind a foreign

antigen
mor
e

tightly

can
outcompete

(
replicate faster

than
)
other

B
-
cells that produce antibodies that
don’t bind
a foreign

antigen as well.

Edelman

went on to propose that a similar
process takes place in the brain for
neuronal groups
(neurons conne
cted to each
other by synapses). He
argued

that
like antibodies,
n
euronal g
roups
also
compete with each other,
binding not antigens
as antibodies
d
o,
but

binding

stimuli
.

N
euronal groups that bind stimuli
best can
obtain more
reward
, with
r
e
-
entrant
(reciprocal)
connections between groups allowing

the winning groups
to
remould the losing groups.



But i
s this natural selection
?

Is Edelman’
s theory entitled to its title of

“Neural
Darwinism”?

It depends which great evolutionary
biologist
you ask.

Acco
rding
to one definition given by John Maynard Smith

(JMS)

[2]
,
Edelman’s

neuronal
groups would only be units of evolution if
re
-
entrant connections between
groups really allowed the replication of informa
tion between groups so that the
losing group cam
e to resemble the winning group.
I
have not been able to find
evidence of such a mechanism
in Edelman’
s theory. Edelman has

not
shown how
a neuronal group could really
transmit information to another
neuronal
group
.
But according to a broader definition of natural selection given by
the
theoretical
biologist

George
Price

[3]
, one does not require explicit
multiplication of
neuronal groups, just a redistribution of resources
(competition)
between
groups for
a

process
to be called natural selection
. Michod has pointed out that

Edelman’s neuronal groups are Darwinian entities in this somewhat
boarder

Pric
e
ian
sense

[4]
.



The Evolutionary Review



This paper
describes how
Eörs Szathmáry and I
solved the problem of
how
neuronal groups cou
ld really be units of evolution
in the
JMS
sense
and not just
in the weaker
Priceian

sense
.
The principles involved
point to a fundamental
link

between the origin of life
and
the origin of human
cognition

[5]
.
Our

debugging
of Edelman’s theory
may permit

the
unification of two fields of human
endeavour,
evolut
i
onary biology and neuroscience. Both these fields

aim to
explain open
-
ended adaptation, but
up to now have done
s
o in relative isolation.



Along with
Gerald
Edelman
and William Calvin
[6]

we

propose

that an
underlying process of natural sel
e
ction takes place in the brain. But we differ in
claiming

that
p
opulations of neuronal groups
(replicators)
can

undergo natu
ral
selection as defined by JMS, i.e. with true
replication

of information, and not just
in the broader sense described by Price.

In our formulation, n
euronal replicators
are
patterns
of connecti
vity or
patterns of
activity
in the brain tha
t can make
copies of themselves to nearby brain regions
with generation times of seconds to
minutes

[7,8]
. We believe
neuronal units of evolution

can
undergo natural
selection in the brain

itself
,
to
contribute to

adaptive thought and action

[9,10]
;

populations of
good ideas evolve
overnight.

Their fitness is determined by the
same
Dopamine
-
based
reward
s that have been proposed in other
neural
theories
of reinforcement learning,
including Edelman’s.

Unlike Edelman
and Calvin
however, we propose
several viable mechanisms for replication of neuronal
units of evolution.
But why is replication so important for natural selection?
And
what

makes JMS’s formulation
of natural selection
more powerful than Price’s?


L
ets
examine both d
efinitions of
natural selection

more closely.

Definitions are
never

right or w
rong, only helpful or unhelpful
.
JMS
defined a unit of evolution is
any entity that has the following properties

[2]
.
The firs
t property is
m
ultiplication;
the entity
produces copies of itself
that can make
further
copies
of itself, o
ne
entity
produces two, two
entities
produce

four, four
entities
produce

eight, in a process

known as autocatalytic growth.
Most

living things

are
capable of
autocatalytic growth
, but there are some

exceptions;

for example,
sterile worker ants and
mules do not multiply and so whilst being alive,
they
are
not units of evolution.

The second requirement is v
ariation, i.e. there m
ust
be
The Evolutionary Review



multiple
pos
sible kinds of entity.

Some things are capable of autocatalytic growth
and yet do not vary, for example
fire can grow exponentially for
it
is the
macroscopic
phenomena
arising
from
an
autocatalytic
reaction,
yet

fire does not
accumulate adaptations by natu
ral selection.

The third requirement is that there
must be heredity, i.e. like begets like, so that offspring resemble their parents.
Doron Lancet

proposed that
prior to nucleotides and gene based heredity
,

clumps of lipid molecules called composomes
could

be

capable

of
undergoing
natural selection

[11]
. B
ut
recently we

have shown
that

whilst
composomes

can

multiply

and
possess
variation,

they
do not
have stable heredity, i.e. like

occasionally produces

very much unlike

(the mutation bias being too strong in
certain directions),

and so
unfortunately
such
systems
cannot

after all
evolve by

natural selection

[12]
.

Later we will
see that Edelman’s neuronal groups may fall
into this final category.
If units of evolution of different types have different
probabilities of producing offspring, i.e. if they have differential fitness,
and

if
these probabilities are independent of the fr
equencies of other entities, the
average fitness of the population will be maxi
mised,
and

there will be survival of
the fittest.


George Price
gave a
more general

and more inclusive
definition of natural
selection

[3]
. He said that
a trait (any measurable value)
would

increase in
frequency to the extent that the
probability of that
trait

being present
in

the next
generation
was

positively correlated with

the
trait

itself,
counterbalanced by that
trait’s

variability (the
tendency of that trait to
change
between generations for
any reason, e.g.
due to
mutation)
.

Notice that
JMS’s definition is

algorithmic,
it
tells you
roughly
how to m
ake the natural selection c
ake.
Price’s
definition is

st
atistical,
it tells you whether somethi
ng is
a natural selection cake or not, i.e.
whether
it is the kind of

cake
that
can undergo
the accumulation of
adaptation,
or
survival

of the fittest.
It is important to note that both these definitions were
intended for use in the debate that raged over group selection
[13]

because there
it was essential to formally define what
a legitimate
evolvable
substrate

was
.

Here we use them to understand
neuronal

group selection.


The Evolutionary Review



Lets take some
search
algorithms and see if they satisfy these
two very different
definitions of natural selection.
From a computer science perspective, n
atural
selection
is

a search algorith
m that generates and selects
entities
for solving

a
desired
p
roblem,
if t
he
quality of that
entities’

desired
solution
is

correlated with
the probability of transmission of
that

entity, i.e. with its fitness.

G
enetic
algorithms
work in this way.

They
are computer programs that
implement
natural selection as defined by JMS and Price

[14]
.
Many other algorithms exist
such as hill
-
climbing,
simulated annealing,
t
emporal difference learning, an
d
random search, for findin
g adaptations.

We ask, do these
satisfy either of the
definitions of natural selection?


A c
lassification of
search

algorithms show
s

that
natural selection
as defined by
JMS
really does have some special properties that are
often overlooked because
we take its implementatio
n in the biosphere for granted and because we have
erroneously

come to equate the models of natural selection t
hat evolutionary
biologists use,
with natural selection
itself
.

The table

below shows my
classi
fication. Systems under
g
oing natural selection appear on the right.


Solitary
Search

Parallel Search

Parallel Search with
Competition

(Price)

Parallel Search with
Competition and
Information Transmission

(JMS)

(Stochastic)
hill climbing

Independent
hill
climbers

1.
Competitive Learning

2.
Reinforcement Learning

3.
Synaptic Selectionism

4.
Neural Edelmanism

1.
Genetic Natural Selection

2.
Adaptive Immune System

3.
Genetic Algorithms

4. Didactic receptive fields

5
.
Neuronal Replicators


Table 1.
A classification of search (generate
-
and
-
test) algorithms.


On the
left
hand column

of Table 1

is
shown the
simplest
class

of search
algorithm,
solitary search
.
In solitary search

at most two
c
a
ndidate
unit
s are

maintained at one

time.
An algorithm known as
Hill
-
climbing is an example of a
solitary search algorithm in which
a variant of the unit

(candidate solution)

is
produced and
tested at each ‘generation’.

If
the offspring solution’s

quality
exceeds that of
its

parent, then the offsp
ring
replaces the parent.
If it does not,
then the offspring is destroyed and the parent produces another correlated
The Evolutionary Review



offspring.
Such an a
lgorithm can get stuck on local
optima. Figure 1 shows this
algorithm implemented by a robot on a
n actual

hill
y
landscape
.
The robot carries
a windmill, and its aim is to get to the highest peak.
Lets assume wind

speed
increases with altitude for now.
It moves randomly to a point on a radius a few
meters away, measures the
wind

speed and stays there if this wind

spe
ed is
higher than the previous wind

speed it measured.
If it is not higher, it goes back
to its previous location.
To do this, it must have some memory of the previous
location.




Figure 1.
Imagine a robot on a mountainous landscape whose task it is to

reach
the
highest peak.
One can imagine for example that
it holds a windmill which it
wishes to rotate at the highest speed possible, and the higher up it is the faster
its windmill will rotate.
If it behaves according to hill
-
climbing it
starts from a
ra
ndom position (1)
moves to a nearby location
(2)
and tests whether that
location is hi
gher than its original location by measuring the speed of its
windmill.

If

the wind speed is faster
, it
remains there, but if the wind speed is not
faster

(shown in the unnumbered circles)

the robot

move
s back to the
previous
The Evolutionary Review



location.
The robot may get stuck on a peak
that

is not the highest peak (a local

optimum
)
.

A

robot
(not shown)
behaving according to stochastic hill
-
climbing
does the same, except tha
t it accepts the new position with a certain probability
even if it is slightly lower than the orig
inal position. By this method stochastic
hill
-
climbing

can sometimes avoid gett
ing stuck on the local optimum, but it can
also oc
casionally lo
se the peak it
is

on because memory is only kept of the
immediately preceding position.



Stochastic hill
-
climbing and simulated annealing are examples
of solitary search
where there is a certain probability of accept
ing a worse quality offspring. This
balances
exploration and exploitation and can

reduce

the chances of

getting
stuck on local optima, however,
the

cost
is

potentially lo
sing
the currently
optimal
peak.



But one can ask, isn’t
solitary search
actually
a kind of

natural selection

according to Price

and according to JMS
? Is it

not

natural selection with a

population size of
two

in which
one individual replaces the othe
r based on which
is the fitter
?

Or can the fact that it can be implemented

without explicit
multiplication by use of pointers and memor
y
, or by a robot moving on a hillside
mean that it is not an example of natural selection

according to JMS
? S
ee Figure 2
which shows
two other implementations of hill
-
climbing, this time not on a
hillside but
in a

system of
physical
discrete registers
that can be in binary states.


The Evolutionary Review




Figure 2.

Two implementations of hill
-
climbers

that are

both
trying to maximize
the number of 0’s in the string.
The hill
-
climber on the left stores a solution, here
represented as a binary string, it modifies the solution

at some position in the
string, and stores this modification. After assessing the new solution and
comparing its quality with the original solution it either keeps the modification,
or erases the modification.
The hill
-
climber on the right
replicates

the
entire
solution to a separate location in memory.
Only the hill
-
climber on the right has
true multiplication as defined by JMS.


The
search dynamics

shown
by

both
machines

in Figure 2
are identical and
would be capable of accumulation of adaptations according to
Price’s
formulation of nat
ural selection
a
s

there
wa
s covariance between a trait and
fitness.

According to JMS’s definition,
the implementation
on the right
involving
the explic
it multiplication (replication) of a unit would constitute natural
The Evolutionary Review



selection but the implement
ation
on the left
using pointers and memory
would
not.
However, notice that this distinction is between implem
entations of the
same algorithm

both of which
are in
distinguishable in terms of

search
performance (although
the system on the left

uses
fewer resources).
What about
the robot on the hillside?
Here the brain of the robot may store merely the path
back to the
previous

position,
and
so
the
implementation of h
illclimbing in that
spatially
embodied case may require no replication of an explicitly stored
entity
(
i.e. a
position representation)

at all.
Therefore, we can say that
the
phenomenon

of hill
-
climbing can be implemented either with or without explicit replication,
and therefore may or may not involve natural selection as defined by JMS.
However,
in all cases, the
phenomenon

of hill
-
climbing must
accord with the
principle

of
natural sel
ection according to
Price in order to accumulate
adaptations.



N
otice that
in

Figure 2
(right)
only two memory slots
at most are available that
can contain a maximum of two candidate solutions at any one time.

A slot is
simply a material organization or
substance that can be reconfigured into the
form of a unit or candidate solution, for example a piece of memory in a
computer or the organic molecules constituting an organism.
What happens if
many more slots are availab
le
?

How should one best use them?
In

terms of
Figure 1, this is equivalent to a

hillside now inhabited by many robots,
rather han
just one.
Now our aim is that at least one robot finds the highest peak.
Notice
that
a slightly different aim would have been

to maximize the total wind collected by
t
he windmills of all the robots.


The simplest algorithm for these robots to follow would be that each one
behaves completely independently of the others and does not communicate with
the others at all. Each of the
m behaves exactly like the robot in Figure 1.
In terms
of the implementations shown in Figure 2,
this multiple robot version of search
(simple parallel search)

could be achieved by simply having mult
iple instances of
the hill
-
climbing machinery, either of
the replicating kind,
or the pointer and
memory kind, it doesn’t matter.


The Evolutionary Review



So, if we have many slots or robots available, it is possible just to let many of the
solitary searches run at the same time, i.e. in parallel.
However,
can you see
this
would be wa
ste
ful whatever the implementation?

If one pair of slots
(or a robot)
became stuck on a local optimum then
there would be no way of reusing
this pair
of

slots

(or the robot)
.
Whereas, i
f being stuck on a local optimum could be
detected, then random rei
niti
alization of the stuck slot pair

would

be a
possibility

(or in the robot example, moving
the stuck

robot randomly
to a new
position)
. Even so, one c
ould expect only a

linear speed up in the time taken to
find a global optimum

(the highest peak)
.
It is
difficult to imagine why anyone
would

want to do something l
ike this given all those slots, and all those robots.
This is

the
parallel search described in the second column of Table 1, and it is not
surprising that not many algorithms fall into this class.



A cleverer way to use the extra slots
would be

to
allow competition between
slots for search resources,
and
by
resources
I mean
the
generate
-
and
-
test step of
producing a variant and assessing its quality.
In the case of robots
a step is

moving a robot t
o a new position and reading the wind
-
speed

there.

Such an
assessment step is often the constraining fact
o
r in time and processing costs.
If
such steps were biased
so that
the
currently higher quality solutions
(robots) did

proportionally more
of the
search,
then
there would be a biased
search by
higher
quality solutions. This is known as competitive
learning because candidate
solutions com
pete with each other for reward and exploration
opportunities.

I
f
the robots
are programmed such that the amount o
f exploration they do is
greater

as the altitude increases, then those at
higher

altitude
s

do more

exploration
and this may allow a faster discovery of the global optimum
.
No
robot com
municates with any other robot.
If robots utilize a common power
supply
then the robots are competing with each other for exploration resources.
This is an example of parallel search with resource competition, shown in
column 3 of Table 1.
It requires no
natur
al selection as defined by JMS, i.e.
it
requires no explicit multiplication of information.


Several algorithms fall into the above category.
Reinforcement learning
algorithms
are
examples of
parallel search with competition

[1
5]
.
Such
The Evolutionary Review



a
lgorithms have been proposed as an explanation for learning in the brain, and
work in the following way. If the
response produced by the
firing of a synapse is
positively correlated with reward
and if this reward strengthens this synapse,
whic
h increases its subsequent
probability of firing
, then the conditions of
Price’s definition of natural selection have been fulfilled. This is because there is
a positive correlation between the trait (
i.e. the
response produced by firing
the
synapse) and t
he
subsequent

probability of that response occurring again.

Similarly, a negative correlation between synaptic firing and reward reduces the
subsequent probability of firing.
Sebastian Seun
g calls
these
h
edon
istic synapses

[16]
.

A single
hedonistic
synapse is equivalent to a single allele in genetic terms.

If there is an array of such synapses
emanating from the same neuron and there
is competition for chemical resources from the cell body of the neuron, then
these synapses are equivalent to multiple genetic alleles
competing
for resou
rces

from the cell body
, and the situation is
almost
mathemat
ically
equivalent
to
the
Nobel Prize winner
Manfr
ed Eigan’s replicator equations
[17]

in which the total
population size
of replicators
is kept fixed and there is a
well defined

number of
possible
distinct variants

[9]
.

Eigen’s equations are a popular model used by
evolutionary biologist
s to model evolution.


Notice that there is a subtle difference between the competition between
synapses described above and robot example given for parallel search with
competition.
To use the robot analogy, e
ach synapse is like a robot

stuck in
particular location on the hillsi
de
, unable to move.

Those with higher
wind
speeds are allowed to build
larger windmills.
In this simple kind of synaptic
selectionism the system only exploits the variation that exists at the beginning.

It is not even as
powerful as the case we first considered as parallel search with
competition in which the robots with the greatest wind

speeds
are able to do
more exploration because the variation is limited to the variation that was
produced at the very beginning, robots

cannot move up hills, just
increase
(or
decrease) the size of their windmills.


So d
o such systems
of parallel search
with competition
between synaptic slots
really
exhibit natural selection? Not according to the definition of JMS
because
The Evolutionary Review



there is no rep
licator; there is
no
copying
of solutions from
one
slot

to another

slot
, there is

no

information
that
is transmitted between synapses
.
Resources are
simply redistributed between
synapses (i.e.
synapses

are strengthened or
weakened
in the same way that

the
stationary
robots
increase

or
decrease

the
size of their windmills)
.
Traits

(responses)

are not copied between slots.
Instead
,
adaptation arises by
the mechanism proposed by Price
because there is
covari
ance between traits (here reward obtained by a partic
ular synaptic
response
,
or the
amount of wind collected by a

particular

robot
windmill
) and
their probability of
subsequent
activation

determined by changing the
synaptic
weight
,
or changing the

size of a robot windmill
. According to Price if this
covariance is maintained

there is survival of the fittest synapse or
windmill.

Such
a process of synaptic selectionism has been proposed by the neur
oscientist Jean
-
Piere Changeux
[18]
.


A surprising
consequence is that
Eigen’s replicator equatio
ns
can be run without
any system
having to undergo

natural selection as defined by JMS
. That is, they
can model JMS type natural selection without t
here being any real replicators in
implementing them
.

B
ut they do
always
exhibit natural selection
as descri
bed by
Price, and could of
-
course serve as
models

of systems undergoing natural
selection as defined by JMS.
Nothing

needs to multiply

when
Eigen
’s equations
are run
, but
they emulate the

consequences of multiplication
that would occur in
a system
undergoing natural selection according to JMS.



In short,
synaptic selection

algorithms can best be
understood

as competitive
learning
between
synapse
s

(slots)

that satisfy Price
’s

criteria

for
adaptati
on by
natural selection but use

a different recipe to

achieve this to that

proposed by
JMS.
Instead of explicit multiplication of replicators,
i.e. a process where matter
at one site reconfigures matter at another site

(i.e. where traits are explicitly
copied)
, both Hebbian learning and Eigen’s replicator eq
uations model the
effects of multiplication.
The recipe

i
n the case of synapses emanating from a
single
neuron
involves encoding
the information
(trait)
by

the
location

of the
synapse, and
allowing
matter
to

be redistributed

(fitness)
between synapses.
This is a very different recipe
compared to

how JMS pictured natural selection
The Evolutionary Review



working at the genetic, organismal and group levels in the biosphere.
Synapses
compete for
growth
resources, but it is their connections that encode
information.

Thus the synapt
ic se
lectionism of Changeux
[18]

is a sound form of
Darwini
an dynamic
s as defined by Price and Eigen, but is not
the same class of
implementation of
natural sele
c
tion as defined by JMS. In fact,
no modification of
the responses encoded by each
individual synapse

is possible
.
Each
synapse is a
slot that signifies one
fixed
solution and the relativ
e probability of a slot being
active is
modified by competition
.

Notice
,
there is no transmission of inform
ation
be
tween sl
ots,
in fact no

communication

between slots

at all
,
in other words, the

response arising from activating synapse A does not
become

the response arising
from activating synapse B.
In terms of the robot analogy, a robot that is doing
well does not call other robots to join it.
So,
selec
tionism is an example of parallel
search with competition. It is natural selection in the Price sense, but not in the
JMS sense.


It is at this stage
that I think

Edelman
took

Changeux
(and later Seung’s)
ideas
of
natural selection acting at the level of
the synapse

a step

too far
. The
third

Nobel
Prize
winner in our story,

Francis Crick who worked down the corridor to Gerald
Edelman
at the Salk institute
disliked Edelman’s Neural Darwinism so much that
he called it Neural Edelmanism

[19]
. The reason
wa
s

that
Edelman
had

identified
no replicator
s in the brain
,
and so there wa
s
no unit of evolution

as required by
JMS.

However,
Edelman had sati
sfied the definitions of

natural selection defined
by
Price and Eigen.

Edelman
proposed

c
ompetition between neuronal groups
(
a
neuronal group is Edelman’s implementation of a slot
)
for
synaptic resources,
but he failed

to explain how the particular pattern of synaptic weights that
constitute
the function of
one group c
ould

be copied

from

one group to another.
This leaves no mechanism by which
a

synaptic
-
pattern
-
dependent trait

could

be
inherited between neuronal groups.



In the best paper to formulate Edelman’
s theory of neuronal group selection
,
Izhikevich

shows that there is

no mechanism by which
functional
variations in
synaptic connectivity patterns can be inherited

(transmitted)

between neuronal
groups

[20]
.

Edelman does satisfy Price if a neuronal group is doing no more
The Evolutionary Review



than a sing
le synapse in Chan
geux’s theory, i.e. encoding a particular response.

However, it does not satisfy Price if Edelman wishes to claim that the
trait in
question
is a
transmissible
pattern of

synaptic strengths

within a neuronal group

because

Edelman cannot show there is cova
riance between
such a trait
and the
number of groups in which such a trait is found across generations
.
There is no
communication of solutions between group
-
based

slots, no information transfer
as there is no information transfer between synapses.

Edel
man’
s mechanism
appears to be a

mechanism of competitive learning betwee
n neuronal group
slots which only has a Darwinian interpretation according to Price
if a neuronal
group is
nothing more than
a synapse in Changeux’s model
, i.e. with a fixed
response funct
ion, without copying of response functions between groups.
Therefore,
Francis Crick
was right in a sense.

N
eural Edelmanism falls into the
third column of my classification of search algo
r
ithms

as competitive learning,
but
, which

if interpreted as the same theory as Changeux and Seung’s
can
satisfy

Price’s
phenomenological

but
never

JMS
’s
definition of natural selection.
This is
because natural selection as defined by JMS requires informa
tion transmission
between slots, i.
e. multi
plication (replication), and Price’s definition requires
only covariance between a response encoded by a neuronal group and the
probability of a change in frequency of that response
.





This leads us to the
final

column in Table 1.
Here is a radically di
fferent way of
utilizing multiple slots that extends the algorithmic capacity of
the competitive
learning algorithms above.
In this case
I

allow
not only the competition of slots
for
generate and test cycles, but
I

allow

slots to pass
information

(traits/responses)
be
tween each other, see Figure 3.


The Evolutionary Review





Figure
3

shows
the robots on the hillside again but this time, those robots in the
higher altitudes can recruit robots in lower altitudes to come and join them.
This
is equivalent to replication of
robot locations.
The currently best location can be
copied to other slots. There is transmission of information between slots.
Note,
replication is always of information (patterns), i.e. reconfiguration by matter of
The Evolutionary Review



other matter.
This is
one of the reasons

the cyberneticist anthropologist
Gregory
Bateson called evolution a mental process
[21,22]
.


This means that the currently higher quality slots have not only a greater chance
of being var
ied and tested, but that they can
copy their trait
s

to other slots that
do not have such good quality traits. This permits the redistribution of
information between material slots. Notice that the
synaptic slot system did not
have this capability. If one s
ynapse location produced response A, then it was not
possible for other synaptic locations t
o come

to produce response A
, even
if
response A was associated with higher reward.
We will see shortly a real case in
the brain where such copying of response prop
e
rties is possible between slots,
and is therefore clear evidence for natural selection in the brain not just of the
Prician type but
of
the
JMS
type.


Crucially, s
uch a system
of parallel search,
competition
and
information
transmission between slots doe
s satisfy JMS


definition of natural selection. The
configuration of a unit of evolution (s
lot) can
reconfigure other material slots. It
also satisfies Price’s definition.
Some
eponymists might wish to say that

this is a
full Darwinian population
.
But it
is better to show that there are some
algorithmic advantages compared to a competitive learning system w
ithout
information transmission

that satisfies only Price’s formulation of natural
selection.


The critical advantage
of JMS’s definition over Price’s
definition
is that multiple
search points can be recruited to the region of the search space that is currently
the best.
In terms of evolutionary theory, a solution can r
each fixation and then
utilize all the search resources available for further explorat
ion.

This allows the
entire population
(of robots)
to acquire the response characteristics
(locations)
of the
currently

best unit

(robot)
, and therefore, allows the accumulation of
adaptations.
Once one peak has been reached by
all the robots, they can the
n all
be in a position to do further exploration to find even higher peaks. Adaptations
can accumulate.
In many real world problems there is never a global optimum,
rather further mountain ranges remain to be explored after a plateau has been
The Evolutionary Review



reached.
For
example, there is no end to science.
Not
every system

that satisfies
Price’s definition of natural selection
can have these special properties.


It may come as a surpr
ise that there are already well
-
recognised processes in the
brain that
are known to
a
limited

extent to implement natural select
ion as
defined by JMS

(and Price)
, and that have the same
algorithmic characteristics

as
Figure
3
.

Figure
4

shows a recent experiment by
Young et al that
showed that
receptive fields in the primary visual cortex o
f cats
can
replicate

to adjacent
neurons, a process they called “didactic transfer”.






Figure
4
. Adapted from Young, the
orientation selectivity

of simple cells in the
visual cortex can be copied between cells.
Simple cells in the visual cortex
have
orientation selectivity which means they respond optimally to bars presented to
the visual field of a particular orientation.
The arrows in each cell show the
direction of a bar
that maximally stimulates that cell.

The orient
ation selectivity
can be
copied between cells. In fact, t
he fitness of an orientation
selective
response

is the extent to
which
stimulation at

the

retina activates the cell with
The Evolutionary Review



such a response
.

If the retina supplying the cells in the inner circle is
cut out

then those cells rec
eive no inputs, and the increase

their s
ensitivity to activation
by horizontal connections
the adjacent cells.
This process can copy the
orientiation selectivity of adjacent cells

onto the central cells.



If a region of the retina is removed then
the cor
tical neurons that normally
receive input from that region are less active and so they become more sensitive
to being activated by their adjacent active neighbours.
With a special type of
plasticity called spike
-
time
-
dependent plasticity

(STDP)
, nearly all

the neurons
in the silenced region take on the orientation selecti
vity of a neuron adjacent to
that

region.
In this case the trait is
the orientation selectivity, and the fitness is
the change in the proportion of cells with that orientation selectivity.
The unit of
evolution is the receptive field
, which multiplies, and has hereditary variation.


However,
an important term in Price’s
formulation

is the bias due to
transmission
, e.g. mutation

(the rate of which is too large in the GARD model),

but this includes any other factor that alters the trait or the fitness of the trait.

It
seems that the
capacity of
STDP in horizontal connections to continue to
make
copy after copy with
sufficient f
idelity may be low in this case, alt
hough this has
yet
to be tested, but if it is so then
covariance between fitness and orientation
selectivity cannot b
e
maintained across many neurons.


Another limitation of Young’s system of didactic

receptive fields
is that they are
capable of only

limited
heredity, which

means that all possible
orientations could
be

exhaustively

encoded.
The situation is analogous to
pre
-
genetic inheritance in
the origin of life.
Prior to the origin of nucleotides and template replication, e.g.
DNA and RNA replication,
natural select
ion
may have utilized attractor
-
based
heredity in the form of autocatalytic chemical reaction networks
[23]
.
These
were capable of only limited information transmission.
However, the origin of
symbolic information in the form of strings of nucleotides permitted unlimited
heredity. To see thi
s, imagine how long it would take to generate
all possible
strings of DNA of only 100 nucleotides in
length. There are 4 nucleotides, A, C, G,
and T, so
this gives 4
100
possibilities
.
If each
string
could be made in one
second

it
The Evolutionary Review



would still take 5 x 10
46

years to make th
em all. The universe is only 433 x 10
15
seconds old according to Wikipedia.
The capacity to encode
symbolic (digital)
information allows
a far greater number of states and therefore strategies and
responses to be encoded.


This brings us
back then to
what
Edelman

failed to explain in his
neuronal
group
selection.
Is there a way in which

a neuronal pattern could transmit an unlimited
amount of informatio
n to another neuronal pattern? The
neuronal replicator
hypothesis
Eors
Szathmary
and I

proposed in 2008 claims

that the origin of
human
language and unlimited open
-
ended thought
and problem solving
in
humans
arose
because of a major transition in evolution that
bears

a
resemblance to the origin of nucleotides in the origin of life and the o
rigin

of the
adaptive immune system.
We propose that t
he capacity
of

the brain to evolve
unlimited heredity neuronal replicators by
neuronal
natural selection allows

truly
open
-
ended
creative
thought.

Penn and Povinelli have given a politically
incorrect
b
ut

convincing argument that human cognition is indeed qualitatively
distinct from all other animals in that we can reason about unobserved hidden
causes, and the abstract relations between such entities, whereas no other
animal can

[24]
.
We propose that t
his
cognitive sophistication
involved the
evolution at the genetic level of the neuronal capacity for not only competitive
learning
and Price
ian evolution
as described by Changuex,
Edelman, and
Seung,
but for
information transmission between
higher
-
order units of neuronal
evolution, and thus natural selection as described by JMS.



W
e proposed
a plausible
neuronal basis for
the replication of
higher or
der units
of neuronal evolution above the synaptic level

(ne
uronal groups)
.
The method
allows a

pattern of synaptic connections
to

be copied fr
om one such unit
to
another
as
shown in Figure
5

[7]
.


The Evolutionary Review




Figure 5. Our proposed mechanism for copying patterns of synaptic connection
s
between neuronal groups. The pattern of connectivity from the lower layer is
copied to the upper layer.
See text.


The Evolutionary Review



In the brain there are many topographic maps. These are pathways of parallel
connections that preserve
adjacency relationships and they ca
n act to establish a
one
-
to
-
one (or at least a few
-
to
-
few) transformation between
neurons in
distinct

regions of the brain.
In addition there is
a

kind of synaptic plasticity called spike
-
time
-
dependent plasticity

(STDP)
, the same kind of plasticity that Y
oung used to
explain the copying of receptive fields.
It works rather like Hebbian learning.
Donald Hebb said that neurons that fire together wire together, which means
that the synapse connecting neuron A to neuron B gets stronger if A and B fire at
the s
ame time

[25]
.
However, recently
it has been discovered that there is an
asymmetric
form of Hebbian learning (STDP) where if the pre
-
synaptic neuron A
fires before the post
-
synaptic neuron B,
the synapse is strengthened, but if pre
-
synaptic neuron A fires af
ter post
-
synaptic neuron B then the synapse is
weakened.

Thus STDP in an unsupervised manner
, i.e. without an explicit
external teacher,

reinforces potential causal relationships. It

is able to guess
which synapses were causally implicated in a pattern of activation.



If a neuronal circuit exists in layer A in Figure 7, and is externally stimulated
randomly

to make it’s neurons spike
,

then due to the topographic map from layer
A to
layer B, neurons in layer B will experience
similar
spike pattern statistics
as
in layer A (due to the topographic map).

I
f there is STDP in layer B between
weakly connected neurons
then
this layer becomes a kind of causal inference
machine that observes t
he spike
input from layer A and tries to produce a circuit
with the same
connectivity, or at least that is capable of
generating

th
e same
pattern of correlations.
One problem with this mechanism is that there are
many
possible patterns of connectivity

that

gen
erate the same spike statistics when a
circuit is randomly externally stimulated to spike.

As the circuit
size gets larger
,
due
to the many possible paths that activity can take through a circuit

within a
layer, the number of possible equivalent circui
ts grows. This can be prevented by
limiting the amount of horizontal spread of activity permissible within a layer. If
this is done,
and some simple error correction neurons were added,
we found it
was possible to evolve
a
fairly large network to obtain a particular desired
pattern of connectivity.
The network with connectivity closest to the desired
connectivity was allowed to replicate itself to other circuits.

The Evolutionary Review




At the moment,
neurophysiologists

would

struggle to
observe
the connectivity
patterns in

microcircuits of this size and to
undertake a similar experiment in
slices or neuronal
cultures;

however, I

think t
he

day is not far when
it becomes
possible
to identify the mechanisms we propose.
These are not the only kinds o
f
neurona
l replication that are possible
. But this is the closest to turning Edelman’s
theory into something that is truly Darwinian

as defined by John Maynard
Smith.



Acknowledgements:
Thanks to Eors Szathmary, Phil Husbands, Simon
McGregor, and Yasha H
artberg for discussions about the manuscript.


1. Edelman GM (1987) Neural Darwinism. The Theory of Neuronal Group
Selection New York: Basic Books.

2. Maynard Smith J (1986) The problems of biology. Oxford, UK

: Oxford University Press.

3. Price GR (1970) Selection and covariance. Nature 227: 520
-
521.

4. Michod RE (1988) Darwinian Selection in the Brain. Evolution 43: 694
-
696.

5. Maynard Smith J, Szathmáry E (1995) The Major Transitions in Evolution.
Oxford: Oxford University Press.

6. C
alvin WH (1996) The cerebral code. Cambridge, MA.: MIT Press.

7. Fernando C, Karishma KK, Szathmáry E (2008) Copying and Evolution of
Neuronal Topology. PLoS ONE 3: e3775.

8. Fernando C, Goldstein R, Szathmáry E (2010) The Neuronal Replicator
Hypothesis. N
eural Computation 22: 2809

2857.

9. Fernando C, Szathmáry E (2009) Chemical, neuronal and linguistic replicators.
In: Pigliucci M, Müller G, editors. Towards an Extended Evolutionary
Synthesis Cambridge, Ma.: MIT Press. pp. 209
-
249.

10. Fernando C, Szathmá
ry E (2009) Natural selection in the brain. In: Glatzeder
B, Goel V, von Müller A, editors. Toward a Theory of Thinking. Berlin.:
Springer. pp. 291
-
340.

11. Segrè D, Lancet D, Kedem O, Pilpel Y (1998) Graded Autocatalysis Replication
Domain (GARD): kinetic

analysis of self
-
replication in mutually catalytic
sets. Origins Life Evol Biosphere 28: 501
-
514.

12.
Vasas V
,
Szathmáry E
,
Santos M

(
2010
)
Lack of evolvability in self
-
sustaining autocatalytic networks constraints metabolism
-
first
scenarios for the origi
n of life.
.
Proc Natl Acad Sci U S A
.

13. Okasha S (2006) Evolution and the levels of selection. Oxford: Oxford
University Press.

14. Holland JH (1975) Adaptation in Natural and Artificial Systems. Ann Arbor:
University of Michigan Press.

15. Sutton SR, B
arto AG (1998) Reinforcement Learning: An Introduction.
Cambridge, MA: MIT Press.

The Evolutionary Review



16. Seung SH (2003) Learning in Spiking Neural Networks by Reinforcement of
Stochastic Synaptic Transmission. Neuron 40: 1063
-
1973.

17. Eigen M (1971) Selforganization of mat
ter and the evolution of biological
macromolecules. Naturwissenschaften 58: 465
-
523.

18. Changeux JP (1985) Neuronal Man: The Biology of Mind: Princeton
University Press.

19. Crick FHC (1989) Neuronal Edelmanism. Trends Neurosci 12: 240
-
248.

20. Izhikevich

EM, Gally JA, Edelman GM (2004) Spike
-
timing dynamics of
neuronal groups. Cereb Cortex 14: 933
-
944.

21. Bateson G (1979) Mind and Nature: A Necessary Unity: Bantam Books.

22. Bateson G (1972) Steps to an Ecology of Mind: Collected Essays in
Anthropology,
Psychiatry, Evolution, and Epistemology: University Of
Chicago Press.

23. Szathmáry E (2006) The origin of replicators and reproducers. Philos Trans R
Soc London B Biol Sci 361: 1761
-
1776.

24. Penn DC, Holyoak KJ, Povinelli DJ (2008) Darwin's Mistake: Expl
aining the
Discontinuity Between Human and Nonhuman Minds. Behavioral and
Brain Sciences 31: 109
-
130.

25. Hebb DO (1949) The Organization of Behaviour: John Wiley & Sons.