Ficici_Solution_Concepts_Dissertation_Notesx

rumblecleverAI and Robotics

Dec 1, 2013 (3 years and 6 months ago)

84 views

“Solution Concepts in Coevolutionary Algorithms” (Dissertation) by Sevan Gregory Ficici

1.4 Foundations



Evolutionary algorithms typically have the following steps:

o

Initialize population

o

Evaluate each member of the population and assign a rating

o

If halting

criterion is met, then stop; otherwise…

o

Select population members for breeding according to their ratings

o

Generate “offspring” from selected “parents” with variation operators

o

Insert offspring into population, and go to Step 2



In coevolution, individuals
are evaluated by seeing how it interacts with other individuals in the
population or other populations, depending on the search problem

o

In a population of size n, an individual will interact n*(n
-
1) / 2 times

o

In a population of size n and another populatio
n of size m, there will be n*m interactions



This is called “complete mixing”



A constant may be added to an individual’s fitness in order to make the value nonnegative



Potter introduces cooperative coevolution (not cooperative game theory)

o

Aims to solve a d
ifficult problem X by coevolving an effective set of solutions to a
decomposition of X; if X is decomposed into n sub
-
problems, then n reproductively isolated
populations are coevolved to “cooperately” solve the problem X

o

The less

the sub
-
problems interact

with each other, the more effective cooperative
coevolution will be

o

Has the ability to dynamically adjust the problem’s decomposition

o

Requires all populations to adhere to a pre
-
specified interface that governs interaction
between components



Competitive c
oevolution: coevolution applied to a zero
-
sum game

o

Example: Iterated prison’s dilemma



Both “styles” of coevolution use multiple, reproductively isolated populations; both can use similar
patterns of inter
-
population interaction, similar diversity maintenan
ce schemes, and so on



The most salient difference between cooperative and competitive coevolution resides primarily in
the game
-
theoretic properties of the domains to which these algorithms are applied



Assumptions when performing coevolutionary

optimization

o

Initially ignorant to the gamut of behaviors available to an evolving agent

o

Initially ignorant of the outcomes obtained by the possible behaviors

o

Treat evolving individual as a “black box”

o

Cannot definitively establish identity of a behavior
exhibited by an evolving agent without
exhaustive testing

o

Cannot assume that individuals with different genotypes must behave differently



In this dissertation, coevolutionary algorithms
perfom optimization, and the notion of optimality is
specified by a so
lution concept



k
-
armed bandit problem
: We have
k
slot machines and
N
coins; each of these slot machines has a
different expected rate of reward that is unknown to us. Our task is to apportion our
N
coins
amongst the
k
machines to optimize our expected cumu
lative return. Thus, we have finite

resources
with which to
explore
the rates of return of the
k
machines and
exploit
the machine with the highest
o
bserved return.



When we apply evolution to a static multi
-
objective problem, then the solution that is deliv
ered is
typically the Pareto front, which is a set of non
-
dominated feasible members of a trade
-
off surface;
these individuals are either in the evolving population or in an archive of some sort



The solution may be an individual, a group of individuals fro
m different population, a state of a
population, or in some other form



Behavior complex represents various types of strategy collections



Chapter 3: A Taxonomy of Issues and Research



Reasons to use a coevolutionary algorithm for machine learning

o

Make more

efficient use of finite computational power by focusing evaluation effort on the
most relevant tests. For example, those that best distinguish the quality of potential
solutions

o

Some domains intrinsically require coevolution, such as games

o

Some domains r
equire less (human
-
supplied) inductive bias when using coevolution than
when using other search methods

o

Some domains are “open ended”

having
an
infinite
number of possible
behaviors



Efficiency: focuses on minimal sorting networks and cellular automaton for

density classification

o

Hillis and 16
-
input sorting networks
:
tries to competitively coevolve test
-
case samples such
that they remain appropriate to the abilities of the evolving networks as they improve



Finds better results, including a solution that has
one more comparison
-
exchange
operation than the currently known minimal network by Green



Juille uses a portion of Hillis’s best solution to improve minimal network

o

Juille’s and Pollack’s majority function: discover an automaton rule that will cause the
aut
omaton to converge to a state of all ones if the IC has more ones than zeros, and
converge to a state of all zeros otherwise



Paredis was the first to attempt to use coevolution with the majority problem



Coevolve rules with actual initial conditions rather

than density classes



Uses lifetime fitness evaluation (LTFE) to integrate multiple scores over
multiple fitness evaluations



Intrinsically Interactive D
omains



Valen’s

Red
-
Queen Effect: if we simply monitor population fitness values (whether mean or
maximum), we cannot reliably detect coevolutionary progress

o

For example, if a strong individual interacts with superior individuals, then the strong
individual will appear w
eak. On the other hand, a mediocre individual interacting with weak
individuals will appear to be strong

o

Several methods to detect and monitor progress relate to memory mechanisms



Prevent evolutionary “forgetting” and maintaining a history (or “memory”).

These
operate by collecting the most fit individuals over evolutionary time (typically the
most fit in each generation) and playing them against each other



Miller and Cliff’s current individual ancestral opponents (CIAO)



Floreano and Nolfi’s master tourna
ment (MT)



Stanley and Miikkulainen’s dominance tournament (DT)

o

Zero
-
sum symmetric “robot duel” game

o

Adds an individual to the collection if and only if it defeats all other
individuals already in the collection

o

Ensures no intransitive cycle



Loss of Gradien
t and Disengagement

o

Coevolution entails two search problems



Primary search problem concerns the domain of interest



Example: Cellular Automaton Research. Find an optimal automaton rule.



Secondary search problem concerns the discovery of interactions that
will allow us
to search the primary domain effectively and recognize solutions



Example: Cellular Automaton Research. Find appropriate automaton initial
conditions



Loss of gradient: If no member in the set of interactions can distinguish any two
members of

the current set of evolving individuals, then we have a loss of gradient
in the primary search effort



In single populations, gradient loss implies that all individuals receive the
same fitness



When the primary and secondary search problems involve separat
e populations,
then a loss of gradient means that the populations have become disengaged



Examples:



Juille and Pollack coevolve cellular automaton rules with automaton initial
conditions for a density classification task



Generators (no input, one output) an
d predictors (one input, one output)

a predictor guesses what the generator will output



Three population framework: generators evolve to be predictable to
“friendly” predictors and simultaneously unpredictable to “hostile”
predictors

o

Algorithmic remedies



E
xample: Phantom parasite. If a strong individual a1 in population p1 beats every
individual in population p2, then a1 loses to the phantom parasite; if an individual
b1 in population p1 loses to an individual in population p2, then b1 beats the
phantom pa
rasite, which makes a1 at a disadvantage and preventing a1 from taking
over population p1



Example:
Moderating parasite virulence.
Cartilage and Bullock seek to discount the
fitness of individuals who attain perfect scores against the opposing population,
preve
nting them from taking over and causing disengagement



Example: Paredis and Olsson
’s approach was to slow down reproduction for the
stronger population, so the weaker population has time to adapt



Intran
sitivity, Cycling, and the Red
-
Queen

o

Cycling popul
ation dynamics: caused by intransitive superiority structures



Example: Rock
-
Paper
-
Scissors (RPS)



Nash equilibrium strategy: equally choose rock, paper, and scissors



Example: Matching pennies game. P1 wins if p1 and p2 both choose heads or both
choose
tails; p2 wins otherwise



Nash equilibrium strategy: both players choose each pure strategy with
probability one
-
half

o

Valen’s Red
-
Queen Effect: To maintain a level of fitness in a dynamic environment, a specie
must continuously evolve. It also refers to an

evolutionary “arms race” between two
competing species, where each specie forces the other to become increasingly competent at
certain behaviors

o

Examples:



Paredis described cyclic dynamics on the majority function



Juille and Pollack described cyclic dynam
ics on the majority function



Author used coevolution for a time
-
series prediction task



Miller discuss cyclic dynamics in pursuit and evasive contests



Nolfi and Floreano discuss cyclic dynamics in pursuit and evasive contests

o

Algorithmic Remedies



Nolfi and
Floreano show that the effect of intransitivity can be diminished by adding
various static obstacles to the environment that affect agent fitness



Bullock implements a diffuse selection pressure by evolving multiple, reproductively
isolated populations and
having each agent interact with members of each
population, creating greater genetic and behavioral diversity, which broadens
selection pressure and dilutes the effect of intransitivity



Forgetting: Process of trait loss

o

“Trait” refers to any measureable
aspect of behavior

o

C
auses of trait loss:



Selected against

individuals with the trait are less fit, on average, than individuals
without the trait



Trait is not strongly acted upon by selection pressure and is left to drift according to
biases in the variati
onal operators



Trait is selected for, but is difficult to maintain

the variational operators are
strongly biased against it, making offspring likely to lack the trait



These causes eventually lead to a population at some later point in time where no
individ
ual has the particular trait

o

An example of trait
-
loss becomes an instance of forgetting when, at some later point in
time, the population has



No individual having a trait x



Some individual would gain an increase in fitness value if the trait x is obtained

o

This suggests an intransitive structure is at work

o

Focusing: When a trait is forgotten due to drift, selection pressure has become too narrow

o

Examples:



Cliff and Miller discuss role of intransitivity in forgetting in pursuit
-
and
-
evasive
contests



Floreano
and Nolfi use a shallow “Hall of Fame” memory to help stabilize cycling,
but still obtain forgetting due to intransitivity



Watson and Pollack provide vivid illustrations of how forgetting ensues from genetic
drift in numbers games

o

Algorithmic Remedies:



Pol
lack and Blair’s work suggests that the game of backgammon naturally provides
such diverse selection pressures and is therefore resistant to evolutionary forgetting



Boyd proves that contrite tit
-
for
-
tat (contains mistakes) prevents forgetting of
important
skills



Memory mechanisms maintain a collection of “good” individuals thus encapsulating
a wider range of phenotypes than is typically found in the evolving population at any
one moment



What is the solution concept (what to remember)?

o

When a domain forces
mutual exclusivity between certain traits, or
when an evolutionary representation (genotype) cannot
simultaneously encode all desired traits



Almost all memory mechanisms in the literature are instances of a general “best of
generation” (BOG) model where



Th
e most fit individual in each of the m most recent generations is retained
by the memory mechanism



L of the m retained individuals are sampled without replacement for use in
testing individuals in the current generation



Stanley and Miikkulainen propose tha
t their dominance tournament can be adapted
for use as a memory mechanism by retaining the most fit individual of the current
generation only if it beats all the individuals previously retained by the memory



Fitness Deception Obscures Solutions

o

Coordinatio
n Game: symmetric two
-
player variable
-
sum game where both players must play
the same pure strategy to receive maximal payoff



Diversity Maintenance and Teaching

o

Maintaining genetic phenotypic diversity is a general antidote to all of the common
pathologies

o

Several methods have been reported to maintain genetic phenotypic diversity