# Ficici_Solution_Concepts_Dissertation_Notesx

Τεχνίτη Νοημοσύνη και Ρομποτική

1 Δεκ 2013 (πριν από 4 χρόνια και 7 μήνες)

128 εμφανίσεις

“Solution Concepts in Coevolutionary Algorithms” (Dissertation) by Sevan Gregory Ficici

1.4 Foundations

Evolutionary algorithms typically have the following steps:

o

Initialize population

o

Evaluate each member of the population and assign a rating

o

If halting

criterion is met, then stop; otherwise…

o

Select population members for breeding according to their ratings

o

Generate “offspring” from selected “parents” with variation operators

o

Insert offspring into population, and go to Step 2

In coevolution, individuals
are evaluated by seeing how it interacts with other individuals in the
population or other populations, depending on the search problem

o

In a population of size n, an individual will interact n*(n
-
1) / 2 times

o

In a population of size n and another populatio
n of size m, there will be n*m interactions

This is called “complete mixing”

A constant may be added to an individual’s fitness in order to make the value nonnegative

Potter introduces cooperative coevolution (not cooperative game theory)

o

Aims to solve a d
ifficult problem X by coevolving an effective set of solutions to a
decomposition of X; if X is decomposed into n sub
-
problems, then n reproductively isolated
populations are coevolved to “cooperately” solve the problem X

o

The less

the sub
-
problems interact

with each other, the more effective cooperative
coevolution will be

o

Has the ability to dynamically adjust the problem’s decomposition

o

Requires all populations to adhere to a pre
-
specified interface that governs interaction
between components

Competitive c
oevolution: coevolution applied to a zero
-
sum game

o

Example: Iterated prison’s dilemma

Both “styles” of coevolution use multiple, reproductively isolated populations; both can use similar
patterns of inter
-
population interaction, similar diversity maintenan
ce schemes, and so on

The most salient difference between cooperative and competitive coevolution resides primarily in
the game
-
theoretic properties of the domains to which these algorithms are applied

Assumptions when performing coevolutionary

optimization

o

Initially ignorant to the gamut of behaviors available to an evolving agent

o

Initially ignorant of the outcomes obtained by the possible behaviors

o

Treat evolving individual as a “black box”

o

Cannot definitively establish identity of a behavior
exhibited by an evolving agent without
exhaustive testing

o

Cannot assume that individuals with different genotypes must behave differently

In this dissertation, coevolutionary algorithms
perfom optimization, and the notion of optimality is
specified by a so
lution concept

k
-
armed bandit problem
: We have
k
slot machines and
N
coins; each of these slot machines has a
different expected rate of reward that is unknown to us. Our task is to apportion our
N
coins
amongst the
k
machines to optimize our expected cumu
lative return. Thus, we have finite

resources
with which to
explore
the rates of return of the
k
machines and
exploit
the machine with the highest
o
bserved return.

When we apply evolution to a static multi
-
objective problem, then the solution that is deliv
ered is
typically the Pareto front, which is a set of non
-
dominated feasible members of a trade
-
off surface;
these individuals are either in the evolving population or in an archive of some sort

The solution may be an individual, a group of individuals fro
m different population, a state of a
population, or in some other form

Behavior complex represents various types of strategy collections

Chapter 3: A Taxonomy of Issues and Research

Reasons to use a coevolutionary algorithm for machine learning

o

Make more

efficient use of finite computational power by focusing evaluation effort on the
most relevant tests. For example, those that best distinguish the quality of potential
solutions

o

Some domains intrinsically require coevolution, such as games

o

Some domains r
equire less (human
-
supplied) inductive bias when using coevolution than
when using other search methods

o

Some domains are “open ended”

having
an
infinite
number of possible
behaviors

Efficiency: focuses on minimal sorting networks and cellular automaton for

density classification

o

Hillis and 16
-
input sorting networks
:
tries to competitively coevolve test
-
case samples such
that they remain appropriate to the abilities of the evolving networks as they improve

Finds better results, including a solution that has
one more comparison
-
exchange
operation than the currently known minimal network by Green

Juille uses a portion of Hillis’s best solution to improve minimal network

o

Juille’s and Pollack’s majority function: discover an automaton rule that will cause the
aut
omaton to converge to a state of all ones if the IC has more ones than zeros, and
converge to a state of all zeros otherwise

Paredis was the first to attempt to use coevolution with the majority problem

Coevolve rules with actual initial conditions rather

than density classes

Uses lifetime fitness evaluation (LTFE) to integrate multiple scores over
multiple fitness evaluations

Intrinsically Interactive D
omains

Valen’s

Red
-
Queen Effect: if we simply monitor population fitness values (whether mean or
maximum), we cannot reliably detect coevolutionary progress

o

For example, if a strong individual interacts with superior individuals, then the strong
individual will appear w
eak. On the other hand, a mediocre individual interacting with weak
individuals will appear to be strong

o

Several methods to detect and monitor progress relate to memory mechanisms

Prevent evolutionary “forgetting” and maintaining a history (or “memory”).

These
operate by collecting the most fit individuals over evolutionary time (typically the
most fit in each generation) and playing them against each other

Miller and Cliff’s current individual ancestral opponents (CIAO)

Floreano and Nolfi’s master tourna
ment (MT)

Stanley and Miikkulainen’s dominance tournament (DT)

o

Zero
-
sum symmetric “robot duel” game

o

Adds an individual to the collection if and only if it defeats all other

o

Ensures no intransitive cycle

t and Disengagement

o

Coevolution entails two search problems

Primary search problem concerns the domain of interest

Example: Cellular Automaton Research. Find an optimal automaton rule.

Secondary search problem concerns the discovery of interactions that
will allow us
to search the primary domain effectively and recognize solutions

Example: Cellular Automaton Research. Find appropriate automaton initial
conditions

Loss of gradient: If no member in the set of interactions can distinguish any two
members of

the current set of evolving individuals, then we have a loss of gradient
in the primary search effort

same fitness

When the primary and secondary search problems involve separat
e populations,
then a loss of gradient means that the populations have become disengaged

Examples:

Juille and Pollack coevolve cellular automaton rules with automaton initial
conditions for a density classification task

Generators (no input, one output) an
d predictors (one input, one output)

a predictor guesses what the generator will output

Three population framework: generators evolve to be predictable to
“friendly” predictors and simultaneously unpredictable to “hostile”
predictors

o

Algorithmic remedies

E
xample: Phantom parasite. If a strong individual a1 in population p1 beats every
individual in population p2, then a1 loses to the phantom parasite; if an individual
b1 in population p1 loses to an individual in population p2, then b1 beats the
phantom pa
rasite, which makes a1 at a disadvantage and preventing a1 from taking
over population p1

Example:
Moderating parasite virulence.
Cartilage and Bullock seek to discount the
fitness of individuals who attain perfect scores against the opposing population,
preve
nting them from taking over and causing disengagement

Example: Paredis and Olsson
’s approach was to slow down reproduction for the
stronger population, so the weaker population has time to adapt

Intran
sitivity, Cycling, and the Red
-
Queen

o

Cycling popul
ation dynamics: caused by intransitive superiority structures

Example: Rock
-
Paper
-
Scissors (RPS)

Nash equilibrium strategy: equally choose rock, paper, and scissors

Example: Matching pennies game. P1 wins if p1 and p2 both choose heads or both
choose
tails; p2 wins otherwise

Nash equilibrium strategy: both players choose each pure strategy with
probability one
-
half

o

Valen’s Red
-
Queen Effect: To maintain a level of fitness in a dynamic environment, a specie
must continuously evolve. It also refers to an

evolutionary “arms race” between two
competing species, where each specie forces the other to become increasingly competent at
certain behaviors

o

Examples:

Paredis described cyclic dynamics on the majority function

Juille and Pollack described cyclic dynam
ics on the majority function

Author used coevolution for a time
-

Miller discuss cyclic dynamics in pursuit and evasive contests

Nolfi and Floreano discuss cyclic dynamics in pursuit and evasive contests

o

Algorithmic Remedies

Nolfi and
Floreano show that the effect of intransitivity can be diminished by adding
various static obstacles to the environment that affect agent fitness

Bullock implements a diffuse selection pressure by evolving multiple, reproductively
isolated populations and
having each agent interact with members of each
population, creating greater genetic and behavioral diversity, which broadens
selection pressure and dilutes the effect of intransitivity

Forgetting: Process of trait loss

o

“Trait” refers to any measureable
aspect of behavior

o

C
auses of trait loss:

Selected against

individuals with the trait are less fit, on average, than individuals
without the trait

Trait is not strongly acted upon by selection pressure and is left to drift according to
biases in the variati
onal operators

Trait is selected for, but is difficult to maintain

the variational operators are
strongly biased against it, making offspring likely to lack the trait

These causes eventually lead to a population at some later point in time where no
individ
ual has the particular trait

o

An example of trait
-
loss becomes an instance of forgetting when, at some later point in
time, the population has

No individual having a trait x

Some individual would gain an increase in fitness value if the trait x is obtained

o

This suggests an intransitive structure is at work

o

Focusing: When a trait is forgotten due to drift, selection pressure has become too narrow

o

Examples:

Cliff and Miller discuss role of intransitivity in forgetting in pursuit
-
and
-
evasive
contests

Floreano
and Nolfi use a shallow “Hall of Fame” memory to help stabilize cycling,
but still obtain forgetting due to intransitivity

Watson and Pollack provide vivid illustrations of how forgetting ensues from genetic
drift in numbers games

o

Algorithmic Remedies:

Pol
lack and Blair’s work suggests that the game of backgammon naturally provides
such diverse selection pressures and is therefore resistant to evolutionary forgetting

Boyd proves that contrite tit
-
for
-
tat (contains mistakes) prevents forgetting of
important
skills

Memory mechanisms maintain a collection of “good” individuals thus encapsulating
a wider range of phenotypes than is typically found in the evolving population at any
one moment

What is the solution concept (what to remember)?

o

When a domain forces
mutual exclusivity between certain traits, or
when an evolutionary representation (genotype) cannot
simultaneously encode all desired traits

Almost all memory mechanisms in the literature are instances of a general “best of
generation” (BOG) model where

Th
e most fit individual in each of the m most recent generations is retained
by the memory mechanism

L of the m retained individuals are sampled without replacement for use in
testing individuals in the current generation

Stanley and Miikkulainen propose tha
t their dominance tournament can be adapted
for use as a memory mechanism by retaining the most fit individual of the current
generation only if it beats all the individuals previously retained by the memory

Fitness Deception Obscures Solutions

o

Coordinatio
n Game: symmetric two
-
player variable
-
sum game where both players must play
the same pure strategy to receive maximal payoff

Diversity Maintenance and Teaching

o

Maintaining genetic phenotypic diversity is a general antidote to all of the common
pathologies

o

Several methods have been reported to maintain genetic phenotypic diversity