Evolutionary Robotics - Institut für Informatik

albanianboneyardAI and Robotics

Nov 2, 2013 (4 years and 5 days ago)

161 views

A TUTORIAL

Stefano Nolfi

Neural Systems & Artificial Life

National Research Council

Roma, Italy

nolfi@ip.rm.cnr.it

Dario Floreano

Microengineering Dept.

Swiss Federal Institute of Technology

Lausanne, Switzerland

dario.floreano@epfl.ch

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

The method

fitness function

genotype
-
to
-
phenotype

mapping

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Behavior
-
Based Robotics & ER

locomote

avoid hitting things

explore

manipulate the world

build maps

sensors

actuators

behavior
-
based
robotics

[Brooks, 1986]

evolutionary robotics

?

?

?

?

?

sensors

actuators

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Learning Robotics & ER

sensors

motors

desired output or

teaching signal

[Kodjabachian & Meyer, 1999]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Artificial Life & ER

[Menczer and Belew, 1997]

[Floreano and Mondada 1994]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

How to Evolve Robots

evolution on the real world

[Floreano and Nolfi, 1998]

evolution on simulation

+ test on the real robot

[Nolfi, Floreano, Miglino, Mondada 1994]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolution in the Real World

mechanical
robustness

energy
supply

analysis

[© K
-
Team SA]

[© K
-
Team SA]

[© K
-
Team SA]

[Floreano and Mondada, 1994]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolution in Simulation

Different physical sensors and actuators may perform differently
because of slight differences in their electronics or mechanics.

Physical sensors deliver uncertain values and commands to
actuators have uncertain effects.

The body of the robot and the environment should be accurately
reproduced in the simulation.

4th IF sensor

8th IF sensor

[Nolfi, Floreano, Miglino and Mondada 1994; Miglino, Lund, Nolfi, 1995]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Designing the Fitness Function

FEE functions that describe how the controller should work (functional), rate the system on
the basis of several variables and constraints (explicit), and employ precise external
measuring devices (external) are appropriate to optimize a set of parameters for complex
but well defined control problem in a well
-
controlled environment.

BII functions that rate only the behavioral outcome of an evolutionary controller (behavioral),
rely on few variables and constraints (implicit) that be computed on
-
board (internal) are
suitable for developing adaptive robots capable of autonomous operation in partially
unknown and unpredictable environments without human intervention.

[Floreano et al, 2000]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Genetic Encoding

Evolvability

Expressive power

Compactess

Simplicity

[Gruau, 1994, Nolfi and Floreano 2000]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Adaptation is more Powerful than
Decomposition and Integration

The

main

strategy

followed

to

develop

mobile

robots

has

been

that

of

Divide

and

Conquer
:

1) divide the problem into a list of hopefully simpler sub
-
problems

2) build a set of modules or layers able to solve each sub
-
problem

3) integrate the modules so to solve the whole problem

Unfortunately,

it

is

not

clear

how

a

desired

behavior

should

be

broken

down

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Proximal and Distal Descriptions of
Behaviors

[Nolfi, 1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Discrimination Task (1)

explore

avoid

approach

discriminate

sensors

actuators

decomposition and integration

walls
and
cylinders

small
and
large
cylinders

[Nolfi, 1996,1999]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Discrimination Task (2)

explore

avoid

approach

discriminate

sensors

actuators

[Nolfi, 1996]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Discrimination Task (3)

[Scheier, Pfeifer, and Kuniyoshi, 1998]

Evolved robots act so to select sensory patterns that are
easy to discriminate

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

The Importance of Self
-
organization

Operating a decomposition at the level of the distal description
of behavior does not necessarily simplify the challenge

By allowing individuals to self
-
organise, artificial evolution
tends to find simple solutions that exploit the interaction
between the robot and the environment and between the
different internal mechanism of the control system.

[Nolfi, 1996,1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Modularity and Behaviors

Is modularity useful in ER ?

What is the relation between self
-
organized
neural modules and behaviors ?

[Nolfi, 1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

The Garbage Collecting Task (1)

[Nolfi, 1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

The Garbage Collecting Task (2)

There is not a correspondence
between self
-
organized neural
modules and sub
-
behaviors

Modular neural controller able to
self
-
organize outperform other
architectures

[Nolfi, 1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolving “complex” behaviors

Bootstrap problem
: selecting individuals directly for their
ability to solve a task only works for simple tasks

Incremental Evolution
: starting with a simplified version
of the task and then progressively increasing complexity

Including in the selection criterion also a reward for
sub
-
components of the desired behavior

Start with a simplified version of the task and then
progressively increase its complexity by modifying
the selection criterion

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Visually
-
Guided Robots

[Cliff et al. 1993; Harvey et al. 1994]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Learning & Evolution: Interactions


Different time scales, different mechanisms, similar effects



Learning Advantages in Evolution
[Nolfi & Floreano, 1999]
:


Adapt to changes that occur faster than a generation


Extract information that might channel the course of evolution


Help and guide evolution


Reduce genetic complexity and increase population diversity



Learning Costs in Evolution
[Mayley, 1997]
:


Delay in the ability to achieve fit behaviors


Increased unreliability (learning wrong things)


Physical damages, energy waste, tutoring



Baldwin effect
[Baldwin, 1896; Morgan, 1896; Waddington, 1942]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Hinton & Nowlan model [1987]


Learning samples space in the surrounding of the individual


Fitness landscape is smoothed and evolution becomes faster


Baldwin effect (assimilation of features normally «

learnt

»)



Model constraints:


Learning task and evolutionary task are the same


Learning is a random process


Environment is static


Genotype and Phenotype space are correlated

00?11???0111?0?1?0?1

1

1

?

0

?

0

Fitness=correct combination of weights

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Different Tasks
[Nolfi, Elman, Parisi, 1994]

-
Evolving for food

-
Learning predictions

-
Learning mechanism=BP

-
Increased speed & fitness

-
Genetic assimilation

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Perspectives on Landscape

Correlated landscapes

[Parisi & Nolfi, 1996]

Relearning effects

to compensate mutations

[Harvey, 1997]

(it may hold only in few cases)

A

C

B1

Q

P

B2

A=weights evolved for food finding

C=weights trained for prediction

B1, B2= new position after mutation

Fitness=higher when closer to A

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolutionary Reinforcement Learning


Evolving both action and
evaluation connection
strengths
[Ackley & Littman, 1991]


Action module modifies
weights during lifetime
using CRBP


ERL better better
performance than E alone
or RL alone


Baldwin effect


Method validated on
mobile robots
[Medeen, 1996]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolutionary Auto
-
teaching


All weights genetically
encoded, but one half
teaches the other half
using Delta rule
[Nolfi & Parisi,
1991]


Individuals can live in one
of two environments,
randomly determined at
birth


Learning individuals adapt
strategy to environment
and display higher fitness

Learning

No learning

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolution of Learning Mechanisms (1)


Encoding learning rules, NOT learning weights
[Floreano & Mondada, 1994]


Weights always initialized to random values


Different weights can use different rules within same network


Adaptive method can be applied to node encoding
(short genotypes)

1 synapse

synapse sign

synapse strength

Genetically
-
determined

1 synapse

synapse sign

learning rule

-

hebb

-

postsynaptic

-

presynaptic

-

covariance

learning rate

Adaptive

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Sequential task & unpredictable change


Faster and better results
[Floreano & Urzelai, 2000]


Automatic decomposition of
sequential task


Synapses continuously
change


Evolved robots adapt online
to upredictable change
[Urzelai &
Floreano, 2000]
:


Illumination


From simulations to robots


Environmental layout


Different robotic platform


Lesions to motor gears
[Eggenberge et al., 1999]

Genetically
-
determined

Adaptive

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Summary


Learning is very useful for robotic evolution:


accelerates and boosts evolutionary performance


can cope with fast changing environments


can adapt to unpredictable sources of change



Lamarck evolution
(inherit learned properties)

may provide short
-
term gains
[Lund, 1999]
, but it does not display all the advantages
listed above
[Sasaki & Tokoro, 1997, 1999]



Distinction between learning and adaptation
[Floreano & Urzelai, 2000]
:


Adaptation does not necessary develops and capitalize upon
new skills and knowledge


Learning is an incremental process whereby new skills and
knowledge are gradually acquired and integrated

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Competitive Co
-
evolution


Fitness of each population depends on fitness of opponent
population. Examples:


Predator
-
prey


Host
-
parasite


It may increase adaptive power by producing an
evolutionary
arms race

[Dawkins & Krebs, 1979]


More complex solutions may
incrementally

emerge as
each population tries to win over the opponent


It may be a solution to the
boostrap

problem


Fitness function plays a less important role


Continuously
changing fitness landscape

may help to
prevent stagnation in local minima
[Hillis, 1990]


TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Co
-
evolutionary Pitfalls

The same set of solutions

may be discovered over

and over again. This

cycling behavior may end

up in very simple solutions.

Solution
: Retain best
individuals of last few gens

(Hall
-
of
-
Fame
-
>all gens).

Whereas in conventional evolution the fitness

landscape is static and fitness is a monotonic

function of progress, in competitive co
-
evolution

the fitness landscape can be modified by the

competitor and fitness function is no longer an

indicator of progress.

Solution
: Master Fitness (after evolution test

each best against all best), CIAO graphs (test

each best against all previous best).

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Examples of Co
-
evolutionary Agents

Simulated predator
-
prey

[Cliff & Miller, 1997]

Distance
-
based fitness

100s generations

CIAO method et al.

Evolution of sensors

Ball
-
catching agents

[Sims, 1994]

Distance
-
based fitness

Rare good results

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Co
-
evolutionary Robots



Energetically autonomous



Predator
-
prey scenarion



Time
-
based fitness



Controllers downloaded to


increase reaction speed



Retain last best 5 controllers


for testing individuals



Predators=vision+proximity



Prey=proximity+faster



Predator genotype longer



Prey has initial position


advantage

Floreano, Nolfi, & Mondada, 1998

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Co
-
evolutionary Results

Predators do not attempt

to minimize distance

Prey maximize distance

progress

best

fun

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Increasing Environmental Complexity

36


240


…prevents premature cycling
[Nolfi & Floreano, 1999]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Summary


Competitive co
-
evolution is challenging because:


Fitness landscape is continuously changing


Hard to monitor progress online


Cycling local minima



When environment is sufficiently complex, or Hall
-
of
-
Fame
method is used, the system develops increasing more
complex solutions



It can work and capitalize on very implicit, internal, and
behavioral fitness functions by exploring a large range of
behaviors triggered by opponents



When co
-
evolving adaptive mechanisms, prey resort to
random actions whereas predators adapt online to the prey
strategy and report better performance
[Floreano & Nolfi, 1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolvable Hardware


Evolution of electronic circuits
http://www.cogs.susx.ac.uk/users/adrianth/EHW_groups.html



Evolution of body morphologies (including sensors)



Why evolve hardware?


Hardware choice constrains environmental interactions and
the course of evolution


Evolved solutions can be more efficient than those designed
by humans


Develope new adaptive materials with self
-
configuration and
self
-
repair abilities





TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolutionary Control Circuits



Thompson’s
unconstrained

evolution



Xilinx, family 6000, overwrite global
synchronization



Tone reproduction



Robot control



Fitness landscape studies (very rugged,
neutral networks)

Evolvable Hardware

Module for Khepera

http://www.aai.ca

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolutionary Control Circuits



Keymeulen: evolution of vision


based controllers



Find ball while avoiding obstacles



Constrained evolution, entirely


on physical robot



De Garis: CAM Brain, composed


of tens of Xilinx FPGAs, 6000 family



Growth of neural circuits using CA


with evolved rules



Willing to evolve brain for kitten robot.


Pitfall: speed limited by sensory
-


motor loop.

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evolutionary Morphologies



Evolution of Lego Structures

[Funes et al,, 1997]



Bridges



Cranes



Extended to objects and robot bodies



see
www.demo.cs.brandeis.edu



Example of evolved crane

[Funes et al,, 1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Co
-
evolutionary Morphologies

Karl Sims, 1994

Komosinski & Ulatowski, 1999

http://www.frams.poznan.pl

Effect of doubling sensor range on body/wheel size
[Lund et al., 1997]

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Suggestions for Further Research


Encoding and mapping of control systems


Exploration of alternative building blocks


Integration of growth, learning, and maturation


Incremental and open
-
ended evolution


Morphology and sensory co
-
evolution


Application to large
-
scale circuits


User
-
directed evolution


Comparison with other adaptive techniques


Further readings:


Nolfi, S. & Floreano, D.
Evolutionary Robotics. The Biology, Technoloy, and
Intelligence of Self
-
Organizing Machines.

MIT Press, October 2000


Husbands, P. & Meyer, J
-
A. (Eds.)
Evolutionary Robotics. Proceedings of the 1
st

European Workshop
, Springer Verlag, 1998


Gomi, T. (Ed.)
Evolutionary Robotics
. Volume series: I (1997), II (1998), III
(2000), AAI Books.

TUTORIAL

Stefano Nolfi & Dario Floreano
, 2000

Evorobot Simulator

Sources, binaries, and documentation files freely
available at:
http://gral.ip.rm.cnr.it/evorobot/simulator.html

[Nolfi, 2000]