Introduction to Genetic Algorithms

observantjadedΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 5 μήνες)

100 εμφανίσεις

1

Introduction to

Genetic Algorithms

2

Genetic Algorithms


What are they?


Evolutionary algorithms that make use of
operations like mutation, recombination, and
selection


Uses?


Difficult search problems


Optimization problems


Machine learning


Adaptive rule
-
bases

3

Theory of Evolution


Every organism has unique attributes that
can be transmitted to its offspring


Offspring are unique and have attributes from
each parent


Selective breeding can be used to manage
changes from one generation to the next


Nature applies certain pressures that cause
individuals to evolve over time

4

Evolutionary Pressures


Environment


Creatures must work to survive by finding
resources like food and water


Competition


Creatures within the same species compete with
each other on similar tasks


Rivalry


Different species affect each other by direct
confrontation (e.g. hunting) or indirectly by fighting
for the same resources

5

Natural Selection


Creatures that are not good at completing
tasks like hunting have fewer chances of
having offspring


Creatures that are successful in completing
basic tasks are more likely to transmit their
attributes to the next generation since there
will be more creatures born that can survive
and pass on these attributes

6

Genetics


Genome (class)


Sequence of genes describing the overall
structure of the genetic for a particular species


Genomics


Study of the meaning of the genes for a particular
species


Alleles


Values that can be assigned to a given gene


Genotype (instance)


Sequence of alleles

7

Physical Properties


Phenetics


Study of physical properties and morphology of
creatures independent of genetic information


Phenome


General structure of creatures body and attributes


Phenotype


Particular instance of phenome realized as a
unique creature


Product of genotype and environment forces

8

Conversions


In real
-
world mapping between genotypes
and phenotypes is hard


In AI work it can be done by defining a
convenient function or even designing
encodings by hand


It is often easier to adapt genetic operators to
work with the evolutionary data structure
used to represent the phenotype than to
encode and decode phenotypes

9

Genetic Algorithmic Process


Potential solution for problem domains are
encoded using machine representation (e.g.
bit strings) that supports variation and
selection operations


Mating and mutation operations produce new
generation of solutions from parent encodings


Fitness function judges the individuals that
are “best” suited (e.g. most appropriate
problem solution) for “survival”

10

Initialization


Initial population must be a representative
sample of the search space


Random initialization can be a good idea (if
the sample is large enough)


Random number generator can not be biased


Can reuse or seed population with existing
genotypes based on algorithms or expert
opinion or previous evolutionary cycles

11

Evaluation


Each member of the population can be seen
as candidate solution to a problem


The fitness function determines the quality of
each solution


The fitness function takes a phenotype and
returns a floating point number as its score


It is problem dependent so can be very simple


It can be a bottleneck if it is not carefully thought
out (there are magic ways to create them)

12

Selection


Want to give preference to “better” individuals
to add to mating pool


If entire population ends up being selected it
may be desirable to conduct a tournament to
order individuals in population


Would like to keep the best in the mating pool
and drop the worst (elitism)


Elitism is trade
-
off with search space
completeness

13

Crossover


In sexual reproduction the genetic codes of
both parents are combined to create offspring


A sexual crossover has no impact on the
mating pool


Would like to keep 60/40 split between parent
contributions


95/5 splits negate the benefits of crossover

14

Crossover


If we have selected two strings



A = 11111 and B = 00000


We might choose a uniformly random site
(e.g. position 3) and trade bits


This would create two new strings



A’ =11100 and B’ = 00011


These new strings might then be added to the
mating pool if they are “fit”

15

Mutation


Mutations happen at the genome level (rarely
and not good) and the genotype level (better
for the GA process)


Mutation is important for maintaining diversity
in the genetic code


In humans, mutation was responsible for the
evolution of intelligence


Example: The occasional (low probably)
alteration of a bit position in a string

16

Operators


Selection and mutation


When used together give us a genetic algorithm
equivalent of to parallel, noise tolerant, hill
climbing algorithm


Selection, crossover, and mutation


Provide an insurance policy against losing
population diversity and avoiding some of the
pitfalls of ordinary “hill climbing”

17

Replacement


Determine when to insert new offspring into
the mating pool and which individuals to drop
out based on fitness


Steady state evolution calls for the same
number of individuals in the population, so
each new offspring processed one at a time
so fit individuals can remain a long time


In generational evolution, the offspring are
placed into a new population with all other
offspring (genetic code only survives in kids)

18

Genetic Algorithm

Set time t = 0

Initialize population P(t)

While termination condition not met


Evaluate fitness of each member of P(t)


Select members from P(t) based on fitness


Produce offspring from the selected pairs


Replace members of P(t) with better offspring


Set time t = t + 1

19

Why use genetic algorithms?


They can solve hard problems


Easy to interface genetic algorithms to
existing simulations and models


GA’s are extensible


GA’s are easy to hybridize


GA’s work by sampling, so populations can
be sized to detect differences with specified
error rates


Use little problem specific code

20

Traveling Salesman Problem


To use a genetic algorithm to solve the traveling
salesman problem we could begin by creating a
population of candidate solutions


We need to define mutation, crossover, and
selection methods to aid in evolving a solution
from this population


At random pick two solutions and combine them
to create a child solution, then a fitness function
is used to rank the solutions

21

Traveling Salesman Problem


For
crossover

we might take two paths (P1
and P2) break them at arbitrary points and
define new solutions Left1+Right2 and
Left2+Right1


For
mutation

we might randomly switch two
cites in an existing path

22

Evolve Algorithm for TSP


Set up initial population


For G generations


Create M mutations and add them to the population


Subject mutations to population constraints and
determine their relative fitness


Create C crossovers and add them to the population


Subject crossovers to population constraints and
determine their relative fitness

23

Solving TSP using GA

Steps:

1.
Create group of random tours


Stored as sequence of numbers (parents)

2.
Choose 2 of the better solutions


Combine and create new sequences (children)


Problems here:


City 1 repeated in Child 1


City 5 repeated in Child 2

24

Modifications Needed


Algorithm must not allow repeated cities


Also, order must be considered


12345 is same as 32154


Based upon these considerations, a
computer model for N cities can be
created


Gets quite detailed

Genetic Algorithm Example

A

A

B

B

C

C

D

D

E

E

Parent A

Parent B

A

B

C

D

E

Genetic Algorithm Example

A

A

A

A

A

B

B

B

B

B

Combined Path

Genetic Algorithm Example

B

A

B

C

D

E

A

A

B

B

Child

Mutations

Chance of 1 in 50 to introduce a mutation to the next
generation (the child if it replaces a parent, or the first
parent)

E

B

F

D

G

A

C

R1

R2

E

A

G

D

F

B

C

29

Premature Convergence


Occasionally a gene takes over because it is so
much fitter than all others (genetic drift)


If this is the best solution, that may be OK (if not
you may never find the optimal solution if this
happens too soon)


Large populations genetic drift is less likely to
happen


Using higher mutation rates can combat genetic
drift

30

Premature Convergence


High levels of randomness are not always
helpful to GA


To prevent genetic drift


You might have several small populations and
cross
-
breed individuals from them


Take game of life approach, pretend individuals
live on 2D grid and only allow breeding between
neighbors (spatial organizational structure)

31

Slow Convergence


Some GA will simply fail to converge


Similar to plateau problem in hill climbing
(need to add noise to fitness values to make
them converge)


Can increase elitism to encourage fitter
individuals to spread their genes (at the risk
of premature convergence)


Increasing level of random mutations
sometimes helps

32

Parameters


Require lots of parameters (mutation rate,
crossover type, population size, fitness
scaling policy)


Can make use of a hierarchy of GA’s with a
master GA setting the parameters for an
ordinary GA


Parameterless GA have default values
chosen for parameters so that human
interaction is not needed for fine tuning

33

Domain Knowledge


GA do not exploit domain knowledge unless
the KE designs special policies and operators


During initialization there can be a bias
toward certain genotypes selected by the
domain expert


Can use gene dependent mutation rates and
heuristic crossover split points


The choice of representation can affect the
size and search efficiency of the problem
space


34

GA Strengths


Do well at avoiding local minima and can
often times find near optimal solutions since
search is not restricted to small search areas


Easy to extend by creating custom operators


Perform well for global optimizations


Work required to to choose representations
and conversion routines is acceptable

35

GA Weaknesses


Do not take advantage of domain knowledge


Not very efficient at local optimization (fine
tuning solutions)


Randomness inherent in GA make them hard
to predict (solutions can take a long time to
stumble upon)


Require entire populations to work (takes lots
of time and memory) and may not work well
for real
-
time applications

36

Evolvee


Uses existing representations (like Neural Net)


Realism is relatively poor


Attack simple tasks (e.g. attack behaviors) do
not pose any problems for it


(not found in current archive)

37

Actions and Parameters


Limited action set needed


Look

parameter: direction


Single value: up, ahead, down


Move

parameter: weights


Vector (projectile, collision point, impact location)


Fire

parameter:


Jump

parameter:

38

Sequences


Contained in simple arrays of actions and
times


Times can be associated with actions in two
ways


Time offset relative to previous action


Absolute time since start of sequence


The order of sequences in an array is not
important (this allows symmetric solutions but
avoids the cost of sorting actions before
evolution is complete)

39

Random Generation


Time offset will be a randomly generated
values within maximum sequence length


Action type can be encoded as a symbol
randomly chosen from set of all possible
actions


Parameters values are action specific and
need to be chosen after action is selected
and given in range values

40

Random Generation


The length of all action sequences can also
be generated randomly (with an maximum
upper bound)


The sequences of actions will be housed in a
dynamic array


Start time of first action in a sequence can be
reset to zero

41

Crossover


Simple one point crossover


Randomly split two move sequences from
parents and swap sub
-
arrays to create two
new children


Fairly easy to program using arrays

42

Mutation


A low probability mutation might be to change
the length of a sequence


Empty spaces can be filled with random action


Excess actions are simply ignored


A low probability mutation might be to replace
individual actions within existing sequences


Gene storage time follows normal distribution

43

Evolution


Population size will remain constant


Evolution happens on request


If individual unassigned fitness exists chose it
otherwise choose two parents with probabilities
proportional to their fitness for crossover/mutation


Individuals are removed from the population
using random selection based on inverse
fitness


To diversify the population remove the poorer of
two similar behaviors

44

Object for Defensive Tactics


In combat game terms,
defensive tactics

is the sequence
of actions carried out by an object to protect itself when it
comes under attack


This is a natural choice for learning behavior by genetic
algorithm, because the object is in a highly competitive
situation with a survival mandate


It should be possible to decide on the fittest behaviors and
select for them in the evolving sequence of actions


To keep things simple, we will focus on only two behaviors


dodging

enemy fire and
rocket jumping


But the method could be extended to include other
defensive moves, such as
weaving

and
seeking cover


45

Computing Fitness

Rocket Jumping


Assign rewards only for upward movement
when object is not touching the floor, to avoid
rewarding running up the stairs


Reward high jump a lot more than lower
jumps

46

Computing Fitness

Dodging Fire


Provide 0 reward when hit and high reward
when object escapes with no damage


Must include distance of dodging movement
away from point of impact to avoid rewarding
“standing still”


Damage to object must also be measured
and subtracted from fitness value


Use time as a 4
th

dimension to resolve ties

47

For the Game


Make use of genetic algorithm


Learn its jumping and dodging behaviors
during the game


Fitness function provides rewards on a per
jump or per dodge basis

48

Evaluation


Learns to jump fairly quickly


Multiple jumps are no problem


Dodging behavior is also learned quickly


Any balanced combination of vector weights
(estimated point of impact, closest collision
point, project attributes) that causes
movement to safety work well


Approach is sub
-
optimal but acceptable

49

Evaluation


Continuous fitness values are more helpful to
the genetic algorithm than Boolean success
indicators


Scheme reveals how well it is possible to
evolve behaviors using genetic operators


The representation is better suited to
modeling sequences than either decision
trees or fuzzy rules


Representation is incompatible with rule
-
based schemes

50

Related Technologies


Genetic Programming


Existing programs are combined to breed
new programs


Artificial Life


Using cellular automata to simulate
population growth