# Introduction to Genetic Algorithms

AI and Robotics

Oct 24, 2013 (4 years and 8 months ago)

145 views

1

Introduction to

Genetic Algorithms

2

Genetic Algorithms

What are they?

Evolutionary algorithms that make use of
operations like mutation, recombination, and
selection

Uses?

Difficult search problems

Optimization problems

Machine learning

-
bases

3

Theory of Evolution

Every organism has unique attributes that
can be transmitted to its offspring

Offspring are unique and have attributes from
each parent

Selective breeding can be used to manage
changes from one generation to the next

Nature applies certain pressures that cause
individuals to evolve over time

4

Evolutionary Pressures

Environment

Creatures must work to survive by finding
resources like food and water

Competition

Creatures within the same species compete with

Rivalry

Different species affect each other by direct
confrontation (e.g. hunting) or indirectly by fighting
for the same resources

5

Natural Selection

Creatures that are not good at completing
tasks like hunting have fewer chances of
having offspring

Creatures that are successful in completing
basic tasks are more likely to transmit their
attributes to the next generation since there
will be more creatures born that can survive
and pass on these attributes

6

Genetics

Genome (class)

Sequence of genes describing the overall
structure of the genetic for a particular species

Genomics

Study of the meaning of the genes for a particular
species

Alleles

Values that can be assigned to a given gene

Genotype (instance)

Sequence of alleles

7

Physical Properties

Phenetics

Study of physical properties and morphology of
creatures independent of genetic information

Phenome

General structure of creatures body and attributes

Phenotype

Particular instance of phenome realized as a
unique creature

Product of genotype and environment forces

8

Conversions

In real
-
world mapping between genotypes
and phenotypes is hard

In AI work it can be done by defining a
convenient function or even designing
encodings by hand

It is often easier to adapt genetic operators to
work with the evolutionary data structure
used to represent the phenotype than to
encode and decode phenotypes

9

Genetic Algorithmic Process

Potential solution for problem domains are
encoded using machine representation (e.g.
bit strings) that supports variation and
selection operations

Mating and mutation operations produce new
generation of solutions from parent encodings

Fitness function judges the individuals that
are “best” suited (e.g. most appropriate
problem solution) for “survival”

10

Initialization

Initial population must be a representative
sample of the search space

Random initialization can be a good idea (if
the sample is large enough)

Random number generator can not be biased

Can reuse or seed population with existing
genotypes based on algorithms or expert
opinion or previous evolutionary cycles

11

Evaluation

Each member of the population can be seen
as candidate solution to a problem

The fitness function determines the quality of
each solution

The fitness function takes a phenotype and
returns a floating point number as its score

It is problem dependent so can be very simple

It can be a bottleneck if it is not carefully thought
out (there are magic ways to create them)

12

Selection

Want to give preference to “better” individuals

If entire population ends up being selected it
may be desirable to conduct a tournament to
order individuals in population

Would like to keep the best in the mating pool
and drop the worst (elitism)

-
off with search space
completeness

13

Crossover

In sexual reproduction the genetic codes of
both parents are combined to create offspring

A sexual crossover has no impact on the
mating pool

Would like to keep 60/40 split between parent
contributions

95/5 splits negate the benefits of crossover

14

Crossover

If we have selected two strings

A = 11111 and B = 00000

We might choose a uniformly random site
(e.g. position 3) and trade bits

This would create two new strings

A’ =11100 and B’ = 00011

These new strings might then be added to the
mating pool if they are “fit”

15

Mutation

Mutations happen at the genome level (rarely
and not good) and the genotype level (better
for the GA process)

Mutation is important for maintaining diversity
in the genetic code

In humans, mutation was responsible for the
evolution of intelligence

Example: The occasional (low probably)
alteration of a bit position in a string

16

Operators

Selection and mutation

When used together give us a genetic algorithm
equivalent of to parallel, noise tolerant, hill
climbing algorithm

Selection, crossover, and mutation

Provide an insurance policy against losing
population diversity and avoiding some of the
pitfalls of ordinary “hill climbing”

17

Replacement

Determine when to insert new offspring into
the mating pool and which individuals to drop
out based on fitness

Steady state evolution calls for the same
number of individuals in the population, so
each new offspring processed one at a time
so fit individuals can remain a long time

In generational evolution, the offspring are
placed into a new population with all other
offspring (genetic code only survives in kids)

18

Genetic Algorithm

Set time t = 0

Initialize population P(t)

While termination condition not met

Evaluate fitness of each member of P(t)

Select members from P(t) based on fitness

Produce offspring from the selected pairs

Replace members of P(t) with better offspring

Set time t = t + 1

19

Why use genetic algorithms?

They can solve hard problems

Easy to interface genetic algorithms to
existing simulations and models

GA’s are extensible

GA’s are easy to hybridize

GA’s work by sampling, so populations can
be sized to detect differences with specified
error rates

Use little problem specific code

20

Traveling Salesman Problem

To use a genetic algorithm to solve the traveling
salesman problem we could begin by creating a
population of candidate solutions

We need to define mutation, crossover, and
selection methods to aid in evolving a solution
from this population

At random pick two solutions and combine them
to create a child solution, then a fitness function
is used to rank the solutions

21

Traveling Salesman Problem

For
crossover

we might take two paths (P1
and P2) break them at arbitrary points and
define new solutions Left1+Right2 and
Left2+Right1

For
mutation

we might randomly switch two
cites in an existing path

22

Evolve Algorithm for TSP

Set up initial population

For G generations

Create M mutations and add them to the population

Subject mutations to population constraints and
determine their relative fitness

Create C crossovers and add them to the population

Subject crossovers to population constraints and
determine their relative fitness

23

Solving TSP using GA

Steps:

1.
Create group of random tours

Stored as sequence of numbers (parents)

2.
Choose 2 of the better solutions

Combine and create new sequences (children)

Problems here:

City 1 repeated in Child 1

City 5 repeated in Child 2

24

Modifications Needed

Algorithm must not allow repeated cities

Also, order must be considered

12345 is same as 32154

Based upon these considerations, a
computer model for N cities can be
created

Gets quite detailed

Genetic Algorithm Example

A

A

B

B

C

C

D

D

E

E

Parent A

Parent B

A

B

C

D

E

Genetic Algorithm Example

A

A

A

A

A

B

B

B

B

B

Combined Path

Genetic Algorithm Example

B

A

B

C

D

E

A

A

B

B

Child

Mutations

Chance of 1 in 50 to introduce a mutation to the next
generation (the child if it replaces a parent, or the first
parent)

E

B

F

D

G

A

C

R1

R2

E

A

G

D

F

B

C

29

Premature Convergence

Occasionally a gene takes over because it is so
much fitter than all others (genetic drift)

If this is the best solution, that may be OK (if not
you may never find the optimal solution if this
happens too soon)

Large populations genetic drift is less likely to
happen

Using higher mutation rates can combat genetic
drift

30

Premature Convergence

High levels of randomness are not always

To prevent genetic drift

You might have several small populations and
cross
-
breed individuals from them

Take game of life approach, pretend individuals
live on 2D grid and only allow breeding between
neighbors (spatial organizational structure)

31

Slow Convergence

Some GA will simply fail to converge

Similar to plateau problem in hill climbing
(need to add noise to fitness values to make
them converge)

Can increase elitism to encourage fitter
individuals to spread their genes (at the risk
of premature convergence)

Increasing level of random mutations
sometimes helps

32

Parameters

Require lots of parameters (mutation rate,
crossover type, population size, fitness
scaling policy)

Can make use of a hierarchy of GA’s with a
master GA setting the parameters for an
ordinary GA

Parameterless GA have default values
chosen for parameters so that human
interaction is not needed for fine tuning

33

Domain Knowledge

GA do not exploit domain knowledge unless
the KE designs special policies and operators

During initialization there can be a bias
toward certain genotypes selected by the
domain expert

Can use gene dependent mutation rates and
heuristic crossover split points

The choice of representation can affect the
size and search efficiency of the problem
space

34

GA Strengths

Do well at avoiding local minima and can
often times find near optimal solutions since
search is not restricted to small search areas

Easy to extend by creating custom operators

Perform well for global optimizations

Work required to to choose representations
and conversion routines is acceptable

35

GA Weaknesses

Do not take advantage of domain knowledge

Not very efficient at local optimization (fine
tuning solutions)

Randomness inherent in GA make them hard
to predict (solutions can take a long time to
stumble upon)

Require entire populations to work (takes lots
of time and memory) and may not work well
for real
-
time applications

36

Evolvee

Uses existing representations (like Neural Net)

Realism is relatively poor

Attack simple tasks (e.g. attack behaviors) do
not pose any problems for it

37

Actions and Parameters

Limited action set needed

Look

parameter: direction

Move

parameter: weights

Vector (projectile, collision point, impact location)

Fire

parameter:

Jump

parameter:

38

Sequences

Contained in simple arrays of actions and
times

Times can be associated with actions in two
ways

Time offset relative to previous action

Absolute time since start of sequence

The order of sequences in an array is not
important (this allows symmetric solutions but
avoids the cost of sorting actions before
evolution is complete)

39

Random Generation

Time offset will be a randomly generated
values within maximum sequence length

Action type can be encoded as a symbol
randomly chosen from set of all possible
actions

Parameters values are action specific and
need to be chosen after action is selected
and given in range values

40

Random Generation

The length of all action sequences can also
be generated randomly (with an maximum
upper bound)

The sequences of actions will be housed in a
dynamic array

Start time of first action in a sequence can be
reset to zero

41

Crossover

Simple one point crossover

Randomly split two move sequences from
parents and swap sub
-
arrays to create two
new children

Fairly easy to program using arrays

42

Mutation

A low probability mutation might be to change
the length of a sequence

Empty spaces can be filled with random action

Excess actions are simply ignored

A low probability mutation might be to replace
individual actions within existing sequences

Gene storage time follows normal distribution

43

Evolution

Population size will remain constant

Evolution happens on request

If individual unassigned fitness exists chose it
otherwise choose two parents with probabilities
proportional to their fitness for crossover/mutation

Individuals are removed from the population
using random selection based on inverse
fitness

To diversify the population remove the poorer of
two similar behaviors

44

Object for Defensive Tactics

In combat game terms,
defensive tactics

is the sequence
of actions carried out by an object to protect itself when it
comes under attack

This is a natural choice for learning behavior by genetic
algorithm, because the object is in a highly competitive
situation with a survival mandate

It should be possible to decide on the fittest behaviors and
select for them in the evolving sequence of actions

To keep things simple, we will focus on only two behaviors

dodging

enemy fire and
rocket jumping

But the method could be extended to include other
defensive moves, such as
weaving

and
seeking cover

45

Computing Fitness

Rocket Jumping

Assign rewards only for upward movement
when object is not touching the floor, to avoid
rewarding running up the stairs

Reward high jump a lot more than lower
jumps

46

Computing Fitness

Dodging Fire

Provide 0 reward when hit and high reward
when object escapes with no damage

Must include distance of dodging movement
away from point of impact to avoid rewarding
“standing still”

Damage to object must also be measured
and subtracted from fitness value

Use time as a 4
th

dimension to resolve ties

47

For the Game

Make use of genetic algorithm

Learn its jumping and dodging behaviors
during the game

Fitness function provides rewards on a per
jump or per dodge basis

48

Evaluation

Learns to jump fairly quickly

Multiple jumps are no problem

Dodging behavior is also learned quickly

Any balanced combination of vector weights
(estimated point of impact, closest collision
point, project attributes) that causes
movement to safety work well

Approach is sub
-
optimal but acceptable

49

Evaluation

Continuous fitness values are more helpful to
the genetic algorithm than Boolean success
indicators

Scheme reveals how well it is possible to
evolve behaviors using genetic operators

The representation is better suited to
modeling sequences than either decision
trees or fuzzy rules

Representation is incompatible with rule
-
based schemes

50

Related Technologies

Genetic Programming

Existing programs are combined to breed
new programs

Artificial Life

Using cellular automata to simulate
population growth