Genetic algorithms
Genetic Algorithms in a slide
Premise
Evolution worked once (it produced us!), it might work
again
Basics
Pool of solutions
Mate existing solutions to produce new solutions
Mutate current solutions for long

term diversity
Cull population
Originator
John Holland
Seminal work
Adaptation in Natural and Artificial Systems introduced
main GA concepts, 1975
Introduction
Computing pioneers (especially in AI) looked to natural
systems as guiding metaphors
Evolutionary computation
Any biologically

motivated computing activity simulating
natural evolution
Genetic Algorithms are one form of this activity
Original goals
Formal study of the phenomenon of adaptation
John Holland
An optimization tool for engineering problems
Main idea
Take a population of candidate solutions to a given problem
Use operators inspired by the mechanisms of natural genetic
variation
Apply selective pressure toward certain properties
Evolve a more fit solution
Why evolution as a metaphor
Ability to efficiently guide a search through a large solution
space
Ability to adapt solutions to changing environments
“Emergent” behavior is the goal
“The hoped

for emergent behavior is the design of
high

quality solutions to difficult problems and the
ability to adapt these solutions in the face of a
changing environment”
Melanie Mitchell, An Introduction to Genetic
Algorithms
Evolutionary terminology
Abstractions imported from biology
Chromosomes, Genes, Alleles
Fitness, Selection
Crossover, Mutation
GA terminology
In the spirit
–
but not the letter
–
of biology
GA chromosomes are strings of genes
Each gene has a number of alleles; i.e., settings
Each chromosome is an encoding of a solution to a
problem
A population of such chromosomes is operated on by a GA
Encoding
A data structure for representing candidate solutions
Often takes the form of a bit string
Usually has internal structure; i.e., different parts of the
string represent different aspects of the solution)
Crossover
Mimics biological recombination
Ssome portion of genetic material is swapped between
chromosomes
Typically the swapping produces an offspring
Mechanism for the dissemination of “building blocks”
(schemas)
Mutation
Selects a random locus
–
gene location
–
with some
probability and alters the allele at that locus
The intuitive mechanism for the preservation of variety in the
population
Fitness
A measure of the goodness of the organism
Expressed as the probability that the organism will live
another cycle (generation)
Basis for the natural selection simulation
Organisms are selected to mate with probabilities
proportional to their fitness
Probabilistically better solutions have a better chance of
conferring their building blocks to the next generation (cycle)
A Simple GA
Generate initial population
do
Calculate the fitness of each member
// simulate another generation
do
Select parents from current population
Perform crossover add offspring to the
new population
while new population is not full
Merge new population into the current population
Mutate current population
while not converged
How do GAs work
The structure of a GA is relatively simple to comprehend, but
the dynamic behavior is complex
Holland has done significant work on the theoretical
foundations of Gas
“GAs work by discovering, emphasizing, and recombining
good ‘building blocks’ of solutions in a highly parallel fashion.”
Melanie Mitchell, paraphrasing John Holland
Using formalism
Notion of a building block is formalized as a schema
Schemas are propagated or destroyed according to the
laws of probability
Schema
A template, much like a regular expression, describing a set
of strings
The set of strings represented by a given schema
characterizes a set of candidate solutions sharing a property
This property is the encoded equivalent of a building block
Example
0 or 1 represents a fixed bit
Asterisk represents a “don’t care”
11****00 is the set of all solutions encoded in 8 bits,
beginning with two ones and ending with two zeros
Solutions in this set all share the same variants of the
properties encoded at these loci
Schema qualifiers
Length
The inclusive distance between the two bits in a schema
which are furthest apart (the defining length of the
previous example is 8)
Order
The number of fixed bits in a schema (the order of the
previous example is 4)
Not just sum of the parts
GAs explicitly evaluate and operate on whole solutions
GAs implicitly evaluate and operate on building blocks
Existing schemas may be destroyed or weakened by
crossover
New schemas may be spliced together from existing
schema
Crossover includes no notion of a schema
–
only of the
chromosomes
Why do they work
Schemas can be destroyed or conserved
So how are good schemas propagated through generations?
Conserved
–
good
–
schemas confer higher fitness on the
offspring inheriting them
Fitter offspring are probabilistically more likely to be
chosen to reproduce
Approximating schema dynamics
Let H be a schema with at least one instance present in the
population at time t
Let m(H, t) be the number of instances of H at time t
Let x be an instance of H and f(x) be its fitness
The expected number of offspring of x is f(x)/f(pop) (by
fitness proportionate selection)
To know E(m(H, t +1)) (the expected number of instances of
schema H at the next time unit), sum f(x)/f(pop) for all x in H
GA never explicitly calculates the average fitness of a
schema, but schema proliferation depends on its value
Approximating schema dynamics
Approximation can be refined by taking into account the
operators
Schemas of long defining length are less likely to survive
crossover
Offspring are less likely to be instances of such
schemas
Schemas of higher order are less likely to survive
mutation
Effects can be used to bound the approximate rates at
which schemas proliferate
Implications
Instances of short, low

order schemas whose average fitness
tends to stay above the mean will increase exponentially
Changing the semantics of the operators can change the
selective pressures toward different types of schemas
Theoretical Foundations
Empirical observation
GAs can work
Goal
Learn how to best use the tool
Strategy
Understand the dynamics of the model
Develop performance metrics in order to quantify success
Theoretical Foundations
Issues surrounding the dynamics of the model
What laws characterize the macroscopic behavior of GAs?
How do microscopic events give rise to this macroscopic
behavior?
Theoretical Foundation
Holland’s motivation
Construct a theoretical framework for adaptive systems as
seen in nature
Apply this framework to the design of artificial adaptive
systems
Issues in performance evaluation
According to what criteria should GAs be evaluated?
What does it mean for a GA to do well or poorly?
Under what conditions is a GA an appropriate solution
strategy for a problem?
Theoretical Foundation
Holland’s observations
An adaptive system must persistently identify, test, and
incorporate structural properties hypothesized to give
better performance in some environment
Adaptation is impossible in a sufficiently random
environment
Theoretical Foundation
Holland’s intuition
A GA is capable of modeling the necessary tasks in an
adaptive system
It does so through a combination of explicit computation
and implicit estimation of state combined with incremental
change of state in directions motivated by these
calculations
Theoretical Foundation
Holland’s assertion
The ‘identify and test’ requirement is satisfied by the
calculation of the fitnesses of various schemas
The ‘incorporate’ requirement is satisfied by implication of
the Schema Theorem
Theoretical Foundation
How does a GA identify and test properties?
A schema is the formalization of a property
A GA explicitly calculates fitnesses of individuals and
thereby schemas in the population
It implicitly estimates fitnesses of hypothetical individuals
sharing known schemas
In this way it efficiently manages information regarding
the entire search space
Theoretical Foundation
How does a GA incorporate observed good properties into the
population?
Implication of the Schema Theorem
Short, low

order, higher than average fitness schemas
will receive exponentially increasing numbers of
samples over time
Theoretical Foundation
Lemmas to the Schema Theorem
Selection focuses the search
Crossover combines good schemas
Mutation is the insurance policy
Theoretical Foundation
Holland’s characterization
Adaptation in natural systems is framed by a tension
between exploration and exploitation
Any move toward the testing of previously unseen
schemas or of those with instances of low fitness takes
away from the wholesale incorporation of known high
fitness schemas
But without exploration, schemas of even higher fitness
can not be discovered
Theoretical Foundation
Goal of Holland’s first offering
The original GA was proposed as an “adaptive plan” for
accomplishing a proper balance between exploration and
exploitation
Theoretical Foundation
GA does in fact model this
Given certain assumptions, the balance is achieved
A key assumption is that the observed and actual
fitnesses of schemas are correlated
This assumption creates a stumbling block to which we
will return
Traveling Salesperson Problem
Initial Population for TSP
(5,3,4,6,2)
(2,4,6,3,5)
(4,3,6,5,2)
(2,3,4,6,5)
(4,3,6,2,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
Select Parents
(5,3,4,6,2)
(2,4,6,3,5)
(4,3,6,5,2)
(2,3,4,6,5)
(4,3,6,2,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
Try to pick the better ones.
Create Off

Spring
–
1 point
(5,3,4,6,2)
(2,4,6,3,5)
(4,3,6,5,2)
(2,3,4,6,5)
(4,3,6,2,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
(
3
,
4
,
5
,
6
,
2
)
(
3
,
4
,
5
,
6
,
2
)
Create More Offspring
(5,3,4,6,2)
(2,4,6,3,5)
(4,3,6,5,2)
(2,3,4,6,5)
(4,3,6,2,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
(
5
,
4
,
2
,
6
,
3
)
(
3
,
4
,
5
,
6
,
2
)
(
5
,
4
,
2
,
6
,
3
)
Mutate
(5,3,4,6,2)
(2,4,6,3,5)
(4,3,6,5,2)
(2,3,4,6,5)
(4,3,6,2,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
Mutate
(5,3,4,6,2)
(2,4,6,3,5)
(4,3,6,5,2)
(2,3,4,6,5)
(
2
,3,6,
4
,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
(
3
,
4
,
5
,
6
,
2
)
(
5
,
4
,
2
,
6
,
3
)
Eliminate
(5,3,4,6,2)
(2,4,6,3,5)
(4,3,6,5,2)
(2,3,4,6,5)
(
2
,3,6,
4
,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
Tend to kill off the worst ones.
(
3
,
4
,
5
,
6
,
2
)
(
5
,
4
,
2
,
6
,
3
)
Integrate
(5,3,4,6,2)
(2,4,6,3,5)
(
2
,3,6,
4
,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
(
3
,
4
,
5
,
6
,
2
)
(
5
,
4
,
2
,
6
,
3
)
Restart
(5,3,4,6,2)
(2,4,6,3,5)
(2,3,6,4,5)
(3,4,5,2,6)
(3,5,4,6,2)
(4,5,3,6,2)
(5,4,2,3,6)
(4,6,3,2,5)
(3,4,2,6,5)
(3,6,5,1,4)
(3,4,5,6,2)
(5,4,2,6,3)
Genetic Algorithms
Facts
Very robust but slow
Can make simulated annealing seem fast
In the limit, optimal
Other GA

TSP Possibilities
Ordinal Representation
Partially

Mapped Crossover
Edge Recombination Crossover
Problem
Operators are not sufficiently exploiting the proper
“building blocks” used to create new solutions.
Genetic Algorithms
Some ideas
Parallelism
Punctuated equilibria
Jump starting
Problem

specific information
Synthesize with simulated annealing
Perturbation operator
Heuristic H
Length(MST) < Length(T)
Let T be the optimal tour.
Heuristic H
Tour T’
Tour T’’
Perturbation of points
Perturbation of a Point
Mutation Operator
Points are perturbed in a normal distribution centered
around the original location and a
standard deviation which is a function of the original
interpoint distances.
Crossover Operator
Perturbed points tend to stay close to original locations, hence
distances remain reasonable.
Small shifts in point position can have an effect on the MST,
hence see many different solutions.
Characteristics of Operators
% Improvement Over EC
Average Improvement: 32.1%
% Improvement Over H
Average Improvement: 15.1%
Comments 0
Log in to post a comment