Kokkoras F.  Paraskevopoulos K.
P a g e

1
E x e r c i s e
Genetic Algorithms
GAs
In this Exercise
1.
Theory (in brief)
2.
Things you have to consider
3.
GAs and Matlab
4.
Part #1 (fitness function, variables, representation, plots)
5.
Part #
2
(population diversity
–
size
–
range, fitness scaling)
6.
Part #
3
(selection, elitism, mutation)
7.
Part #
4
(global vs. local minima)
Duration
:
120
min
1.
Theory
(in brief)
(
5
min)
A Genetic Algorithm is an optimization technique that is based on the
evolution theory. Instead of
searching for a solution to a problem in the "state space" (like the traditional search algorithms do), a
GA works in the "solution space" and builds
(or better
,
"
breeds
"
)
new
,
h
opefully better solutions
based on existing ones.
The
general
idea behind GAs is that we can build a better solution if we somehow combine the
"good" parts of other solutions
(schemata theory)
, just like nature does by combining the DNA of li
v
ing beings. The overall idea of a GA is depicted in
Figure
1
(you should refer to the theory for the very
details).
Figure
1
: The outline of a Genetic Algorithm
2.
Things you have to consider
(and be aware of)
(5 min)
The first thing you must do in order to use a GA is to decide
if it is possible to
automatically
build
solutions to your problem
. For example, in the Traveling Salesman Problem, every route that passes
through the cities in
question
is potentially a solution, although probably not the optimal one.
You
must be able to do that because a GA requires an initial
population P
of solutions.
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

2
Then you must decide
what "gene" representation you will use
. You have
a few alternatives like
binary
,
integer
,
double
,
permutation
, etc. with the
binary
and
double
being the most commonly used
since they are the most flexible.
After having selected the representation you must decide
in order
:
t
he
method to select parents
fr
om the population P
(Cost Roulette Wheel, Stochastic Un
i
versal Sampling, Rank Roulette Wheel, Tournament Selection, etc.)
t
he
way these parents will "mate"
to create descendants
(to many methods to mention
here
–
just note that your available options are a result of the representation decided earlier)
the
mutation method
(optional but useful
–
again, options are representation depended
)
the
method you will use to populate the nex
t generation
(P
i+1
) (
age based
,
quality based
,
etc.
–
you probably use
elitism
as well)
the algorithm's
termination condition
(number of generations, time limit,
acceptable
quality
threshold, improvement stall, etc.
–
combination of these is commonly used)
3.
GAs and Matlab
(
10
min)
Figure
2
: GAs in Matlab
's Optimization Toolbox
Matlab provides an optimization toolbox that includes a GA

based solver. You start the toolbox
by typing
optimtool
in the Matlab's
command line and pressing enter. As soon as the optimization
window appears, you select the solver
ga
–
Genetic Algorithm
and you are ready to go.
Matlab does
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

3
not provide every method available
in the literature
in every step but it does have a lot of opt
ions for
fine tuning
and also provides hooks for customization
. The user should program (
by
writing m files)
any extended functionality required.
Take your time and explore the window. If you mesh up with
the settings, before you proceed close the window a
nd run the toolbox again.
Matlab R2008a
(v.7.6)
was used for the tutorial
. E
arlier versions are OK
as soon as the proper
toolbox is presented and installed
.
4.
Part
#
1
(fitness function, variables, representation, plots)
(
15
min)
The first thing you
have to do is to provide the
fitness function
, that is, the function that calculates
the quality of each member of the population (or in plain mathematics, the function you have to o
p
timize). Let's use one provided by Matlab: type
@rastriginsfcn
in
the pro
per field and set the
Number of variables
to 2.
The representation used is
defined in the Options

Population section. The default
selection
Double Vector
is fine.
To have an idea of what we are looking for, check
the
equation of this function and its
plot
on the right
.
Ras(x,y)=20+x
2
+y
2

10(cos2
π
x+cos2
π
y)
We want to find the absolute minimum
which is 0
at (0,0)
.
Note that by default, only minimization is
supported. If for example you want to
maximize
the
f
1
(x,y) function then built and minimize
the
following custom function:
f
2
(x,y) =

f
1
(x,y).
Although you are ready to run, let's ask for some plots, so
we
will be able to better figure out
what happens. Go in the
Options
section, scroll down
to the
Plot fun
c
tions
and check
Best Fitness
and
Di
s
tan
ce
checkboxes.
Now you are ready (the default settings
in ever
y
thing else
is adequate
). Press the
Start
button. The
al
gorithm starts, the plots are pop

up and soon you
have the results at the bottom left of the window.
The best fitness function value (the smallest one
since
we minimize) and the termination condition
met are printed, together with the solution (Final
Point
–
it is very close to (0,0)
). Since the method is
stochastic, don't expect to be able to reproduce
any
result found in a different run.
Now check the two plots
on the left
. It is obvious
that the
population
converges, since the average
distance between i
ndividuals (solutions)
in term of
the fitness value
is reduced
,
as the generations pass.
This is a
measure of the diversity of a popul
a
tion
.
It is hard to avoid convergence but keeping it low or postpon
ing
its appearance is better. Having
diversity
in the
population
allows the GA to search better in the solution space.
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

4
Check
also the fitness value
as it
gradually get
s
smaller.
This is required, it is an indication that
optimization takes place.
Not only the fitness value of the best individual was reduced
but the mean
(average) fitness of the population was also reduced (that is,
in terms of the fitness value,
the whole
population was improved
–
we have better solutions in the population
,
at the end
).
All the
above
together are a good indication that the
GA did its job well
but we are
really
happy
only because we know where the solution is (at (0,0) with fitness value 0)
.
Note however that the nature of the GA prevents it from finding the best solution (0,0). It can go very close
to this value but getting
exactly to (0,0) is hard and could be done only by luck.
This is OK since in this kind of
problems (optimization) we are happy even with a good (and not the perfect one) solution.
If not, the hybrid
function option should be used (not discussed here).
Generally
speaking, t
o get the best results from the
GA requires
experiment
ation
with
the
diffe
r
ent options.
Let's see how some of these affect the performance of the GA.
5.
Part
#2
(population diversity
–
size
–
range
,
f
itness
s
caling
)
(3
5
min)
The
per
form
ance of a GA is affected by
the diversity of the
initial
population. If the average distance
between individuals is large, the diversity is high; if the average distance is small, the diversity is low.
You should experiment to get
the right amount of d
iversity
.
If the diversity is too high or too low, the
genetic algorithm might not perform well.
We will demonstrate this in the following.
By default, the Optimization Tool creates a random initial population using a creation function.
You can
limit this
by setting the
Initial range
field in
Population
options
.
Set it to (1; 1.1).
By this we
actually make it harder for the GA to search equally well in all the
solutions
space. We do not prevent
it though.
The genetic algorithm
can find the solution even if
it does not lie in the initial range
, pr
o
vided that the populations have enough diversity.
Note
:
The
initial range
only restricts the range of the points in the initial population by specifying the lower
and upper bounds. Subsequent generations can contain points whose entries do not lie in the initial range.
If
you want to bound all the individuals in all generations
in a range,
then you can use
the
lower
and
upper
bound
fields in the
constraints
panel, on the left.
Leave the rest
settings
as in
Part
#1 except
O
p
tions

Stopping Criteria

Stall Generations
which should
be set to 100. This will let the algorithm run for 100
generation providing us with better results (and
plots). Now click the Start button.
The
GA
returns the best fitness function value of
approximately 2 and displays the plots in the figu
re
on the right
.
The upper plot, which displays the best fitness at
each generation, shows little progress in lowering the
fitness value
(black dots)
. The lower plot shows the
average distance between individuals at each gener
a
tion, which is a good measu
re of the diversity of a
population. For this setting of initial range, there is too little diversity for the algorithm to make pr
o
gress.
The algorithm was trapped in a local minimum due to the initial range restriction!
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

5
Next, set
Initial range to [1;
100] and run
the algorithm
again
. The
GA
returns the best
fitness value of approximately 3.
3
and displays
the following plots
:
This time, the genetic algorithm makes
progress, but because the average distance
between individuals is so large, the best ind
i
viduals are far from the optimal solution.
Note
though that if we let the GA to run for more
generations
(by setting
Generations
and
Stall
Generation
s
in
Stopping Criteria
to 200)
it will
eventually find a better solution.
Note
: If you try this, please le
ave the settings
in their initial value
s
before you proceed
(
d
e
fault
and
100
, respectively)
.
Finally, set
Initial range
to [1; 2] and run
the
GA
. This returns the best fitness value of
approximately
0
.012 and displays the plots
that follow
.
The diversit
y in this case is better suited to the problem, so the genetic algorithm returns a much
better result than in the previous two cases.
In all the examples above, we had the
Population Size
(
Options

Population
) set to
20 (the default). This value determines
the
size of the population at each generation.
Increasing the population size enables the
genetic algorithm to search more points and
thereby obtain a better result. However, the
larger the population size, the longer the g
e
netic algorithm takes to comput
e each ge
n
eration.
Note though that you should set
Popul
a
tion Size
to be at least the value of
Number of
variables
, so that the individuals in each
population span the space being searched.
You can experiment with different settings
for
Population Size
t
hat return good results
without taking a prohibitive amount of time to run.
Finally, another parameter that affects the diversity of the population (remember, it's vital to
have good diversity in the population) is the
Fitness Scaling
(in Options).
If t
he
fitness
values vary too
widely
Figu
re
3
, the individuals with the
lowest
values
(recall that we minimize)
reproduce too rapi
d
ly, taking over the population pool to
o quickly and preventing the
GA
from searching other areas of
the solution space. On the other hand, if the values vary only a little, all individuals have approx
i
mately the same chance of reproduction and the search will progress very slowly.
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

6
Figu
re
3
:
Raw f
itness
values
(lower is better)
vary too widely
on the left. Scaled values (right) do not
alter the selection advantage of the good individuals (except that now bigger is better). They just
reduce the diversity we have
on the left.
This prevents the GA from converging too early.
The
Fitness Scaling
adjusts the fitness values (scaled values) before the selection step of the GA.
This is done without changing the ranking order, that is, the best individual based on the ra
w fitness
value remains the best in the scaled rank, as well. Only the values are changed,
and
thus the prob
a
bility of an individual
to
get selected
for mating
by the selection procedure.
This prevents the GA
from converging too fast which allows the
algorithm to better search the solution space.
6.
Part
#3
(selection, elitism, mutation)
(3
5
min)
We continue this GA tutorial using the Rastrigin's function
. Use the following settings leaving ever
y
thing else in its default value
(
Fitness function
:
@r
astriginsfcn
,
Number of Variables
: 2,
Initial Range
:
[1; 2
0
]
,
Plots
: Best Fitness, Distance).
The
Selection
panel
in
Options
controls the
Selection Function
, that is, how individuals are s
e
lected to become parents.
Note that this mechanism works on the sc
aled values, as described in the
previous section.
Most
well

known methods are presented (uniform, roulette and tournament).
An
individual can be selected more than once as a parent, in which case it contributes its genes to more
than one child.
Figure
4
:
Stochastic uniform
selection method. For 6 parents we step the
selection line with steps equal to 15/6.
The default selection option,
Stochastic
U
niform
, lays out a line
(
Figure
4
)
in which each parent
corresponds to a section of the line of length proportional to its scaled value. The algorithm moves
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

7
along the line in steps of equal size. At each step, the algorithm a
llocates a parent from the section it
lands on.
For example, assume a population of 4 individuals with scaled values 7, 4, 3 and 1. The ind
i
vidual with the scaled value of 7 is the best and should contribute its genes more than the rest. We
create a line o
f length 1+3+4+7=15. Now, let's say that we need to select 6 individuals for parents.
We step over this line in steps of 15/6 and select the individual we land in (
Figure
4
).
The
Reproduction
panel in
Options
control how the
GA
creates the next generation. Here you
specify the amount of elitism and the fraction of the population of the next generation that is gene
r
ated through mating
(the rest is generated by mutatio
n)
. The options are:
Elite
C
ount
: the number of individuals with the best fitness values in the current generation
that are guaranteed to survive to the next generation. These individuals are called elite chi
l
dren. The default value of Elite count is 2.
Try to solve the Rastrigin's problem by changing only this parameter.
Try values of 10, 3 and
1. You will get results like those depicted in
Figure
5
.
It is obvious th
at you should keep this
value low. 1
(
or 2

depending on the population
size
) i
s
OK
. (Why?)
Figure
5
: Elite count 10 (left), 3 (middle) and 1 (right). Too much elitism results in
early convergence which can make the
search less effective.
Crossover
F
raction
:
t
he fraction of individuals in the next generation, other than elite chi
l
dren, that are created by crossover
(that is, mating)
.
The rest are generated by mutation.
A
crossover fraction of 1 means that all children
other than elite individuals are crossover chi
l
dren
.
A
crossover fraction of 0 means that all children are mutation children.
The following example show
s
that neither of these extremes is an effective strategy for
optimizing a function.
You will now chan
ge the problem (you better restart the optimization
toolbox to have everything set to default values). You will optimize this function:
f(x
1
, x
2
, ..., x
10
) = X
1
 + X
2
 + ... +X
1
0

Use the following settings:
o
Fitness Function:
@(x) sum(abs(x))
o
Number
of variables: 10
o
Initial range: [

1; 1].
o
Plots: Best fitness and Distance
Run the example with the default value of 0.8 for
Crossover fraction
, in the
Options > R
e
production
pane
l
. This returns the best fitness value of approximately 0.
25
and displays
plots
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

8
like those
in
Figu
re
6
(left)
.
Note though that
for another fitness function, a different setting
for Crossover fraction might yield the best result.
Figu
re
6
: Plots for Crossover fraction set to 0.8 (left) and 1 (right).
To see how the genetic algorithm performs when there is no mutation, set Crossover
fraction to 1.0 and click Start. This returns the best fitness value of approxi
mately 1.
1
and
displays plots
similar to th
e one in
Figu
re
6
(right).
In this case, the algorithm selects genes from the individuals in the initial population and
recombines them. The algorithm cannot create any new genes because there is no mutation.
The algorithm generates the best individual that it can using these genes
at generation nu
m
ber
~
15
, where the best fitness plot becomes level. After this, it creates new copies of the
best individual, which are then are selected for the next generation. By generation number
~
1
9
, all individuals in the population are the same, na
mely, the best individual. When this o
c
curs, the average distance between individuals is 0. Since the algorithm cannot improve the
best fitness value after generation
~
15
, it
terminates because the average change to the fi
t
ness function is less what is set
to the termination conditions.
To see how the genetic algorithm performs
when there is
no crossover
, set Crossover fraction
to 0 and click Start. This returns the best fitness
value of approximately
~2
.
7
and di
s
plays plots like
that on the right
.
In
this case, all children are generated though
mutation. The random changes that the algorithm
applies never improve the fitness value of the
best individual at the first generation. While it i
m
proves the individual genes of other individuals,
as you can see
in the upper plot by the decrease
in the mean value of the fitness function, these
improved genes are never combined with the
genes of the best individual because there is no
International Hellenic University
Genetic Algorithms
Kokkoras F.  Paraskevopoulos K.
P a g
e

9
crossover. As a result, the best fitness plot is level and the algorithm stalls
at generation
number 50.
7.
Part #4
(global vs. local minima)
(
15
min)
Optimization algorithms sometimes return a local minimum instead of the global one, that is, a point
where the function value is smaller than the nearby
points, but possibly greater th
an one at a distant point in
the solution space. The genetic algorithm can sometimes
overcome this deficiency with the right settings. As an e
x
ample, consider the following function which has the plot
depicted on the right:
(
)
{
(
)
(
)
(
)
The function has two local minima, one at x = 0, where
the function value is
–
1, and the other at x = 21, where the function value is
about

1.37
. Since the
latter value is smaller, the global minimum occurs at x = 21.
Let us now see how we can
define custom fitness functions. Go to Matlab and select
File>New>M

File. Define the function in the editor window as shown is the picture.
Then
Save the file
to your desktop
using the suggested filename
two_min
(do not change it!).
Now i
n the Matlab's
m
ain
toolbar,
set the cu
r
rent directory to the Desktop. This way, your M file
will be visible to the Matlab.
In the Optimization Toolbox set
Fitness fun
c
tion
to
@two_min
,
Number of variables
to 1
,
Stopping criteria>Stall Generations to 100
and click
Start. The genetic algorithm returns a point very
close to the local minimum at x = 0.
The problem here is the default initial range of [0; 1] (in the
Options > Population
panel). This
range
is not large enough to explore
points near the globa
l minimum at x
= 21.
One way to make the
GA
explore
a wider range of points
(
that is, to
increase the diversity of the popul
a
tions
)
is to increase the
Initial range
.
It
does not have to include the point
x=21, but it must be large enough so
that the algo
rithm
will be able to
generates individuals near x = 21.
So, s
et Initial range to [0;
15]
and click
Start
once again
.
Now t
he
GA
returns a point very close to 21
.
Comments 0
Log in to post a comment