1
Chap 7. Evolutionary computation
In this chap, we consider the field of evolutionary computation, including genetic
algorithms, evolution strategies and genetic programming, and their applications to
machine learning.
7.1 Introduction, or can evolution be
intelligent?
Evolutionary computation
simulates evolution on a computer. The result of such a
simulation is a series of
optimization algorithm
, usually based on a simple set of rules.
Optimization iteratively improves the quality of solutions until an opt
imal, or at least
feasible, solution is found. All these techniques simulate evolution by using the
processes of
selection
,
mutation
and
reproduction
.
7.2 Simulation of natural evolution
Evolution can be seen as a process leading to the maintenance or
increase of a
population
’
s ability to
survive
and
reproduce
in a specific environment. This ability is
called
evolutionary fitness
.
The goal of evolution is to generate a population of individuals with increasing
fitness.
But how is a population with inc
reasing fitness generated? P.220 shows a simple
explanation based on a population of
rabbits
. (The faster
rabbits
meet their
environment challenges in the face of foxes.)
Can we simulate the process of natural evolution in a computer
We will start with t
he
genetic algorithm
(
GA
s) as most of the other evolutionary
algorithm can be viewed as variations of GAs.
In 1975, Holland
’
s GA can be represented by a sequence of procedural steps for
moving from one population of artificial
‘
chromosomes
’
to a new popu
lation. Each
gene
in chromosome is represented by
0
or
1
, as shown in
Figure 7.1
.
Two mechanisms link a GA to the problem it is solving:
encoding
and
evaluation
.
Encoding
is carried out by
representing chromosomes
as strings of ones and zeros. An
evalu
ation function
is used to measure the chromosome
’
s performance, or
fitness
.
7.3 Genetic algorithms
A basic GA can be represented as
Figure 7.2
. and major steps:
2
Step 1:
Represent the problem variable domain as a
chromosome
of a fixed length,
choose th
e size of chromosome population
N
, the crossover probability
p
c
and
the mutation probability
p
m
.
3
Step 2:
Define the
fitness function
to measure the performance.
Step 3:
Randomly generate an
initial population
of chromosomes of size
N
:
x
1
, x
2
,
…
, x
N
Step
4:
Calculate the fitness
of each individual chromosome:
f(x
1
), f(x
2
),
…
, f(x
n
)
Step 5: Select
a pair of chromosome for
mating
from the current population.
Step 6:
Create a pair of offspring chromosomes by applying the genetic operators
–
crossover
and
mu
tation
.
Step 7:
Place the created
offspring
chromosome
in the new population.
Step 8:
Repeat step 5 until the size of the
new chromosome population
becomes equal
to the size of the initial population N.
Step 9:
Replace
the initial (parent) chromosome popul
ation with the new (offspring)
population.
Step 10:
Go to Step 4
, and
repeat
the process until the termination criterion is
satisfied.
Each iteration is called a generation. The entire set of generations
is
called a
run
.
Are any conventional termination
criteria used in genetic algorithms?
A common practice is to terminate a GA after a specified number of generations
and then examine the best chromosomes in the population. If no satisfactory solution
is found, the GA is restarted.
A simple example:
fin
d the maximum value of the function (15x
–
x
2
)
where
parameter x varies between
0 and 15
. Assume x take only integer, thus the
chromosome can be built with only
four
bits (genes):
Suppose the size of the chromosome population N is 6, the crossover proba
bility
pc = 0.7, and the mutation probability pm = 0.001. the fitness function is:
f(x) = 15 x
–
x
2
The initial population is shown in
Table 7.1
and the fitness functions are
illustrated in
Figure 7.3(a)
and the final solutions are in
Figure 7.3(b
).
4
How can we maintain the size of the chromosome population constant, and the same
time improve its average fitness?
One of the most commonly used chromosome selection techniques is the
roulette
wheel selection
(shown in
Figure 7.4
). To select a c
hromosome for mating, a random
number is generated in the interval [0, 100], and the chromosome whose segment
spans the random number is selected.
5
How does the crossover operator work?
First, the crossover operator randomly chooses a crossover point whe
re two
parent chromosomes
‘
break
’
, and then exchanges the chromosome parts after that
point. As a result, two new offspring are created (shown in
Figure 7.5
).
After selection and crossover, the average fitness of the chromosome population
has improved a
nd gone from 36 to 42.
What does mutation represent?
Mutation, which is rare in nature, represents a change in the gene. It may lead to
a significant improvement in fitness, but more often has rather harmful results. Its role
is to provide a guarantee th
at the search algorithm is not trapped on a local optimum.
The sequence of selection and crossover operations may stagnate at any homogeneous
6
set of solutions. Mutation is equivalent to a random search, and aids us in avoiding
loss of genetic diversity.
H
ow does the mutation operator work?
The mutation operator flips a randomly selected gene in a chromosome. For
example, the chromosome X1
’
might be mutated in its second gene, and the
chromosome X2 in its third gene, as shown in above
Figure 7.5
.
To find
the maximum of the
‘
peak
’
function of two
variables
:
,
where parameters x and y vary between
–
3 and 3.
The first step is to represent the problem variables as a chromosome. In other words,
we represent parameters x and y as a conca
tenated binary string:
How is decoding done?
First, a chromosome, that is a string of 16 bits, is partitioned into two 8

bit
strings:
Then they are converted from binary to decimal:
A variable is handled by 8

bits, that is the range from 0 to
(2
8
–
1), and maps to
–
3 to
3
（
range = 6
）
:
6 / (256
–
1) =0.0235294, X = (138)
10
0.0235294
–
3 = 0.2470588
Figure 7.6(a)
shows the initial locations of the chromosomes on the surface and
contour plot of the
‘
peak
’
function. After the first generation, the population begins to
con
verge on the peak containing the maximum, as shown in
Figure 7.6(b)
.
Figure
7.6(c)
shows the final chromosome generation. If we increase the mutation rate to
0.01 and rerun the GA. The population might now converge on the chromosomes
shown in
Figure 7.6(d)
.
7
What is a
performance
graph?
Figure 7.7(a) and (b)
show plots of the best and average values of the fitness
function across 100 generations. The mutation operator allows a GA to explore the
landscape in a random manner. Mutation may lead to signific
ant improvement in the
population fitness, but more often decreases it.
To ensure diversity and at the same time to reduce the harmful effects of the
mutation, we can increase the size of the chromosome population.
Figure 7.8
shows
the performance graphs
for 20 generations of 60 chromosomes.
8
9
7.4 Why genetic algorithms work
The GA techniques have a solid theoretical foundation. That foundation is based
on the
Schema Theorem
. A schema is a set of bit strings of ones, zeros and asterisks,
where each as
terisk can assume either value 1 or 0. For example, a schema
□
1
□
*
□
*
□
0
stands for a set of 4

bit strings. Each string in this set begins with 1 and ends with 0.
These strings are called
instance
s of th
e schema.
What is the relationship between a schema and a chromosome?
For example, the schema H is
□
1
□
*
□
*
□
0
matches the following set of 4

bit
chromosomes:
□
1
□
1
□
1
□
0
□
1
□
1
□
0
□
0
□
1
□
0
□
1
□
0
□
1
□
0
□
0
□
0
Each chromosome here begins wi
th 1 and ends with 0. These chromosomes are said
to be instances of the schema H.
The number of
defined bits
(non

asterisks) in a schema is called the
order
. If
GAs use a technique that makes the probability of
reproduction
proportional to
chromosome fitn
ess, then according to the schema Theorem, we can predict the
presence of a given schema in the next chromosome generation. Let
m
H
(
i
)
be the
number of instances of the schema
H
in the generation
i
, and f
H
(
i
) be the average
10
fitness of these instances. We wa
nt to calculate the number of instances in the next
generation, m
H
(
i
+1). The
expect number
of offspring of
a chromosome x
in the next
generation:
(
f
x
(
i
) is the fitness for chromosome
x
;
若設有
N
個
chromosomes,
則期望值為：
)
Since, schema H
平均適合度：
We obtain
Thus, a schema with above

average fitness will indeed tend to occur more frequently
in the next generation of chromosomes, and a schema with below

average fitness wi
ll
tend to occur less frequently.
How about effects caused by crossover and mutation?
Crossover and mutation can both create and destroy instances of a schema. Here
we will consider only destructive effects, that is effects that decrease the number of
in
stances of the schema.
What is the defining length of a schema?
The distance between the
outermost defined bits
of a schema is called
defining
length
. For example, the defining length of
□
*
□
*
□
*
□
*
□
1
□
0
□
0
□
0
is 3,
□
*
□
1
□
*
□
*
□
0
□
0
is 4.
11
If the crossover takes place
within the defining length, the schema H can be
destroyed and offspring that are not instances of H can be created. The probability that
the schema H will
survive
after crossover can be defined as:
where
P
c
is the crossover prob
ability, and the
l
and
l
d
are, respectively, the
length
and
the
defining length
of the schema H. It is clear, that the probability of survival under
crossover is higher for
short
defining length
of schemata rather than for long ones.
Let
p
m
be the mutati
on probability for any bit of the schema H, and
n
be the
order
of the schema H. Then (1

p
m
) represents the probability that the bit will not mutated,
and thus the probability that schema will survive after mutation is determined as:
It is also clear that the probability of survival under mutation is higher for
low

order schema than high

order ones. We consider the
reproduction
,
crossover
, and
mutation
, the destructive effects is:
It is known as the
Sche
ma Theorem
.
Genetic algorithms are a very powerful tool, but need to be applied intelligently.
For example,
coding
the problem as a bit string may change the nature of the problem
being investigated. In other words, there is a danger that the coded repres
entation
becomes a problem that is different from the one we wanted to solve.
7.5 Case study: maintenance scheduling with genetic algorithms
Scheduling problems are complex and difficult to solve.
Why are scheduling problems so difficult?
First, schedu
ling belongs to
NP

complete
problems. Second, scheduling
problems involve a competition for limited resources; as a result, they are complicated
by many
constraints
. The key to success of the GA lies in defining a fitness function
that incorporates all the
se constraints. For example, the maintenance scheduling in
modern power systems is discussed.
A typical process of the GA development includes the following steps:
1.
Specify the problem, define constraints and optimum criteria.
2.
Represent the problem domain
as a chromosome.
3.
Define a fitness function to evaluate the chromosome
’
s performance.
4.
Construct the genetic operators.
12
5.
Run the GA and tune its parameters.
Step 1:
Specify the problem, define constraints and optimum criteria.
This is probably the most impor
tant step in developing a GA. The purpose of
maintenance scheduling is to find the sequence of outages of power units over a
given period time such that the security of a power system
maximized
.
The security margin is determined by the system
’
s net reserv
e. The net
reserve is defined as the
total installed
generating capacity of the system
minus
the power lost due to
scheduled outage
and
minus
the maximum
load forecast
during the maintenance period. For instance, if the total capacity is
150MW
（
=15
萬千瓦）
and
a unit of
20MW
is scheduled for maintenance, and the
maximum load is predicted to be
100MW
, then the
net reserve
will be
30MW
(=
150
–
20
–
100). Maintenance scheduling must ensure that sufficient net reserve
is provided for secure power supply during any ma
intenance period.
Suppose there are seven units to be maintained in
four equal intervals
. The
maximum
loads expected
during these intervals are
80, 90, 65 and 70MW
. The
unit capacities and their maintenance requirements are presented in
Table 7.2
.
The net
reserve of the power system must be greater than or equal to zero at any
interval. The optimum criterion here is that the net reserve must be at the
maximum during any maintenance period.
Step 2
: Represent the problem domain as a chromosome
Our job is
to represent a complete schedule as a chromosome of a fix length.
The unit schedule can be easily represented as a 4

bit string, where each bit is a
maintenance interval (1: to be maintained; 0: otherwise). Thus, a complete
maintenance schedule for our pro
blem can be represented as a 28

bit
chromosome (see
Figure 7.9
).
13
A better approach is to change the chromosome syntax. The smallest indivisible
part of our chromosome is a 4

bit string. So, we produce a pool of genes for
each unit:
Step 3:
Define
a fitness function to evaluate the chromosome
’
s performance.
For our problem we apply a fairly simple function concerned with
constraint violations and the net reserve at each interval. Based on above
Figure
7.9
, we obtain:
14
At any interval is
nega
tive
, the schedule is
illegal
.
Step 4:
Construct the genetic operators.
Constructing genetic operators is challenging and we must experiment to
make crossover and mutation work correctly.
Figure 7.10(a)
shows an example
of the crossover application durin
g a run of the GA. The mutation operator
randomly selects a 4

bit gene in a chromosome and replaces it by a gene
randomly selected from the corresponding pool (see
Figure 7.10(b)
).
15
Step 5:
Run the GA and tune its parameters
First, we must choose the po
pulation size and the number of generations to
be run. What are the best parameters? Only experimentation can give us the
answer.
Figure7.11(a)
presents performance graphs and the best schedule created by 50
generations of 20 chromosomes, and
Figure 7.11(
b)
for 100 generations.
16
17
Figure 7.12(a)
presents performance graphs by 100 chromosomes with mutation
rate 0.001 and
Figure 7.12(b)
for mutation rate 0.01.
18
7.6 Evolution strategies
Another approach to simulating natural evolution was proposed in
Germany in
the early 1960s for solving technical optimization problems. Unlike GAs, evolution
strategies use only a mutation operator.
How do we implement an evolution strategy?
In its simplest form, termed as a
(1 + 1)

evolution strategy
,
one parent gen
erates
one offspring
per generation by applying normally distributed mutation. The steps are:
Step 1:
Choose the number of parameters N to represent the problem, and then
determine a feasible range for each parameter:
{x
1min
, x
1max
}, {x
2min
, x
2max
}, {x
3m
in
, x
3max
},
…
,{x
Nmin
, x
Nmax
},
Define a standard deviation for each parameter and the function to be
optimized
.
Step 2:
Randomly select an initial value for each parameter from the respective
feasible range.
x
1
, x
2
,
…
, x
N
Step 3:
Calculate the solution a
ssociated with the parent parameters:
X = f(x
1
, x
2
,
…
, x
N
)
Step 4:
Create a new (offspring) parameter by adding a normally distributed random
variable
a
with mean zero and pre

selected deviation
to each parent
parameter:
i
=1
, 2,
…
, N.
Step 5:
Calculate the solution associated with the offspring parameters:
X
’
= f(x
’
1
, x
’
2
,
…
, x
’
N
)
Step 6:
Compare the solution. If the solution for the offspring is better than that for the
parents, replace the parent population. Otherwise, ke
ep the parent parameters.
Step 7:
Go to Step 4, and repeat the process until a satisfactory solution is reached.
The (1 + 1)

evolution strategy can be represented as a block

diagram shown in
Figure 7.13.
19
20
Why do we vary all the parameters simultaneou
sly when generating a
new solution?
An evolution strategy here reflects the nature of a chromosome. In fact, a single
gene may simultaneously affect characteristics of the living organism. On the other
hand, a single characteristic of an individual may be
determined by the simultaneous
interactions of several genes. Experiments also suggest that the simplest version of
evolution strategies that uses a single parent
–
single offspring search works best.
What are the differences between genetic algorithms a
nd evolution strategies?
Genetic algorithm uses crossover and mutation whereas evolution
strategies
uses
only
mutation
. In addition, when we use an evolution strategy we do not need to
represent the problem in a coded form.
Which method works best?
GAs
are capable of more general applications, but the hardest part of applying a
GA is coding the problem. Which method works best? It is application

dependent.
7.7 Genetic programming
Genetic programming offers a solution through the evolution of computer
p
rograms by methods of natural selection. In fact, genetic programming is an
extension of the conventional genetic algorithm, but the goal of genetic programming
is not just to evolve a bit

string representation of some problem but the computer code
that so
lves the problem.
How odes genetic programming work?
According to Koza, genetic programming searches the space of possible
computers for a program that is highly fit for solving the problem at hand. The used
programming language should permit a computer
program to be manipulated as data
and newly created data to be executed as a program. For these reasons, LISP was
chosen as the main language for genetic programming.
What is LISP?
LISP, which was written by John McCarthy in the late 1950s, has become on
e of
the standard languages for artificial intelligence. LISP has a highly symbol

oriented
structure. Its basic structures are atoms and lists. For example,
The list: (

( * A B) C); atom: A, B, C; function:

, *.
Both atoms and lists are called sy
mbolic expressions or S

expressions.
Figure 7.14
shows the tree corresponding to the S

expression (

(* A B) C).
21
This tree has five points, each of which represents either a function or a terminal.
How do we apply genetic programming to a problem?
Be
fore applying genetic programming to a problem, we must accomplish five
preparatory steps:
1.
Determine the set of terminals.
2.
Select the set of primitive functions.
3.
Define the fitness function
4.
Decide on the parameters for controlling the run
5.
Choose the method
for designating a result of the run.
The Pythagorean Theorem helps us to illustrate these preparatory steps, and
demonstrate the potential of genetic programming.
The fitness cases for the Pythagorean Theorem are represented by the samples of
right trian
gles in Table 7.3.
To find the relationship between a, b and c, the steps are:
Step 1
: Determine the set of terminals. (Our example: a, b)
Step 2
: Use four standard
arithmetic
operations: +,

, *, and /, and one
mathematic function sqrt.
Step 3
: The f
itness of the computer can be measured by the error between the
actual
result
produced by the
program
and the correct result given by
the
fitness case.
22
Step 4
:
Decide on the parameters for controlling the run
Include the population size and the maximum nu
mber of generations to
be run.
Step 5
: Choose the method for designating a result of the run
Designate the best

so

far generated program as the result of a run.
Just as a fitter chromosome is more likely to be selected for reproduction, so a fitter
comput
er program is more likely to survive by copying itself into the next generation.
Is the crossover operator capable of operating on computer program?
Two offspring programs are composed by recombining randomly chosen parts of
their parents. For example:
(
/ (

(sqrt (+ (* a a)(

a b))) a) (* a b)) and (+ (

(sqrt (

(* b b) a)) b) (sqrt (/ a b)))
are shown in
Figure 7.15
.
23
Is mutation used in genetic programming?
A mutation operator can randomly change any function or any terminal in the
LISP S

expression
. Under mutation, a function can only be replaced by a function and
a terminal can only be replaced by a
terminal
.
Figure 7.16
explains the basic concept
of mutation in genetic programming.
24
The genetic programming executing steps are shown in
Figure 7.
17
:
25
26
Figure 7.18
shows the fitness history of the best S

expression in a population of
500 computer programs for the Pythagorean Theorem.
What are the main advantages of genetic programming compared to genetic
algorithms?
The fundamental difficulty of
GAs lies in the program representation,
that is, in the fixed

length coding. A poor representation limits the power
of a GA.
Genetic programming uses high

level building blocks of variable
length. Their size and complexity can change during breeding. If
it scales
up, extensive computer run times may be needed.
7.8 Summary
Comments 0
Log in to post a comment