# Genetic Algorithms Without Parameters

AI and Robotics

Oct 23, 2013 (4 years and 8 months ago)

79 views

1.2. Review.

Holand
[28]

introduced the Genetic Algorithm (GA) as a method that was going to be
efficient, easy to use, and applicable to a wide range of optimization problems. The performance
of GA applied to a
given optimization problem is affected by a number of factors. One of the
most important factors is the parameters manipulation strategy such as crossover, mutation, and
population size. The parameters values are problem dependent, therefore, the optimal s
etting of
these parameters must be chosen carefully for a given problem. Finding of parameter settings that
work well in one problem is not a trivial task and it is time consuming.

Two methods are used to find the best parameters values: parameter tuning
and parameter
control. Parameter tuning involves finding a good value for the parameters before the GA run and
fixing it for all generations and all chromosomes in the population. Dejong
[16]
,

empirically,
found
static values for the parameters, which were good for the classes of test functions he used
(the test functions are a numeric problems). The values he found were: population size equal to
50, probability of crossover equal to 0.6, and probability of mutati
on equal to 0.001. Grensfelt
[20]

used GA to find a set of static parameter values on the same test functions used by DeJong
[16]
. The values he obtained were: populati
on size equal to 30, crossover probability equal to
0.95, and mutation probability equal to 0.01. These parameter values give reasonable
performance for the studied class of test function. The second method of finding a set of
parameter values suitable for

a given optimization problem is parameter control. In this method
one starts with certain initial parameters values. Then these values are changed either in an
adaptive way by using feedback information during the GA run or using a preset formula.
Paramet
er control was early realized by Rechenberg

[30]

in his 1/5 successful rule. According to
this rule the ratio of successful mutations to all mutation should be 1/5
[30]
.

If the ratio is greater
than 1/5 the mutation step size should be decreased, if it is less than 1/5 the mutation step size
should be increased
[30]
.

Eiben, Hinterding, and Michalewicz
[17]

classified parameter control into three classes:
-
adaptive which are described in more details below.

In the deterministic approach the parameters are changed according to a heuristic formula
which usu
ally depends on time schedule, and uses no feedback from the GA run. Fogarty,
Terance
[18]

changed the mutation probability in GA by decreasing it exponentially over
generations. They started with an initial muta
tion probability and then halved it in the next
generation adding to it a base line value. The base line value is employed to keep the mutation
probability from becoming too small later stages of generation.

where t is the generatio
n number and 1/240 is the base line value used.

Hesser, and Manner
[24]

derived a general formula to calculate the probability of
mutation on the basis of the generation number and certain constant parameters di
fferent for each
problem. Unfortunately, these constants are not easy to calculate for some optimization problems.

where
and
are the constant parameter.
is populat
ion size, string length of the
chromosome, and the generation number respectively.

Muhlenbein

[29]
, and Back

[5]
, experimentally, found the optimal mutation rate is 1/
l

(
l

is the string length of the chromosome) on (1+1) algorithm. (1+1) algorithm single parent
produce one offspring by means of mutation, then the best of them will survive to the next
generation. Back
[9]
,

[10]

then, suggested a time dependent mutation, where the mutation
probability is decreased over generation on the basis of the generation number. The dynamics of
mutation probability is controlled by the maximum num
ber of generations where it is set before
the GA run.

where
and
are the string length of the chromosome, the maximum number of generations,
and the number of current generation.

Back re
ported excellent experimental results were obtained for hard combinatorial
optimization problems
[10]
. Fogarty
[18]
, and Back
[5]
,

[10]

investigated the mutation
probability control when no crossover was used. Gabriela
[19]

stated that fixed and small
mutation probability if crossover used w
ill give better performance. He, experimentally, proved
that using a time dependent variant of the mutation might improve the GA performance if no
crossover is used which agrees with the conclusions of
[18]
,
[5]
, and

[10]
.

In adaptive parameter control feedback information is extracted on how well the search is
going and used to control the values and direction of the paramete
parameter were first used by Rechenberg
[30]

in his 1/5 successful rule of controlling the
mutation step size in evolution strategy. Bryant, and Julstrom
[13]

used the contribution of the
parameter value in making new chromosomes with a fitness better than the median of the
population at that time. This contribution is recorded as a credit for that parameter value. The
credit assigned for the
parameter value controls its decrease or increase. Schlierkamp, and
Muhlenvein

[33]

adapted the population sizes using competing subpopulations. In this method,
multiple populations with different sizes are run at the same time.

After each generation the
subpopulation which has the best fitness is stored in a quality record. After a number of
generations the population with the highest quality record is increased. All others are decreased
according to their quality record. In a s
imilar fashion, Hinterding
[26]

ran three populations
simultaneously. These populations had an initial size ratio of 1:2:4. After a certain time interval
the size of each population was halved, doubled, or mainta
ined depending on its best fitness.
Using a slightly different approach, Lobo, and Harik
[23]

ran a population with a small size and
doubling it at genotype convergence (all chromosomes in the population have the

same
genotype). In order to accelerate the some time tedious genotype convergence he ran several
populations simultaneously. According to
Annunziato, and Pizzuti
[3]

the environment contains
some useful informat
ion that could be used to stabiles or control the parameter strategy. The
parameter values should be changed according to how a chromosome in the population interacts
with the others. The rate of change of the parameter values are controlled by the current

environment.

Self
-
adaptive control, is the third approach used described in the literature. It was first
used by Schwefl
[34]

in the evolution strategy where he tried to control the mutation step size.
Each chr
omosome in the population combined with its own mutation variance as part of the
chromosome structure, and this mutation variance is subjected to mutation, recombination, and
selection as well as the solution parameters. Back
[12]

extended Schwefl
[34]

work into GA. He
added extra bits at the end of each chromosome in the population to control the mutation, and
crossover probability. At first the mutation, and crossover
probability bits were purely random as
the parameters solution. These bits were subjected to mutation and crossover as well, better
values of parameter gave better chromosomes and then at the end better values dominated the
population. Another way to self
-
adapt the parameters values, described by Srivniras, and Patniak
[38]
,

is by assigning mutation and crossover probability for each chromosome on the basis of its
fitness and the environments property at that time
. Arabas, Michalewicz, and Mulawka
[2]

used
the age of the chromosomes to control the population size. Each created chromosome was
assigned a lifetime to control how many generations would survive before being de
stroyed. The
lifetime of chromosome was determined by its fitness and the current state of the search.

Back’s

[10]
,
[12]

approach for deterministic and self
-
ntrol, as well as the
methodologies of Lobo, and Harik

[23]

and of Annunziato, and Pizzuti

[3]

were used by us and are described in details further

below.
2. Test Genetic Algorithms

Traditional steady state genetic algorithm is used to serve as a benchmark for other
variants. Gray code is used, a mutation rate P
m

=1/l, crossover rate P
c

= 0.9, populatio
n size
N=60, uniform crossover and bit flip mutation. In TGA the worst chromosome is deleted to make
a room for children. The chromosome length will be 200 bits for all test functions except f
5

will
be 60 bits, which consists of 10x20 =200 bits (10x6=60 fo
r f
5
) for ten function variables
(dimension n=10) and at the end of chromosome 2x10=20 bits are added for two self
-
parameters P
m

and P
c
. The GA terminates when the optimum is found or the maximum number of
fitness evaluations is reached. The maxim
um number of evaluations is 500000 for f
1
, f
2
, f
3
, and f
4

and 200000 for f
5

[12]
.

2.2. Self
-
adaptive crossover, mutation probability, and population size
of genetic algorithm

Self
-
[12]

mutation rate is encoded in extra bits at the end of each
chromosome in the population within a range between 0.001 and 0.25, and the same for crossover
rate P
c

within a range between 0 and 1. Mutation then takes places in tw
o steps. First mutate the
bits that encode the mutation rate only and immediately decoded to establish a new mutation rate.
Second, the new mutation rate is used to mutate the main bits (those encode the solution). For
reproduction two chromosomes are sele
cted by Tournament selection. The bits that encode the
crossover rate in the selected chromosome are decoded to establish crossover rate P
c
. A random
number
r

below 1 is compared with P
c

of the selected chromosome, if
r

is lower chromosome P
c
,
the member i
s ready to mate. If both selected chromosomes are ready to mate two children are
created by uniform crossover, then mutated and inserted in the population. While if both selected
chromosomes reject to mate two children created by mutation only. If one of b
oth selected
chromosomes willing to mate and the other doesn’t. One child created by mutating the
chromosome who doesn’t like to mate. The willing chromosome is on hold and the next parent
selection round only picks one other parent.

The number of generati
ons the chromosome stays alive proposed by Arabas
[2]

is used to
self
-
adapt the population size. Every new chromosome created assigned age “Remaining Life
Time (RLT)” according to its fitness by a bi
-
linear schem
e.

,

are minimum and maximum remaining life time, and the set to 1, and 11
respectively.

,
are the worst, and best chromosome fitness in the

population.

is the average fitness of the population.

is the i
th

chromosome fitness.

At each cycle the RLT of each chromosome in the population is decremented by one. If the RLT
of a chromosome reaches zero i
t is removed from the population.

2.3. Adaptive Genetic Algorithm by Reproduction and Competition

Pizzuti and Annunziato
[3]

introduced a dynamic environment determined by the
reproduction and competition rules a
mong chromosomes. The adaptation mechanisms of the
parameters probability are controlled by the environment. The environment is constrained by the
maximum population size which it is set before the GA run. The environment adaptive rules
control the paramet
ers probability called Population density or meeting probability. Meeting
probability defined as:

,

are the current population size and the maximum population size respectively.

When tw
o chromosomes meet then they can create children: by crossover (bisexual) or by
fighting (competition) for natural resources (the stronger kill the weaker), otherwise the current
chromosome can differentiate (mono
-
sexual or mutation) to create child. The c
rossover rate and
competition rate are defined as:

is the crossover rate and

is the meeting probability.

is the competition rate and
is the crossover rate .

Initially a random number, limited by the maximum population size is generated for the
initial population size. Then at each iteration we select the i
th

chromosome of the population, for i
from 1 to population si
ze. Then randomly select another chromosome from the population. A
random number
r

below 1 compared with the meeting probability P
m
, if
r

below P
m
, interact
happened. If interact happened, another random
r
1

below 1 compared to P
r
, if
r
1

below P
r
, tow
child
ren are created by uniform crossover and immediately inserted in the population. If
r
1

is
above P
r
, competition will take place then the weaker chromosome is removed from the
population. If meeting not happened, one child is created by mutation and immedia
tely inserted in
the population.

2.4. Adaptive population size of genetic algorithm

To overcome the population size problem and drift velocity, Lobo
[23]

suggests
establishing a race among populations of various
sizes. Multiple populations with different sizes
are running simultaneously. Lobo method gives priority to the smaller population size by giving
them more function evaluations. Initially runs population 1 for 4 generations, and then runs
population 2 for 1

generation, then population 1for 3 more generation, then population 2 for 1
generation, and so on. At any time if the average fitness of the smaller population is less than the
larger one, then the smaller population will be destroyed. The crossover rate
P
c

and mutation rate
P
m

for each population are the same and they are set before the run of GA. P
c

=0.9, and P
m
=1/
,
where

the chromosome length.

2.5. Deterministic mutation rate of genetic algorithm

In this al
gorithm we used a formula depends on time

“generation counter” and
chromosome length

to change the mutation rate Back
[10]
. Generation counter constrained by
the ma
ximum number of generation
. The formula of mutation defined as:

,

and

are the chromosome’s length, current generations, and maximum number
of
generations respectively.

At each iteration one chromosome is selected to be parent by tournament selection. Then
one child is created by mutation and the better of both parent and child survives for the next
generation. No crossover used to create chi
ldren.
3. Test Functions

To evaluate the performance of the parameter
-
less GA algorithm, we used the same test
functions used by Back
[12]
. For selecting the test function Back followed the guidelines reported
b
y Whitley
[41]
, and Back
[11]
. The test functions should:

1)

Contain problems resistant to Hill
-
Climbing,

2)

Contain nonlinear, non separable problems,

3)

Contain scalable funct
ions,

4)

Have a canonical form,

5)

Include a few unimodal functions for comparison of efficiency (convergence velocity),

6)

Include a few multimodal functions of different complexity with a large number of local
optima,

7)

Include multimodal functions with irregular a
rrangement of local optima,

8)

Contain high
-
dimensional functions, because these are better representatives to real
-
world
applications.

Ten variables are used (dimension n=10) for each test function and constrained in the interval
-
5and 5,
. Tow dimensional domain are used to draw the surface
of each function except function five. To draw the surface of function five we implement a program
generate a binary string randomly several times, then counter the number of 1’s in the string.
Numb
er of 1’s in the string determines which part of the function five must be use. At the end we
will have a matrix of values that represent the surface amplitude.

The test functions suite the rules above:

is
a sphere model after De Jong
[16]
. It is continuous, convex and unimodal.

The function has global optimum at point zero,

is the generalized Rosenbrock
[31]

function. The function has global optimum inside a
long, narrow, parabolic shaped flat valley. To find the valley is trivial, however convergence to
the global optimum is difficult,

is the generalized Ackley
[1]
, Back
[7]

function. It is a variant multimodal function
with global minimum located at the origin with function value of zero
.

is the generalized Rastrigin
[39]
, Hoffmeister
[27]

function. It is a non
-
linear
multimodal function. This function is a fairly diffi
cult problem due to its large search space and
its large number of local minima,

is the fully deceptive six
-
bit function Deb
[15]
. All functions have dimension n=1
0 and
use 20 bits/variable except f
5

which uses 6 bits/variable.
References

[1]

Ackley, D.H., A connectionist machine for genetic hill climbing,
Kluwer, Boston, 1987.

[2]

Arabas, J., Michalewicz, Z., & Mulawka, J., GAVaPS
-

A genetic algorithm with varying
populat
ion size,
Proceeding of the 1
st

IEEE Conference on Evolutionary Computation,
IEEE Press, 1994.

[3]

Annunziato, M., & Pizzuti, S., Adaptive parameterization of evolutionary algorithms driven
by reproduction and competition,
Proceeding of ESIT2000, PP 246
-
256, A
chen Germany.

[4]

Back, T., Self
-
In F. J. Varela and P. Bourgine, editor,
Proceeding of the First European Conference on Artificial Life, PP 263
-
271, The MIT
Press, Cambridge, MA, 1992.

[5]

Back, T., The interaction of mutation ra
te, selection, and self
-
algorithm,
In R. Manner and B. Manderick, editors, Parallel Problem solving from Nature,
PP 85
-
94, Elsevier Amsterdam, 1992.

[6]

Back, T., Optimal mutation rates in genetic search.
In Forrest, S. (Ed), Procee
ding the Fifth
International Conference on Genetic Algorithms PP 2
-
8, San Mateo, Ca: Morgan
Kufmann. 1993.

[7]

Back, T., & Schwefel, H.
-
P., An overview of evolutionary algorithms for parameter
optimization,
Evolutionary Computation, Vol. 1, No. 1, PP 1
-
23, 199
3.

[8]

Back, T., & Schwefel, H.
-
P., Evolution strategies I: Variants and their computational
implementation. In winter,
G., Perisux, J., Galan, M., & Cuesta, P. (Eds), Genetic
Algorithms in Engineering and Computer Science (Chapter 6, PP 11
-
126). Chiechester:
John Wiley and Sons. 1995.

[9]

Back, T., & Schutz M., Intelligent mutation rate control in canonical genetic algorithm,
Proceeding of the International Symposium on Methodologies for Intelligent Systems, PP
158
-
167, 1996.

[10]

Back, T., Evolutionary Algorithms in t
heory and practice,
Oxford University Press, 1996.

[11]

Back, T., & Michalewiccz, Z., Test landscapes
, In Back, T., Fogel, D.B., & Michalewicz,
Z., (Ed): Handbook of Evolutionary Computation, Chapter B2.7, PP 14
-
20, Institute of
Physics Publishing and Oxford Un
iversity Press, New York, 1997.

[12]

Back, T., Eiben, A.E., & Van der Vaart, N.A., An empirical study on Gas without
parameters,
In Schenauer, M., Deb, K., Rudolph, G., Yao, X., Lutton, E.,Merelo, J. J., and
Schwefel, H
-
P. (Ed): Parallel Problem Solving from Na
ture PPSN V, Lecture Notes in
Computer Science Vol. 1917, PP 315
-
324, 2000.

[13]

Bryant, A., & Julstrom, What have you done for me lately? Adapting operator probabilities
-
state genetic algorithm,
Proceeding of the Sixth International Conference on
G
enetic Algorithms, PP 81
-
7, Morgan Kufmann, 1995.

[14]

Davis, L., Adapting operator probabilities in genetic algorithms.
In Schaffer, J. D. (Ed),
Proceeding of the Third International Conference on Genetic Algorithms PP 16
-
69, San
Mateo, Ca: Morgan Kaufman,. 19
89.

[15]

Deb, K., Deceptive Landscape,
In Back, T., Fogel, D.B, & Michalewicz, Z. (editors):
Handbook of Evolutionary Computation, Institute of Physics Publishing & Oxford
University Press, New York, 1997.

[16]

De Jong, K. A., An analysis of the behavior of a class
Doctoral dissertation, University of Michigan, Ann Arbor, University Microfilms No 76
-
9381, 1975.

[17]

Eiben, A.E., Hinterding, R., & Michalewicz, Z., Parameter control in evolutionary
algorithms,
IEEE Transactions on Evolutionary C
omputation, Vol. 3, No. 2, PP 124
-
41,
1999.

[18]

Fogarty, T., & Terence, C., Varying the probability of mutation in the genetic algorithm,
Proceeding of the Third International Conference on Genetic algorithms, PP 104
-
109,
Morgan Kufmann, 1989.

[19]

Gabriela, O., In
man, H., & Hilary, B. On recombination and optimal mutation rates,

Proceedings of Genetic and Evolutionary Computation Conference

(GECCO
-
99),
PP 488
-
495
,
Morgan Kaufmann, San Francisco, CA, 1999.

[20]

Grefenstette, J. J., Optimization of control parameters for

genetic algorithms,
In Sage, A. P.
(Ed), IEEE Transactions on Systems, Man, and Cybernetics, Volume SMC
-
16
-
1, pp 122
-
128, New York: IEEE, 1986.

[21]

Goldberg, D.E., Genetic algorithms in search, optimization, and machine learning,
Wesley Publishing Com
pany, Inc, 1989.

[22]

Goldberg, D.E., Sizing populations for serial and parallel genetic algorithms,
Proceeding of
the third international Conference on Genetic Algorithms and Their applications, PP 70
-
79, Morgan Kaufmann, 1989.

[23]

Harik, G. R., & Lobo, F. G., A
parameter
-
less genetic algorithm,
Banzhaf, W., Daida, J.,
Eiben, A. E., Garzon, M. H., Honavar, V., Jakiela, M., & Smith, R. E. (Eds.) .GECCO
-
99:
Proceedings of the Genetic and Evolutionary Computation Conference, PP 258

267, San
Francisco, CA: Morgan Kauf
mann, 1999

[24]

Hesser, J. & Manner, R., Towards an optimal mutation probability in genetic algorithms,
Proceeding of the 1
st

Parallel Problem Solving from Nature, PP 23
-
32, Springer 1991.

[25]

Hinterding, R., Gaussian mutation and self
-
c algorithms,
In
Proceeding of IEEE International Conference on Evolutionary Computation, PP 384
-
389,
1995.

[26]

Hinterding, R., Michalewicz, Z., & Peachy, T. C., Self
-
numeric functions,
Proceeding of the Fourth International Conf
erence on Parallel Problem
Solving from Nature, PP 420
-
429, in Lecture Notes from Computer Science, Springer
Verlag, 1996.

[27]

Hoffmeister, F., & Back. T., Genetic algorithms and Evolution strategies: Similarities and
differences,
In Shwefel, H
-
P., Manner. R.,

Parallel Problem Solving from Nature
-

PPSN 1
(Lecture Note in Computer Science; Vol. 496), Springer Verlag, Berlin, 1991.

[28]

Holland, J. H., Adaptation in natural and artificial systems,
Ann Arbor, MI: University of
Michigan Press, 1975.

[29]

Muhlenbein, H., How
genetic algorithms really work: I. Mutation and Hill climbing,
Parallel Problem Solving from Nature
-

PPSN II, 15
-
2, 1992.

[30]

Rechenberg, I., Evolutions strategie: Optimierung technischer systeme nach prinzipien der
biologischen evolution,
Frommann, 1973.

[31]

Rose
nbrock, H.H., An Automatic method for finding the greatest or least value of a
function,
The Computer Journal, Vol. 3, No.3, PP 175
-
184, 1960.

[32]

Schaffer, J.D., Caruana, R.A., Eshelman, L.J., & Das, R., A study of control parameters
affecting online performa
nce of genetic algorithms for function optimization,
Proceeding of
the Third International Conference on Genetic Algorithms and Their Applications, PP 51
-
60, Morgan Kaufmann, 1989.

[33]

Schlierkamp
-
Voosen, D., & Muhlenbein, H. Adaptation of population sizes by
competing
subpopulations,
Proceeding of International Conference on Evolutionary Computation
(ICEC’96), Negoya, Japan, PP 330
-
335, 1996.

[34]

Schwefel, H
-
P., Numerische optimierung von computer
-
modellen mittels der
evolutionsstrategie,
Volume 26 of Interdiscipl
inary systems research. Birkhauser, Basel,
1997.

[35]

Schwefel, H
-
P., Collective phenomena in evolutionary system,
In Preprints of the 31
st

Annual Meeting of the International Society for General System Research, Budapest, Vol.
2, PP 1025
-
1033, 1987.

[36]

Smith, J.,

& Fogarty, T., Self
-
algorithm,
Proceeding of the third IEEE Conference on Evolutionary Computation, IEEE
Press, 1996.

[37]

Smith, R., Adaptively resizing populations: An algorithm and analysis,
Proceeding
of the
Fifth International Conference on Genetic Algorithms, P 653 Morgan Kaufmann, 1989.

[38]

Srinivas, M., & Patniak, L. M., Adaptive Probabilities of crossover and mutation in genetic
algorithms,
IEEE Transactions on Systems, Man and Cybernetics, Vol. 24, No
. 4, PP 17
-
26,1994.

[39]

Torn, A., & Zilinskas, A., Global Optimization,
Lecture Note in Computer Science; Vol.
350, Springer Verlarg, Barlin, 1989.

[40]

Van der Vaart, N.A.L., Towards Totally self
-
adjusting genetic algorithms: Let nature sort
out,
Master Thesis, Le
iden University, 1999.

[41]

Whitley, L. D., Mathias, K.E., Rana, S., & Dzubera, J., Building better test functions,
In
Eshelman, L.J. (Ed), Proceeding of the Sixth International Conference on Genetic
Algorithms, PP 239
-
246, Morgan Kaufmann, San Francisco, Cali
fornia, 1995.