Genetic Algorithms

filerbuttermilkΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

81 εμφανίσεις




Genetic Algorithms



By: Chase Simmons



Abstract


Genetic algorithms are
the

result of combin
ing
biology concepts and computer science
, in order
to solve practical problems in many different fields
.

They utilize

the powerful
concept

of natural
evolution

as a
foundation for

finding good solutions

in

optimization and many other types
of
problems
.
When properly set up g
enetic algorithms
are very

effective and efficient even when
the search space is extremely
vast
. B
ecause of this they have become increasi
ngly popular in
a
wide variety of fields,
both
for research and to solve real world problems
.

Using the techniques
of

mutation
,

crossover, inheritance
,

and selection a program is able to generate generation after
generation of potential solutions
.


E
ach g
eneration
builds

upon the success
es of the previous
generation
, gradually working toward an optimal solution
. This process is
repeated until
there is
a satisfactory solution
,
that properly e
xcels in

all of the particular fitness
areas

that
are

desired.






Introduction


Genetic algorithms are a specific
type or subgroup of

evolutionary algorithms and solve
problems by using
genetic

operators such as selection, mutation and crossover

on strings of
numbers that represent solutions
. The algorithm itself is

based upon fundamental modern
biology concepts and knowledge, in particular the ideal of natural evolution and the field of
genetics that supports it.
The
idea of

natural section from Charles Darwin
is
very much the basis
for
how a genetic algorithm
works

overall and is the groundwork for how we think of natural
evolution today
, with each generation generally being more fit than the last
.

[6]



The genetic operators that
make up the inner workings of the

algorithm
are based upon the
concepts of Gregor

Mendel, who is known as the father of genetics.

Many parallels can be
drawn between the biological natural evolution process and the genetic
algorithm [8]
. The
genetic algorithm works with a population of candidate solutions from the solution space, but

each solution could be viewed a gene or chromosome in biological terms.

A genetic algorithm
first starts with an initial population

of candidate solutions

to work from and
goes through the
emulated

process of evolution to generate a new generation of can
didate solutions
. By going
through this process, after a number of generations the candidate solutions in the last generation
will be closer to the global maximum or ideal solution
. [2]



As with many evolutionary algorithms, genetic algorithms excel whe
n it comes to optimization
problems, but can be used in large variety of other applications a
nd in practically any industry.
The genetic algorithm was invented by the computer scientist John Holland, who is known as
the
2


father of genetic algorithms
.
Holland’s student David Goldberg has made significant
contributions to genetic algorithms by releasing various papers and

books

showing the various
real world uses

as well as implementations

for genetic algorithms.
Goldberg‘s work has led to
many of the m
odern applications of
the genetic algorithms today. One of Goldberg’s real world
examples was his dissertation that used genetic algorithms to optimize gas
-
pipeline transmission
and control. [6]



Overview of
Genetic Algorithm
Process


The genetic algorit
hm gives the programmer a rather large amount of freedom in implementation
of the various areas. In fact a big part of genetic algorithms is tweaking the various settings such
as mutation rate and generation size, since you may have a general idea of

what

you think will
work, but are not quite sure. This is often the case when using genetic algorithms, because in
many
cases you have a huge search space
, which is the set of all possible solutions

and
traditional methods will not work for the problem.
Howe
ver a general process will be followed,
when
implementing

genetic algorithm
s
.

Before anything can take place you must have a
problem that is fit for a genetic algorithm and have

established

a way to encode solution
s

to that
particular problem.



An initial population
will be generated
as

the first step in any genetic algorithm. The initial
population is usually generated randomly or base
d upon some
user defined parameters, but could
make use of a seed, so you are starting with a specific known in
itial population.

In some cases
starting with a specific population

is very useful. One

instance

where you may want to use a
predefined initial population

is

when it may

be the best solution from a previous run of the
genetic algorithm
and

are

trying to i
mprove upon a solution you already have.



The initial population is where the genetic algorithm starts fro
m and to be able to continue on

each individual candidate solution must be evaluated using a user defined fitness function. The
fitness function al
so known as a cost function, is used to determine how well a candidate solution
solves a specific problem or how
well

the solution meets the criteria that is being searched for.
This same fitness function will be used to evaluate every single candidate so
lution that is created.
A number is usually assigned to the candidate solution, indicating how successful or well the
solution meets the criteria. Without the fitness function the genetic algorithm would have no
way to determine the fitness of each indiv
idual solution
.


Once an initial population has been generated and there is a fitness function available to evaluate
each solution
, selection can take place. Selection is used in order to determine which solutions to
select as mates or to take over into

the next generation. There are number of ways to preform
selection, but it is up to the programmer or user to choose a method. Although certain selection
schemes will directly take certain individuals into the next generations, generally selection is
us
ed to select solutions for mating which occurs in the form of crossover which is also known as
genetic recombination.

[4]


Generating

of the next
generation

and all successive generation
s

occurs in two steps generally,
which are mating and mutation.
Chromosomal crossover occurs during sexual reproduction of
3


most living creatures
. This

determines
what

mix of the parents traits that a child gets, so the
genetic algorithm mimics this process to generate the children of the next generation.

This
crossov
er process can be viewed as mating for the solutions.
As with selection,
crossover and
creation

of the next generation can be done in a number of ways
, as long as the specified amount
of solutions are generated for the next generation. While crossover gi
ves u
s

the children that
make up the next generation, it is important that mutation occurs in order to introduce new traits
into the solutions that may have not existed prior. In nature mutation
s

can occur
at nearly any
time
, from sexual reproduction

prio
r to birth to cellular reproduction which occurs on a daily
basis for living creatures. With genetic algorithms mutations are applied after a new generation
has been created, based upon a mutation rate that is defined. The mutation rate is the chance for

each
trait of

a solution to change and is usually quite low, since a very high mutation rate would
mean losing most traits inherited from the parents

or
just completely random solutions
.

After
mating has occurred to creat
e a new generation and mutation

h
as

been applied to these children,
creation of a new generation is complete.
[4]



Figure 1: Flowchart of the genetic algorithm process, where
evaluate solutions

is fitness
evaluation of each solution
in that generation
and the convergence check is determining if a
termination condition is true.


Using the fitness function each solution
for the new generation must then be evaluated and
assigned a fitness value

just as
was done with
the previous generation
. Once each solution has
been evaluated, the algorithm determines if it should continue on and create another generation
or terminate b
ased upon some defined criteria. The genetic algorithm may be setup to halt once
a specific fitness value is reached
,
some condition is present in a solution

or

the generation is
Generate Initial Population

Evaluate Solutions

Select Mates

Mating

Mutation

Convergence Check

Done

4


converging towards one solution
. Another termination method is to stop afte
r a specific number
of generations or period of time.
If no

termination
criteria are

met, the process of selection will
be performed on the
current
generation
. Then these selected solutions will then go through
mating and mutation in order to create a co
mpletely new generation. This new generation will
then go through evaluation as well in order to determine whether the algorithm should continue
or terminate. This process of selection, mating, mutation
,

and evaluation can occur as many
times as needed i
n order to reach a certain stopping criterion.

[6]



Encoding


Before any process such as evaluation by a fitness function or crossover can occur, a solution
must first be encoded. Encoding is the process of representing a solution to a problem as digital

data.

An encoded solution can be viewed as a chromosome, which is able to undergo crossover
and be evaluated for fitness.

When programming a genetic algorithm, figuring out how to
encode a solution is one of the first steps that must occur since nearly
all of the other processes
depend on how the solution is encoded. Encoding is one of the areas that the programmer has
the largest amount of work to do, since they need to determine what data

needs to encoded for
solutions. All relevant traits for a solu
tion must be captured in the encoded solutions.

If any
relevant traits are left out

the algorithm will not work as well and may produce unusable data.

While s
toring

lots of unrelated data that has no impact
the
on solution

it

will
cause bloating of
the s
olution

and result in
slowing the algorithm, especially when evaluating solution fitness
.

[6]



The simplest

and most common

encoding method is to just use a binary bit string to represent a
specific solution.

A solution can be encoded in many different
ways, as long as the
encoding
method makes sense for the problem and capture
s

all relevant solution
s
. Nearly any data
structure can be used when encoding a solution, even real numbers and programmer created data
types.

The types used to encode a solution will of course have an effect on

how mating,
mutation and evaluation functions are
implemented
.

[5]



Another simple type of encoding is permutation encoding, which is a string of numbers, which
represent a sequence.

This method is commonly used
for encoding solutions to

ordering
problems

where each number has a meaning.

This could be used for the

traveling salesman
problem
, where

each number would represent a city.

The traveling salesman problem is to find
the shor
test path that visits each city exactly once and ends at the same city that was the starting
point of the path, given a list of cities and their respective distances from each other.


So 1 2 3 4

5
, could be a solution under this encoding scheme.
For the t
raveling salesman problem each
number would represent a city and

the order of
the numbers

would be the order in which the
cities
are

visited, so in this case city 1 is visited first, followed by city 2 and finally city 5 last.
So this method works great t
o represent the traveling salesman ordering problem and other order
problems, but does require more work when it comes to implementing the crossover and
mutation functions. Since when crossover and mutation are performed, the result must be a
solution tha
t makes sense for the ordering problem and have a real sequence. For instance,
having duplicates of a number in a solution for the traveling salesman problem violates the rules,
since each city can only be visited once for this problem and having duplicat
es means it is not
even
an

actual solution to the problem.
[4]

5




Fitness Function

Evaluation


The fitness function is what evaluates each individual solution

and assigns a fitness value to it.
Designing the fitness function is often the hardest part of implementing a genetic algorithm,
since it is what guides the genetic algorithm in selection, which determines what traits the next
generation is going to have
. The fitness function is what is used to control the results of the
genetic algorithm and can be very complicated

when working with big solutions
.

Without the
fitness function the genetic algorithm would just be a blind search, the fitness function is w
hat
guides the algorithm.
When creating the fitness function, if certain traits are desired, it must be
properly represented and weighed by the function otherwise the algorithm may not work towards
solutions that have these traits. For the traveling sale
sman problem, the fitness function is quite
straightforward and would just evaluate the total distance traveled for the solution and assign that
value to the solution, where the lower value is better.

[4]



The f
itness function
is

not limited to simple numeric calculations, but
can be as complicated as
needed. In some cases the solution may represent a model or structure and the fitness function is
actually a simulation where the model is put in and the results of the simulation a
re the assigned
fitness value. Fitness functions can be setup to evaluate solutions based upon multiple attributes
and give the solution values for each attribute, so more detail specific decisions can be made.


Since genetic algorithms usually work with

large solution spaces and often end up running
through thousands of generations, it is important to have the fitness function be as fast as
possible.

For big problems, a fitness function may take a matter of hours to completely evaluate
a solution, espec
ially in areas like engineering, where complex simulations may be used. To
help speed

fitness

evaluation
s

up, the fitness of a solution may be approximated
. Approximation
is also very useful when unsure of an exact model or method to evaluate fitness.



Selection


Selection is the process of choosing individuals from the current generation to move into the
breeding pool for the next generation

or directly into the next generation, depending upon the
model being used. Prior to performing selection, each

individual solution will be assigned a
fitness value using the fitness function. The selection process uses these values to make
decisions and choose solutions. There are various selection methods to choose from, since just
choosing the fittest solutio
ns

is not always very helpful. Choosing only the fittest solutions can
result in

a

uniform population

in

very early generations of the genetic algorithm
.

This may

cause
highly desired traits that only lesser fit solutions may have

had

to be discarded
. Thi
s would
result in having to rely on mutation to bring these desired traits back into the solutions, which
could take a long time depending on the size and complexity of a trait.



One commonly used selection method is f
itness proportionate selection
which

is also known as
roulette selection. This selection strongly applies the concept of survival of the fittest, where
fitter solutions have a higher chance of be being selected. This type of selection can be
6


performed by normalizing the fitness values of a
n entire

generation

so that the sum is equal to 1
and then sorting the population by these fitness values, so that the most fit solution will go from
0.0 to (individual solution fitness / total fitness of the entire generation) and the next most fit
soluti
on will follow. A random number can then be chosen between 0 and 1 to then choose an
individual solution and this can be done as many times as needed.

[4]



Figure 2: Representing what a roulette selection arrangement looks like where F would have a
value of 1 when these valu
es are normalized and where r is a random value between 0 and the
total fitness, which is selecting B.


Another common selection method used is tournament selection, where a group of specified size
is created from the current gene
ration of solutions. The groups are filled with individual
solutions that are chosen by random, roulette selection
,
or some other means. Once a group is
full of solutions
, the solution with the highest fitness in the group is declared the winner and
move
s into the breeding pool. Using this method you can change to group size to influence the
amount of weak and strong solutions that move into the breeding pool. A large group size will
generally result in only the fittest being selected and a smaller pool

lets less fit solutions have a
chance as well.



Roulette selection and tournament selection generally allow for less fit solutions to be selected
occasionally. There are many selection methods that do not allow the less fit solutions to have a
chance, o
ne such method is elitism selection. In elitism selection only the absolute fittest
solutions are chosen. Some other similar selection methods are choosing randomly from the top
percentage or just truncating to the top fourth and selecting them. Solutio
ns could be selected at
complete random as well.

If the fitness function ranks the solutions in multiple areas, the
selection function can then perform selection based upon more than one area, which is known as
multi
-
objective.



Crossover


When the
breeding pool is filled with the selected solutions, crossover can then take place.
Crossing over takes place in sexual reproduction of animals during the prophase I of meiosis.

This is the main part of sexual reproduction that determines what mix of tra
its a child gets from
each parent, by exchanging genetic data from the parents. Genetic algorithms replicate this
process to create

a

child of the next generation. While performing crossover is easier when
working with a binary string solution, there are

ways to perform crossover for both real values
and any other data type.


7


There are many different way
s

that crossover can be applied, but the simplest is
one
-
point
crossover. With single point crossover you have two parents and you choose a point at rand
om
.
With this point the data is then switched to
create two children

of the next generation
.

Two
-
point
crossover is another method, where two points are chosen at random instead of one. If more than
two points are to be selected, multi
-
point crossover c
an be used and can have as many points
chosen as wanted.
[6]




Figure 3: Illustrates
one
-
point and two
-
point crossover with two parents and the resulting
children.


If crossover based on a number of points does not provide sufficient mixing, uniform crossover
can be used. Uniform crossover makes use of mixing ratio to give a child a percentage of each
parent’s data.
Due to the high amount of mixing there is large am
ount of exploration that occurs,
since lots of different combinations

of traits

happen very quickly opposed to one
-
point crossover.



Figure 4: Illustrates what a uniform crossover could look like with a mixing ratio of 0.5, where
each parent would then
contribute 50% of the data to each child.


Besides these methods of crossover there are many alternatives that can be used
, as long as the
child that is created gets a mix of more than one parent’s traits. There are even some forms of
crossover that use t
hree parents. Sometimes special forms of crossover are required, in particular
when working with ordering problems such as the traveling salesman problem. If a simple one
-
point crossover method was used
for this problem, the child solution could very eas
ily end up
with duplicate of cites, thus creating an invalid solution for the traveling salesman problem.

So
the programmer would need to create an order based crossover method or check and repair the
solution if they used a normal crossover method.

[4]



Mutation


Mutation is used to introduce new traits into the solutions as well as help to maintain the
diversity of the solutions. The amount of mutation that can occur is controlled by a mutation rate
which

is the chance for each trait to be changed for
a solution. A solution may have as few as
zero mutations depending
up
on a low mutation rate and chance or every trait could be mutated
8


with a 100% mutation rate, which would result in a completely random solution. Mutation also
helps to move the solution
s out of local optimums when the solutions in the past generations
have been relatively similar

and are not at a global optimum
.




Figure 5: A mutation creates a solution that is away from the local optimum and has a much
higher fitness value that the
others, making the likely the next generation will move away from
the local optimum.


Mutation can be easily done when the solution is encoded as a binary bit string. When a trait
needs to be mutated
as determined by the mutation rate and random variable
and is represented
by some bit, flipping the bit is all that has to be done
. One method that can be used for mutating
a real number is to change the value to upper or lower bounds of the number randomly. Uniform
mutation is a common way to handle float a
nd integer traits. For uniform mutation, a value is
randomly chosen between specified bounds, giving the programmer a large amount of control for
each trait.

[4]


If mutations were not implemented in a genetic algorithm, no new traits other than the ones
originally held by the initial generation would ever be introduced. Also solutions would
converge very quickly and as a result the algorithm would not explore nearly as much of the
solution space, missing perhaps the very best solution.
If solutions beco
me too similar to each
other the evolution will slow and the algorithm may be setup to terminate once solutions
converge with each other, so the algorithm could terminate early, before a good solution is
reached.

[6]



Summary


Genetic algorithms are best
used for optimization problems or when there is no fully described
algorithm and there is no know solution to the problem. The full strength of genetic algorithms
is most apparent when working with a large solution space, since the algorithm explores so w
ell.
Because of this genetic algorithms often find solutions that would have not

even

been considered
by traditional methods. If

a

problem is too complex

or not eno
ugh is known about the solution

space

for an analytical solution a genetic algorithm may very well be able to solve the problem.

Compared too many traditional problem solving methods, genetic algorithms are very flexible
and allow for a lot of tweaking. Run time tweaking of the various rate
s and selection methods
9


can even be implemented

to

give the user a large degree of control even while the algorithm is
running.


One
of the biggest strengths that
genetic algorithm
s

have is that

they

are

intrinsically parallel,
while most analytical solut
ions are not. If you have a generation size of 20, once all 20 of the
children have been generated using crossover and mutation, the fitness function can be run for
each child independently and
concurrently
. So

the time to evaluate a generation is only t
he
amount of the slowest fitness evaluation for that generation, not the average fitness function time
times the number solutions in the generation. So for a generation size of 20, if programed
properly to make full use of parallel computing it would take

about

1/20
th

of the time

to do
fitness evaluations

opposed to doing one
fitness evaluation

after another on a single core. Today
as parallel computing is increasingly more
common

and computers have an increasing amount of
cores;

this strength is even mor
e apparent.

[1]



While genetic algorithms excel at solving some problems, they have a few weaknesses to be
aware of. The genetic algorithm relies upon be
ing

designed very well and tweaking the
parameters to be successful. If the encoding method or fitness function have any errors the
algorithm is likely to
produce insignificant results. Also if the algorithm is not carefully setup so
that the mutation metho
d and rate are working as desired, the algorithm can get stuck in a local
optimum. Similarly the children of a generation may converge to a similar solution prematurely,
because of a bad mutation and crossover implementation or choice. While some methods

of
crossover and mutation are generally better than others, there is no definitive best method; the
best method depends upon the problem that is being solved and the specific implementation.

For
big problems the complexity of the fitness function can be
extremely high making it very had to
create a
proper
fitness function
.


By utilizing the power of natural evolution genetic algorithms make a great problem solving tool
when properly implemented.
While the genetic algorithm does not promise to find the ab
solute
optimal solution, it often finds solutions very close to optimal and especially ones that would be
ignored by traditional means.
Genetic algorithm
s

have gone from being first
brought to the
attention of computer scientist by John Holland in the mid

1970’s to being utilized in multiple
fields today, ranging from electrical engineering to financial forecasting
[3][7]
.




References


[1]

Affenzeller, M. (2009).
Genetic algorithms and genetic programming: modern concepts and
practical applications
. Boca
Raton: CRC Press.


[2]

Alba, E., & Dorronsoro, B. (2008).
Cellular genetic algorithms
. Berlin: Springer.


[
3
] Altshuler, E., & Linden, D. (1997). Design of a wire antenna using a genetic algorithm.
Journal of Electronic Defense
,
20
(7), 50
-
52.

Retrieved Mar
ch 15, 2012 from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.510&rep=rep1&type=pdf


[
4
]

Buckland, M. (2002).
AI techniques for game programming
. Cincinnati, Ohio: Premier Press.

10



[
5
]

Gen, M., & Cheng, R. (2000).
Genetic algorithms and
engineering optimization
. New York:
Wiley.


[
6
]

Haupt, R. L., & Haupt, S. E. (2004).
Practical genetic algorithms

(2nd ed.). Hoboken, N.J.:
John Wiley.

Retrieved March 13, 2012 from
http://thegrovelibrary
-
ng.com/admin/a2/b31xxx/c42kk/Practical%20Genetic%20
Algorithms%20
-
%20Randy%20L.%20Haupt,%20Sue%20Ellen%20Haupt.pdf


[
7
] Mahfoud, S., & Mani, G. (1996). Financial forecasting using genetic algorithms.
Applied
Artificial Intelligence
,
10
(6), 543
-
566.

Retrieved March 15, 2012 from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.9698&rep=rep1&type=pdf


[
8
]

Mitchell, M. (1998).
An introduction to genetic algorithms
. Cambridge, Mass.: MIT Press.