Genetic Algorithms
By: Chase Simmons
Abstract
Genetic algorithms are
the
result of combin
ing
biology concepts and computer science
, in order
to solve practical problems in many different fields
.
They utilize
the powerful
concept
of natural
evolution
as a
foundation for
finding good solutions
in
optimization and many other types
of
problems
.
When properly set up g
enetic algorithms
are very
effective and efficient even when
the search space is extremely
vast
. B
ecause of this they have become increasi
ngly popular in
a
wide variety of fields,
both
for research and to solve real world problems
.
Using the techniques
of
mutation
,
crossover, inheritance
,
and selection a program is able to generate generation after
generation of potential solutions
.
E
ach g
eneration
builds
upon the success
es of the previous
generation
, gradually working toward an optimal solution
. This process is
repeated until
there is
a satisfactory solution
,
that properly e
xcels in
all of the particular fitness
areas
that
are
desired.
Introduction
Genetic algorithms are a specific
type or subgroup of
evolutionary algorithms and solve
problems by using
genetic
operators such as selection, mutation and crossover
on strings of
numbers that represent solutions
. The algorithm itself is
based upon fundamental modern
biology concepts and knowledge, in particular the ideal of natural evolution and the field of
genetics that supports it.
The
idea of
natural section from Charles Darwin
is
very much the basis
for
how a genetic algorithm
works
overall and is the groundwork for how we think of natural
evolution today
, with each generation generally being more fit than the last
.
[6]
The genetic operators that
make up the inner workings of the
algorithm
are based upon the
concepts of Gregor
Mendel, who is known as the father of genetics.
Many parallels can be
drawn between the biological natural evolution process and the genetic
algorithm [8]
. The
genetic algorithm works with a population of candidate solutions from the solution space, but
each solution could be viewed a gene or chromosome in biological terms.
A genetic algorithm
first starts with an initial population
of candidate solutions
to work from and
goes through the
emulated
process of evolution to generate a new generation of can
didate solutions
. By going
through this process, after a number of generations the candidate solutions in the last generation
will be closer to the global maximum or ideal solution
. [2]
As with many evolutionary algorithms, genetic algorithms excel whe
n it comes to optimization
problems, but can be used in large variety of other applications a
nd in practically any industry.
The genetic algorithm was invented by the computer scientist John Holland, who is known as
the
2
father of genetic algorithms
.
Holland’s student David Goldberg has made significant
contributions to genetic algorithms by releasing various papers and
books
showing the various
real world uses
as well as implementations
for genetic algorithms.
Goldberg‘s work has led to
many of the m
odern applications of
the genetic algorithms today. One of Goldberg’s real world
examples was his dissertation that used genetic algorithms to optimize gas

pipeline transmission
and control. [6]
Overview of
Genetic Algorithm
Process
The genetic algorit
hm gives the programmer a rather large amount of freedom in implementation
of the various areas. In fact a big part of genetic algorithms is tweaking the various settings such
as mutation rate and generation size, since you may have a general idea of
what
you think will
work, but are not quite sure. This is often the case when using genetic algorithms, because in
many
cases you have a huge search space
, which is the set of all possible solutions
and
traditional methods will not work for the problem.
Howe
ver a general process will be followed,
when
implementing
genetic algorithm
s
.
Before anything can take place you must have a
problem that is fit for a genetic algorithm and have
established
a way to encode solution
s
to that
particular problem.
An initial population
will be generated
as
the first step in any genetic algorithm. The initial
population is usually generated randomly or base
d upon some
user defined parameters, but could
make use of a seed, so you are starting with a specific known in
itial population.
In some cases
starting with a specific population
is very useful. One
instance
where you may want to use a
predefined initial population
is
when it may
be the best solution from a previous run of the
genetic algorithm
and
are
trying to i
mprove upon a solution you already have.
The initial population is where the genetic algorithm starts fro
m and to be able to continue on
each individual candidate solution must be evaluated using a user defined fitness function. The
fitness function al
so known as a cost function, is used to determine how well a candidate solution
solves a specific problem or how
well
the solution meets the criteria that is being searched for.
This same fitness function will be used to evaluate every single candidate so
lution that is created.
A number is usually assigned to the candidate solution, indicating how successful or well the
solution meets the criteria. Without the fitness function the genetic algorithm would have no
way to determine the fitness of each indiv
idual solution
.
Once an initial population has been generated and there is a fitness function available to evaluate
each solution
, selection can take place. Selection is used in order to determine which solutions to
select as mates or to take over into
the next generation. There are number of ways to preform
selection, but it is up to the programmer or user to choose a method. Although certain selection
schemes will directly take certain individuals into the next generations, generally selection is
us
ed to select solutions for mating which occurs in the form of crossover which is also known as
genetic recombination.
[4]
Generating
of the next
generation
and all successive generation
s
occurs in two steps generally,
which are mating and mutation.
Chromosomal crossover occurs during sexual reproduction of
3
most living creatures
. This
determines
what
mix of the parents traits that a child gets, so the
genetic algorithm mimics this process to generate the children of the next generation.
This
crossov
er process can be viewed as mating for the solutions.
As with selection,
crossover and
creation
of the next generation can be done in a number of ways
, as long as the specified amount
of solutions are generated for the next generation. While crossover gi
ves u
s
the children that
make up the next generation, it is important that mutation occurs in order to introduce new traits
into the solutions that may have not existed prior. In nature mutation
s
can occur
at nearly any
time
, from sexual reproduction
prio
r to birth to cellular reproduction which occurs on a daily
basis for living creatures. With genetic algorithms mutations are applied after a new generation
has been created, based upon a mutation rate that is defined. The mutation rate is the chance for
each
trait of
a solution to change and is usually quite low, since a very high mutation rate would
mean losing most traits inherited from the parents
or
just completely random solutions
.
After
mating has occurred to creat
e a new generation and mutation
h
as
been applied to these children,
creation of a new generation is complete.
[4]
Figure 1: Flowchart of the genetic algorithm process, where
evaluate solutions
is fitness
evaluation of each solution
in that generation
and the convergence check is determining if a
termination condition is true.
Using the fitness function each solution
for the new generation must then be evaluated and
assigned a fitness value
just as
was done with
the previous generation
. Once each solution has
been evaluated, the algorithm determines if it should continue on and create another generation
or terminate b
ased upon some defined criteria. The genetic algorithm may be setup to halt once
a specific fitness value is reached
,
some condition is present in a solution
or
the generation is
Generate Initial Population
Evaluate Solutions
Select Mates
Mating
Mutation
Convergence Check
Done
4
converging towards one solution
. Another termination method is to stop afte
r a specific number
of generations or period of time.
If no
termination
criteria are
met, the process of selection will
be performed on the
current
generation
. Then these selected solutions will then go through
mating and mutation in order to create a co
mpletely new generation. This new generation will
then go through evaluation as well in order to determine whether the algorithm should continue
or terminate. This process of selection, mating, mutation
,
and evaluation can occur as many
times as needed i
n order to reach a certain stopping criterion.
[6]
Encoding
Before any process such as evaluation by a fitness function or crossover can occur, a solution
must first be encoded. Encoding is the process of representing a solution to a problem as digital
data.
An encoded solution can be viewed as a chromosome, which is able to undergo crossover
and be evaluated for fitness.
When programming a genetic algorithm, figuring out how to
encode a solution is one of the first steps that must occur since nearly
all of the other processes
depend on how the solution is encoded. Encoding is one of the areas that the programmer has
the largest amount of work to do, since they need to determine what data
needs to encoded for
solutions. All relevant traits for a solu
tion must be captured in the encoded solutions.
If any
relevant traits are left out
the algorithm will not work as well and may produce unusable data.
While s
toring
lots of unrelated data that has no impact
the
on solution
it
will
cause bloating of
the s
olution
and result in
slowing the algorithm, especially when evaluating solution fitness
.
[6]
The simplest
and most common
encoding method is to just use a binary bit string to represent a
specific solution.
A solution can be encoded in many different
ways, as long as the
encoding
method makes sense for the problem and capture
s
all relevant solution
s
. Nearly any data
structure can be used when encoding a solution, even real numbers and programmer created data
types.
The types used to encode a solution will of course have an effect on
how mating,
mutation and evaluation functions are
implemented
.
[5]
Another simple type of encoding is permutation encoding, which is a string of numbers, which
represent a sequence.
This method is commonly used
for encoding solutions to
ordering
problems
where each number has a meaning.
This could be used for the
traveling salesman
problem
, where
each number would represent a city.
The traveling salesman problem is to find
the shor
test path that visits each city exactly once and ends at the same city that was the starting
point of the path, given a list of cities and their respective distances from each other.
So 1 2 3 4
5
, could be a solution under this encoding scheme.
For the t
raveling salesman problem each
number would represent a city and
the order of
the numbers
would be the order in which the
cities
are
visited, so in this case city 1 is visited first, followed by city 2 and finally city 5 last.
So this method works great t
o represent the traveling salesman ordering problem and other order
problems, but does require more work when it comes to implementing the crossover and
mutation functions. Since when crossover and mutation are performed, the result must be a
solution tha
t makes sense for the ordering problem and have a real sequence. For instance,
having duplicates of a number in a solution for the traveling salesman problem violates the rules,
since each city can only be visited once for this problem and having duplicat
es means it is not
even
an
actual solution to the problem.
[4]
5
Fitness Function
Evaluation
The fitness function is what evaluates each individual solution
and assigns a fitness value to it.
Designing the fitness function is often the hardest part of implementing a genetic algorithm,
since it is what guides the genetic algorithm in selection, which determines what traits the next
generation is going to have
. The fitness function is what is used to control the results of the
genetic algorithm and can be very complicated
when working with big solutions
.
Without the
fitness function the genetic algorithm would just be a blind search, the fitness function is w
hat
guides the algorithm.
When creating the fitness function, if certain traits are desired, it must be
properly represented and weighed by the function otherwise the algorithm may not work towards
solutions that have these traits. For the traveling sale
sman problem, the fitness function is quite
straightforward and would just evaluate the total distance traveled for the solution and assign that
value to the solution, where the lower value is better.
[4]
The f
itness function
is
not limited to simple numeric calculations, but
can be as complicated as
needed. In some cases the solution may represent a model or structure and the fitness function is
actually a simulation where the model is put in and the results of the simulation a
re the assigned
fitness value. Fitness functions can be setup to evaluate solutions based upon multiple attributes
and give the solution values for each attribute, so more detail specific decisions can be made.
Since genetic algorithms usually work with
large solution spaces and often end up running
through thousands of generations, it is important to have the fitness function be as fast as
possible.
For big problems, a fitness function may take a matter of hours to completely evaluate
a solution, espec
ially in areas like engineering, where complex simulations may be used. To
help speed
fitness
evaluation
s
up, the fitness of a solution may be approximated
. Approximation
is also very useful when unsure of an exact model or method to evaluate fitness.
Selection
Selection is the process of choosing individuals from the current generation to move into the
breeding pool for the next generation
or directly into the next generation, depending upon the
model being used. Prior to performing selection, each
individual solution will be assigned a
fitness value using the fitness function. The selection process uses these values to make
decisions and choose solutions. There are various selection methods to choose from, since just
choosing the fittest solutio
ns
is not always very helpful. Choosing only the fittest solutions can
result in
a
uniform population
in
very early generations of the genetic algorithm
.
This may
cause
highly desired traits that only lesser fit solutions may have
had
to be discarded
. Thi
s would
result in having to rely on mutation to bring these desired traits back into the solutions, which
could take a long time depending on the size and complexity of a trait.
One commonly used selection method is f
itness proportionate selection
which
is also known as
roulette selection. This selection strongly applies the concept of survival of the fittest, where
fitter solutions have a higher chance of be being selected. This type of selection can be
6
performed by normalizing the fitness values of a
n entire
generation
so that the sum is equal to 1
and then sorting the population by these fitness values, so that the most fit solution will go from
0.0 to (individual solution fitness / total fitness of the entire generation) and the next most fit
soluti
on will follow. A random number can then be chosen between 0 and 1 to then choose an
individual solution and this can be done as many times as needed.
[4]
Figure 2: Representing what a roulette selection arrangement looks like where F would have a
value of 1 when these valu
es are normalized and where r is a random value between 0 and the
total fitness, which is selecting B.
Another common selection method used is tournament selection, where a group of specified size
is created from the current gene
ration of solutions. The groups are filled with individual
solutions that are chosen by random, roulette selection
,
or some other means. Once a group is
full of solutions
, the solution with the highest fitness in the group is declared the winner and
move
s into the breeding pool. Using this method you can change to group size to influence the
amount of weak and strong solutions that move into the breeding pool. A large group size will
generally result in only the fittest being selected and a smaller pool
lets less fit solutions have a
chance as well.
Roulette selection and tournament selection generally allow for less fit solutions to be selected
occasionally. There are many selection methods that do not allow the less fit solutions to have a
chance, o
ne such method is elitism selection. In elitism selection only the absolute fittest
solutions are chosen. Some other similar selection methods are choosing randomly from the top
percentage or just truncating to the top fourth and selecting them. Solutio
ns could be selected at
complete random as well.
If the fitness function ranks the solutions in multiple areas, the
selection function can then perform selection based upon more than one area, which is known as
multi

objective.
Crossover
When the
breeding pool is filled with the selected solutions, crossover can then take place.
Crossing over takes place in sexual reproduction of animals during the prophase I of meiosis.
This is the main part of sexual reproduction that determines what mix of tra
its a child gets from
each parent, by exchanging genetic data from the parents. Genetic algorithms replicate this
process to create
a
child of the next generation. While performing crossover is easier when
working with a binary string solution, there are
ways to perform crossover for both real values
and any other data type.
7
There are many different way
s
that crossover can be applied, but the simplest is
one

point
crossover. With single point crossover you have two parents and you choose a point at rand
om
.
With this point the data is then switched to
create two children
of the next generation
.
Two

point
crossover is another method, where two points are chosen at random instead of one. If more than
two points are to be selected, multi

point crossover c
an be used and can have as many points
chosen as wanted.
[6]
Figure 3: Illustrates
one

point and two

point crossover with two parents and the resulting
children.
If crossover based on a number of points does not provide sufficient mixing, uniform crossover
can be used. Uniform crossover makes use of mixing ratio to give a child a percentage of each
parent’s data.
Due to the high amount of mixing there is large am
ount of exploration that occurs,
since lots of different combinations
of traits
happen very quickly opposed to one

point crossover.
Figure 4: Illustrates what a uniform crossover could look like with a mixing ratio of 0.5, where
each parent would then
contribute 50% of the data to each child.
Besides these methods of crossover there are many alternatives that can be used
, as long as the
child that is created gets a mix of more than one parent’s traits. There are even some forms of
crossover that use t
hree parents. Sometimes special forms of crossover are required, in particular
when working with ordering problems such as the traveling salesman problem. If a simple one

point crossover method was used
for this problem, the child solution could very eas
ily end up
with duplicate of cites, thus creating an invalid solution for the traveling salesman problem.
So
the programmer would need to create an order based crossover method or check and repair the
solution if they used a normal crossover method.
[4]
Mutation
Mutation is used to introduce new traits into the solutions as well as help to maintain the
diversity of the solutions. The amount of mutation that can occur is controlled by a mutation rate
which
is the chance for each trait to be changed for
a solution. A solution may have as few as
zero mutations depending
up
on a low mutation rate and chance or every trait could be mutated
8
with a 100% mutation rate, which would result in a completely random solution. Mutation also
helps to move the solution
s out of local optimums when the solutions in the past generations
have been relatively similar
and are not at a global optimum
.
Figure 5: A mutation creates a solution that is away from the local optimum and has a much
higher fitness value that the
others, making the likely the next generation will move away from
the local optimum.
Mutation can be easily done when the solution is encoded as a binary bit string. When a trait
needs to be mutated
as determined by the mutation rate and random variable
and is represented
by some bit, flipping the bit is all that has to be done
. One method that can be used for mutating
a real number is to change the value to upper or lower bounds of the number randomly. Uniform
mutation is a common way to handle float a
nd integer traits. For uniform mutation, a value is
randomly chosen between specified bounds, giving the programmer a large amount of control for
each trait.
[4]
If mutations were not implemented in a genetic algorithm, no new traits other than the ones
originally held by the initial generation would ever be introduced. Also solutions would
converge very quickly and as a result the algorithm would not explore nearly as much of the
solution space, missing perhaps the very best solution.
If solutions beco
me too similar to each
other the evolution will slow and the algorithm may be setup to terminate once solutions
converge with each other, so the algorithm could terminate early, before a good solution is
reached.
[6]
Summary
Genetic algorithms are best
used for optimization problems or when there is no fully described
algorithm and there is no know solution to the problem. The full strength of genetic algorithms
is most apparent when working with a large solution space, since the algorithm explores so w
ell.
Because of this genetic algorithms often find solutions that would have not
even
been considered
by traditional methods. If
a
problem is too complex
or not eno
ugh is known about the solution
space
for an analytical solution a genetic algorithm may very well be able to solve the problem.
Compared too many traditional problem solving methods, genetic algorithms are very flexible
and allow for a lot of tweaking. Run time tweaking of the various rate
s and selection methods
9
can even be implemented
to
give the user a large degree of control even while the algorithm is
running.
One
of the biggest strengths that
genetic algorithm
s
have is that
they
are
intrinsically parallel,
while most analytical solut
ions are not. If you have a generation size of 20, once all 20 of the
children have been generated using crossover and mutation, the fitness function can be run for
each child independently and
concurrently
. So
the time to evaluate a generation is only t
he
amount of the slowest fitness evaluation for that generation, not the average fitness function time
times the number solutions in the generation. So for a generation size of 20, if programed
properly to make full use of parallel computing it would take
about
1/20
th
of the time
to do
fitness evaluations
opposed to doing one
fitness evaluation
after another on a single core. Today
as parallel computing is increasingly more
common
and computers have an increasing amount of
cores;
this strength is even mor
e apparent.
[1]
While genetic algorithms excel at solving some problems, they have a few weaknesses to be
aware of. The genetic algorithm relies upon be
ing
designed very well and tweaking the
parameters to be successful. If the encoding method or fitness function have any errors the
algorithm is likely to
produce insignificant results. Also if the algorithm is not carefully setup so
that the mutation metho
d and rate are working as desired, the algorithm can get stuck in a local
optimum. Similarly the children of a generation may converge to a similar solution prematurely,
because of a bad mutation and crossover implementation or choice. While some methods
of
crossover and mutation are generally better than others, there is no definitive best method; the
best method depends upon the problem that is being solved and the specific implementation.
For
big problems the complexity of the fitness function can be
extremely high making it very had to
create a
proper
fitness function
.
By utilizing the power of natural evolution genetic algorithms make a great problem solving tool
when properly implemented.
While the genetic algorithm does not promise to find the ab
solute
optimal solution, it often finds solutions very close to optimal and especially ones that would be
ignored by traditional means.
Genetic algorithm
s
have gone from being first
brought to the
attention of computer scientist by John Holland in the mid
1970’s to being utilized in multiple
fields today, ranging from electrical engineering to financial forecasting
[3][7]
.
References
[1]
Affenzeller, M. (2009).
Genetic algorithms and genetic programming: modern concepts and
practical applications
. Boca
Raton: CRC Press.
[2]
Alba, E., & Dorronsoro, B. (2008).
Cellular genetic algorithms
. Berlin: Springer.
[
3
] Altshuler, E., & Linden, D. (1997). Design of a wire antenna using a genetic algorithm.
Journal of Electronic Defense
,
20
(7), 50

52.
Retrieved Mar
ch 15, 2012 from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.128.510&rep=rep1&type=pdf
[
4
]
Buckland, M. (2002).
AI techniques for game programming
. Cincinnati, Ohio: Premier Press.
10
[
5
]
Gen, M., & Cheng, R. (2000).
Genetic algorithms and
engineering optimization
. New York:
Wiley.
[
6
]
Haupt, R. L., & Haupt, S. E. (2004).
Practical genetic algorithms
(2nd ed.). Hoboken, N.J.:
John Wiley.
Retrieved March 13, 2012 from
http://thegrovelibrary

ng.com/admin/a2/b31xxx/c42kk/Practical%20Genetic%20
Algorithms%20

%20Randy%20L.%20Haupt,%20Sue%20Ellen%20Haupt.pdf
[
7
] Mahfoud, S., & Mani, G. (1996). Financial forecasting using genetic algorithms.
Applied
Artificial Intelligence
,
10
(6), 543

566.
Retrieved March 15, 2012 from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.86.9698&rep=rep1&type=pdf
[
8
]
Mitchell, M. (1998).
An introduction to genetic algorithms
. Cambridge, Mass.: MIT Press.
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο