By: Chase Simmons
Genetic algorithms are
result of combin
biology concepts and computer science
, in order
to solve practical problems in many different fields
finding good solutions
optimization and many other types
When properly set up g
effective and efficient even when
the search space is extremely
ecause of this they have become increasi
ngly popular in
wide variety of fields,
for research and to solve real world problems
Using the techniques
and selection a program is able to generate generation after
generation of potential solutions
upon the success
es of the previous
, gradually working toward an optimal solution
. This process is
a satisfactory solution
that properly e
all of the particular fitness
Genetic algorithms are a specific
type or subgroup of
evolutionary algorithms and solve
problems by using
operators such as selection, mutation and crossover
on strings of
numbers that represent solutions
. The algorithm itself is
based upon fundamental modern
biology concepts and knowledge, in particular the ideal of natural evolution and the field of
genetics that supports it.
natural section from Charles Darwin
very much the basis
how a genetic algorithm
overall and is the groundwork for how we think of natural
, with each generation generally being more fit than the last
The genetic operators that
make up the inner workings of the
are based upon the
concepts of Gregor
Mendel, who is known as the father of genetics.
Many parallels can be
drawn between the biological natural evolution process and the genetic
genetic algorithm works with a population of candidate solutions from the solution space, but
each solution could be viewed a gene or chromosome in biological terms.
A genetic algorithm
first starts with an initial population
of candidate solutions
to work from and
goes through the
process of evolution to generate a new generation of can
. By going
through this process, after a number of generations the candidate solutions in the last generation
will be closer to the global maximum or ideal solution
As with many evolutionary algorithms, genetic algorithms excel whe
n it comes to optimization
problems, but can be used in large variety of other applications a
nd in practically any industry.
The genetic algorithm was invented by the computer scientist John Holland, who is known as
father of genetic algorithms
Holland’s student David Goldberg has made significant
contributions to genetic algorithms by releasing various papers and
showing the various
real world uses
as well as implementations
for genetic algorithms.
Goldberg‘s work has led to
many of the m
odern applications of
the genetic algorithms today. One of Goldberg’s real world
examples was his dissertation that used genetic algorithms to optimize gas
and control. 
The genetic algorit
hm gives the programmer a rather large amount of freedom in implementation
of the various areas. In fact a big part of genetic algorithms is tweaking the various settings such
as mutation rate and generation size, since you may have a general idea of
you think will
work, but are not quite sure. This is often the case when using genetic algorithms, because in
cases you have a huge search space
, which is the set of all possible solutions
traditional methods will not work for the problem.
ver a general process will be followed,
Before anything can take place you must have a
problem that is fit for a genetic algorithm and have
a way to encode solution
An initial population
will be generated
the first step in any genetic algorithm. The initial
population is usually generated randomly or base
d upon some
user defined parameters, but could
make use of a seed, so you are starting with a specific known in
In some cases
starting with a specific population
is very useful. One
where you may want to use a
predefined initial population
when it may
be the best solution from a previous run of the
trying to i
mprove upon a solution you already have.
The initial population is where the genetic algorithm starts fro
m and to be able to continue on
each individual candidate solution must be evaluated using a user defined fitness function. The
fitness function al
so known as a cost function, is used to determine how well a candidate solution
solves a specific problem or how
the solution meets the criteria that is being searched for.
This same fitness function will be used to evaluate every single candidate so
lution that is created.
A number is usually assigned to the candidate solution, indicating how successful or well the
solution meets the criteria. Without the fitness function the genetic algorithm would have no
way to determine the fitness of each indiv
Once an initial population has been generated and there is a fitness function available to evaluate
, selection can take place. Selection is used in order to determine which solutions to
select as mates or to take over into
the next generation. There are number of ways to preform
selection, but it is up to the programmer or user to choose a method. Although certain selection
schemes will directly take certain individuals into the next generations, generally selection is
ed to select solutions for mating which occurs in the form of crossover which is also known as
of the next
and all successive generation
occurs in two steps generally,
which are mating and mutation.
Chromosomal crossover occurs during sexual reproduction of
most living creatures
mix of the parents traits that a child gets, so the
genetic algorithm mimics this process to generate the children of the next generation.
er process can be viewed as mating for the solutions.
As with selection,
of the next generation can be done in a number of ways
, as long as the specified amount
of solutions are generated for the next generation. While crossover gi
the children that
make up the next generation, it is important that mutation occurs in order to introduce new traits
into the solutions that may have not existed prior. In nature mutation
at nearly any
, from sexual reproduction
r to birth to cellular reproduction which occurs on a daily
basis for living creatures. With genetic algorithms mutations are applied after a new generation
has been created, based upon a mutation rate that is defined. The mutation rate is the chance for
a solution to change and is usually quite low, since a very high mutation rate would
mean losing most traits inherited from the parents
just completely random solutions
mating has occurred to creat
e a new generation and mutation
been applied to these children,
creation of a new generation is complete.
Figure 1: Flowchart of the genetic algorithm process, where
evaluation of each solution
in that generation
and the convergence check is determining if a
termination condition is true.
Using the fitness function each solution
for the new generation must then be evaluated and
assigned a fitness value
was done with
the previous generation
. Once each solution has
been evaluated, the algorithm determines if it should continue on and create another generation
or terminate b
ased upon some defined criteria. The genetic algorithm may be setup to halt once
a specific fitness value is reached
some condition is present in a solution
the generation is
Generate Initial Population
converging towards one solution
. Another termination method is to stop afte
r a specific number
of generations or period of time.
met, the process of selection will
be performed on the
. Then these selected solutions will then go through
mating and mutation in order to create a co
mpletely new generation. This new generation will
then go through evaluation as well in order to determine whether the algorithm should continue
or terminate. This process of selection, mating, mutation
and evaluation can occur as many
times as needed i
n order to reach a certain stopping criterion.
Before any process such as evaluation by a fitness function or crossover can occur, a solution
must first be encoded. Encoding is the process of representing a solution to a problem as digital
An encoded solution can be viewed as a chromosome, which is able to undergo crossover
and be evaluated for fitness.
When programming a genetic algorithm, figuring out how to
encode a solution is one of the first steps that must occur since nearly
all of the other processes
depend on how the solution is encoded. Encoding is one of the areas that the programmer has
the largest amount of work to do, since they need to determine what data
needs to encoded for
solutions. All relevant traits for a solu
tion must be captured in the encoded solutions.
relevant traits are left out
the algorithm will not work as well and may produce unusable data.
lots of unrelated data that has no impact
cause bloating of
and result in
slowing the algorithm, especially when evaluating solution fitness
and most common
encoding method is to just use a binary bit string to represent a
A solution can be encoded in many different
ways, as long as the
method makes sense for the problem and capture
all relevant solution
. Nearly any data
structure can be used when encoding a solution, even real numbers and programmer created data
The types used to encode a solution will of course have an effect on
mutation and evaluation functions are
Another simple type of encoding is permutation encoding, which is a string of numbers, which
represent a sequence.
This method is commonly used
for encoding solutions to
where each number has a meaning.
This could be used for the
each number would represent a city.
The traveling salesman problem is to find
test path that visits each city exactly once and ends at the same city that was the starting
point of the path, given a list of cities and their respective distances from each other.
So 1 2 3 4
, could be a solution under this encoding scheme.
For the t
raveling salesman problem each
number would represent a city and
the order of
would be the order in which the
visited, so in this case city 1 is visited first, followed by city 2 and finally city 5 last.
So this method works great t
o represent the traveling salesman ordering problem and other order
problems, but does require more work when it comes to implementing the crossover and
mutation functions. Since when crossover and mutation are performed, the result must be a
t makes sense for the ordering problem and have a real sequence. For instance,
having duplicates of a number in a solution for the traveling salesman problem violates the rules,
since each city can only be visited once for this problem and having duplicat
es means it is not
actual solution to the problem.
The fitness function is what evaluates each individual solution
and assigns a fitness value to it.
Designing the fitness function is often the hardest part of implementing a genetic algorithm,
since it is what guides the genetic algorithm in selection, which determines what traits the next
generation is going to have
. The fitness function is what is used to control the results of the
genetic algorithm and can be very complicated
when working with big solutions
fitness function the genetic algorithm would just be a blind search, the fitness function is w
guides the algorithm.
When creating the fitness function, if certain traits are desired, it must be
properly represented and weighed by the function otherwise the algorithm may not work towards
solutions that have these traits. For the traveling sale
sman problem, the fitness function is quite
straightforward and would just evaluate the total distance traveled for the solution and assign that
value to the solution, where the lower value is better.
not limited to simple numeric calculations, but
can be as complicated as
needed. In some cases the solution may represent a model or structure and the fitness function is
actually a simulation where the model is put in and the results of the simulation a
re the assigned
fitness value. Fitness functions can be setup to evaluate solutions based upon multiple attributes
and give the solution values for each attribute, so more detail specific decisions can be made.
Since genetic algorithms usually work with
large solution spaces and often end up running
through thousands of generations, it is important to have the fitness function be as fast as
For big problems, a fitness function may take a matter of hours to completely evaluate
a solution, espec
ially in areas like engineering, where complex simulations may be used. To
up, the fitness of a solution may be approximated
is also very useful when unsure of an exact model or method to evaluate fitness.
Selection is the process of choosing individuals from the current generation to move into the
breeding pool for the next generation
or directly into the next generation, depending upon the
model being used. Prior to performing selection, each
individual solution will be assigned a
fitness value using the fitness function. The selection process uses these values to make
decisions and choose solutions. There are various selection methods to choose from, since just
choosing the fittest solutio
is not always very helpful. Choosing only the fittest solutions can
very early generations of the genetic algorithm
highly desired traits that only lesser fit solutions may have
to be discarded
result in having to rely on mutation to bring these desired traits back into the solutions, which
could take a long time depending on the size and complexity of a trait.
One commonly used selection method is f
itness proportionate selection
is also known as
roulette selection. This selection strongly applies the concept of survival of the fittest, where
fitter solutions have a higher chance of be being selected. This type of selection can be
performed by normalizing the fitness values of a
so that the sum is equal to 1
and then sorting the population by these fitness values, so that the most fit solution will go from
0.0 to (individual solution fitness / total fitness of the entire generation) and the next most fit
on will follow. A random number can then be chosen between 0 and 1 to then choose an
individual solution and this can be done as many times as needed.
Figure 2: Representing what a roulette selection arrangement looks like where F would have a
value of 1 when these valu
es are normalized and where r is a random value between 0 and the
total fitness, which is selecting B.
Another common selection method used is tournament selection, where a group of specified size
is created from the current gene
ration of solutions. The groups are filled with individual
solutions that are chosen by random, roulette selection
or some other means. Once a group is
full of solutions
, the solution with the highest fitness in the group is declared the winner and
s into the breeding pool. Using this method you can change to group size to influence the
amount of weak and strong solutions that move into the breeding pool. A large group size will
generally result in only the fittest being selected and a smaller pool
lets less fit solutions have a
chance as well.
Roulette selection and tournament selection generally allow for less fit solutions to be selected
occasionally. There are many selection methods that do not allow the less fit solutions to have a
ne such method is elitism selection. In elitism selection only the absolute fittest
solutions are chosen. Some other similar selection methods are choosing randomly from the top
percentage or just truncating to the top fourth and selecting them. Solutio
ns could be selected at
complete random as well.
If the fitness function ranks the solutions in multiple areas, the
selection function can then perform selection based upon more than one area, which is known as
breeding pool is filled with the selected solutions, crossover can then take place.
Crossing over takes place in sexual reproduction of animals during the prophase I of meiosis.
This is the main part of sexual reproduction that determines what mix of tra
its a child gets from
each parent, by exchanging genetic data from the parents. Genetic algorithms replicate this
process to create
child of the next generation. While performing crossover is easier when
working with a binary string solution, there are
ways to perform crossover for both real values
and any other data type.
There are many different way
that crossover can be applied, but the simplest is
crossover. With single point crossover you have two parents and you choose a point at rand
With this point the data is then switched to
create two children
of the next generation
crossover is another method, where two points are chosen at random instead of one. If more than
two points are to be selected, multi
point crossover c
an be used and can have as many points
chosen as wanted.
Figure 3: Illustrates
point and two
point crossover with two parents and the resulting
If crossover based on a number of points does not provide sufficient mixing, uniform crossover
can be used. Uniform crossover makes use of mixing ratio to give a child a percentage of each
Due to the high amount of mixing there is large am
ount of exploration that occurs,
since lots of different combinations
happen very quickly opposed to one
Figure 4: Illustrates what a uniform crossover could look like with a mixing ratio of 0.5, where
each parent would then
contribute 50% of the data to each child.
Besides these methods of crossover there are many alternatives that can be used
, as long as the
child that is created gets a mix of more than one parent’s traits. There are even some forms of
crossover that use t
hree parents. Sometimes special forms of crossover are required, in particular
when working with ordering problems such as the traveling salesman problem. If a simple one
point crossover method was used
for this problem, the child solution could very eas
ily end up
with duplicate of cites, thus creating an invalid solution for the traveling salesman problem.
the programmer would need to create an order based crossover method or check and repair the
solution if they used a normal crossover method.
Mutation is used to introduce new traits into the solutions as well as help to maintain the
diversity of the solutions. The amount of mutation that can occur is controlled by a mutation rate
is the chance for each trait to be changed for
a solution. A solution may have as few as
zero mutations depending
on a low mutation rate and chance or every trait could be mutated
with a 100% mutation rate, which would result in a completely random solution. Mutation also
helps to move the solution
s out of local optimums when the solutions in the past generations
have been relatively similar
and are not at a global optimum
Figure 5: A mutation creates a solution that is away from the local optimum and has a much
higher fitness value that the
others, making the likely the next generation will move away from
the local optimum.
Mutation can be easily done when the solution is encoded as a binary bit string. When a trait
needs to be mutated
as determined by the mutation rate and random variable
and is represented
by some bit, flipping the bit is all that has to be done
. One method that can be used for mutating
a real number is to change the value to upper or lower bounds of the number randomly. Uniform
mutation is a common way to handle float a
nd integer traits. For uniform mutation, a value is
randomly chosen between specified bounds, giving the programmer a large amount of control for
If mutations were not implemented in a genetic algorithm, no new traits other than the ones
originally held by the initial generation would ever be introduced. Also solutions would
converge very quickly and as a result the algorithm would not explore nearly as much of the
solution space, missing perhaps the very best solution.
If solutions beco
me too similar to each
other the evolution will slow and the algorithm may be setup to terminate once solutions
converge with each other, so the algorithm could terminate early, before a good solution is
Genetic algorithms are best
used for optimization problems or when there is no fully described
algorithm and there is no know solution to the problem. The full strength of genetic algorithms
is most apparent when working with a large solution space, since the algorithm explores so w
Because of this genetic algorithms often find solutions that would have not
by traditional methods. If
problem is too complex
or not eno
ugh is known about the solution
for an analytical solution a genetic algorithm may very well be able to solve the problem.
Compared too many traditional problem solving methods, genetic algorithms are very flexible
and allow for a lot of tweaking. Run time tweaking of the various rate
s and selection methods
can even be implemented
give the user a large degree of control even while the algorithm is
of the biggest strengths that
have is that
while most analytical solut
ions are not. If you have a generation size of 20, once all 20 of the
children have been generated using crossover and mutation, the fitness function can be run for
each child independently and
the time to evaluate a generation is only t
amount of the slowest fitness evaluation for that generation, not the average fitness function time
times the number solutions in the generation. So for a generation size of 20, if programed
properly to make full use of parallel computing it would take
of the time
opposed to doing one
after another on a single core. Today
as parallel computing is increasingly more
and computers have an increasing amount of
this strength is even mor
While genetic algorithms excel at solving some problems, they have a few weaknesses to be
aware of. The genetic algorithm relies upon be
designed very well and tweaking the
parameters to be successful. If the encoding method or fitness function have any errors the
algorithm is likely to
produce insignificant results. Also if the algorithm is not carefully setup so
that the mutation metho
d and rate are working as desired, the algorithm can get stuck in a local
optimum. Similarly the children of a generation may converge to a similar solution prematurely,
because of a bad mutation and crossover implementation or choice. While some methods
crossover and mutation are generally better than others, there is no definitive best method; the
best method depends upon the problem that is being solved and the specific implementation.
big problems the complexity of the fitness function can be
extremely high making it very had to
By utilizing the power of natural evolution genetic algorithms make a great problem solving tool
when properly implemented.
While the genetic algorithm does not promise to find the ab
optimal solution, it often finds solutions very close to optimal and especially ones that would be
ignored by traditional means.
have gone from being first
brought to the
attention of computer scientist by John Holland in the mid
1970’s to being utilized in multiple
fields today, ranging from electrical engineering to financial forecasting
Affenzeller, M. (2009).
Genetic algorithms and genetic programming: modern concepts and
Raton: CRC Press.
Alba, E., & Dorronsoro, B. (2008).
Cellular genetic algorithms
. Berlin: Springer.
] Altshuler, E., & Linden, D. (1997). Design of a wire antenna using a genetic algorithm.
Journal of Electronic Defense
ch 15, 2012 from
Buckland, M. (2002).
AI techniques for game programming
. Cincinnati, Ohio: Premier Press.
Gen, M., & Cheng, R. (2000).
Genetic algorithms and
. New York:
Haupt, R. L., & Haupt, S. E. (2004).
Practical genetic algorithms
(2nd ed.). Hoboken, N.J.:
Retrieved March 13, 2012 from
] Mahfoud, S., & Mani, G. (1996). Financial forecasting using genetic algorithms.
Retrieved March 15, 2012 from
Mitchell, M. (1998).
An introduction to genetic algorithms
. Cambridge, Mass.: MIT Press.