# Genetic algorithms (pdf)

Βιοτεχνολογία

2 Οκτ 2013 (πριν από 4 χρόνια και 7 μήνες)

112 εμφανίσεις

Brandon Andrews

What are genetic algorithms?

3 steps

Applications to Bioinformatics

Invented and published in 1975 by John
Holland

Cells have DNA which define properties

Reproduction crosses DNA from both
parents merging properties from both

During this step random mutations can occur

A test of the fitness of the organism is
performed

Scores the organism against others based on criteria
for survival

Essentially evolution

Selection step

Based on the calculate fitness

Reproduction
step

Mutations

S
trategies
for
crossing

Termination step

When the goal is met

1) Generate random properties
(chromosomes) for N entities

2) Calculate their fitness and discard ones
that fall below the threshold

Can be determined through a simulation

3) Randomly cross over pairs that survive
the selection step

Also randomly choose properties and mutate them.
This could be as simple as jittering them

4) Go to step 2 until a goal is reached

Return the best set of properties

Could be anything

The goal is to minimize or maximize the
fitness function normally after each step

How often crossovers happens

0% represents if no crossover and both parents are
simply moved to the next step

100% represents that all of the parents are crossed
and only their children are move to the next step

The idea is that hopefully the good
properties of both parents are merged or
the good parent is preserved completely if
it has no flaws that can be fixed via a
crossing pair

The probability that part of the chromosome
is changed after a crossing

0
% if none of it is changed

Not useful since variety is needed to approach the best
solution or you’re stuck with the first generated properties

100
% if all of it is changed

Not useful since it negates the point of crossing at all,
causes a random search essentially

The concept is to stop the algorithm from
halting at a local maximum. The mutations
have a chance to generate small better
changes

When the expected error is low

Sometimes it’s hard to calculate an error since
the solution isn’t known

Or when the results stop minimizing for a
few iterations or stops increasing
depending on the problem

Might be obvious, but genetic algorithms
are by design approximate solutions
since they attempt to optimize to a
solution

Perfection is only as good as the fitness function
and the number of iterations, crossing and
mutation probabilities

Multiple Sequence Alignment

Initial generation

random generation of an
alignment based on the alignments of the given
sequences

No authors agree on the initial size of the population

Selection via a tournament style pairing crossing the
possible alignments

The fitness function

“Sum of pair” Objective Function (everyone uses a
different one)

The survival rate is different for each alignment

Sum all alignment scores together and take a percentage
for each alignment

Basically better alignments have a higher percentage to survive

Reproduction

Crossing uses a “one
-
point crossover”

Takes the first half of the first alignment and cross if with the
second half of the second parent

AB
CD and EF
GH

-
> ABGH

Or “point
-
to
-
point crossover”

Random index is chosen

ABC
D and EFG
H

-
> ABCH

Mutation

Remove or insert a gap into the alignment

Obitko

M. (1998). Genetic Algorithms.
Retrieved
from

http://www.obitko.com/tutorials/genetic
-
algorithms
/

A. (2008). Applications of genetic