# Genetic Algorithms

Τεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 4 χρόνια και 8 μήνες)

108 εμφανίσεις

Genetic Algorithms

Overview

Genetic Algorithms: a gentle introduction

What are GAs

How do they work/ Why?

Critical issues

Use in Data Mining

GAs and statistics

decile performance maximization

multi
-
objective models

Natural Genetics to AI

Computational models inspired by
biological evolution

survival of the fittest

reproduction through cross
-
breeding

Genetic Algorithms

Population based search (
parallel
)

simultaneous search from multiple points in search space

useful in complex, unstructured search spaces

(less prone to local failures)

Population members: potential solutions

Population of solutions evolve from one
generation to the next

Genetic Algorithms

Search objective

Fitness score for population members

(
fitness function
)

Survival of the fittest

selection

Generating new solutions

“Mating” and reproduction of individuals

(crossover, mutation)

Basic Operation

Selection

Recombination

Crossover

Mutation

Generation t

Generation t+1

GAs: Parallel Search

X

X

Hill

climber

Fitness

x

GAs: Basic Principles

Representation of individuals

String of parameters (
genes
) :
chromosome

eg
. optimize a function F(p,q,r,s,t)

Population members: p q r s t

genotype
and
phenotype

Binary representation?

Population members as bit strings

F( p,q,r,s,t) as:

1 0 0 1 1 0 1 0 1 1 0 1 1 0 0 1 1 0 1 0

p q r s t

early theory in terms of binary strings

(schema
theorem)

unnecessary perversity?

GAs: Basic Principles

Survival of the fittest (
Fitness function
)

numerical “figure of merit”/utility measure of an individual

tradeoff amongst a multiple evaluation criteria

efficient evaluation

GAs: Basic Principles

Iterative search

population evolves over generations

Convergence

progression towards uniformity in population

premature convergence?

(local optima)

Typical GA Run

Fitness

Generations

Best

Average

Operators: Selection

Fitness proportionate selection (f
i
/f )

number of
reproductive trials

for individuals

Selection

Roulette
-
wheel selection

(stochastic sampling with replacement)

wheel spaced in proportion to

fitness values

N (pop size) spins of the wheel

Stochastic universal sampling

N equally spaced pins on wheel

single turn of the wheel

Selection

Premature converge

Fitness scaling

f = f
-

(2*avg.
-

max.)

Ranked fitness

Elitism

-
state selection

Demetic grouping

Operators: Crossover

Parent 1: axpsqvqbtpihd

Parent 2: qzxxaycgbtphw

crossover sites

Offspring 1: azpsavcbtpphd

Offspring 2: qxxxqyqgbtihw

(
Uniform crossover
)

combining good
building blocks

Operators: Mutation

alters each gene with small probability

x 1 y x 0 y
0

y y 0 x y x y

x 1 y x 0 y
1

y y 0 x x x y

Non
-
Binary Representations

Integer, real
-
number, order
-
based, rules, ...

Binary or Real
-
valued?

real representations give faster, more

consistent, more accurate results

High
-
level representation

intuitive, can utilize
specialized

operators

effective search over complex spaces

Real
-
valued representation

Parent1:

3.45 0.56 6.78 0.976 2.5

Parent2:

0.98 1.06 4.20 0.34 1.8

Offspring1:
3.22

0.56 6.78
0.65

2.12

Offspring2:

1.43

1.06 4.20
0.41

1.93

(Arithmetic crossover)

High
-
level representation

Parent1:

Parent2:

Offspring1:

Offspring2:

High
-
level representation

Generalize/Specialize

Tree
-
structured representation (GP)

/

x

5

log

*

(x log(y))/5)

y

Automated learning of programs (originally)

parse tree expressions

Non
-
linear interaction terms

Function set : internal nodes

{+,
-
,*,/,log}

terminal set: leaf nodes

{constants, variables}

Tree
-
structured representation

Representing complex patterns

<

if

y

7

0

*

y

x

2

+

AND

>

x

2

If (y<7) and (x>2)

then 0

else 2x+y

Genetic search: Issues

Coding scheme
,
fitness function

critical

the “art” in GA design!

General mechanism so robust that, within reasonable margins,
parameter settings are not critical
.

Representation to match problem, domain

utilizing domain knowledge

problem
-
specific crossover, mutation, selection

Flexibility in fitness function formulation

Genetic search: Issues

Stochastic search

initial populations, probabilistic operators

multiple runs with different random streams

Initializing population with known solutions

seeding initial population with solutions from multiple,
independent runs

Genetic search: Issues

Guarantees optimality?

But...

especially useful where traditional approaches fail

Parallelizable for large data

multi
-
processor, networked machines

Using GAs ?

When to use a GA?

How long does it take?

Will it perform better?

Using GAs

population size

mutation, crossover rates

how many generations

multiple runs

Is it a “black
-
box”?

?

Huh?

Data characteristics

Fitness function

GA parameters

GA Application Examples

Function optimizers

difficult, discontinuous, multi
-
modal, noisy functions

Combinatorial optimization

layout of VLSI circuits, factory scheduling, traveling
salesman problem

Design and Control

bridge structures, neural networks, communication networks
design; control of chemical plants, pipelines

GA Application Examples

Machine learning

classification rules, economic modeling, scheduling strategies

Portfolio design, optimized trading models, direct