Lecture 7 Deep Learning - Genetic Algorithms

strangerwineAI and Robotics

Oct 19, 2013 (3 years and 9 months ago)

88 views

Deep Learning

Roger S. Gaborski


Visual System


Visual
cortex is defined in terms of
hierarchical regions:

V1


V2


V3


V4


V5


MST


Some regions may be bypassed,
depending on the features being
extracted


The visual input becomes more
abstract as the signals are processed
by individual regions


Roger S. Gaborski

2

Multiple Stages of Processing

3

INPUT
DATA

Layer1

Layer2

Layer3

OUTPUT
DATA

level of abstraction

Roger S. Gaborski

Extracted Extracted Extracted
Features Features
Features




Traditional Neural Networks


Typically 2 layers, one hidden layer and one
output layer


Uncommon to have more than 3 layers


INPUT and TARGET training data


Backward Error Propagation (BEP) becomes
ineffective with more than 3 layers

4

REF:
www.nd.com

INPUT

TARGET VALUE

Roger S. Gaborski

Multilayer Neural Network


Build 6 layer feed forward neural network


Train with common training algorithm


RESULT: Failure

5

INPUT
DATA

Layer1

Layer2

Layer3

OUTPUT
DATA

? ? ? ?

Roger S. Gaborski

Deep Belief Networks


Need an approach that will allow training layer by
layer


BUT I don’t know the output of each layer.


Hinton (2006)


A fast learning algorithm for
deep belief networks”


Restricted
B
oltzmann Machine


single layer of hidden
neurons not connected to each other


Fast algorithm that can find parameters even for deep
networks (Contrastive Divergence Learning)




Can Evolutionary Algorithms be used to
evolve a
n
etwork?



6

Roger S. Gaborski

One Layer Example of Neural Network
Architecture

7

600 Input Neurons 400 Hidden
Neurons

Weight

Matrix W

600 x 400

INPUT

VECTOR

v

FEATURE

VECTOR

h

Roger S. Gaborski

v
[1x600]
*W
[600x400]

= h
[1x400]


How Do We Find Weight Matrix W


We need a ‘measure of error’


One approach is to reconstruct the input vector v
using the following equation:

v
_reconstruct

[1x600]

= h
[1x400]
*W
T
[400x600]



The difference between the reconstructed v and
the original v is a measure of error

Err =
Σ

(
v
_reconstruct



v )
2


Roger S. Gaborski

8

SUMMARY: Goal


Propagate input vector to hidden units (Feed
forward)


Propagate features extracted by hidden layer
back to input neurons (Feed backwards)


Goal: Input vector and reconstructed input vector
equivalent (Input = Reconstructed Input)


Use an evolutionary strategy approach to find W


This approach allows for any type of activation
function and any network topology


9

Roger S. Gaborski

Use Evolutionary Algorithm to Find W

10

600 Input Neurons 400 Hidden
Neurons

Weight

Matrix W

600 x 400

Roger S. Gaborski

Evolutionary Strategy ES(lambda+mu)


Lambda: size of population


Mu: Fittest individuals in population selected
to create new population


Let lambda = 20, mu = 5


Each selected fittest individual will create
lambda/mu children (20/5 = 4)


The size of the new population will remain at
25

11

Roger S. Gaborski

Population


Randomly reate the first population of potential W
solutions:

Current_population(:,:,k) = .1*randn([num_v,num_h])


Evaluate each weight matrix W in population and rank
W


Select mu fittest weight matrices. These will be used to
create children (new potential solutions)


Create population of the mu fittest weight matrices
and lambda/mu children for each mu


Population increases from lambda to lambda+mu, but
error is monotonically decreasing function


Keep track of fittest W matrix

12

Roger S. Gaborski

Final Selection of W Matrix

13

600 Input Neurons (data) 400 Hidden Neurons (h)

Best Weight

Matrix W

In Terms of
Smallest
Reconstructed
Error

Roger S. Gaborski

Examples from the Simple Digit
Problem

Results for Binary Digit Problem Epochs = 50

15

BEST W AFTER 50 EPOCHS

Roger S. Gaborski

50 Epochs

16

Roger S. Gaborski

Sample Results for Digit Problem
Epochs = 500

17

BEST W AFTER 500 EPOCHS


Roger S. Gaborski

Sample Results for Digit Problem
Epochs = 5000

18

BEST W AFTER 5000 EPOCHS


Roger S. Gaborski

Results for Digit Problem Epochs =
50,000

19

BEST W AFTER 50000 EPOCHS


Roger S. Gaborski

Results for Digit Problem Epochs =
50,000

20

BEST W AFTER 50000 EPOCHS


Roger S. Gaborski

Repeat Process with Second W using
400 Features as Input

21

4
0
0


F
e
a
t
u
r
e
s

600 Input Neurons

Best Weight

Matrix W

In Terms of
Smallest
Reconstructed
Error

Weight Matrix
W2

3

0

0


F
e
a

t

u

r

e

s

Evolve W2 using 400 features as input

Roger S. Gaborski

Face Recognition


The same approach is used to recognize faces

625 Input Neurons 400 Hidden Neurons

Weight

Matrix W

625 x 400

22

Roger S. Gaborski

22

5000 Epochs, lambda=20, mu = 5

23

20 Faces in Training Data

Grayscale, 25x25 Pixels

Roger S. Gaborski

5000 Epochs, Typical Results

24

Roger S. Gaborski

Successfully Reconstruct Images from Features

25

Roger S. Gaborski

Random Data

Roger S. Gaborski

26

Face Classifier

27

625 Input Neurons

Hidden Neurons

Matrices


W
1 ,
W
2

….W



FEATURES


Matrix
V

R

Two Output

Neurons

TRAIN ON

FACES and

NON
-
FACES

FACE

NON
-
FACE

Roger S. Gaborski

Face Detection

Note: Face not in original training data

Roger S. Gaborski

28

Red Mark

Upper

Left Hand

Corner of

Face