Multilayer-Perceptron - Nyu

cartcletchΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

148 εμφανίσεις

Building robots Spring 2003

1

Multilayer Perceptron

One and More Layers Neural Network

Building Robots Spring 2003

2

The association problem


ξ
-

i
nput to the network with length

N
I
, i.e., {
ξ
k
;
k =1,2,…,N
I
}



O

-

output with length N
o
, i.e., {
O
i

;
i=1,2,…,N
o
}


ς

-

desired output , i.e., {
ς
i ;
i=1,2,…,N
o
}


w
-

weights in the network, i.e.,
w
ik
weight between
ξ
k
and
O
i


T


threshold value for output unit be activated


g


function to convert input to output values between 0 and 1.
Special case: threshold function, g(x)=
θ
(
x
)=1 or 0 if x
>

0 or not.

Given an input pattern
ξ

we would like the output O to be the desired
one
ς

. Indeed we would like it to be true for a set of p input patterns
and desired output patterns ,
μ
=1, …, p. The inputs and outputs may
be continuous or boolean.

Building Robots Spring 2003

3

The geometric view of the weights


For the boolean case, we want ,


the boundary between positive and negative threshold is defined
by which gives a plane (hyperplane) perpendicular to .



The solution is to find the hyperplane that separates all the
inputs according to the desired classification


For example: the boolean function
AND

Hyperplane (line)

Building Robots Spring 2003

4

Learning: Steepest descent on weights


The optimal set of weights minimize the following cost:





Steepest descent method will find a local minima via



or






where the update can be done each


pattern at a time,
h

is the “learning


rate”, , and


Building Robots Spring 2003

5

Analysis of Learning Weights


The steepest descent rule



produces changes on the weight vector only in the direction
of each pattern vector . Thus, components of the vector
perpendicular to the input patterns are left unchanged. If is
perpendicular to all input patterns, than the change in weight


will not affect the solution.




For , which is
largest when is small. Since , the largest
changes occur for units in “doubt”(close to the threshold value.)

1

0

Building Robots Spring 2003

6

Limitations of the Perceptron


Many problems, as simple as the XOR problem, can not be
solved by the perceptron (no hyperplane can separate the input)

Not a solution

Building Robots Spring 2003

7

Multilayer Neural Network



-

i
nput of layer L to layer L+1




-

weights connecting layer L to layer L+1.





threshold values for units at layer L


Thus, the output of a two layer network is written as




The cost optimization on all weights is given by

Building Robots Spring 2003

8

Properties and How it Works


With one input layer, one output layer, and one or more hidden
layers, and enough units for each layer, any classification
problem can be solved



Example: The XOR problem:



Later we address the generalization problem (for new examples)

Layer L=0

Layer L=1

Layer L=2

0 1

Building Robots Spring 2003

9

Learning: Steepest descent on weights

Building Robots Spring 2003

10

Learning Threshold Values