Neural networks

maltwormjetmoreAI and Robotics

Oct 19, 2013 (3 years and 9 months ago)

77 views

Neuralnetworks
Chapter20,Section5
Chapter20,Section51
Outline
}Brains
}Neuralnetworks
}Perceptrons
}Multilayerperceptrons
}Applicationsofneuralnetworks
Chapter20,Section52
Brains
1011
neuronsof
>20
types,
10
14
synapses,1ms{10mscycletime
Signalsarenoisy\spiketrains"ofelectricalpotential
Axon
Cell body or Soma
Nucleus
Dendrite
Synapses
Axonal arborization
Axon from another cell
Synapse
Chapter20,Section53
McCulloch{Pitts\unit"
Outputisa\squashed"linearfunctionoftheinputs:
ai
g(ini)=g
j
Wj;i
aj

Output
Σ
Input
Links
Activation
Function
Input
Function
Output
Links
a
0 = −1
a
i = g(in
i)
ai
g
ini
W
j,i
W
0,i
Bias Weight
aj
Agrossoversimplicationofrealneurons,butitspurposeis
todevelopunderstandingofwhatnetworksofsimpleunitscando
Chapter20,Section54
Activationfunctions
(a) (b)
+1+1
in
i
in
i
g(ini)g(in
i)
(a)isa
stepfunction
or
thresholdfunction
(b)isa
sigmoid
function
1=(1+ex
)
Changingthebiasweight
W0;i
movesthethresholdlocation
Chapter20,Section55
Implementinglogicalfunctions
AND
W0 = 1.5
W1 = 1
W2 = 1
OR
W2 = 1
W1 = 1
W0 = 0.5
NOT
W1 = –1
W0 = – 0.5
McCullochandPitts:everyBooleanfunctioncanbeimplemented
Chapter20,Section56
Networkstructures
Feed-forwardnetworks
:
{
single-layerperceptrons
{
multi-layerperceptrons
Feed-forwardnetworksimplementfunctions,havenointernalstate
Recurrentnetworks
:
{
Hopeldnetworks
havesymmetricweights(
Wi;j
=Wj;i
)
g(x)=sign(x)
,
ai
=1
;
holographicassociativememory
{
Boltzmannmachines
usestochasticactivationfunctions,
MCMCinBayesnets
{recurrentneuralnetshavedirectedcycleswithdelays
)haveinternalstate(like ip- ops),canoscillateetc.
Chapter20,Section57
Feed-forwardexample
W
1,31,4
W
2,3
W
2,4
W
W
3,54,5
W
1
2
3
4
5
Feed-forwardnetwork=aparameterizedfamilyofnonlinearfunctions:
a5
=g(W3;5
a3
+W4;5
a4)
=g(W3;5
g(W1;3
a1
+W2;3
a2)+W4;5
g(W1;4
a1
+W2;4
a2))
Adjustingweightschangesthefunction:dolearningthisway!
Chapter20,Section58
Single-layerperceptrons
InputUnitsUnits
Output
W
j,i
-4
-2
0
2
4
x1
-4
-2
0
2
4
x2
0
0.2
0.4
0.6
0.8
1
Perceptron output
Outputunitsalloperateseparately|nosharedweights
Adjustingweightsmovesthelocation,orientation,andsteepnessofcli
Chapter20,Section59
Expressivenessofperceptrons
Consideraperceptronwith
g
=stepfunction(Rosenblatt,1957,1960)
CanrepresentAND,OR,NOT,majority,etc.,butnotXOR
Representsa
linearseparator
ininputspace:

j
Wj
xj
>0orWx>0
(a) x1 and x2
10
01
x1
x2
(b) x1 or x2
01
10
x1
x2
(c) x1 xor x2
?
01
10
x1
x2
Minsky&Papert(1969)prickedtheneuralnetworkballoon
Chapter20,Section510
Perceptronlearning
Learnbyadjustingweightstoreduce
error
ontrainingset
The
squarederror
foranexamplewithinputxandtrueoutputyis
E=
1
2
Err
2

1
2
(yhW
(x))2
;
Performoptimizationsearchbygradientdescent:
@E
@Wj
=Err
@Err
@Wj
=Err
@
@Wj
yg(nj=0
Wj
xj
)
=Errg
0(in)xj
Simpleweightupdaterule:
Wj
Wj
+Errg
0(in)xj
E.g.,+veerror
)
increasenetworkoutput
)
increaseweightson+veinputs,decreaseon-veinputs
Chapter20,Section511
Perceptronlearningcontd.
Perceptronlearningruleconvergestoaconsistentfunction
foranylinearlyseparabledataset
0.4
0.5
0.6
0.7
0.8
0.9
1
0
10
20
30
40
50
60
70
80
90
100
Proportion correct on test set
Training set size - MAJORITY on 11 inputs
Perceptron
Decision tree
0.4
0.5
0.6
0.7
0.8
0.9
1
0
10
20
30
40
50
60
70
80
90
100
Proportion correct on test set
Training set size - RESTAURANT data
Perceptron
Decision tree
Perceptronlearnsmajorityfunctioneasily,DTLishopeless
DTLlearnsrestaurantfunctioneasily,perceptroncannotrepresentit
Chapter20,Section512
Multilayerperceptrons
Layersareusuallyfullyconnected;
numbersof
hiddenunits
typicallychosenbyhand
Input units
Hidden units
Output units
ai
W
j,i
aj
W
k,j
ak
Chapter20,Section513
ExpressivenessofMLPs
Allcontinuousfunctionsw/2layers,allfunctionsw/3layers
-4
-2
0
2
4
x1
-4
-2
0
2
4
x2
0
0.2
0.4
0.6
0.8
1
hW
(x1, x2)
-4
-2
0
2
4
x1
-4
-2
0
2
4
x2
0
0.2
0.4
0.6
0.8
1
hW(x1, x2)
Combinetwoopposite-facingthresholdfunctionstomakearidge
Combinetwoperpendicularridgestomakeabump
Addbumpsofvarioussizesandlocationstotanysurface
Proofrequiresexponentiallymanyhiddenunits(cfDTLproof)
Chapter20,Section514
Back-propagationlearning
Outputlayer:sameasforsingle-layerperceptron,
Wj;i
Wj;i
+aj
i
where
i
=Err
i
g
0(in
i
)
Hiddenlayer:
back-propagate
theerrorfromtheoutputlayer:
j
=g
0(in
j
)
X
i
Wj;i
i
:
Updateruleforweightsinhiddenlayer:
Wk;j
Wk;j
+ak
j
:
(Mostneuroscientistsdenythatback-propagationoccursinthebrain)
Chapter20,Section515
Back-propagationderivation
Thesquarederroronasingleexampleisdenedas
E=
1
2
X
i
(yi
ai
)2
;
wherethesumisoverthenodesintheoutputlayer.
@E
@Wj;i
=(yi
ai
)
@ai
@Wj;i
=(yi
ai
)
@g(in
i
)
@Wj;i
=(yi
ai
)g
0(in
i
)
@in
i
@Wj;i
=(yi
ai)g
0
(in
i)
@
@Wj;i
0B@X
j
Wj;iaj
1CA
=(yi
ai
)g
0(in
i
)aj
=aj
i
Chapter20,Section516
Back-propagationderivationcontd.
@E
@Wk;j
=
X
i
(yi
ai)
@ai
@Wk;j
=
X
i
(yi
ai
)
@g(in
i)
@Wk;j
=
X
i
(yi
ai)g0
(in
i)
@in
i
@Wk;j
=
X
i
i
@
@Wk;j
0B@X
j
Wj;i
aj
1CA
=
X
i
iWj;i
@aj
@Wk;j
=
X
i
i
Wj;i
@g(in
j
)
@Wk;j
=
X
i
iWj;i
g
0(in
j
)
@in
j
@Wk;j
=
X
i
iWj;i
g
0(in
j
)
@
@Wk;j
0B@X
k
Wk;j
ak
1CA
=
X
i
iWj;i
g
0(in
j
)ak
=ak
j
Chapter20,Section517
Back-propagationlearningcontd.
Ateach
epoch
,sumgradientupdatesforallexamplesandapply
Trainingcurve
for100restaurantexamples:ndsexactt
0
2
4
6
8
10
12
14
0
50
100
150
200
250
300
350
400
Total error on training set
Number of epochs
Typicalproblems:slowconvergence,localminima
Chapter20,Section518
Back-propagationlearningcontd.
LearningcurveforMLPwith4hiddenunits:
0.4
0.5
0.6
0.7
0.8
0.9
1
0
10
20
30
40
50
60
70
80
90
100
Proportion correct on test set
Training set size - RESTAURANT data
Decision tree
Multilayer network
MLPsarequitegoodforcomplexpatternrecognitiontasks,
butresultinghypothesescannotbeunderstoodeasily
Chapter20,Section519
Handwrittendigitrecognition
Summary
Mostbrainshavelotsofneurons;eachneuronlinear{thresholdunit(?)
Perceptrons(one-layernetworks)insucientlyexpressive
Multi-layernetworksaresucientlyexpressive;canbetrainedbygradient
descent,i.e.,errorback-propagation
Manyapplications:speech,driving,handwriting,frauddetection,etc.
Engineering,cognitivemodelling,andneuralsystemmodelling
subeldshavelargelydiverged
Chapter20,Section521