Neural Networks

bannerclubAI and Robotics

Oct 20, 2013 (4 years and 2 months ago)

88 views

Neural Networks

Background

-

Neural Networks can be :



-

Biological
models



-

Artificial

models


-

Desire to produce
artificial systems

capable of


sophisticated computations
similar

to the human brain.

Biological analogy and some main ideas


The brain is composed of a
mass of interconnected neurons


each neuron is connected to many other neurons


Neurons
transmit signals to each other


Whether
a signal is transmitted is an
all
-
or
-
nothing

event
(the electrical potential in the cell body of the neuron is
thresholded
)


Whether
a signal is sent, depends on the
strength of the
bond

(synapse) between two neurons

How Does the Brain Work ? (1)

NEURON


-

The cell that performs information processing in the brain.

-

Fundamental functional unit of all nervous system tissue.

Each consists of :


SOMA, DENDRITES, AXON, and SYNAPSE.

How Does the Brain Work ? (2)

Brain vs. Digital Computers (1)

-

Computers require hundreds of cycles to simulate


a firing of a neuron.


-

The brain can fire all the neurons in a single step.


Parallelism


-

Serial computers require billions of cycles to


perform some tasks but the brain takes
less than


a second.


e.g.
Face Recognition


Definition of Neural Network

A Neural Network is a
system

composed of


many simple processing elements

operating in


parallel

which can
acquire, store, and utilize



experiential knowledge.

Artificial Neural Network?

Neurons vs. Units (1)


Each
element of NN is a node called
unit.


Units
are connected by
links.


Each
link has a
numeric weight
.

Neurons
vs.
units (2)

Real neuron is far away
from our simplified
model
-

unit

Chemistry,
biochemistry,
quantumness.

Computing Elements

A typical unit:

Planning

in building a Neural Network

Decisions must be taken

on the following:



-

The number of units to use.

-

The type of units required.

-

Connection between the units.

How NN learns a task.

Issues to be discussed

-

Initializing the weights
.

-

Use of a learning algorithm.

-

Set of training examples.

-

Encode the examples as inputs.

-

Convert output into meaningful results.

Neural Network Example

A
very simple, two
-
layer, feed
-
forward network with two inputs, two
hidden nodes, and one output node.

Simple Computations in this network

-

There are
2 types of components:

Linear

and
Non
-
linear
.

-

Linear:

Input function


-

calculate weighted sum of all inputs.

-

Non
-
linear:

Activation function


-

transform sum into activation level.

Calculations

Input
function:






Activation

function
g
:

A Computing Unit.

Now in more detail but for a particular model only

A
unit

Activation

Functions

-

Use
different functions

to obtain different models.

-

3 most common choices :



1)
Step

function


2)
Sign

function


3)
Sigmoid

function

-

An output of
1 represents firing

of a neuron down the
axon.

Step Function Perceptrons

3 Activation Functions

Standard structure of an artificial neural
network


Input units


represents the input as a fixed
-
length vector of numbers (user
defined)


Hidden units


calculate thresholded weighted sums of the inputs


represent intermediate calculations that the network learns


Output units


represent the output as a fixed length vector of numbers

Representations


Logic rules


If color = red ^ shape = square then +


Decision trees


tree


Nearest neighbor


training examples


Probabilities


table of probabilities


Neural networks


inputs in [0, 1]

Can be used for all of them

Many variants exist

Notation

Notation (cont.)

Operation of individual units


Output
i

= f(W
i,j

* Input
j

+ W
i,k

* Input
k

+ W
i,l

*
Input
l
)


where f(x) is a threshold (activation) function


f(x) = 1 / (1 + e
-
Output
)


“sigmoid”


f(x) = step function

Artificial Neural Networks

Perceptron Learning Theorem


Recap
: A perceptron (threshold unit) can
learn
anything that it can
represent
(i.e.
anything separable with a hyperplane)


26

The Exclusive OR problem

A Perceptron cannot represent
Exclusive OR
since it is not linearly separable.

27

28

Properties of architecture




No connections within a layer



No direct connections between input and output layers



Fully connected between layers



Often more than 3 layers



Number of output units need not equal number of input units



Number of hidden units per layer can be more or less than


input or output units

Each unit is a perceptron

Often include bias as an extra weight

29

Conceptually: Forward Activity
-

Backward Error

30

Backward pass phase: computes ‘error signal’,
propagates

the
error
backwards


through
network starting at output units
(
where the error is the difference between

actual
and desired
output
values)

Forward pass phase: computes ‘functional signal’, feed
forward propagation

of
input pattern signals through network

Backpropagation

learning algorithm ‘BP’


Solution to credit assignment problem in MLP.
Rumelhart
, Hinton and Williams (1986)
(
though actually invented earlier in a PhD thesis relating to economics)


BP has two phases
:

31

Forward Propagation of Activity


Step 1: Initialize weights at random, choose a
learning rate η


Until network is trained:


For each training example i.e. input pattern and
target output(s):


Step 2: Do forward pass through net (with fixed
weights) to produce output(s)


i.e., in Forward Direction, layer by layer:


Inputs applied


Multiplied by weights


Summed


‘Squashed’ by sigmoid activation function


Output passed to each neuron in next layer


Repeat above until network output(s) produced

32

Step 3. Back
-
propagation of error

33

‘Back
-
prop’ algorithm summary

(
with
Maths
!)

34

‘Back
-
prop’ algorithm summary

(
with NO
Maths
!)

35

MLP/BP: A worked example

36

Worked example: Forward Pass

37

Worked example: Forward Pass

38

Worked example: Backward Pass

39

Worked example: Update Weights

Using Generalized Delta Rule (BP)

40

Similarly for the all weights wij:

41

Verification that it works

42

Training


This was a single iteration of back
-
prop


Training requires many iterations with many
training examples or
epochs
(one epoch is entire
presentation of complete training set)


It can be slow !


Note that computation in MLP is local (with
respect to each neuron)


Parallel computation implementation is also
possible

43

Training and testing data


How many examples ?


The more the merrier !


Disjoint training and testing data sets


learn from training data but evaluate
performance (generalization ability) on
unseen test data


Aim
: minimize error on
test
data

44

More resources


Binary Logic Unit in an example


http://www.cs.usyd.edu.au/~irena/ai01/nn/5.html



MultiLayer

Perceptron Learning Algorithm


http://www.cs.usyd.edu.au/~irena/ai01/nn/8.html


45