1
Modeling with neural networks
Garrison W. Cottrell
Gary's Unbelievable Research Unit (GURU)
Computer Science and Engineering Department
Temporal Dynamics of Learning Center
Institute for Neural Computation
UCSD
Modeling with Neural Networks
2
Ways to understand
how the brain works
•
Behavioral measures
•
Choices
•
Reaction times
•
Eye movements
•
Brain imaging
•
PET
•
fMRI
•
MEG
•
EEG
•
NIRS
•
DTI
•
Neural recording
•
Single cell recording
•
Multicell recording
•
Optical imaging
•
Voltage

sensitive dyes
•
Optogenetics
•
ECOG
•
Modeling
•
Neural networks
•
Bayesian Models
•
Abstract Mathematical Models
Modeling with Neural Networks
3
Why model?
•
Models rush in where theories fear to tread.
•
Models can be manipulated in ways people cannot
•
Models can be analyzed in ways people cannot.
Modeling with Neural Networks
4
Models rush in
where theories fear to tread
•
Theories are high level
descriptions
of the
processes underlying behavior.
•
They are often not explicit about the
processes involved.
•
They are difficult to reason about if no
mechanisms are explicit

they may be
too high level to make explicit predictions.
•
Theory formation itself is difficult.
Modeling with Neural Networks
5
Models rush in where theories fear to
tread
•
Using machine learning techniques, one can often build a
working model
of a task for which we have no theories or
algorithms (e.g., expression recognition).
•
A working model provides an “intuition pump” for how
things
might
work, especially if they are “neurally
plausible” (e.g., development of face processing

Dailey
and Cottrell).
•
A working model may make unexpected predictions (e.g.,
the Interactive Activation Model and SLNT).
Modeling with Neural Networks
6
Your first neural net:
The Interactive Activation Model:
A model of reading from print
•
Word level
•
Letter level
•
Feature level
Modeling with Neural Networks
7
Operation of the model
Modeling with Neural Networks
8
Operation of the model
Modeling with Neural Networks
9
Example of data accounted for…
Pseudoword effect
Modeling with Neural Networks
10
Example of data accounted for…
Pseudoword effect
Modeling with Neural Networks
11
Example of data predicted
•
What about non

pronounceable non

words like SLNT?
•
SLNT has a lot of friends at the word level
•
The model predicts that there should be a superiority effect for SLNT.
•
The tested this in UCSD Psychology sophomores and got the predicted
effect
Modeling with Neural Networks
12
Summary
•
Why model?
•
Models make assumptions explicit
•
Models (because they are run on a computer and can be
highly non

linear) can make unexpected predictions
•
While no model is “correct”, the more data a model
predicts, the more we “believe” that model…
Modeling with Neural Networks
13
Models can be manipulated in ways
people cannot
•
We can see the effects of variations in cortical
architecture (e.g., split (hemispheric) vs. non

split
models (Shillcock and Monaghan word perception
model)).
•
We can see the effects of variations in processing
resources (e.g., variations in number of hidden
units in Plaut et al. models).
Modeling with Neural Networks
14
Models can be manipulated in ways
people cannot
•
We can see the effects of variations in environment
(e.g., what if our parents were cans, cups or books
instead of humans? I.e., is there something special
about face expertise versus visual expertise in
general? (Sugimoto and Cottrell, Joyce and
Cottrell)).
•
We can see variations in behavior due to different
kinds of brain damage within a single “brain” (e.g.
Juola and Plunkett, Hinton and Shallice).
Modeling with Neural Networks
15
Models can be analyzed in ways
people cannot
In the following, I specifically refer to neural network models.
•
We can do single unit recordings.
•
We can selectively ablate and restore parts of the network,
even down to the single unit level, to assess the contribution
to processing.
•
We can measure the individual connections

e.g., the
receptive and projective fields of a unit.
•
We can measure responses at different layers of processing
(e.g., which level accounts for a particular judgment:
perceptual, object, or categorization? (Dailey et al. J Cog
Neuro 2002).
Modeling with Neural Networks
16
How
(I like)
to build Cognitive Models
•
In a domain where there is a
lot of data and controversy
!
•
I like to be able to relate them to the brain, so “neurally
plausible” models are preferred

neural nets.
•
The model should be a
working model
of the
actual
task,
rather than a cartoon version of it.
•
Of course, the model should nevertheless be
simplifying
(i.e. it should be constrained to the essential features of the
problem at hand):
•
Do we really need to model the (supposed) translation invariance
and size invariance of biological perception?
•
As far as I can tell, NO!
•
Then, take the model “as is” and fit the experimental data:
No fitting parameters is to be preferred over 1, 2 , or 3.
Modeling with Neural Networks
17
The
other
way (I like) to build
Cognitive Models
•
In domains where there is
little data and much mystery
•
Use them as
exploratory
models

in domains where there
is little direct data (e.g. no single cell recordings in infants
or undergraduates) to suggest what we might find if we
could
get the data. These can then serve as “intuition
pumps.”
•
Examples:
•
Why we might get specialized face processors
•
Why those face processors get recruited for other tasks
Modeling with Neural Networks
18
A few giants
Modeling with Neural Networks
19
A few giants
Frank Rosenblatt invented the
perceptron
:
•
One of the first neural
networks to learn by
supervised training
•
Still in use today!
Modeling with Neural Networks
20
A few giants
Dave E. Rumelhart with Geoff
Hinton and Ron Williams
invented back

propagation
•
Many had invented back
propagation before; few could
appreciate as deeply as Dave
did what they had when they
discovered it.
Modeling with Neural Networks
21
A few giants
•
Hal White was a theoretician of
neural networks
•
Hal White’s paper with Max
Stinchcombe:
“Multilayer feedforward networks are
universal approximators” is his
second most

cited paper, at 8,114
cites.
Modeling with Neural Networks
22
A few giants
•
In yet another paper (in
Neural
Computation,
1989), he wrote
“
The premise of this article is that
learning procedures used to train
artificial neural networks are
inherently statistical techniques. It
follows that statistical theory can
provide considerable insight into the
properties, advantages, and
disadvantages of different network
learning methods…”
This was one of the first papers to make
the connection between neural
networks and statistical models

and
thereby put them on a sound
statistical foundation.
Modeling with Neural Networks
23
What is backpropagation, and why
is/was it important?
•
We have billions and billions of neurons
that somehow work together to create the
mind.
•
These neurons are connected by 10
14

10
15
synapses, which we think encode the
“knowledge” in the network

too many for
us to explicitly program them in our
models
•
Rather we need some way to
indirectly
set them via a procedure that will achieve
some goal by changing the synaptic
strengths (which we call weights).
•
This is called
learning
in these systems
.
Modeling with Neural Networks
24
Learning: A bit of history
•
Frank Rosenblatt studied a simple version of a neural net
called a
perceptron:
•
A single layer of processing
•
Binary output
•
Can compute simple things like (some) boolean functions (OR,
AND, etc.)
Modeling with Neural Networks
25
Learning: A bit of history
net input
output
Modeling with Neural Networks
26
Learning: A bit of history
Modeling with Neural Networks
27
Learning: A bit of history
•
Rosenblatt (1962) discovered a learning rule for perceptrons called
the
perceptron convergence procedure
.
•
Guaranteed to learn anything computable (by a two

layer
perceptron)
•
Unfortunately, not everything was computable (Minsky & Papert,
1969)
Modeling with Neural Networks
28
Perceptron Learning Demonstration
•
Output activation rule:
•
First, compute the
net input
to the output unit:
w
i
x
i
=
net
•
Then, compute the output as:
If
net
then output =
1
else output =
0
net input
output
Modeling with Neural Networks
29
Perceptron Learning Demonstration
•
Output activation rule:
•
First, compute the
net input
to the output unit:
w
i
x
i
=
net
If
net
then output =
1
else output =
0
•
Learning rule:
If output is
1
and should be
0
, then
lower
weights to active inputs
and
raise
the threshold (
)
If output is
0
and should be
1
, then
raise
weights to active inputs
and
lower
the threshold (
)
(“active input” means x
i
=
1
, not
0)
Modeling with Neural Networks
30
Characteristics of perceptron learning
•
Supervised learning: Gave it a set of input

output examples
for it to model the function (a
teaching signal
)
•
Error correction learning: only correct it when it is wrong.
•
Random presentation of patterns.
•
Slow! Learning on some patterns ruins learning on others.
Modeling with Neural Networks
31
Perceptron Learning Made Simple
•
Output activation rule:
•
First, compute the
net input
to the output unit:
w
i
x
i
=
net
If
net
then output =
1
else output =
0
•
Learning rule:
If output is
1
and should be
0
, then
lower
weights to
active inputs and
raise
the threshold (
)
If output is
0
and should be
1
, then
raise
weights to
active inputs and
lower
the threshold (
)
Modeling with Neural Networks
32
Perceptron Learning Made Simple
•
Learning rule:
If output is
1
and should be
0
, then
lower
weights to
active inputs and
raise
the threshold (
)
If output is
0
and should be
1
, then
raise
weights to
active inputs and
lower
the threshold (
)
•
Learning rule:
w
i
(t+1) =
w
i
(t) +
*(teacher

output)*
x
i
(
is the
learning rate
)
Modeling with Neural Networks
33
Perceptron Learning Made Simple
•
Learning rule:
If output is
1
and should be
0
, then
lower
weights to active inputs
and
raise
the threshold (
)
If output is
0
and should be
1
, then
raise
weights to active inputs
and
lower
the threshold (
)
•
Learning rule:
w
i
(t+1) =
w
i
(t) +
*(teacher

output)*
x
i
(
is the
learning rate
)
•
This is known as the
delta rule
because learning is based
on the
delta
(difference) between what you did and what
you should have done:
= (teacher

output)
Modeling with Neural Networks
34
Problems with perceptrons
•
The learning rule comes with a great guarantee: anything a
perceptron can
compute
, it can
learn to compute.
•
Problem: Lots of things were not computable,
e.g., XOR (Minsky & Papert, 1969)
•
Minsky & Papert said:
•
if you had hidden units, you could compute
any
boolean function.
•
But no learning rule exists for such multilayer networks,
and we
don’t think one will ever be discovered.
Modeling with Neural Networks
35
Problems with perceptrons
Modeling with Neural Networks
36
Aside about perceptrons
•
They didn’t have hidden units

but Rosenblatt assumed
nonlinear preprocessing!
•
Hidden units compute features of the input
•
The nonlinear preprocessing is a way to choose features by
hand.
•
Support Vector Machines essentially do this in a principled
way, followed by a (highly sophisticated) perceptron
learning algorithm.
Modeling with Neural Networks
37
Enter Rumelhart, Hinton, & Williams (1985)
•
Discovered a learning rule for networks with hidden units.
•
Works a lot like the perceptron algorithm:
•
Randomly choose an input

output pattern
•
present the input, let activation propagate through the network
•
give the
teaching signal
•
propagate the error back through the network (hence the name
back propagation
)
•
change the connection strengths according to the error
Modeling with Neural Networks
38
Enter Rumelhart, Hinton, & Williams (1985)
•
The actual algorithm uses the chain rule of calculus to go
downhill
in
an error measure with respect to the weights
•
The hidden units must learn features that solve the problem
. . .
. . .
Activation
Error
INPUTS
OUTPUTS
Hidden Units
Modeling with Neural Networks
39
XOR
•
Here, the hidden units learned AND and OR

two features
that when combined appropriately, can solve the problem
Back Propagation
Learning
Random Network
XOR Network
AND
OR
Modeling with Neural Networks
40
XOR
But, depending on initial conditions, there are an infinite
number of ways to do XOR

backprop can surprise you
with innovative solutions.
Back Propagation
Learning
Random Network
XOR Network
AND
OR
Modeling with Neural Networks
41
Why is/was this wonderful?
•
Efficiency
•
Learns internal representations
•
Learns internal representations
•
Learns internal representations
•
Generalizes to
recurrent networks
Modeling with Neural Networks
42
Hinton’s Family Trees example
•
Idea: Learn to represent relationships between people that
are encoded in a family tree:
Modeling with Neural Networks
43
Hinton’s Family Trees example
•
Idea 2: Learn
distributed
representations of concepts:
localist outputs
Learn:
features
of these
entities useful for
solving the task
Input: localist people
localist relations
Localist
: one unit “ON” to represent each item
Modeling with Neural Networks
44
People hidden units: Hinton diagram
•
What does the
unit 1 encode?
What is unit 1 encoding?
Modeling with Neural Networks
45
People hidden units: Hinton diagram
•
What does
unit 2 encode?
What is unit 2 encoding?
Modeling with Neural Networks
46
People hidden units: Hinton diagram
•
Unit 6?
What is unit 6 encoding?
Modeling with Neural Networks
47
People hidden units: Hinton diagram
When all three are on, these units pick out Christopher and Penelope:
Other combinations pick out other parts of the trees
Modeling with Neural Networks
48
Relation units
What does the lower middle one code?
Modeling with Neural Networks
49
Lessons
•
The network learns features
in the service of the
task

i.e., it learns features on its own.
•
This is useful if we don’t know what the features
ought to be.
•
Can explain some human phenomena
Modeling with Neural Networks
50
Thanks to funders, GURONS, and you!
Questions?
Comments 0
Log in to post a comment