NEURAL NETWORK MODELS OF CONDITIONING AND ACTION

sciencediscussionΤεχνίτη Νοημοσύνη και Ρομποτική

20 Οκτ 2013 (πριν από 4 χρόνια και 21 μέρες)

90 εμφανίσεις









NEURAL NETWORK MODELS OF



CONDITIONING AND ACTION



THE TWELFTH SYMPOSIUM


ON MODELS OF BEHAVIOR


AT HARVARD UNIVERSITY







Friday and Saturday, June 2 and 3, 1989


William James Hall, Harvard University


33 Kirklan
d Street


Cambridge, Massachusetts 02138





2



Welcome to the Society's Twelfth Annual Symposium. The suggested donation to the Society for Quantitative Analysis of
Behavior is $45 per person. Students with ID receive a $15 discount.










Program Comm
ittee:




Michael L. Commons


Stephen Grossberg


John E. R. Staddon












We thank the Dare Association, Inc., The Center for Adaptive Systems, Department of Mathematics, Boston University, and the
Department of Psychology, Harvard University, for the
ir support. Typesetting was done at


PC Genius, Woburn MA with a Newton
-
286 PC and a NEC Laser Printer.


c. 1989 Society for Quantitative Analysis of Behavior





3


NEURAL NETWORK MODELS OF CONDITIONING AND ACTION



The Twelfth Symposium on Models of Behavio
r at Harvard University


Friday and Saturday, June 2 and 3, 1989


William James Hall, Harvard University


33 Kirkland Street, Cambridge, Massachusetts 02138


THURSDAY, JUNE 1, 1989
-
Reception


FRIDAY, JUNE 2, 1989


8:30 AM


1.
Daniel L. Alkon, Thomas P. Vogl
, Kim T. Blackwell, & David Tam
,
Pattern Recognition and Storage by an Artificial
Network Derived from Biological Systems
.


Laboratory of Biophysics, Marine Biological Laboratory, Woods Hole, MA 02543




The DYSTAL (Dynamically Stable Associative Learn
ing) model is an artificial modifiable neural network based on
observed features of biological neural systems in the mollusk
Hermissenda

and the rabbit hippocampus. In the DYSTAL
network, synaptic weight modification depends on (1) convergence of modifiab
le "collateral" and unmodifiable "flow
-
through"
inputs, (2) temporal pairing of these inputs, and (3) past activity of elements receiving the inputs. Modification is indepe
ndent of
element output. As a consequence, DYSTAL shows (1) linear scaling of comp
utational complexity with network size, (2)
exceptionally rapid learning without an external "teacher," and (3) ability to independently associate different ensembles of

inputs, serving both as an associator and classifier of input patterns.


9:30 AM


2.
Jo
hn H. Byrne
,
Analysis and Simulation of Cellular and Network Properties Contributing to Learning and Memory in
Aplysia
.


Department of Neurobiology and Anatomy, The University of Texas, 7046 Medical School Main Building, P.O. Box
20708, Houston, TX 77225



A goal common to both neurobiologists and adaptive network theorists is to understand events occurring within single
elements and networks that contribute to learning and memory. Our approach is to analyze empirically the properties of neuro
ns
and neural

circuits that mediate simple forms of learning in
Aplysia

and to develop mathematical models of these neuronal
elements and networks. A single neuron
-
like adaptive element, which reflects the subcellular properties of sensory neurons,
simulates many feat
ures of nonassociative learning and classical conditioning. Moreover, relatively simple neural networks
exhibit some higher
-
order features of classical conditioning. These results help to provide insights into the relative contributions
of subcellular an
d network properties to simple forms of learning.


10:30 AM
-

Break

10:45 AM


3.
William B. Levy
,
Synaptic Modification Rules in Hippocampal Learning
.


Department of Neurological Surgery, University of Virginia, School of Medicine, Box 420, Charlottesville,

VA 22908




Neural networks are presented which simulate some behavioral effects of frontal lobe damage. On some cognitive tasks,
frontal lesions cause perseveration of formerly rewarding choices of action. On other tasks, frontal lesions cause approach

to
objects just because they are novel. Both effects can be explained by weakening of signals between sensory and reinforcement

loci. The networks that reproduce these data incorporate neural design principles developed for other purposes by Grossberg
a
nd
his co
-
workers. These design principles include adaptive resonance between two layers of sensory processing; attentional gating


4

of synapses between layers; and competition among gated dipoles containing on
-
units and off
-
units.


11:45 AM
-

Group Photo
graph of all Presenters and Co
-
presenters

12:00 PM
-

Luncheon





FRIDAY, JUNE 2


1:00 PM


4.
Gail A. Carpenter
,
Recognition Learning by a Hierarchical ART Network Modulated by Reinforcement Feedback
.


Center for Adaptive Systems, Boston University, 111 Cum
mington Street, Boston, MA 02215



Adaptive resonance architectures are neural networks that self
-
organize stable pattern recognition codes in real
-
time in
response to arbitrary sequences of input patterns. Top
-
down learned expectations and matching mecha
nisms are critical for code
stability. A parallel search scheme realizes a form of hypothesis discovery, testing, learning, and recognition. A matching

criterion determines whether an exemplar will be accepted in a given category. Reinforcement feedback

can modulate the search
process by altering this matching criterion. A depletable chemical transmitter model within the bottom
-
up and top
-
down
adaptive filters implements a robust, flexible search process and attentional shifts due to reinforcement feedb
ack.


2:00 PM



5.
Stephen Grossberg
,
Neural Dynamics of Reinforcement Learning, Selective Attention, and Adaptive Timing
.

Center for Adaptive Systems, Program in Cognitive and Neural Systems, Department of Mathematics, Boston University, 111
Cummington St
reet, Boston, MA 02215



A real
-
time neural network model is described in which reinforcement helps to focus attention upon and organize learning
of those environmental events and contingencies that have predicted behavioral success in the past. Computer
simulations of the
model reproduce properties of attentional blocking, inverted
-
U in learning as a function of interstimulus interval, primary and
secondary excitatory and inhibitory conditioning, anticipatory conditioned responses, attentional focusing by

conditioned
motivational feedback, and limited capacity short term memory processing. An explanation is offered of how forgetting is
actively controlled, and, in particular, why conditioned responses extinguish when a conditioned excitor is presented alo
ne, but
do not extinguish when a conditioned inhibitor is presented alone. These explanations invoke associative learning between
sensory representations and drive, or emotional, representations (in the form of conditioned reinforcer and incentive motivat
ional
learning), between sensory representations and learned expectations of future sensory events, and between sensory
representations and learned motor commands.


3:00 PM
-

Break

3:15 PM




6.
Daniel S. Levine
,
Simulations of Conditioned Perseveration an
d Novelty Preference from Frontal Lobe Damage
.


Department of Mathematics, University of Texas at Arlington, Arlington, TX 76019



Work with Stephen Grossberg has shown that blocking and interstimulus interval effects can be simulated by a neural
network t
hat combines sensory and drive representations in a positive feedback loop. In addition, the network includes
competition between the sensory representations for limited capacity attentional resources. In current work, this circuit an
d the
Grossberg
-
Schm
ajuk READ circuit are extended to model unblocking, extinction, and conditioned inhibition. The expanded
circuit includes both sensory and motivational dipoles, and mismatch
-
activated modulation of short
-
term memory decay.






5

4:15 PM


7.
Nestor A. Schmajuk
,
Neural Dynamics of Hippocampal Modulation of Classical Conditioning
.


Department of Psychology, Northwestern University, 102 Swift Hall, Evanston, IL 60208



This paper describes hippocampal participation in classical conditioning in terms of Grossberg's (
1975) attentional theory.
This paper proposes that the hippocampus controls the competition among sensory representatives for a limited capacity short
-
term memory, and that it stores incentive motivation associations. Based on this hypothesis, the model
predicts that hippocampal
lesions impair phenomena that depend on the competition for short
-
term memory, such as blocking and overshadowing. The
model also predicts that hippocampal long
-
term potentiation facilitates the acquisition of classical condition
ing by increasing the
stored values of incentive motivation associations. Finally, the model describes hippocampal neural activity as proportional

to
the strength of the associations between conditioned and unconditioned stimuli.



6

SATURDAY, JUNE 3


8:30 A
M


8.
John W. Moore
,
Implementing Connectionist Algorithms for Classical Conditioning in the Brain
.


Department of Psychology, University of Massachusetts at Amherst, Amherst, MA 01003



Simple connectionist models of learning that conform to the Widrow
-
Hof
f rule can be parameterized and extended to
describe real
-
time features of classical conditioning. These features include the dependence of learning on the moment
-
to
-
moment status of input to the computational system and on the desired topography of its o
utput. Using the classically
conditioned nictitating response (NMR) of the rabbit as a prototypal system, my coworkers and I have devised models which
successfully meet these real
-
time criteria. Two models and neural network architectures are described.

The first consists of a
single neuron
-
like processor with learning rules based on the Sutton
-
Barto model. The second consists of two neuron
-
like units
with input based on a tapped
-
delay line representation of stimuli. Using anatomical and physiological
data, both network models
can be aligned with brain stem and cerebellar circuits involved in classical NMR conditioning. These models and their
implementation in the brain have testable empirical consequences.


9:30 AM


9.

Russell M. Church
,
A Connectioni
st Model of Scalar Timing Theory
.


Department of Psychology, Brown University, Providence, RI 02912



An information
-
processing version of Scalar Timing Theory has been useful for explaining the ability of animals to
estimate the duration of an interval.
Unfortunately, this version contains some cognitive activities that are difficult to duplicate
with known biological mechanisms. A new connectionist version of Scalar Timing Theory is now under development that
requires only the standard assumptions about

the operations of neural networks. It deals with many of the facts of duration
discrimination and also with the closely related cognitive capacities of rate and number discrimination.


10:30 AM
-

Break

10:45 AM


10.
William S. Maki & Adel M. Abunawass
,
"A

Connectionist Approach to Conditional Discrimination: Learning, Short
-
Term
Memory, and Attention
.

Departments of Psychology and Computer Science, North Dakota State University, Fargo, ND 58105
-
5075



Our investigations occupy a niche situated between com
putational models of neurobiology of conditioning and parallel
distributed processing models of human learning and cognition. We have been studying an adaptive network model of a complex
discrimination (matching
-
to
-
sample, or MTS). MTS has been used for
years to study a variety of cognitive processes in animal
behavior and the results of that research should provide a valuable source of constraints on connectionist models. MTS reduc
es
to the exclusive or problem (XOR), and, like the XOR, is learned by a
multilayer network using an error back
-
propagation
algorithm (the "generalized delta rule"). The network model exhibits some interesting effects that are shown by real organis
ms;
for example, the network's performance of delayed MTS improves with delay t
raining ("rehearsal"), and its matching accuracy is
impaired when compound samples are presented ("shared attention"). Details of these and related simulations provide the
grounds for a critical evaluation of the strengths and weaknesses of the model.


11
:45
-

Luncheon


SATURDAY, JUNE 3

1:00 PM
--
5:15 PM


1:00 PM





7


11.
Michael L. Commons, John T. Bailey, Charla C. Griffey, James E. Mazur, & Eric V. Bing,

Models of Acquisition and
Preference
.


Department of Psychiatry, Harvard Medical School, Massachusetts Men
tal Health Center, 74 Fenwood Road, Boston, MA
02115
-
6196




To develop learning algorithms, acquisition of response allocation leading to steady
-
state matching (Herrnstein, 1970) was
examined. Pigeons were exposed to random
-
ratio probabilistic schedules
(Bailey and Mazur, submitted). A single choice
-
peck
turned off both keys, possibly earning access to grain and started and ITI. The data analysis examines the observed rate of
acquisition and compares it to a number of nonlinear regression predictions an
d simulations. For the regressions, both obtained
and programmed reinforcement served as parameters. Myerson and Miezin's kinetic model, along with Herrnstein and Vaughan's
melioration model, were tested.


2:00 PM


12.

John E.R. Staddon
, "Simple Parallel

Model for Operant Learning with


Application to a Class of Inference Problems."


Department of Psychology, Duke University, Durham, NC 27706



Instrumental (operant) learning has two aspects: the development of an appropriate repertoire of responses, an
d the
selection of effective members from the repertoire. There has been little theoretical progress towards understanding how
repertoires come to be organized, but there is a substantial literature showing how reward and punishment affects the selecti
on
of
responses. We show how several well
-
studied examples of adaptive and maladaptive effects of reinforcement (reversibility,
delay of reinforcement, contingency, superstition and instinctive drift) can be derived from a theoretical model with only sh
ort
-
t
erm memory (STM) in which a behavioral repertoire is considered as an ensemble of simple integrators.


We show how the STM model can be extended to incorporate long
-
term effects, which allows it to accommodate classical
conditioning and performance on a ta
sk, transitive inference, normally thought to require "higher" cognitive processes.


2:25 PM


13.

Alliston K. Reid
,
Computational Models of Instrumental and Scheduled Performance
."


Departments of Computer Science and Psychology, Eastern Oregon State


Coll
ege, La Grande, OR 97850



There are striking similarities between the credit/blame assignment problem in machine reinforcement learning and the
accurate detection of contingencies by animals when first exposed to simple reward schedules. The credit assig
nment problem is
especially difficult to solve when many decisions may be made before receiving reward or punishment (learning with a critic),

such as in chess. Barto, Sutton, and Anderson (1983) have demonstrated an algorithm that appropriately assigns c
redit and
blame to individual components of the input vector that is produced by the various concurrent sources of feedback. This
presentation explores possible modifications to their algorithm in order to extend the technique to instrumental learning. T
he
approach may allow differential sensitivity to various concurrent feedback sources available in most operant learning situati
ons,
such as time since last reward or time since last response. I believe that this approach will allow an analysis of the sou
rces of
feedback used during learning and in steady
-
state schedule performance.


2:50 PM
-

Break

3:05 PM



14.
Stephen Jose Hanson
,
Behavioral Diversity, Hypothesis Testing, and the Stochastic Delta Rule
.


Cognitive Science Laboratory, Princeton University
, Princeton, NJ 08542




Biological systems engaged in "strategic" type behavior
--
foraging, avoiding threats, sexual selection
--
tend to become


8

"activated" or aroused. Concomitant with the observable consequences of arousal in terms of the rate and vigor o
f activity
(Killeen, 1975) are changes in behavioral diversity (Staddon & Simmelhag, 1971; Hanson, 1980). The dynamics of variability i
n
behavior during challenge can serve at least three purposes: (1) it allows the organism to entertain multiple hypothe
ses in a static
environment, (2) maintains a prediction history (in terms of activity variability) which is local, recent and cheap, and (3)
allows
the organism to revoke or revise strategies which may not lead to globally optimal outcomes. Problem solvin
g and learning in
many different contexts can be seen as a tension between directed search and random variation and combination of previously
attempted strategies. In environments with a large number of constraints and variables to negotiate, directed sea
rch is intractable
and must be amended with more local or more heuristic strategies that may be reflected in the judicious introduction of noise

inherent in biological or neural systems.



A connectionist implementation of a simple learning rule is discuss
ed in the context of behavioral diversity and its relation
to problem solving and learning. It is shown that the simple principles suggested by natural variability in behavior and neu
ral
systems provides for a heuristic search mechanism that solves proble
ms fast and efficiently. The same mechanism can be shown
to provide insights into problem solving and learning and the nature of the exploitation and generation of learning strategie
s.


4:05 PM


15.
Richard S. Sutton
,
Time Derivative Models of Pavlovian Re
inforcement
.


Computer and Intelligent Systems Laboratory, GTE Laboratories


Incorporated, Waltham, MA 02254



Several computational models of classical conditioning have been proposed that model reinforcement as the time derivative
of the output of a neu
ron
-
like element (Sutton & Barto, 1981; Moore, et al., 1986; Klopf, 1988; Gelperin, Hopfield & Tank,
1985; Tesauro, 1986). I explore the ideas behind these models from the point of view of animal learning theory and argue tha
t
their use of time derivative
s leads to difficulties in reproducing the empirical ISI dependency. I present a modified time
-
derivative theory of reinforcement and show that Sutton & Barto's TD model, which uses this refinement, solves the ISI
problems.





9



REGISTRATION FEE BY MAIL


(P
aid by check to Society for Quantitative Analysis of Behavior)


(Postmarked by April 30, 1989)

Name



Title



Affiliation



Address




Telephone(s)



E
-
mail Address



( ) Regular $35

( ) Full
-
time Stude
nt $25

School



Graduate Date



Print Faculty Name



Faculty Signature





PREPAID 10
-
COURSE CHINESE BANQUET ON JUNE 2


( ) $20 (add to pre
-
registration fee check)



M
ail registration form and check to:


Dr. Michael L. Commons, Society for Quantitative Analysis of Behavior,


234 Huron Avenue, Cambridge, MA 02138





REGISTRATION FEE AT THE MEETING

Regular $45

Full
-
Time Student $30 (Students must show active student ID)

On Site Registration
: 5:00 PM
--
8:00 PM, June 1, at the RECEPTION in

Room 1550, William James Hall, 33 Kirkland Street
and

7:30 AM
--
8:30 AM,

June 2, in the LOBBY of William James Hall.



Registration by mail before April 30, 1989 is recommended


as seat
ing is limited



HOUSING INFORMATION


Rooms have been reserved in the name of the symposium ("Models of Behavior") for the Friday and Saturday nights at:


Best Western Homestead Inn, 220 Alewife Brook Parkway, Cambridge, MA 02138.

Single: $71; Double: $80.

Call (617) 491
-
1890 or (800) 528
-
1234 and ask for the Group Sales desk.


Reserve your room as soon as possible. The hotel will not hold them past May 1. Because of Harvard and MIT graduation
ceremonies, space will fill up rapidly. Other nearby hotels:


Howard Johnson's Motor Lodge Suisse Chalet

777 Memorial Drive 211 Concord Turnpike Parkway

Cambridge, MA 02139 Cambridge, MA 02140

(617) 492
-
7777; (800) 654
-
2000 (617) 661
-
7800; (800)
258
-
1980

Single: $115
--
$135; Double: $115
--
$135 Single: $48.70; Double: $52.70