Modeling Language Acquisition

appliancepartΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

72 εμφανίσεις

Modeling Language Acquisition
with Neural Networks

A preliminary research plan


Steve R. Howell

Presentation Overview


Goals & challenges


Previous & related research


Model Overview


Implementation details

Project Goals


Model two aspects of human language
(grammar and semantics)


Use single neural network performing word
prediction


Use orthographic representation


Use small but functional word corpus


e.g. child’s basic functional vocabulary?

Challenges


Need a network architecture capable of
modeling
both

grammar and semantics


What if phonology is required?


Computational limitations

Previous Research

Previous Research


Ellman (1990)



Mozer (1987)



Seidelberg & McLelland (1989)



Landauer et al. (LSA)



Rao & Ballard (1997)

Ellman (1990)


Simple recurrent network (context units)


No built
-
in representational constraints


Predicts next input from current plus context


Discovers word boundaries in continuous
stream of phonemes (slide)

Ellman J. L. (1990). Finding structure in time.

Cognitive Science,
14,

p. 179
-
211

FOR MORE INFO...

Mozer (1987)
-

BLIRNET


Interesting model of spatially parallel
processing of independent words


Possible input representation
-
letter triples


e.g. money = **M, **_0, *MO,…E_**, Y**


Encodes beginnings and ends of words well,
as well as relative letter position, important to
fit human relative
-
position priming data


Mozer M.C. (1987) Early parallel processing in reading: A
connectionist approach. In M. Coltheart (Ed.)
Attention and
Performance, 12:

The psychology of reading.

FOR MORE INFO...

Siedelberg & McLelland (1989)


Model of word pronunciation


Again, relevant for the input representations
used in its orthographic input
-

triples:


MAKE = **MA, MAK, AKE, KE** (**= WS)


Distributed coding scheme for triples =
distributed internal lexicon

Seidelberg, M.S. & McClelland J. L. (1989). A distributed,
developmental model of word recognition and naming.
Psychological Review, 96
, 523
-
568

FOR MORE INFO...

Landauer et al.


“LSA”
-

a statistical model of semantic
learning


Very large word corpus


Significant computation required


Good performance


Data set apparently proprietary

FOR MORE INFO...

Don’t call them, they’ll call you.

Rao & Ballard (1997)


Basis for present network


Algorithms based on extended Kalman Filter


Internal state variable is output of
input*weights


(Internal state)*(transpose of feedforward
weights) feeds back to predict next input

Rao, R.P.N. & Ballard, D.H. (1997). Dynamic model of visual
recognition predicts neural response properties in the visual
cortex.
Neural Computation, 9
, 721
-
763.

FOR MORE INFO...

Model Overview

Model Overview


Architecture as in Rao & Ballard


Recurrent structure excellent for
temporal variability


Starting with single layer network


Moving to multi
-
layer Rao & Ballard net

Model Overview(cont’d)



High
-
level input representations


First layer

of net performs word
prediction from letters


Second layer

adds word prediction
from previous words


Words predict next words
-

Simple
grammar?

Model Overview(cont’d)



Additional higher levels should add
larger temporal range of context


Words in a large temporal range of
current word help to predict it


Implies semantic linkage?


Analogous to LSA “Bag of words”
approach at these levels

Possible Advantages


Lower level units learn grammar


Higher level units learn semantics


Combines grammar
-
learning methods
with “bag of words” approach


Possible modification to language
generation

Disadvantages


Complex mathematical implementation


Unclear how well higher levels will actually
perform


As yet unclear how to modify the net for
language generation

Implementation Details

Implementation Challenges


Locating basic functional vocabulary of
English language


600
-
800 words?


Compare to children’s language
learning/usage, not adults


Locating child data?

Model Evaluation (basic)


Test grammar learning as per Ellman


Test semantic regularities as for LSA

Model Evaluation (optimistic)

If generative modifications possible:


Ability to output words/phrases
semantically linked to input?


ElizaNet?


Child Turing Test?


Human judges compare model output to
real children's output for same input?

Current Status


Continue reviewing previous research


Working through implementation details
of Rao & Ballard algorithm


Considering different types of high
-
level
input representations


Need to develop/acquire basic English
vocabulary/grammar data set

Thank you.

Questions and comments are
sincerely welcomed. Thoughts on
any of the questions raised herein
will be extremely valuable.

FOR MORE INFO...

Please see my web page at:
http://www.the
-
wire.com/~showell/