Natural Language Processing and Cognitive Science

cabbagecommitteeΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

90 εμφανίσεις

Natural Language Processing and Cognitive Science

NLPCS 2012


University of Wroclaw, School of Economics

Wroclaw, Poland

June 28, 2012



Amy Neustein, Ph.D.

Founder and CEO

Linguistic Technology Systems

www.lingtechsys.com

Goals:


Broaden definition of Cognitive Science (CS) to make its
application to NLP
timeless



as opposed to fashionable
(e.g. neurolinguistics, psycholinguistics, semiotics, social
psychology, theoretical linguistics, etc.)



Use a generic definition of CS

an
interdisciplinary

field
of study concerned with how information is represented,
processed and transformed


to encompass certain
disciplines that are traditionally outside the scope of
cognitive studies.




The Role of Cognitive Science in Natural
Language Processing


Selecting apposite methods for EACH task at hand
rather than broadly applying methods


individually or
collectively


of human language
study




Understanding both the method AND its origins
when
applying different disciplines to solve natural language
problems



What Constitutes a Sensible Interdisciplinary
Approach to NLP?

Discourse Analysis
and Computational
Linguistics

Sociology,
Sociolinguistics,
and Conversation
Analysis

Linguistic
Philosophy and
Speech Act Theory

Artificial
Intelligence, Soft
Computing, and
Argument
-
Based
Computing

Cognitive
Psychology and
Psycholinguistics

RELATED DISCIPLINES FOR THE
STUDY OF HUMAN LANGUAGE

EXAMPLES OF CHALLENGING NLP TASKS THAT MAY
BENEFIT FROM A COGNITIVE SCIENCE APPROACH

Machine Translation of Web
Pages (and cross
-
lingual text
mining in which material is
extracted from a specific
portion of the text ) in Under
-
Resourced Languages

Design of Spoken Dialog
Systems that Must Adjust
to Ambiguous Customer
Requests or to Complaints
that do not Contain
Standard Keywords used to
Express Anger/Frustration

Design of Electronic
Dictionaries and Search
Engines that Conform to
Dynamic (rather than
static) Referential Practices

Intelligent Tutoring
Systems that Can Reach
Deeper levels of
Understanding by
Responding to Human
Emotions and
Disengagement Behavior

Computer
-
Aided Bilingual
Instruction for Hearing
-
Impaired Primary School
Students that use Situated
Learning Techniques


Under
-
resourced
languages pose several problems for computational
modeling:



1)They
produce
small

quantities of parallel corpus
data;


2)Speech recognition
accuracy

is compromised by small quantities of
parallel corpus data (this is especially true for computational (statistical)
models that have come to depend on large amounts of corpus data to
perform with a high level of recognition accuracy); and



3)Performing
d
isambiguation

of sense
-
meaning of words (among other
NLP tasks) can be hampered by limited parallel corpus data.




NLP PROBLEMS POSED BY UNDER
-
RESOURCED
LANUAGES WHICH CANNOT BE SOLVED STATISTICALLY


A Lexically Sensitive Model of WSD
(
Kwong

2012)

How Does it Work and Why is it
Important?

Given that intrinsic properties of words
are closely related to our cognition, a
lexically sensitive model of WSD presents
one possible solution to the ambiguities
found in natural language

Such a model would separate words, or
more accurately word senses, into fairly
distinct groups (sense types) according to
their responses to disambiguation, based
on different knowledge sources

Such sense types go beyond simple
linguistic categories such as POS because
are more likely to be semantic and
perceptual

COGNITIVE STRATEGIES FOR WORD
SENSE DISAMBIGUATION (WSD)

By separating words into sense
types, the knowledge pertaining
to information susceptibility of
target words (the relation
between the intrinsic properties
of a word and the effectiveness of
various types of
lexico
-
semantic
knowledge to characterize and
disambiguate it) can help fine
-
tune WSD systems and inform
the optimal combination of
knowledge sources for
disambiguation.


SEPARATING WORDS INTO SENSE TYPES
ACHIEVES DISAMBIGUATION

Kwong

(2012) demonstrates that a
lexically sensitive model for WSD,
one that combines both a
cognitive and computational
perspective, will better inform
automatic systems with
psycholinguistic evidence instead
of “resting entirely and helplessly
with specific machine learning
algorithms and their feature
selection mechanisms” (p. 92).


USING PSYCHOLINGUISTIC EVIDENCE TO RESOLVE
AMBIGUITIES PRESENT IN NATURAL LANGUAGE


Bel
-
Enguix and Jimenez
-
Lopez propose
Conversational Grammar Systems (CGS) to model
dialog as inter
-
action: “a sequence of acts performed
by two or more agents in a common environment”
(2008: 209).

The authors drew from the conversation analytic
literature, which they combined with their knowledge
of computational linguistics, formal language theory,
and speech act theory, maintaining that because the
“investigation and modeling of human language is
clearly an interdisciplinary task…methods for
language technology have to come from different
disciplines” (p. 219)

AN INTERDISCIPLINARY APPROACH

TO THE DESIGN OF SPOKEN DIALOG SYSTEMS

THE PROBLEM OF METAPHORS FOR
SPOKEN LANGUAGE UNDERSTANDING

Barnden (2008) explored the problem presented
by the use of metaphors when performing both
text and speech
-
based NLP tasks, pointing out
that while the problem of metaphor may be
viewed “as a peripheral problem (perhaps mostly
to do with poetry and other literary language) it
is in fact a pervasive feature of mundane
language…” (p. 121).

Barnden employed an interdisciplinary
approach, augmenting discourse analysis with
conversation analysis by drawing on the research
of Paul Drew and Elizabeth Holt (1998) who
showed how speakers employ the art of
metaphor to achieve topic transition in
conversation.

HERE IS AN ILLUSTRATION OF HOW AN
INTERDISCIPLINARY APPROACH ASSISTS THE
DIALOG MANAGER IN GUAGING WHAT IS
HAPPENING AT EACH TURN:

HOW AN INTERDISCIPLINARY APPROACH
AIDS THE DIALOG MANAGER

ANALYZING THE TURN
-
TAKING FEATURES
OF CONCESSIVE CONNECTORS

Popescu, Caelen, and Burileau (2009) discuss
the importance of a dialog manager’s correct
reading of a “concessive connector”
(Moeschler and Reboul, 1994) or what might
be seen as a
clue word


“so,” “anyway,” “now,”
“but,” “although”


that connects the various
utterances (or context space, Reichman ,
1985) that comprise a multi
-
utterance
speaking turn.

Computational linguists, in the absence of
conversation analysts who study in
painstaking detail the turn
-
taking features of
talk
-
in
-
interaction (and how speakers
demonstrate, through the design of their
speaking turns, their understanding and
interpretation of each other’s social actions)
can misinterpret the meaning/function of the
concessive connector which may in fact NOT
serve its literal meaning of “connecting” parts
of a multi
-
utterance speaking turn (Neustein,
2001; 2004; 2007; 2011)

The current speaker, immediately
after producing the concessive
connector (“so”), displays a
“holding” silence (Jefferson, 1983;
1986)


an abrupt silence
accompanied by marked inhalation
that indicates the speaker’s intent to
“hold” the turn and continue
speaking.

The current speaker, unlike the first
scenario, immediately after
producing a concessive connector
(“so”), displays a “trail off ” silence


a gradual silence accompanied by
exhalation


the kind of silence that
provides a clear transition relevance
place for the other speaker to begin
to speak.

GIVING DIALOG MANAGER TWO CONSTRASTIVE
SCENARIOS FOR CONCESSIVE CONNECTOR

AIDING THE DIALOG MANAGER BY ADDING
CONVERSATION ANALYSIS TO COMPUTATIONAL
LINGUISTICS

The two scenarios show that a concessive connector or clue
word does not always serve the purpose of connecting the
utterances of a multi
-
utterance turn because as
demonstrated in the second scenario, the speaker’s intent
may be to yield his turn to the next speaker rather than to
continue speaking.

This is amply demonstrated by the speaker’s use of a clue
word followed by a “trail off” silence, which indicates the
speaker’s intent to relinquish his turn to the next speaker.
Dialog managers must be able to recognize some of the
formal properties of conversation interaction, such as the
difference between a “holding” silence and a “trail off”
silence to gauge what is happening at each turn.

Since it would be practically
impossible to construct grammars
that could cover all spontaneous
utterances, including all
concomitant disfluencies,
robust
parsing
of spontaneous speech has
proven to be a practical alternative
to the crafting of rule
-
based
grammars (
Pieraccini
, 2012, p. 162).

Using statistical modeling
“conceptual HMMs can find the
most probable concepts represented
by a sequence of words, just as
acoustic HMMs can find the most
probable phonemes for a given
sequence of acoustic observations.”
(p. 164).

Reduced to a sequences of words
and their associated probabilities
depending on context, robust
parsing methods, however, can be
severely hampered when keywords
are not found in the dialog
(
Neustein
, 2001; 2006;2007)

LIMITATIONS OF NATURAL LANGUAGE UNDERSTANDING
METHODS FOR PARSING OF CALL CENTER DIALOG

ANGRY UTTERANCES THAT ELUDE THE NATURAL
LANGUAGE UNDERSTANDING MODULE

Keywords associated with
anger and frustration

“cancel my
account”

“give me a
supervisor”

“I’m switching
to X
(competitor)”

When keywords are absent
from the dialog, the Spoken
Language Understanding
module fails to identify
angry/frustrated customers.

ADDING CONVERSATION ANALYSIS TO THE DESIGN OF
SPOKEN DIALOG SYSTEMS TO IDENTIFY SPEAKER STATE

GOALS:

To devise a parsing
method that builds
conversation analysis into
Spoken Dialog Systems

To base this parsing
method on a statistical
language modeling
approach to understanding
natural language dialog in
lieu of rule
-
based
grammars that anticipate
all constructions of
spontaneous utterances
and their associated
disf luencies

GOALS, continued

To build a BNF table (built upon
more elemental units) consisting of
a set of non
-
terminals


context
-
free grammatical units and their
related prosodic features for which
there is a corresponding list of
interchangeable terminals (words,
phrases, or a whole utterance)
(
Neustein
, 2007)

To build this multi
-
tiered BNF table
with an elaborate incremental
design of complex grammatical
units that capture the kind of
speaker state data
(angry/frustrated) that elude
natural language systems that
search for standard keywords (e.g.
“cancel my account”) (
Neustein
,
2006; 2011)

NEW NLU METHOD FOR BUILDING MULTI
-
TIERED
TABLE OF SPEAKER
-
STATE PARSING STRUCTURES

SEQUENCE PACKAGE ANALYSIS
(or SPA) constitutes a new NLU
method for classifying speaker
state (
Neustein
, 2001; 2004; 2006;
2011)

SPA algorithms identify in spoken
language dialog (and blogs,
tweets, and other social media)
the conversational sequence
patterns of natural language
dialog that reflect elusive,
sometimes confounding, human
emotions

SPA draws from the field of
conversation analysis, a rigorous,
empirically
-

based method of
recording and transcribing verbal
interaction (using highly refined
transcription symbols to identify
linguistic and paralinguistic
features) to study how speakers
demonstrate, through the design
of their speaking turn, their
understanding and interpretation
of each other’s social actions



HOW DOES SPA WORK?

SPA relies more on the
sequence package (a series of
related turns and turn
construction units or part of
turns that are discretely
packaged a sequence of
conversational interaction) in
its entirety, as the
primary
unit
of analysis, than on isolated
syntactic parts

By marking sequence package
boundaries and specifying
package properties, the SPA
-
enhanced mining program
gives the software downstream
the contextual indicia

the
precise location points in the
flow of interactive dialog,
signifying the different
conversational activities and
phases of the dialog

needed
to interpret the rest of the data
stream reliably.

By parsing dialog for its
relevant sequence packages
that are discretely packaged a
sequence of conversational
interaction


the SPA designed
natural language interface
extracts important data,
including emotional content on
speaker state, by looking at the
sequential order and frequency
of the
totality
of the context
-
free grammatical components
that make up each sequence
package

ILLUSTRATION OF ANGRY CALLER IN THE ABSENCE OF
STANDARD KEYWORDS THAT SIGNIFY AN ANGRY CALLER


Caller: Absolutely
unbelievable! What is your?
name


Agent: Mr. Smith


Caller: Well! I intend to take
this much further…This is
just absolutely ridiculous!

Note: Punctuation
symbols below are
acoustic and not
grammatical:
question marks
appear mid
-
sentence to indicate
an upward query at
that location point
in the dialog;
exclamatory marker
is used to indicate a
rise in inflection

SEQUENCE PACKAGES AND
CORRESPONDING ANGER INDEX


Absolutely Unbelievable! <
Exaggerative Qualifier
> (
8
)


What is your? name <
Identification Request with
Inflection
> (non sequitur; accusatory tone as
indicated by displaced (mid
-

sentence) inflection) (
9
)


Well! <
Exclamation with Prosody
> (
7
)


I intend to take this much further…<
Declarative
Assertion
> (
9
)


This is absolutely ridiculous! <
Exaggerative Qualifier
>
(
8
)


Total Score for Customer Anger Index
:
41

ACCRETION OF MORE ELEMENTAL PARSING
FEATURES IN ANGRY CALLER EXAMPLE

A “very angry complaint,” is
illustrated on the BNF table
as the natural accretion of
its more elemental parsing
features:

assertions

Exaggerations

declarations

SALIANCE VALUE ATTRIBUTED TO PARSING
STRUCTURES



Note: For the purposes of this
illustration, I am not addressing
the smaller POS grammatical units
that make up the larger parsing
structures, such as exaggerative
qualifiers or exclamation with
prosody, since it is a given that a
spoken language system would
identify the smaller units that
make up these larger parsing
structures.

Descriptors (“absolutely
unbelievable” “absolutely
ridiculous”) have “high
salience value” (they co
-
occur with the emotion
class “anger” or “surprise” as
opposed to a low salience
value ascribed to more
neutral words, such as
“continue” or “yes”); yet
there are still no “catch”
phrases or standard
keywords in dialog to
signify an irate caller

DESIGN OF DYNAMIC ELECTRONIC DICTIONARIES AND DYNAMIC
SEARCH ENGINES TO CONFORM TO HUMAN LANUAGE USE


Language Use is Not a
Static

Process: The
meaning of words continually evolve as words
derive their meaning from their contextual
usage and in turn “ref lexively” re
-
define
context through their use (e.g., political usage
of “transparency” or “bailout”)

Query
-
Based Search is a Not
Static

Process:
Users do not engage in what is known as
“precise search” which presupposes that user
know exactly what to look for: a precise paper
knowing its title, authors, and major theme.
It is, therefore, not unusual for users to input
search terms that are different from index
terms used by the system (
Kboubi
, et al. 2012)

Electronic dictionaries
“promise dynamic, proactive
search via multiple criteria
(meaning, sound, related
words) and via diverse access
routes.

Navigation (supported by our
understanding of the mental
lexicon and an integration of
these findings into the design
of electronic dictionaries)
takes place in a huge
conceptual lexical space, and
the results are displayable in a
multitude of forms (e.g. as
trees, as lists, as graphs, or
sorted alphabetically, by
topic, by frequency)” (Zock
and Rapp, 2012).

ELECTRONIC DICTIONARIES

DYNAMIC AND FLEXIBLE SEARCH ENGINES

Kboubi
, et al. (2012) Propose Alternative Search
Types:


“thematic search” (allowing users to navigate the
corpus according to a particular theme)


“connotative search” (allowing users to discover the
associated and similar concepts to their target
concepts)


“exploratory search” (allowing users to ‘consult’
with the corpus so that they will derive a better idea
of what they were not able to initially define)

APPLYING CONVERSATION ANALYSIS TO STUDY
WEB USERS’ DYNAMIC SEARCH PROCESS

Moore (2012)
showed that web
searchers display
interactional
competencies
found in
conversational
dialog

in the course
of their query
formulations
produced during
an Internet search:


First, they formulate
their queries using
names for the entity that
occupies their online
search


Second, they resort to
generic descriptions
when they don’t know
the name of the entity in
question


Third, they use the newly
learned name (uncovered
during a generic search)
as opposed to generic
descriptions in
all

subsequent searches

ONLINE SEARCH PROBLEMS AND THEIR
CONVERSATIONAL DIALOG COUNTERPART

REFERENCE GENERATION: In the process of preforming reference
generation the speaker must move from his own “egocentric point of
view…to the listener’s position” so that the referring expressions are aimed at
the listener’s frame of reference and cognitive state (Zock, et al., 2012). In
conversation speakers display a preference for
recognitionals

[referring
expressions that are recognized/understood by the other speaker] as
“stronger than the preference for minimization” [use of a single reference
form, usually a name] (Sacks and
Schegloff
, 1979).

SIGNS OF TROUBLE: When listeners fail to recognize the referring
expression (e.g., the name of a person or object), speakers tend to become
verbose
, giving multiple descriptions in an attempt to gain recognition over
concision. “[W]hen [online] search queries are verbose
due to their
descriptive nature
, they can be taken as signs of interactional trouble and of
a knowledge gap on the part of the user…The occurrence of verbose queries
suggests that the user chose to relax the preference for minimization
[usually a single reference form]…After a few failed queries… [users] often
formulated their queries as grammatical questions thereby increasing query
length” (Moore, 2012)

ARRIVING AT NATURAL LANGUAGE
SOLUTIONS TO WEB SEARCH CHALLENGES

Break online search sessions into segments. Pay close attention to midway points in
users’ on line search sessions where long/verbose queries (“kitchen
-
sink”
queries)signifying trouble are most likely to occur during difficult tasks (Moore 2012)

Look for indications of what Moore (2012) calls “kitchen
-
sink” queries (in the absence
of proper entity names, users progressively pile on additional descriptive words and
phrases until they become unwieldy)

Determine if web search problems which fail to bring up useful search results are
caused by scarcity of available information on the web OR by users’ failure to use a
correct entity name?

A NATURAL LANGUAGE HEURISTIC FOR
IMPROVING ONLINE SEARCH RESULTS


Using both images of entity descriptors and text
descriptions that provide the best alternate match
for the user’s incorrect entity name, users can be
taught the correct entity name for future online
search purposes (i.e., “large pot” V. “casserole pot”)



Users can be presented with a pop up link labeled ‘I
don’t know what it’s called’ (Moore 2012) when they
first begin to demonstrate signs of trouble (e.g.,
repetitive “kitchen

sink” queries). This link may be
used as a standard query option (Moore 2012) or it
may be activated only if users display search
difficulties.

Similar to web
-
based
applications for
teaching foreign
speakers how to
perform reference
generation (Zock
et al., 2012), users
can be trained to
use the correct
search item in the
following way:

REFERENCES

Barnden, J. A., 2008.
Challenges in natural
language
understanding: the
case of metaphor
(commentary).
International Journal
of Speech
Technology, 11(3
-
4):
121
-
123.

Bel
-
Enguix, G.,
Jimenez
-
Lopez,
M.D., 2008.
Modelling dialog as
inter
-
action.
International Journal
of Speech
Technology, 11(3
-
4):
209
-
221.

Drew, P., Holt, E.,
1998. Figures of
speech: Figurative
expressions and the
management of topic
transition in
conversation.
Language in Society,
27(4): 495
-
522.

Kwong, O., 2012. New
Perspectives on
Computational and
Cognitive Strategies
for Word Sense
Disambiguation.
SpringerBriefs, Series
in Speech
Technology
(Neustein, A., Ed.)
Springer
-
Verlag,
Berlin Heidelberg
New York.

REFERENCES, CONT.

Jefferson, G., 1983.
On a failed
hypothesis:
“Conjuctionals” as
overlap vulnerable.
Tilburg Papers in
Language and
Literature, 28: 29
-
33.

Jefferson, G., 1986.
Notes on ‘latency’ in
overlap onset.
Human Studies, 9(2
-
3): 153
-
183.

Kboubi, F., Habacha
Chaibi, A., and
BenAhmed, M., 2012.
Semantic
visualization and
navigation in textual
corpus. International
Journal of
Information Sciences
and Techniques, 2
(1):53
-
63

Moeschler, J.,
Reboul, A., 1994.
Dictionnaire
Encyclopedique de
pragmatique. Seuil,
Paris.

Moore, R. J., forthcoming.
A Name is Worth a
Thousand Pictures:
Referential Practice
inSearch Engine
Interactions. In A.
Neustein and J. M.
Markowitz (Eds.), Machine
Talk: The Next Generation
of Natural Language
Processing and Speech
Technology. Springer
-
Verlag,Berlin Heidelberg
New York.

Neustein, A., 2001. Using
sequence package analysis
to improve natural
language understanding.
International Journal of
Speech Technology, 4(1):
31
-
44.

Neustein, A., 2004.
Sequence Package
Analysis: a new natural
language understanding
method for performing
data mining of help
-
line
calls and doctor
-
patient
interviews. In B. Sharp
(Ed.), Proceedings of first
international workshop on
natural language
understanding and
cognitive science, ICEIS
2004, University of
Portugal, (April 13) pp. 64
-
74.

Neustein, A., 2006.
Sequence Package
Analysis: A new natural
language understanding
method for improving
human response in critical
systems. International
Journal of
SpeechTechnology, 9(3
-
4):
109
-
120.

REFERENCES, CONT.

Neustein, A., 2007. Sequence
Package Analysis: A new method for
intelligent mining of patient dialog,
blogs and help
-
line calls. Journal of
Computers, 2(10): 45
-
51.

Neustein
, A., 2011. Sequence Package
Analysis and Soft Computing:
Introducing a new hybrid method to
adjust to the fluid and dynamic
nature of human speech. In E.
Corchado,V
. Snasel, J.
Sedano
, A.E.
Hassanien
, J.L. Calvo, and D.
Slezak

(Eds.) Soft
ComputingModels

in
Industrial and Environmental
Applications, 6th International
Conference SOCO2011: Advances in
Intelligent and Soft Computing,
Volume 87, Springer
-
Verlag
,
BerlinHeidelberg

New York, pp. 1
-
10.

Pieraccini, R., 2012. The Voice in the
Machine. MIT Press, Cambridge,
Mass.

REFERENCES, CONT.

REFERENCES, CONT.

Popescu, V., Caelen, J.,
and Burileanu, D., 2009.
A constraint satisfaction
approach to context
-
sensitive utterance
generation in multi
-
party dialogue systems.
International Journal of
Speech Technology, 12(2
-
3): 95
-
112.

Sacks, H., Schegloff, E.
A., 1979. Two preferences
in the organization of
reference to persons in
conversation and their
interaction. In G.
Psathas (Ed.), Everyday
Language: Studies in
Ethnomethodology.
Irvington Publishers,
Inc., New York, pp. 15
-
21.

Zock, M., Lapalme, G.,
and
Yousfi
-
Monod, M.,
2012. Learn to speak like
normal people do: The
case of object
descriptions. In B. Sharp
and M. Zock (Eds.),
Proceedings of 9
th

International Workshop
on Natural Language
Processing and
Cognitive Science.
Wroclaw University of
Economics, Wroclaw,
Poland (June 28, 2012).

Zock, M., Rapp, R., 2012.
Cognitive Aspects of the
Lexicon (CogALex
-
III).
Workshop in
conjunction with the
24th International
Conference on
Computational
Linguistics , Mumbai,
India, (December 8
-
15,
2012)