Natural Language Processing in

scarfpocketΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 4 χρόνια και 20 μέρες)

57 εμφανίσεις

Natural Language Processing in
Augmentative and Alternative
Communication



Yael Netzer

Outline


Natural Language Processing


Augmentative and Alternative Communication


My work
-

Generation of messages


How does the process look like


What is needed


NLP in AAC


Word prediction


Message generation


IR methods


Natural Language Processing

Applications/models with usage of linguistic
knowledge, or that provide linguistic
knowledge (POS taggers, parsers etc.)


Language applications:


Machine translation


Text summarization


Information retrieval/extraction


Human Computer interface.

Alternative and Augmentative
Communication


AAC Users


Congenital diseases e.g. cerebral palsy


Progressive diseases e.g. ALS Amyotrophic Lateral Sclerosis (Lou
Gehrig's Disease)


Trauma e.g. head injury


Cognitive disabilities vs. physical disabilities (each requires
different methods and assumptions).


Slow rate of conversation


Speech rate
150
-
200
wpm, skilled typist
60
wpm


Speech prosthesis users:
10
-
15
wpm


Each ‘key stroke’ may consume a lot of energy.


Trade off between conversation rate and cohesion of utterances.


AAC Techniques


Simple pointing on boards and letter charts



Portable keyboard devices



Computer
-
based systems using single
-
switch
access for severely impaired subjects.


Symbols or letters



Various symbol systems (Blissymbolics) /sets (PCS).


Pre
-
stored phrases accessible via grid or iconic
buttons.


AAC and NLP


Common issues:


Text generation


Speech recognition


Text to speech synthesis


Information retrieval.


3 workshops, 1 special edition in journal
(Natural Language Engineering).

Our framework


Natural language generation


Content planning


Surface realization


Lexical choice


Syntactic realization


Morphological processing.


FUF/SURGE, HUGG


Lexicon (Jing et al.)

Desired Scenario



Content planning



l exicalization

Syntactic

realization

vocalization

Examples


ME / TO SEE / CAT / TO EAT


I saw the cat eating.


CAT / TO EAT / TO SEE / ME


The cat ate and I saw it


The cat that ate saw me.


Blissymbolics


Invented by Charles Bliss
1965
as a written
universal language.


Adapted by Canadian speech therapists in the
early
70
s.: successful alternative to verbal comm.


Consists of approx.
100
basic symbols.


Language now consists of more than
2000
complex ‘words’.


Not so easy to learn but…


Enables good novel/personal expressions


Good basis for literacy


Adults like to use it too.

Bliss Example

Afraid =

sad



future







=

Example
-

Minspeak /PCS
lexicon

apple, food, eat

house + food = grocery

rainbow + apple = red

Me See Cat Eat

Cat Eat See Me (Heb. Ver.)

Syntax: Ambiguity


ME / SEE / CAT / TO EAT


I saw the cat eating.


CAT / TO EAT / SEE / ME


The cat ate and I saw it


The cat that ate saw me.


However, users of AAC don’t usually obey the
word order of spoken language:


go+girl+house

or
girl+house+go

or
house+go+girl


Two+bed+sleep+boy+one+girl+white+bed+
brown+bed (
the boy and the girl are sleeping in two
bed, one in a white bed and the other in a brown bed
).

Pragmatics


where situation is taking place,


who’s the hearer,


Good morning

vs.
Hi


Open the window

vs.
Can you please open the
window?


Gestures (facial, body)


Pragmatics: Ambiguity

Do you want to eat?

I want to eat.

Contextual Resources


What is the context of the things that are
said, following what was already said
before, referential expressions.



In a restaurant you can talk about “the
menu”


In front of a computer, “the menu” is a set
of commands.


Textual Context: Previous
Utterances

someone

a person

a woman

a mother

a female parent

Lucy

she

etc…



I met Lucy.

She looks great

Generating from Symbols: Issues


Syntactic ambiguity


Contextual ambiguity


No strict rules for use of symbols


Individual
codes, conventions, abbreviations.


Textual


how one word affects the choice of
another, ordering words, fluency.


Practical: Enhancing communication rate w/o
limiting expressing abilities.


(efficient keyboard setup, word prediction, structure
prediction).

An overview on Architecture



Content planning



l exicalization

Syntactic

realization

vocalization

Ta sk: Parsi ng



Re qui rements: worl d k nowl edge

Le xi cal i nformati on

Ou t put: wel l f ormed i nput f or


s y ntacti c real i zer

Ta sk: generati ng well f ormed

s e ntences


To ol s: Vall i ant

s conceptual

gr a phs p arsi ng

Le xi con f or verbs ( J i ng e t al.)

B l i s s Lexi con

To ol s: S URGE/HUGG

Lexicon


Mapping concepts
-

symbols to word


Compositional vs. non
-
compositional


Organization of symbols for efficient
retrieval.


(POS, semantic connections)


Available lexical knowledge


Syntactic structure, irregularities etc.

Methodology

Test interaction of different aspects


Word/symbol/ structure prediction

With more specific questions:


Concepts to words


Referential expression generation


Pragmatic considerations.

Semantic Network (Valliant)


Blissymbols Grid

Word
-
> Symbols


Symbols
-
> Symbol | Symbol near Symbols


Symbol
-
> FeaturedComponent | Symbol++FeaturedComponent


FeaturedComponent
-
> (Atomic)Position+Size+Direction

Atomic
-
> Pictographs | Arbitrary


Pictographs
-
> protection, house, circle, plus, pointer^, arrow, room, body, legs, chair, water, wheel, feeling


Arbitrary
-
> Articles | Numerals | Math
-
sign | Bliss
-
arbitrary


Position
-
> Vertical and Horizontal



Vertical
-
>VerticalPosition Spacing @ VerticalSigner

VerticalPosition
-
> right, left, centralized

VerticalSigner
-
> skyline, midline, earthline



Horizontal
-
> HorizontalPosition Spacing @ HorizontalSigner

HorizontalPosition
-
> above, under, centeralized


HorizontalSigner
-
> leftline, middleline, rightline



Spacing
-
> zero | one | two ;; the distance between the constituents


Size
-
> full, half, quarter



Direction
-
> as
-
is, horizontal, vertical, left, right, upside
-
down | Direction
-
Direction


Bliss
-
arbitrary
-
> action , enclosure, multiplication, evaluation, nature, horizontal
-
line, vertical
-
line,



SURGE


Work left to do:


Integration


Evaluation



symbols to utterances corpus



keystrokes savings


Previous work of NLP
-
AAC


Word prediction


Message Generation


Text simplification

Word Prediction


Simple non
-
linguistic methods
-

possibly up to
50
% savings of keystrokes.


Required


improvement,


Including syntactic/semantic knowledge in the
prediction process, using machine learning
methods, based on corpus analysis


Methods:


Frequency
-
based models (bi/tri
-
grams)


Grammatical and conceptual modeling to predict well
formed utterances


such as the use of POS tags.


Word Prediction



KOMBE project


hand written syntactic
rules.


Carlberger


Different languages?

Message Generation


Language generation from reduced input


Telegraphic text


[Cushler Badman Demasco and McCoy]



think red hammer break John =>


I think that the red hammer was broken by John.



Cogeneration [Copestake]


Construction of full sentences from templates.


PVI


Main assumption: order of word choice implies
topicalization and should be considered.






The common architecture



Iconic/telegraphic

input

Semantic parser :

Identification of

predicator

Unification of

Arguments.

Lexical choice

Syntactic realization:

Closed
-
words selection

Linearization

morphology

Cogeneration approach


Situation
-
based approach.


A set of pre
-
defined templates :

Topic of discussion: <>


Participants: <>

Time of discussion: <> (optional)

You know <participants> talking about <topic>


Prefer:

You know we were talking at breakfast about buying a desk
lamp.

On ambiguous:

You know we were talking about buying a desk lamp at
breakfast.

Templates W/o cogeneration:

You know us talking about buy desk lamp breakfast.

PVI


Paradigmatic dimension: icons organized in
taxemes,
further grouped in samantic domains.


Syntagmatic dimension: build a casual structure of
predicative concepts.


Meaning of an icon: the features that distinguish it
from the other icons.


Semantic analysis: reconstructing the meaning of
the icon sequence


building a semantic network.


Lexical choice


assuming there is no bijection
mapping of icons/words.


Generation

Message Selection Systems


Discourse structure


Talk:About univ. of Dundee


A user uses pre
-
stored sentences.


The sentences are indexed using rhetorical
structure assumptions.



Language Simplification and
Language Understanding


PSET project


[Carroll et al.]


Intended for aphasic readers


with lexical
or syntactic impairments.


Syntactic simplification:


Passive to active


Lexical simplification: lookup for
synonyms, use most frequent.


To sum…..


NLP can be naturally and effectively
integrated into AAC systems.


Relaxations


user feedback is available on
the spot.


Data collection IS an issue here.


The aim: make more flexible, expressive
tools, with enhanced rate.


Possibly, combined approaches.