McCoy - University of Delaware

ghostslimIA et Robotique

23 févr. 2014 (il y a 3 années et 1 mois)

78 vue(s)

Kathy McCoy

Artificial Intelligence

Natural Language Processing

Applications for People with Disabilities

Primary Research Areas


Natural Language Generation



problem of choice.


Deep Generation
---

structure and content of coherent text


Surface Generation


particularly using TAG (multi
-
lingual
generation and machine translation)


Discourse Processing


Second Language Acquisition


Applications for people with disabilities

affecting
their ability to communicate

Projects


Augmentative Communication



Center for
Applied Science and Engineering in Rehabilitation
(ASEL)


Word Prediction and Contextual Information
(Keith Trnka, (Jay McCaw), Chris Pennington, Debbie
Yarrington)


ICICLE



CALL system for teaching English as a
second language to ASL natives (Rashida Davis, Charlie
Greenbacker)


Text Skimming



for someone who is blind to skim a
document to find an answer to a question (Debbie
Yarrington).


Generating Textual Summaries of Graphs



(Sandee
Carberry, Seniz Demir)

Developing Intelligent
Communication Aids for
People with Disabilities

Kathleen F. McCoy

Computer and Information Sciences &
Center for Applied Science and
Engineering in Rehabilitation
University of Delaware

Augmentative Communication


Intervention that gives non
-
speaking person an
alternative means to communicate


User Population


May have severe motor impairments


Unable to speak


Unable to write


Cannot use sign language


May have cognitive impairments and/or developmental
disabilities


May be too young to have developed literacy skills


Row
-
Column Scanning

Row
-
Column Scanning II

Language Representation: Words

Still Need to Spell!

Predicting Fringe Vocabulary


Word Prediction of Spelled Words (infrequent
context
-
specific words)

Methods


Statistical NLP Methods


Learning from the context of the individual


Other Contextual Clues


Geographic Location, Time of Day, Conversational
Partner, Topic of Conversation, Style of the
Document

Prediction Example

Trigram Model: P(w|h)=P(w|w
-
2

w
-
1
)

Can we do better??


Intuitively all possible words do not occur with
equal likelyhood during a conversation.


The topic of the conversation affects the words
that will occur.


E.g., when talking about baseball: ball, bases, pitcher,
bat, triple….


How often do these same words occur in your
algorithms class?

Topic Modeling


Goal: Automatically identify the topic of the
conversation and increase the probability of
related words and decrease probability of
unrelated words.


Questions


Topic Representation


Topic Identification


Topic Application


Topic Language Model Use

Topic Modeling Approach

Topic Identification


Topic Identification

Topic Application


How do we use those similarity scores?


Essentially weight the contribution of each topic
by the amount of similarity that topic has with
the current conversation.

Results Using Topics

Current Work


What happens with significantly larger corpora?


What other kinds of tuning to the user can we
do:


Recency


Style



Does keystroke savings translate into
communication rate enhancement?

Text Skimming

Debra Yarrington, Kathleen McCoy

Problem:


Blind and dyslexic individuals cannot skim text


Example: “What’s the syntax for calling a function with
template parameters?” (skimming through code)


“Why was Ayers Rock renamed?”


“What type of tree produces leaves with three distinct
shapes?”


“Where can I find more information about Portugal?”


People who cannot read text rely on


screen readers (Jaws, Window
-
Eyes)


braille output



more difficult to come by


extremely bulky to carry around

Example of Jaws Output at 400
wpm


Link



“What psychological and philosophical significance should we attach to recent efforts at
computer simulations of human cognitive capacities? In answering this question, I find it useful to
distinguish what I will call "strong" AI from "weak" or "cautious" AI (Artificial Intelligence).
According to weak AI, the principal value of the computer in the study of the mind is that it gives
us a very powerful tool. For example, it enables us to formulate and test hypotheses in a more
rigorous and precise fashion. But according to strong AI, the computer is not merely a tool in the
study of the mind; rather, the appropriately programmed computer really is a mind, in the sense
that computers given the right programs can be literally said to understand and have other
cognitive states. In strong AI, because the programmed computer has cognitive states, the
programs are not mere tools that enable us to test psychological explanations; rather, the
programs are themselves the explanations.



I have no objection to the claims of weak AI, at least as far as this article is concerned. My
discussion here will be directed at the claims I have defined as those of strong AI, specifically the
claim that the appropriately programmed computer literally has cognitive states and that the
programs thereby explain human cognition. When I hereafter refer to AI, I have in mind the
strong version, as expressed by these two claims.



I will consider the work of Roger Schank and his colleagues at Yale (Schank & Abelson
1977), because I am more familiar with it than I am with any other similar claims, and because it
provides a very clear example of the sort of work I wish to examine. But nothing that follows
depends upon the details of Schank's programs. The same arguments would apply to Winograd's
SHRDLU (Winograd 1973), Weizenbaum's ELIZA (Weizenbaum 1965), and indeed any Turing
machine simulation of human mental phenomena.”

Proposed Solution:


A system that takes a question and a document
or a few documents, and returns a small set of
text links where potential answers to the
question might be found


In order to accomplish this, we will potentially
use:


Techniques used in existing Question Answering
systems


Data collected from skimming text with an eye
tracking device

Example



Gaze Plot


link


Hot Spots



What

Art

Middle

infused

purpose

with

also served

people believed

writing does

who read

Sculpture. The mission

as well as decorate

Biblical tales

lessons to

were

church sculpture; animals

life

“Green man” peering

carefully

wrought

forth

Romanesque era

classical

conventions

of figures


Romanesque

At the beginning

era the style of

architecture

that was in vogue

Known as Romanesque because it
copied the pattern

proportion

of the architecture

the Roman Empire

chief characteristics of the
Romanesque style were

vaults, round arches,

and few windows

The easiest point to look

for is the rounded arch, seen in door
openings

windows

In general

churches were heavy

Carrying about them an air

solemnity and

These early

tapestries or

look closely




were

France called it “gothic”

was a reference

Ransacked Rome

twilight

architectural

Romanesque

vaults

incorporated

of window

The easiest point of

arch

doors. Also

later Gothic

very

especially the

the

churches

outdo each

of

For the

construction, througt

The architect

same place

Text Skimming

Debra Yarrington, Kathleen McCoy

Problem:


Blind and dyslexic individuals cannot skim text


Example: “What’s the syntax for calling a function with
template parameters?” (skimming through code)


“Why was Ayers Rock renamed?”


“What type of tree produces leaves with three distinct
shapes?”


“Where can I find more information about Portugal?”


People who cannot read text rely on


screen readers (Jaws, Window
-
Eyes)


braille output



more difficult to come by


extremely bulky to carry around

Example of Jaws Output at 400
wpm


Link



“What psychological and philosophical significance should we attach to recent efforts at
computer simulations of human cognitive capacities? In answering this question, I find it useful to
distinguish what I will call "strong" AI from "weak" or "cautious" AI (Artificial Intelligence).
According to weak AI, the principal value of the computer in the study of the mind is that it gives
us a very powerful tool. For example, it enables us to formulate and test hypotheses in a more
rigorous and precise fashion. But according to strong AI, the computer is not merely a tool in the
study of the mind; rather, the appropriately programmed computer really is a mind, in the sense
that computers given the right programs can be literally said to understand and have other
cognitive states. In strong AI, because the programmed computer has cognitive states, the
programs are not mere tools that enable us to test psychological explanations; rather, the
programs are themselves the explanations.



I have no objection to the claims of weak AI, at least as far as this article is concerned. My
discussion here will be directed at the claims I have defined as those of strong AI, specifically the
claim that the appropriately programmed computer literally has cognitive states and that the
programs thereby explain human cognition. When I hereafter refer to AI, I have in mind the
strong version, as expressed by these two claims.



I will consider the work of Roger Schank and his colleagues at Yale (Schank & Abelson
1977), because I am more familiar with it than I am with any other similar claims, and because it
provides a very clear example of the sort of work I wish to examine. But nothing that follows
depends upon the details of Schank's programs. The same arguments would apply to Winograd's
SHRDLU (Winograd 1973), Weizenbaum's ELIZA (Weizenbaum 1965), and indeed any Turing
machine simulation of human mental phenomena.”

Proposed Solution:


A system that takes a question and a document
or a few documents, and returns a small set of
text links where potential answers to the
question might be found


In order to accomplish this, we will potentially
use:


Techniques used in existing Question Answering
systems


Data collected from skimming text with an eye
tracking device

What

Art

Middle

infused

purpose

with

also served

people believed

writing does

who read

Sculpture. The mission

as well as decorate

Biblical tales

lessons to

were

church sculpture; animals

life

“Green man” peering

carefully

wrought

forth

Romanesque era

classical

conventions

of figures


Romanesque

At the beginning

era the style of

architecture

that was in vogue

Known as Romanesque because it
copied the pattern

proportion

of the architecture

the Roman Empire

chief characteristics of the
Romanesque style were

vaults, round arches,

and few windows

The easiest point to look

for is the rounded arch, seen in door
openings

windows

In general

churches were heavy

Carrying about them an air

solemnity and

These early

tapestries or

look closely




were

France called it “gothic”

was a reference

Ransacked Rome

twilight

architectural

Romanesque

vaults

incorporated

of window

The easiest point of

arch

doors. Also

later Gothic

very

especially the

the

churches

outdo each

of

For the

construction, througt

The architect

same place

Current Directions


Have collected eye
-
tracking data from close to 100
people (on several documents each)


Analysis quite interesting


enough data to find
patterns in where the skimmers are looking.


Analyzing data with “text tiling methods” to pick out
places in the text where “same thing” being
discussed.



Incorporate question extraction techniques


How to present this to the user?

Modeling

the Acquisition of
English in the ICICLE System

Kathleen F. McCoy

Department of Computer and Information Sciences

University of Delaware

People


Current People


Rashida Davis


Charlie Greenbacker



Others


Chris Pennington, Dan Blanchard, Mike Bloodgood, Greg
Silber, Meghan Boyle, Mohamed Mostagir, Stephanie Baker,
Heejong Yi, David Derman


Graduates: Matthew Huenerfauth, Jill Janofsky, Lisa
Masterman Michaud, Litza Stark, David Schneider

The ICICLE Project

I
nteractive
C
omputer
I
dentification and
C
orrection of
L
anguage
E
rrors


Interactive writing tutor for native signers of
American Sign Language (ASL)


Purpose: analyze student
-
written English texts
and provide individualized feedback and
instruction on grammar

The ICICLE Project


system provides student with tutorial
instruction on the errors


student has opportunity to make
corrections and request re
-
analysis

The

ICICLE

System


student provides piece of text


system analyzes text for grammatical
errors


Cycle of user input, system response

Current Implementation

the student
enters text here

the system shows
which sentences
have errors

explanations
shown here

Writing From Deaf Students


Literacy is a serious issue for the Deaf population.


Lots of variation in level of acquisition.


Marked Differences from writing of hearing peers.


Dropped be:
She really pretty.


Missing Possessives:
She age is 13.


Subject/verb agreement, plural markers, determiners:
She
really like go with friend to mall.

Work on ICICLE


Previous work focused on developing grammar
and mal
-
rules and modeling the user’s level of
acquisition (so different analyses can be found
depending on it)

Current Work


Tutorial Responses


Probabilistic Parsing


need help!


NEED SYSTEM HELP!!!!!

What Mal
-
Rules do We Use?


Beginner
:

Over
-
application of auxiliary IS,
missing simple present morphology:


She
teaches

piano on Tuesdays.


Intermediate
:

Botched progressive tense:


She is teach
ing

piano on Tuesdays.


Advanced
:

Botched passive voice:


She is
taught

piano on Tuesdays.

“She
is teach

piano on Tuesdays.”


Current Directions


Have collected eye
-
tracking data from close to 100
people (on several documents each)


Analysis quite interesting


enough data to find
patterns in where the skimmers are looking.


Analyzing data with “text tiling methods” to pick out
places in the text where “same thing” being
discussed.



Incorporate question extraction techniques


How to present this to the user?