Automated Identification of Preposition Errors

blabbingunequaledAI and Robotics

Oct 24, 2013 (3 years and 7 months ago)

55 views

Automated Identification of
Preposition Errors

Joel Tetreault

Educational Testing Service

ECOLT

October 29, 2010


Outline


Computational Linguistics (CL) and Natural
Language Processing (NLP)


NLP at ETS (automated scoring)


Automated Preposition Error Detection






Linguistics

Homer
talked to
Marge

D’oh
!


Computational Linguistics

Homer
talked to
Marge

D’oh!


Want computers to
understand language

Computational Linguistics

Homer
talked to
Marge

D’oh
!


9omer2
ta3lked
4 6arge1

Computational Linguistics vs. NLP


Computational Linguistics (CL):


Computers understanding language


Modeling how people communicate


Natural Language Processing (NLP):


Applications on the computer side


Natural
: refers to languages spoken by people
(English, Swahili) vs. artificial languages (C++)


Take CL theories and implement them into tools


CL and NLP often conflated

Computational Linguistics Space



Computer Science: learning algorithms



Linguistics: formal grammars



Psychology: human processing modeling





CL

Computational Linguistics Space





CL

Artificial

Intelligence

Intelligent

Machines



Perfect speech recognition



Perfect language understanding



Perfect speech synthesis



Perfect discourse modeling



Intention Recognition



World Knowledge



(Vision)


Real World Applications of NLP


Spelling and Grammar correction/detection


MSWord, e
-
rater


Machine Translation


Google and Bing Translate


Opinion Mining


Extract sentiment of demographic from blogs and
social media


Speech Recognition and Synthesis


Automatic Document Summarization

NLP at ETS: Motivation


Millions of GRE and TOEFL tests taken each
year


Tests move to more natural assessment


Fewer multiple choice questions


Tests have essay component


Problem:


Thousands of raters required


Costly and timely

NLP at ETS


Use NLP techniques to automatically score
essays (e
-
rater)


Other scoring tools which use NLP:


Criterion: online writing feedback


SpeechRater: automatic speaking assessment


C
-
Rater: content scoring of short answers


Plagiarism Detection

E
-
rater (Automated Essay Scoring)


First deployed in 1999 for GMAT Writing
Assessment


Operational for the GRE and TOEFL as well as
a collection of smaller assessments


System Performance (5 point essay scale):


E
-
rater
/Human agreement: 75% exact, 98% exact
(+1 adjacent)


Comparable to two humans



E
-
rater (Automated Essay Scoring)


Massive collection of 50+ weighted features
organized into 5+ high level features


Each feature is represented by a module:


Simple: collection of manual rules and/or regular
expressions


More complex: NLP (Natural Language Processing)
statistical system is behind the feature


Combined using linear regression


E
-
rater Features



Sentence fragments, garbled words


Subject
-
Verb Agreement:
the
motel are





Verb form:
They
are need

to distinguish



Pronoun Errors:
Them are

my reasons …


Grammar


Incorrect article/preposition


Confused Word:
affect vs. effect


Faulty Comparison:
It is
more big


Double negatives:
He
don’t have no

candy.

Usage

E
-
rater Features



Spelling


Punctuation


Capitalization


Missing hyphens, apostrophes


Mechanics


Sentence length, word repetition


Passives

Style


Discourse sequences


RST & Syntactic structures (contrast,
elaboration, antithesis, etc.)

Discourse

How to Game the System



Word Salad Detector





Unusually Short / Off
-
Topic Essays

“Quick The
the

over brown dogs fox.
Jumped. Lazy”


Skfhdorla;sf
[
e’skas

as,fr’r
;/.,
fkrasa


“I don’t know how to explain this
question because I took a nap. Sorry.”

“I THINK EVERYONE SHOULD BE ABLE
TO WEAR WHATEVER THE HELL THEY
WANT TO WEAR.”

NLP for English Language Learners


Increasing need for tools for instruction in
English as a Second Language (ESL)


300 million ESL learners in China alone


10% of US students learn English as a second
language


Teachers now burdened with teaching classes
with wildly varying levels of English fluency


Assessments for EFL Teacher Proficiency

NLP for English Language Learners


Other Interest:


Microsoft Research (ESL Assistant)


Publishing/Assessment Companies (Cambridge, Oxford,
Pearson)


Universities


Objective


Research Goal: develop NLP tools to
automatically provide feedback to ESL
learners about grammatical errors



Preposition Error Detection


Selection Error (“They arrived
to

the town.”)


Extraneous Use (“They came
to

outside.”)


Omitted (“He is fond this book.”)

Motivation


Preposition usage is one of the most difficult
aspects of English for non
-
native speakers


[Dalgish ’85]


18% of sentences from ESL essays
contain a preposition error


Our data: 8
-
10% of all prepositions in TOEFL
essays are used incorrectly

Why are prepositions hard to master?


Prepositions are problematic because they
can perform so many complex roles


Preposition choice in an adjunct is constrained by
its object (“
on

Friday”,
“at

noon”)


Prepositions are used to mark the arguments of a
predicate (“fond
of

beer.”)


Phrasal Verbs (“
give in

to

their demands.”)


“give in”


“acquiesce, surrender”

Why are prepositions hard to master?


Multiple prepositions can appear in the same
context:



Choices


to


on


toward


onto

Source


Writer


System


Rater 1


Rater 2

“When the plant is horizontal, the force of the gravity causes the
sap to move __ the underside of the stem.”

Preposition Error Detection


In NLP: computer system learns from lots and
lots of data


Training Phase:
Create a “model” of the
problem area


Face detection


Credit Card Usage


Translating from Chinese to English


Testing Phase:
Use model to classify new
cases

Baseball Feature Example


Predict the outcome of the baseball game


Look at all the games where both teams
played each other:


For each game (event), use features:


Win/loss records before game


Home field advantage


Players’ prior performance


Train learning algorithm


Baseball Feature Example

Event

Winner

Location

Prior

Isotopes
Win Streak

Prior Capital
City Win Streak

Game

1

Isotopes

Springfield

0

3

Game 2

Capital

City

Springfield

4

0

Game 3

Capital

City

Capital City

2

0

Game 4

Isotopes

Springfield

2

1

Building a Model of Preposition Usage


Prepositions are influenced by:


Words in the local context, and how they interact
with each other (lexical)


Syntactic structure of context


Semantic interpretation


Get computer to understand correct usage:


Encode these influences as “features”


Train computer algorithm on millions of examples
of correct usage with the associated features



Deriving the Features


Derived using NLP tools


Tokenizing


“He is fond of beer . ”


Part
-
of
-
Speech Tagging



He
_PRP

is
_BE

fond
_VB

of
_PREP

beer
_NN

.
_.



Chunking / Parsing



{NP
He
_PRP

}
{VP
is
_BE

fond
_VB

}
of
_PREP

{NP
beer
_NN

}

.
_.




Feature Overview


System uses a minimum of 25 features


Lexical, syntactic, semantic sources


Head words before and after preposition


Words in the local context (+/
-

2 words)


Part of Speech (POS) of words above


Combination Features


Parse Features



Preposition Feature Example

Event

Prep

Prior Verb

Prior Noun

Following

Word

POS of Following Word

Prep 1

of

fond


<none>

beer

NN

Prep 2

at

arrive

<none>

the

Det

Prep 3

with

<none>

car

the

Det

1.
He is fond
of

beer.

2.
The train will arrive
at

the Springfield Station.

3.
The car
with

the broken wheel is in the shop.



Flagging Errors


Train learning algorithm on millions of events


develop model (classifier)


Testing (flagging errors)


Derive features


Replace writer’s preposition with all other
prepositions, classifier outputs score for each
preposition


Compare top scoring preposition to score of
writer’s preposition


Thresholds

“He is fond
with

beer”

FLAG AS ERROR

Thresholds

“My sister usually gets home
by

3:00”

FLAG AS OK

Performance


Evaluation corpus of 5600 TOEFL essays (8200
prepositions)


Each preposition manually annotated


Recall = 0.19 ; Precision = 0.84


1/5 of errors are flagged


84% of flagged errors are indeed errors


Precision > recall to reduce false positives


State of the Art performance

Conclusions


Presented an overview of:


NLP


NLP at ETS


One feature (Prepositions) in e
-
rater


Future Directions


Use of large scale corpora (WWW)


L1
-
specific models


Train on error
-
annotated data

Plugs


ETS/NLP Publications:


http://ets.org/research/erater.html


5
th

Workshop on Innovative Use of NLP for
Educational Applications (NAACL
-
10)


http://www.cs.rochester.edu/u/tetreaul/naacl
-
bea5.html


Plugs


“Automated Grammatical Error Detection for
Language Learners”


Leacock et al., 2010


Synthesis Series


Thanks!






Joel Tetreault:
JTetreault@ets.org