GENERALIZING SEMANTIC RELATIONS

plantationscarfAI and Robotics

Nov 25, 2013 (3 years and 8 months ago)

71 views

GENERALIZING
SEMANTIC RELATIONS

12月7日

研究会

祭都援炉
(
マットエンロ
)

Up until now: Getting to know NLP


“Speech and Language Processing” (
Jurafsky

& Martin)


論文:


On
-
Demand Information Extract
(
Sekine
)


Learning First
-
Order Horn Clauses From Web Text
[Sherlock]
(
Schoenmackers

2010)


Coupled Semi
-
Supervised Learning for Information Extraction
[NELL]
(Carlson)


Identifying Relations for Open Information Extraction
[
ReVerb
] (Fader)


Relation Acquisition using Word Classes and Partial Patterns
(
Saeger
)


Interpretation as Abduction
(Hobbs)


An ILP Formulation of
Abudctive

Inference for Discourse Interpretation
(Inoue

)


Learning Dependency
-
Based Compositional Semantics
(Liang)



Motivation


Ultimate Goal: Inference


Inference requires: knowledge


Large scale database of semantic relations have been
created from web text

ReVerb

(Fader et al., 2011)


relation
(
arg1
,
arg2
) tuples acquired from large
-
scale Web
data


Over 14.5 million semantic relations released to public

8
#
9
:
;
7
-
,
#
*
!
$
&
.
0
3
-
%
.
$
*
4
5
6
0
-
7
%
.
$
*


1
6
+
,
F
%
G
+
'
H
"
%
C
0
+
*
:
+
'
)
F
I
J
4
7
4
K
%
"
'
L
%
M
0
+
"
+
F
H
'


2
"
)
L
0
+
'
"
+
C
'
!
D
'
)
/
/
0
4
F
)
K
%
"
7
'
?
'
<
=
#
*
8
#
9
*
$
)
0
8
=
4
8
+
N
A
O
A
.
G
-
'
P
@
&
,
Q
'
F
)
J
7
+
7
N
C
4
"
:
K
/
'
6
%
,
K
F
+
7
-
'
*
,
)
:
Q
'
:
+
"
J
7
N
#
J
/
+
0
%
-
'
#
,
+
+
Q
'
L
%
,
*
+
,
7
N
B
)
"
7
)
7
-
'
'
'
R
+
S
)
7
Q
'
0
+
)
*
.
4
"
:
+
,
N
T
+
+
'
.
"
&
*
+
,
-
'
'
'
R
C
4
7
#
+
*
'
.
4
7
#
+
,
Q
'
4
"
$
4
L
4
#
7
N
(
%
0
0
4
7
#
)
K
"
-
'
'
'
!
F
K
6
4
"
Q
'
>
(
,
,
(
.
$
/
*
.
&
*
?
/
/
#
0
%
.
$
/
*
9
+
L
B
U
'
>
,
%
E
+
F
#
'
N
V
,
)
6
+
"
'
+
#
'
)
0
W
-
'
A
P
P
X
Q
'
B
"
%
C
D
#
!
0
0
'
>
,
%
E
+
F
#
'
N
2
#
3
4
%
"
4
'
+
#
'
)
0
W
-
'
?
@
@
Y
Q
'
Z
+
)
*
R
$
+
9
+
L
'
>
,
%
E
+
F
#
'
N
V
)
,
0
7
%
"
'
+
#
'
)
0
-
'
?
@
A
@
Q
'
Problem


Many different ways to
express equivalent meaning


Consider
resides
relation


Table shows counts of reverb
relations containing
live

or
reside


Relations in red
should be
generalized to:


reside(<PERSON>,<PLACE>)


We aim to generalize through
semantic clustering



Frequency | Relation

27,383
lives in



10,315
live in


8,653 lived in



5,185
currently resides in



4,002
currently lives in

3,310
now lives in



1,933
resides in



1,548
is a resident of

1,468 live on



1,308
now resides in

1,191 has lived in



1,055 resided in



876 lives on



590 lived on

531
live at

515
still lives in




461 can live up to



456
is a lifelong resident of

444 was a resident of



413 live for



382 must be residents of



332 lives with



332 lived for

Semantic Clustering Goals

1.
Semantic Relations Dictionary


Mapping from
ReVerb’s

specific instances to generalized semantic
-
placeholder that looks like

<Generalized
-
Rel
>
(
<Arg1 Type>,<Arg2 Type>)

2.
Method of mapping real
-
world relation instances to
generalized semantic form


C
an
be accomplished with a semantic similarity function


Clustering and generalize relations


Looking up new relations from text

Semantic Similarity


Ontological: Similarity based on arguments’ hierarchy of
semantic types



Lexical: Similarity based on lexical features of relation



Contextual: Similarity based on surrounding text


Ontological Similarity (Clustering)

1.
Matt

resides in
Sendai

2.
Eric

lives in
Japan



Should these be clustered together? (Yes!)


Matching arg1 type
<Person>


Matching arg2 type
<Place>


High ontological similarity means good chance of
clustering

Ontological Similarity (Lookup)

1.
Matt

lives on
a
farm
=> ??
?

2.
Eric

lives on
donuts
=> ??
?



Are these the same semantic relation? (NO!)


Multiple entries in dictionary for
lives_on
:


r
esides
(<Living Thing>,<Place>)


nourished_by
(<
Living
Thing>,<
Nourishment
>)


Use argument type similarity testing to differentiate
between senses of
lives_on

Ontological Similarity (Lookup cont.)


Which version will ontological similarity suggest we return for
each example?


1.



Matt

lives on
a
farm


<
Person
>

lives on<?>
<
Place
>



resides
(
Matt,a_farm
)


2.

Eric

lives on
donuts




<Person>
lives on<?>
<Food>



nourished_by
(
Eric,donuts
)



o
nto_sim
(
<Food>
,
<Nourishment>
) is greater than
onto_sim
(
<Food
>
,
<Place>
)

so we know knows
Eric

is
nourished_by

donuts

Lexical Similarit
y


Use relationship features to score similarity


N
-
gram overlap, bag
-
of
-
words, …


Weighting content/functional words differently


etc

Lexical Similarity


Correctly groups together


Lives at


Live in


But erroneously clusters


Lives for


Lives
with


And doesn’t cluster


resides in


(relying on ontological
sim
.
f
or that)


27,383 lives in



10,315 live in



8,653 lived in




5,185 currently resides in




4,002 currently lives in


3,310 now lives in




1,933 resides in




1,548 is a resident of


1,468 live on




1,308 now resides in


1,191 has lived in




1,055 resided in




876 lives on




590 lived on


531 live at


515 still lives in




461 can live up to




456 is a lifelong resident of


444 was a resident of




413 live for




382 must be residents of




332 lives with




332 lived for

Contextual Similarity


How similar is the surrounding text?


To answer this, we need original text


Will have to hunt down sentences on the web


Time consuming


Feasible?

Other Issues


Word tense


Does
lived in
belong with
lives in
?


Detection of conflicting polarity


x
(
Acesulfame_Potassium

does_not_promote

tooth_decay
)

x
(Conservatives

should_not_promote

democracy)


x
(
Website

must_not_promote

hate)


x
?

(Environmentalists

are_not_alone_in_promoting

renewable_en
ergy
)


Semantic type coverage problems


Use lexical similarity
-
based lookup for semantic type too?

進捗報告


Set up
git

repository


Implemented:


Wrapper for reverb


(data lookup)


WordNet

type
-
lookup


Sherlock type
-
lookup


Ontological similarity


Made a slideshow for
研究会


ただ今ご覧になっていただ
いている物

計画!!!!!


Finish similarity score


Selecting a
wordnet

ontological similarity function


(Over 5 different evaluations already exist)


Implement lexical similarity


(Should already be in NLTK somewhere)


Implementing contextual similarity


(Prepare for the hunt!)


Selecting & implementing a clustering method


Test on
ReVerb

data


First on
wikipedia



Then on
clueweb