Natural Language Processing

gorgeousvassalΛογισμικό & κατασκευή λογ/κού

7 Νοε 2013 (πριν από 4 χρόνια και 5 μέρες)

127 εμφανίσεις

Natural Language Processing
Assignment


Final Presentation




Varun

Suprashanth
, 09005063

Tarun

Gujjula
, 09005068

Asok Ramachandran, 09005072

Part 1 : POS Tagger

Tasks Completed


Implementation of Viterbi


Unigram, Bigram.


Five Fold Evaluation.


Per POS Accuracy.


Confusion Matrix.



0
0.2
0.4
0.6
0.8
1
1.2
Serie…
Per POS Accuracy for Bigram
Assumption.

Screen shot of Confusion Matrix

AJ0

AJ0
-
AV0

AJ0
-
NN1

AJ0
-
VVD

AJ0
-
VVG

AJ0
-
VVN

AJC

AJS

AT0

AV0

AV0
-
AJ0

AVP

AJ0

2899

20

32

1

3

3

0

0

18

35

27

1

AJ0
-
AV0

31

18

2

0

0

0

0

0

0

1

15

0

AJ0
-
NN1

161

0

116

0

0

0

0

0

0

0

1

0

AJ0
-
VVD

7

0

0

0

0

0

0

0

0

0

0

0

AJ0
-
VVG

8

0

0

0

2

0

0

0

1

0

0

0

AJ0
-
VVN

8

0

0

3

0

2

0

0

1

0

0

0

AJC

2

0

0

0

0

0

69

0

0

11

0

0

AJS

6

0

0

0

0

0

0

38

0

2

0

0

AT0

192

0

0

0

0

0

0

0

7000

13

0

0

AV0

120

8

2

0

0

0

15

2

24

2444

29

11

AV0
-
AJ0

10

7

0

0

0

0

0

0

0

16

33

0

AVP

24

0

0

0

0

0

0

0

1

11

0

737

Part 2 : Discriminative VS
Generative

Problem Statement


Generate
unigram parameters of P(
t_i|w_i
). You already
have the annotated corpus.


Compute
the
argmax

of P(T|W); do not invert through
Bayes theorem.


Compare
with unigram based unigram performance
between (2) and the HMM based system.


Tasks Completed


Generated
unigram parameters of
P(
ti|wi
).


Computed
the
argmax

of P(T|W
).



Compared
with unigram based unigram
performance between
the
HMM based
system
and the above.


Better results were produced by the generative
model in cases of ambiguous sentences.


Discriminative


𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(T|W) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(
𝑇
𝑖
,
𝑁
|
𝑊
𝑖
,
𝑁
)




=
𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(
𝑇
1
|
𝑊
𝑖
,
𝑁
)
.





𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(
𝑇
2
|
𝑊
𝑖
,
𝑁
).

………





𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(
𝑇
𝑁
|
𝑊
𝑖
,
𝑁
)



Assuming word tag pair to be independent,


𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(T|W) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(
𝑇
𝑖
,
𝑁
|

𝑊
𝑖
,
𝑁
)




precision
0.896788


F
-
measure
0.896788


Per
-
PoS

Accuracy

0
0.2
0.4
0.6
0.8
1
1.2
AJ0
AJ0-NN1
AJ0-VVG
AJC
AT0
AV0-AJ0
AVP-PRP
AVQ-CJS
CJS
CJS-PRP
CJT-DT0
CRD-PNI
DT0
DTQ
ITJ
NN1
NN1-NP0
NN1-VVG
NN2-VVZ
NP0-NN1
PNI
PNP
PNX
PRP
PRP-CJS
TO0
VBB
VBG
VBN
VDB
VDG
VDN
VHB
VHG
VHN
VM0
VVB-NN1
VVD-AJ0
VVG
VVG-NN1
VVN
VVN-VVD
VVZ-NN2
Series1
Generative


𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(T|W) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(T|W). P(T).



Assuming unigram assumption and word tag pairs to be
independent,



𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(T|W) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑇

P(
𝑇
𝑖
|
𝑊
𝑖
)
.

P(
𝑇
𝑖
)



Part 3 : Analysis of Corpora Using
Word Prediction

Tasks Completed


Predicted the next word on the basis of the patterns
occurring in both the corpora.



First Corpus had untagged
-
word sentences and the
second one had tagged
-
word sentences.



The corpus with the tagged words gives better results for
word prediction.

Untagged Corpus


𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

P
(
𝑊

|
𝑊
1
,
𝑁
) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑤
𝑐
(
𝑊
1
,
𝑁
.
𝑊
)
𝐶
(
𝑊
1
,
𝑁
)


Where c() is the count.



By Bigram Assumption,


𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

P(
𝑊

|
𝑊
1
,
𝑁
) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑤
𝑐
(
𝑊
𝑁
.
𝑊
)
𝐶
(
𝑊
𝑁
)


By Trigram Assumption,


𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

P(
𝑊

|
𝑊
1
,
𝑁
) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑤
𝑐
(
𝑊
𝑁
.
𝑊
𝑁

1
)
𝑐
(
𝑊
𝑁

1
.
𝑊
)
𝐶
(
𝑊
𝑁
)












Tagged Corpus


𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

P
(
𝑊

|
𝑊
1
,
𝑁
,
𝑇
1
,
𝑁
)

=


𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

𝑐
(
<
𝑤
1
,
𝑇
1
>

,
<
𝑤
2
,
𝑇
2
>
,


<
𝑤
𝑁
,
𝑇
𝑁
>
,
<
𝑤
1
,
𝑇
𝑖
>
)
𝑖
𝑐
(
<
𝑤
1
,
𝑇
1
>

,
<
𝑤
2
,
𝑇
2
>
,


<
𝑤
𝑁
,
𝑇
𝑁
>
)



Using Bigram Assumption,


𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

P
(
𝑊

|
𝑊
1
,
𝑁
,
𝑇
1
,
𝑁
) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

𝑐
(

<
𝑤
𝑁
,
𝑇
𝑁
>
,
<
𝑤
1
,
𝑇
𝑖
>
)
𝑖
𝑐
(

<
𝑤
𝑁
,
𝑇
𝑁
>
)


Using Trigram Assumption,


𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

P
(
𝑊

|
𝑊
1
,
𝑁
,
𝑇
1
,
𝑁
) =
𝑎𝑟𝑔𝑚𝑎𝑥
𝑤

𝑐
(
<
𝑤
𝑁
,
𝑇
𝑁
>
,
<
𝑤
𝑁

1
,
𝑇
𝑁

1
>
,
<
𝑤
1
,
𝑇
𝑖
>
)
𝑖
𝑐
<
𝑤
𝑁
,
𝑇
𝑁
>
,
<
𝑤
𝑁

1
,
𝑇
𝑁

1
>
)


Examples.


Example 1 :


TO0_to
VBI_be

CJC_or

XX0_not
TO0_to


VBI_be


to be or not
to


The



Example 2:


AJ0_complete
CJC_and

AJ0_utter


NN1_contempt


complete and utter


Loud


Examples Cont.


Example 3:


PNQ_who

VBZ_is

DPS_your

AJ0
-
NN1_favourite



NN1_gardening


who is your
favourite


is

Results


Raw text LM :


Word Prediction Accuracy:
13.21%



POS tagged text LM :


Word Prediction Accuracy :
15.53%


Part 4 : A
-
star Implementation

Problem Statement


The
goal is to see which algorithm is better for POS
tagging, Viterbi or A*


Look
upon the column of POS tags above all the words as
forming the state space graph.


The
start state S is '^' and the goal stage G is
'$'. 6
. Your
job is to come up with a good heuristic. One possibility is
that the
heuristic
value h(N), where N is a node on a word
W, is the product of the distance of W from '$' and the
least arc cost in the state space graph.


G(N
) is the cost of the best path found so far to W from '^'.


Run
A* with this heuristic and see the result.


Compare
the result with Viterbi.


A
-
Star Implementation.


precision
0.937254


F
-
measure
0.937254

0
0.2
0.4
0.6
0.8
1
1.2
AJ0
AJ0-NN1
AJ0-VVG
AJC
AT0
AV0-AJ0
AVP-PRP
AVQ-CJS
CJS
CJS-PRP
CJT-DT0
CRD-PNI
DT0
DTQ
ITJ
NN1
NN1-NP0
NN1-VVG
NN2-VVZ
NP0-NN1
PNI
PNP
PNX
PRP
PRP-CJS
TO0
VBB
VBG
VBN
VDB
VDG
VDN
VHB
VHG
VHN
VM0
VVB-NN1
VVD-AJ0
VVG
VVG-NN1
VVN
VVN-VVD
VVZ-NN2
Series1
Screen shot of Confusion Matrix

12836

58

187

9

13

28

0

0

240

110

52

7

98

44

3

0

0

0

0

0

0

5

26

0

357

1

377

0

2

0

0

0

1

0

1

0

33

0

0

2

0

1

0

0

7

0

0

0

33

0

2

0

29

0

0

0

4

0

0

0

42

0

0

5

0

15

0

0

5

0

0

0

4

0

0

0

0

0

403

0

3

38

0

0

4

0

0

0

0

0

0

214

0

18

0

0

1

0

0

0

0

0

0

0

23454

55

0

0

82

11

2

0

0

0

58

11

99

9198

68

42

34

12

0

0

0

0

0

0

0

69

75

0

4

0

0

0

0

0

0

0

1

38

0

1533

0

0

0

0

0

0

0

0

0

5

0

72

0

0

0

0

0

0

0

0

1

15

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

3

0

0

0

0

0

0

0

0

0

0

1

109

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Heuristics.


h

= g * (N
-

n)/ n


Where N is the length of the sentence, and n is the index of the
current word in the sentence.

A
-
star Vs. Viterbi


Part 5 : YAGO

Problem

Statement


Take
as input two words and show A PATH between them
listing all the concepts that are encountered on the way.


For
example, in the path from 'bulldog' to '
cheshire

cat',
one would presumably encounter 'bulldog
-
dog
-
mammal
-
cat
-
cheshire

cat'.


Similarly for 'VVS
Laxman
' and
'Hyderabad', 'Tendulkar' and 'Tennis' (you will be
surprised!!).


Part 6: Parser Projection

Example


English:
Dhoni

is the
captain
of
India.


Hindi:
dhoni

bhaarat

ke

kaptaan

hai
.



Hindi
-
parse
:


[













[ [
dhoni
]
NN
]
NP













[

[[[
bhaarat
]
NNP
]
NP

[
ke
]
P

]
PP

[
kaptaan
]
NN
]
NP

[
hai
]
VBZ

]
VP




]
S



English
-
parse
:






[













[ [
Delhi]
NN
]
NP













[ [
is]
VBZ

[[the]
ART

[capital]
NN
]
NP

[[of]
P

[[India]
NNP
]
NP
]
PP
]
VP




]
S




28

Problems and Conclusions



Many Idioms in English are translated directly, even
though they mean something else,


E.g. Phrases like “break a leg
”, “He Lost His Head”,
“French kiss”,
“Flip the bird”


Noise because of misalignments.

29

Natural Language Tool Kit


The Natural Language Toolkit, or more commonly NLTK,
is a suite of libraries and programs for symbolic and
statistical natural language processing (NLP) for the
Python programming language.


NLTK includes graphical demonstrations and sample
data.


It is accompanied by extensive documentation, including
a book that explains the underlying concepts behind the
language processing tasks supported by the toolkit
It
provides lexical resources such as
WordNet
.


It has a suite of text processing libraries for classification,
tokenization, stemming, tagging, parsing, and semantic
reasoning.

30

[EOF]