Research Issues on Language Processing

mumpsimuspreviousΤεχνίτη Νοημοσύνη και Ρομποτική

25 Οκτ 2013 (πριν από 4 χρόνια και 2 μήνες)

88 εμφανίσεις


1

Abstract


This paper
describe
s initiatives in the field
of

language processing as related to Artificial
Intelligence
. The three papers described below bring t
o
light some o
f the work

that

currently attempt to give
machine the ability to handle written and spoken
language in manners similar to the human mind
.


1.

P
APER
I

This paper argues that there has been little
cooperation between Natural Language Processing
(NLP) and Plan
Recognition (PR)

[1]
.
In order to
parse a set of
observations

into an explanation

both
PR and NLP must specify the patterns of
observations

they are willing to accept or the rules
that govern how the observations

can be combined.
In PR this specification is done

in t
he form of a
library of plans, while in NLP this is done

through a
grammar.

The authors show that
the plan libraries
used by PR systems

are equivalent to the grammars
used by NLP systems.

One of the tools that allow
such integration is Mildly C
ontext
-
Sensi
tive
G
rammars

(MCSGs), which
have a number of
properties

that make them attractive for NLP
including
polynomial algorithms

for parsing. These
properties also make them attractive for

adoption by
the PR community
.


2.

P
APER
II


The second paper describes SenGe
n, an

algorithm
for

linear text segm
entation on general corpuses

[2]
.
The goal is to
segment texts into thematic
homogeneous

parts.

SegGen is an evolutionary
algorithm that evaluates the segmentations

of the
whole text rather than setting boundaries

on text

topics, as is commonly done.
The method uses an

external archive

to memorize individuals

w.r.t. both
cr
iteria and a current population.
Individuals

are
selected from these two populations in order to

generate new individuals thanks to genetic
operators. T
hese

new individuals constitute are used
to update

the
set of potential

segmentations of the
text.

According to the authors, t
he

experiments have
shown that their

approach appears to

yield a

more
accurate segmentation of the texts than

existing
incremental

methods.

3


3.

P
APER

III

In t
he third

paper, the authors

propose a
n

unsupervised
Word Sense Disambiguation (
WSD
)

algorithm, which is based on

generating Spreading
Activation Networks (SANs)

from the senses of a
thesaurus and the relations between

them

[3]
.
Word
Sense Disambiguation
aims
t
o assign to every

word
of a document the
most appropriate meaning
among
those offered by a lexicon or a thesaurus.

The
authors propose a four
-
step algorithm. In step 1,
the
text

is fragmented

into sentences, and
relevant
words are
select
ed. Step 2 b
uild
s a SAN. In step 3,
f
or every word node, the last active sense node

is
stored. Finally, step 4 a
ssign
s

to each word the
sense corresponding to the

sense node stored in the
previous step.

Accordin
g to the authors,
experiments made on SenGen
matched the best
WSD results that have

been reported on the same
data.


I.

REFERENCES



[1]

C. W. Geib and M. Steedman, "On Natural Language Processing
and Plan Recognition," presented at Intern
ational Joint
Conferences on Artificial Intelligence
(IJCAI), Hyderabad, India,
2007, pp 1612
-
1617.

[2]

S.Lamprier, T.Amghar, B.Levrat, and F.Saubion, "SegGen: a
Genetic Algorithm for Linear Text Segmentation," presented at
International Joint Conferences
on Artifi
cial Intelligence (IJCAI),
2007, pp 1647
-
1652.

[3]

G. Tsatsaronis, M. Vazirgiannis, and I. Androutsopoulos, "Word
Sense Disambiguation with Spreading Activation Networks
Generated from Thesauri," presented at International Joint
Conferences on Art
ifi
cial Intelligence (IJCAI), 2007, pp 1725
-
1730.



Research Issues on Language Processing



Ray Dos Santos


rdossant@vt.edu