An Evaluation Tool for Natural Language Processing Systems

blabbingunequaledΤεχνίτη Νοημοσύνη και Ρομποτική

24 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

98 εμφανίσεις

An Evaluation Tool for Natural
Language Processing
Systems

Audrey N. Mbeje

Department of Computer Science
Ball State University


November 09, 2000

Contents

I.
Introduction


Problem Description


Significance of the Study

II.
Definition of Terms


Computational Linguistics


Context

III.
Literature Review

IV.
Methodology

V.
Anticipated Results

VI.
Time Schedule

VII.
Deliverables

VIII.
Future Research & Conclusion


Problem Description

Problem Background:

Human interactive discourse provides many challenges

for natural language processing (NLP) systems. One of

the main challenges is representing the speaker’s

intended meaning in its context. Thus the focus of

current research on NLP has been to develop the

technology that will enable the computer to understand

news events in the context they occur in the real world.


The evolving technology, however, is linguistically

inclined and is less concerned about the quality of the

software. Additionally, it does not reflect uniform

principles of software evaluation.



Goal:



The goal of the proposed study is to improve the

quality of the natural language processing technology

by assessing NLP system inventions for linguistic and

technical quality assurance before they are implemented.


We are suggesting a natural language processing system

evaluation tool that will provide both the linguistic and

software quality assurance. The proposed study is based

on the assumption that progress in developing NLP

technology depends on using evaluation methods that

better model the speakers’ natural discourse and the

quality software.

Significance of the Study


The study will benefit the theory of natural language

processing, particularly the research area concerned

with context in NLP systems.



The study is proposing an integration of linguistic

principles and software design principles in NLP systems

evaluation which would be a contribution in the current

progress in NLP technology.



The proposed tool will improve the NLP system

usability by offering quality assurance for reliability

and validity of the software technically and linguistically.

Definition of Terms

1.
Computational Linguistics:


-
Discipline between linguistics and computer science


which is concerned with the computational aspects of


human language faculty.



-
Belongs to the cognitive sciences, artificial


intelligence (AI) specifically.



-
Has two components




applied and theoretical




Definition of Terms (cont’d)


-
With the applied component the interest is in the


practical outcome of modeling human language


use. The goal is to create software products that


have some knowledge of human language.




-
The theoretical aspect deals with issues of formal


theories about the linguistic knowledge that a


human needs for generating and understanding


language.


(The proposed evaluation tool is intended for the applied

component of CL.)


Definition of Terms (cont’d)

2.
Context:


-
Rough definition of the term



-
We say that an utterance x presupposes a fact y,



if uttering x only makes sense if the context (e.g.,



world knowledge or earlier utterance in the same



conversation) provides enough information to



conclude that y is the case. Consider example 2a


2a.

Mary’s husband is out of town.


The noun phrase presupposes Mary is married.


Computational linguists are concerned with making NLP

systems understand such contextual information.


Literature Review



Much research on the problem of in
-
depth story

understanding by computer was performed starting in

the 1970’s.




In the 1990’s the interest shifted towards

information extraction and word sense disambiguation.



The end of the 1990 marked another shift in focus back

to in
-
depth story understanding by the computer.







McCarthy (1990) discusses the problem of getting the

computer to understand the following text from the New

York Times:



A 61
-
year old furniture salesman was pushed


down the shaft of a freight elevator yesterday in


his downtown Brooklyn store by two robbers


while a third attempted to crush him with the


elevator car because they were dissatisfied with


the $1,200 they had forced him to give them. The


buffer springs at the bottom of the shaft prevented


the car from crushing the salesman John J. Hug, after


he was pushed from the first floor to the basement.





The car stopped about 12 inches above him as he


flattened himself at the bottom of the pit.







(Mueller, 1999)


McCarthy’s concern was beyond mere word sense

disambiguation and information extraction. He

suggested that the computer should be able to

demonstrate such contextual questions as:




Who was in the store when the events began?


Who had the money at the end?


What would have happened if Mr. Hug had not


flattened himself at the bottom of the pit? etc.



Literature Review (cont’d)

Current research on contextual understanding is

concerned with such problems as the one stated above.


Several NLP systems have been suggested whose

orientations is mainly linguistic.



This study is suggesting an evaluation tool for such

NLP systems integrating linguistic and technical

principles, namely, speed.

Methodology


Create an algorithm simulating aspects of human

language faculty, namely, speed and ability to

decode contextual discourse.




-
Evaluation technologies to evaluate the NLP


systems for context decoding and speed using


existing evaluation technology.







Methodology (cont’d)

-
Do the same test using the proposed tool.


-
Compare the results


Note: The proposed evaluation tool will be evaluated


for validity and reliability before its


implementation using outside researcher’s


evaluation tool.


Anticipated Results



The proposed tool should effectively evaluate NLP

systems for context and speed.

Time Schedule

August
-

November: Proposal Writing & Presentation


November
-

December: Proposal Review


January


March:


Literature Review


April


July:



Data Gathering






Evaluation Tool Designing






Evaluation Tool Testing


August
-

November:


Thesis Writing & Defense


Deliverables

1.
Natural Language Processing Evaluation Tool

2.
Research Presentation at a Conference

3.
Research Publication

Conclusion and Future Research


Computing context of a natural language discourse is

an essential task for a natural language processing

system.



The proposed evaluation tool for NLP system will have

a potential for modification to incorporate new design

principles for improved usability.

The End

********


********