Natural Language Processing Course

addictedswimmingAI and Robotics

Oct 24, 2013 (3 years and 11 months ago)

88 views



Amirkabir University of Technology

Computer Engineering Faculty


Natural Language Processing

Course

Dr. Ahmad Abdollahzadeh

2

Session Agenda



Artificial Intelligence



Natural Language Processing



History of NLP



Applications of NLP


3

AI Concepts and Definitions



Encompasses Many Definitions


AI Involves Studying Human

Thought
Processes


Representing Thought Processes on
Machines

4

Artificial Intelligence


Behavior by a machine that, if performed
by a human being, would be considered
intelligent


“…study of how to make computers do
things at which, at the moment, people
are better” (Rich and Knight [1991])


Theory of how the
human mind

works
(Mark Fox)

Decision Support Systems and Intelligent Systems, Efraim Turban and Jay E. Aronson

6th ed, Copyright 2001, Prentice Hall, Upper Saddle River, NJ

5

AI Objectives


Make machines
smarter

(primary goal)


Understand what
intelligence

is (Nobel
Laureate purpose)


Make machines more
useful

(entrepreneurial purpose)


(Winston and Prendergast [1984])

6

Signs of Intelligence



Learn

or
understand

from experience


Make sense out of ambiguous or
contradictory messages


Respond quickly and successfully to new
situations


Use
reasoning

to solve problems

7

More Signs of Intelligence


Deal with perplexing situations


Understand

and
infer

in ordinary,
rational ways


Apply
knowledge

to manipulate the
environment


Think

and
reason


Recognize the relative importance of
different elements in a situation

8

Turing Test for Intelligence

A computer can be considered to be
smart

only when a human interviewer,
“conversing” with both an unseen human
being and an unseen computer, can not
determine which is which

9

Symbolic Processing



Use
Symbols

to Represent Problem
Concepts




Apply Various Strategies and Rules to
Manipulate these Concepts

10

AI Represents Knowledge as
Sets of Symbols

A
symbol
is a string of characters that stands for
some real
-
world concept


Examples


Product


Defendant


0.8


Chocolate

11

Symbol Structures
(Relationships)


(DEFECTIVE product)


(LEASED
-
BY product defendant)


(EQUAL (LIABILITY defendant) 0.8)


tastes_good (chocolate).

12


AI Programs Manipulate Symbols to Solve
Problems



Symbols and Symbol Structures Form
Knowledge Representation



Artificial Intelligence Dealings Primarily with
Symbolic
, N
onalgorithmic

Problem
-

Solving
Methods

13

AI Computing



Based on
symbolic representation

and
manipulation


A
symbol

is a letter, word, or number
representing objects, processes, and their
relationships


Objects

can be people, things, ideas, concepts,
events, or statements of fact


Creates a
symbolic knowledge base


14

AI Computing
(cont’d
)


Manipulates symbols to generate advice


AI reasons or infers with the knowledge base
by search and pattern matching


Hunts for answers (via algorithms)

15

Major AI Areas



Expert Systems


Natural Language Processing


Speech Understanding


Robotics and Sensory Systems


Computer Vision and Scene Recognition


Intelligent Computer
-
Aided Instruction


Neural Computing

16

Additional AI Areas


News Summarization


Language Translation


Fuzzy Logic


Genetic Algorithms


Intelligent Software Agents

17

Natural Language?


Natural language is the language we write and speak in
everyday social interaction.


There are of course many varieties of natural language



It is quite possible to argue that the spoken and the
written forms of the language are different and may be
largely independent.


There are systems of vocabulary, syntax and semantics
which can be observed (or similarly discovered) and
recorded.


Those working in NLP also would claim (or at least
hope) that it is possible to "automate" these
descriptions to produce useful systems that are based
on these descriptions.


18

Natural Language Processing
(NLP)


Natural language processing concerns the development of
computational models of aspects of human language
processing such as :




Reading and interpreting a textbook



Writing a letter



Holding a conversation



Translating a document



Searching for useful information



Such models are useful in order to write computer programs
to perform useful tasks involving language processing and in
order to develop a better understanding of human
communication.

19

Other Titles


The most common titles, apart from
Natural
Language Processing

include:



Automatic Language Processing



Computational Linguistics



Natural Language Understanding



20

Computational Lingusitics


This is the application of computers to the scientific
study of human language.


This definition suggests that there are connections
with Cognitive Science, that is to say, the study of
how humans produce and understand language.


Historically, Computational Linguistics has been
associated with work in
Generative Linguistics

and
formerly included the study of formal languages (eg
finite state automata) and programming languages.


The computer is used as a tool on which models can
be developed and evaluated, for instance
implementations of theories of child language
acquisition.


21

Natural Language Understanding


Distinguish a particular approach to Natural
Language Processing.


The people using this title tend to lay much
emphasis on the
meaning

of the language being
processed, in particular getting the computer to
respond to the input in an apparently intelligent
fashion.


At one time, those who belonged to the Natural
Language Understanding camp avoided the use of
any
syntactic

processing, but textbooks that bear this
title now include significant sections on syntactic
processing, which suggests that the edge of the title
has been rather blunted. (For instance, see
Allen
(1987
; part 1).


22

NLP History (1)


The first recognisable NLP application was a
dictionary look
-
up system developed at
Birkbeck College, London in 1948.



NLP from 1966
-
1980


Augmented Transition Networks


The Augmented Transition Network (ATN) is a piece of
searching software that is capable of using very powerful

grammars to process syntax.


Case Grammar





The significance of the proposal for NLP is
that it contributed a


relatively easily
implementable theory which could contribute




much semantic information with little
processing effort. It also




contributed to the solution of one of
theintractable problems of


Machine Translation:
thetranslation of prepositions.

23

NLP History (2)


NLP from 1966
-
1980


Semantic representations



Schank and his workers introduced the notion of
Conceptual

Dependency
, a method of expressing language in terms of
semantic primitives. Systems were written which included no
syntactic processing.


QuillianÕs work on memory introduced the idea of the
semantic

network
, which has been used in varying forms for knowledge
representation in many systems.


William Woods used the idea of
procedural semantics

to act as an
intermediate representation between a language processing
system and a database system.


The key systems were:



SHRDLU


LUNAR: A database interface system that used ATNs and Woods'
Procedural Semantics.


LIFER/LADDER: One of the most impressive of NLP systems. It was
designed as a natural language interface to a database of information
about US Navy ships.

24

NLP History (3)


NLP from 1980
-

1990




-

Grammar Formalisms



NLP from 1990
-

now



-

Multilinguality and Multimodality

25

NLP Applications


Applications can be classified in different
ways, e.g. medium/modality; depth of
analysis;degree of interaction




Text
-
based applications



NL Understanding



Dialogue Systems



Multimodal


26

Text
-
based Applications

Processing of written texts such as books,news, papers,reports:



Finding appropriate documents on certain topics from a text
database



Extracting information from messages,articles, Web pages, etc.



Translating documents from one language to another



Text summarisation


Note: Not all such applications require NLP


Keyword based techniques can suce for identifying particular
subject areas, e.g. legal, financial, etc.

27

NL Understanding


Other kinds of request require a deeper level of analysis


Find me all articles concerning car accidents involving more than
two cars in Malta during the first half of

2001


Here the system must extract enough information to determine
whether the article meets the criterion defined by the query.


A crucial characteristic of an understanding system is that it can
compute some representation of the information that can be used
for later inference


A crucial question for an NLP system is how much understanding
is necessary to achieve the purpose of the system.

28

Dialogue
-
based Applications

Dialogue
-
based applications involve man
-
machine
communication



NL database query systems



Automated customer services, e.g. banking services



General NL mediated problem solving systems


Some of the differences between dialogue and text
-
based systems:


Language used is less formal


System needs to act proactively in order to maintain smooth conversation


Use of acknowledgements clarication sub
-
dialogues


29

Text
-
based Applications

Processing of written texts such as books,news, papers,reports:



Finding appropriate documents on certain topics from a text
database



Extracting information from messages,articles, Web pages, etc.



Translating documents from one language to another



Text summarisation


Note: Not all such applications require NLP


Keyword based techniques can suce for identifying particular
subject areas, e.g. legal, financial, etc.

30

Multimodal Applications

Involve two or more modalities of communication



Text



Speech



Gesture



Image




Text


speech




Speech


text


Multimodal document generation


Spoken translation systems


Spoken dialogue systems