Harvesting Semantic Content from the Web Eduard Hovy 3-4:30, Tuesday, October 14, 2008, Lecture Hall 1 Fall 2008 PLATO Royalty Lecture Series

estonianmelonAI and Robotics

Oct 24, 2013 (3 years and 7 months ago)

91 views

Harvesting Semantic Content from the Web

Eduard Hovy

3
-
4:30, Tuesday, October 14, 2008, Lecture Hall 1

Fall 2008 PLATO Royalty Lecture Series
1


Abstract:


Research in natural language processing (NLP) over the past fifteen years has
produced impressive p
ractical results using statistical methods. But increasingly there are
signs that continued quality improvement in language processing applications (including QA,
summarization, information extraction, opinion mining, and machine translation) requires
dee
per and richer representations, possibly even (shallow) semantics of text meaning.
Although theories of semantics (
formal and informal) abound, no
one has yet built a resource
of semantic symbols that effectively supports NLP, that is empirically based, an
d that has
been validated through human
-
agreement scores. Can this be done?


This talk describes the harvesting of semantic knowledge from the web, and
reformulation of that knowledge into the Omega ontology, to support various NLP
applications. We wil
l explore a series of increasingly detailed experiments in knowledge
harvesting and organization: from fully automated, through partly automated, ending with
work requiring manual annotation. The first two make extensive use of the web; the third is
part
of the
OntoNotes

project, a large collaborative effort to build
a manually annotated
corpus of one

million words of English, Chinese, and Arabic text, with accompanying
ontology for the senses of nouns and verbs.


Throughout the lecture, we will touch on

some problematic aspects of
the
semantics
and semantic representations that must support robust large
-
scale reasoning and other
applications. We will see examples of cases where traditional, formal, semantics simply does
not work, and where what does wor
k instead looks woefully simplistic.


The Speaker:


Eduard Hovy leads the Natural Language Research Group at the Information
Sciences Institute of the University of Southern California
, and
is Deputy Director of the
Intelligent Systems Division, as well

as a research associate professor of the Computer
Science Department of USC and Advisory Professor of the Beijing University of Posts and
Telecommunications. He completed a Ph.D. in Computer Science (Artificial Intelligence) at
Yale University

in 1987, a
nd h
is research focuses on information extraction, automated text
summarization, the semi
-
automated construction of large lexicons and ontologies, machine
translation, question answering, and digital government. He is the author or co
-
editor of five
books
and over 180 technical articles. Dr. Hovy regularly serves in an advisory capacity to
funders of NLP research in the US and EU. In 2001 Dr. Hovy served as President of the
Association for Computational Linguistics (ACL) and in 2001

03 as President of the
International Association of Machine Translation (IAMT); he currently serves as President
of the Digital Government Society of North
America (DGSNA). He

regularly co
-
teaches a



1

This Lecture Series is sponsored by Evergreen’s PLATO Royalty Fund, a fund established with royalties
from computer assisted instruction (CAI) software written by Evergreen faculty John Aikin Cushing and
students in the early 1980’s for the C
ontrol Data PLATO system.

course in the Master’s Degree Program in Computer Science at the University of S
outhern
California, as well as occasional short courses on
machine translation

and other topics at
universities and conferences. He has served on the Ph.D. and M.S. committees for students
from USC, Carnegie Mellon University, Taiwan National U, the Unive
rsities of Toronto,
Karlsruhe, Pennsylvania, Stockholm, Waterloo, Nijmegen, Pretoria, and Ho Chi Minh City.
http://www.isi.edu/natural
-
language/nlp
-
at
-
isi.html

and

http://www.isi.edu/~hovy.html




Companion
Reading

for the Lecture, and Reading
s

for Week

6
D
ata &
I
nformation

Seminar
:

1.

Deepak Ravichandran and Eduard Hovy,

Learning Surface Text Patterns for a
Question Answering System
.”
http://www.isi.edu/natural
-
language/projects/webclopedia/pubs/02ACL
-
patterns.pdf


2.


OntoN
otes:
The 90% Solution.


http://www.isi.edu/natural
-
language/people/hovy/papers/06HLT
-
NAACL
-
OntoNotes
-
short.pdf


3.

Soo
-
Min Kim and Eduard Hovy,

Identifying and Analyzing Judgment Opinions.


http://www.isi.edu/natural
-
language/people/hovy/papers/06HLT
-
JudgmentOpinion.pdf



Optional

associated
readings:

1.

Especially the Abstract and Sections 1
-
3, 7: Chin
-
Yew Lin and Eduard Hovy,

The
Auto
mated Acquisition of Topic Signatures for Text Summarization.


http://www.isi.edu/natural
-
language/people/hovy/papers/00linhovy.pdf


2.

Dongui Feng, Eduard Hovy,

Handling B
iographical Questions with Implicature.


http://www.isi.edu/natural
-
language/people/hovy/papers/05HLT
-
bio
-
questions
-
implic.pdf


3.

Ken Barker, Bhalchandra
Agashe, et al.

Learning by Reading
:
A Prototype System,
Performance Baseline and Lessons Learned.


http://www.isi.edu/natural
-
language/people/hovy/papers/07AAAI
-
mob
ius.pdf