the Ontology Life-Cycle

steelsquareInternet και Εφαρμογές Web

20 Οκτ 2013 (πριν από 4 χρόνια και 24 μέρες)

108 εμφανίσεις





1



1

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

NLP in a Data
-
Driven Approach to
the Ontology Life
-
Cycle


Paul Buitelaar

Competence Center Semantic Web &

Language Technology Lab

DFKI GmbH

Saarbrücken, Germany

2

2

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Overview

Part I
Ontologies and the Semantic Web


Why Ontologies?


The Semantic Web

Part II
The Ontology Life
-
Cycle


Ontology Search


OntoSelect


Ontology Population


SOBA offline


Knowledge Retrieval


SOBA online


Ontology Learning


OntoLT (RelExt, ISOLDE)

Part III

Ontologies and the Lexicon


A Lexicon Model for Multilingual Ontologies



LingInfo

3

3

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Part I


Ontologies and the Semantic Web

4

4

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Ontologies


An example

Ontology

F
-
Logic

similar

city

Neckar

Zugspitze

Geographical Entity (GE)

Natural GE

Inhabited GE

country

river

mountain

instance_of

Germany

Berlin

Stuttgart

is
-
a

f
low
_t
hrough

l
ocated
_in

c
apital
_of

flow_through

flow_through

located_in

c
apital
_of

367

length (km)

2962

height (m)

Design: Philipp Cimiano

5

5

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Why Ontologies?


Provide explicit and formal context for:



Interpretation



Integration



Sharing

6

6

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Why Ontologies?
-

Interpretation

B

C

A

i

j

C
1

Y

Z

X

Y
1

Z
1

k

l

7

7

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Why Ontologies?
-

Integration

B

C

A

i

j

C
1

B
1

8

8

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Why Ontologies?
-

Sharing

Y

Z

X

Y
1

Z
1

k

l

9

9

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Ontologies and the Semantic Web




Brief intro to the Semantic Web

10

10

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Web

Web Consists of Uninterpreted Data

Text

DBs

Images

Tables

11

11

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Web

Markup

Interpretation through Markup
-

Categories

12

12

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Web2.0

Markup

Interpretation through Markup


User Tags

13

13

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Knowledge

Markup

Formal Interpretation
-

Knowledge Markup

Ontologies

Semantic Web

(Web3.0)

14

14

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Knowledge

Markup

Formal Interpretation
-

Knowledge Markup

Ontologies

Semantic Web

(Web3.0)

15

15

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Knowledge

Markup

Ontologies

…turns the Web into a Knowledge Base



<rdf:Description rdf:about="KB_100308_Individual_16">


<rdf:type rdf:resource="http://www.lehigh.edu/univ
-
bench.owl#Director"/>


<ub:title>PhD</ub:title>


<ub:age>51</ub:age>


<ub:headOf>KB_100308_Individual_19</ub:headOf>

</rdf:Description>

<rdf:Description rdf:about="KB_100308_Individual_19">


<rdf:type rdf:resource="http://www.lehigh.edu/univ
-
bench.owl#Program"/>

</rdf:Description>



16

16

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Knowledge

Markup

Ontologies

Semantic

Web Services

… that enables Semantic Web Services

17

17

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Intelligent

Man
-
Machine Interface

Knowledge

Markup

Ontologies

Semantic

Web Services

… and Intelligent Man
-
Machine Interfaces

18

18

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Part II


The Ontology Life
-
Cycle

19

19

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Ontology Life Cycle

Create/Select

Development and/or Selection

Populate

Knowledge Base Generation

Validate

Consistency Checks

Evolve

Extension, Modification

Maintain

Usability Tests

Deploy

Knowledge Retrieval

20

20

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

NLP in the Ontology Life Cycle

Text
-
Driven Ontology Search

Ontology Population from Text

Ontology Learning from Text

NL Interaction with KBs

21

21

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

NLP in the Ontology Life Cycle

Text
-
Driven Ontology Search

Ontology Population from Text

Ontology Learning from Text

NL Interaction with KBs

22

22

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

OntoSelect


Ontology Library and Ontology Search Service



http://olp.dfki.de/OntoSelect



OntoSelect monitors the web for ontologies (automatic updates
and indexing)


Ontology browse and search (by keyword, by document, by topic)


Class, property and (multilingual) label browse and search


Ontology publishing (submit your ontology)


Statistics on formats (mostly OWL), languages (mostly EN),
frequently used labels, ontology publishing



Selected ontologies may be used in:


Knowledge extraction/markup in Semantic Web applications


Semantic tagging in Natural Language Processing

Paul Buitelaar, Thomas Eigner, Thierry Declerck
OntoSelect: A Dynamic Ontology Library with Support for Ontology Selection

In:
Proc. of the Demo Session at the International Semantic Web Conference, Hiroshima, Japan, Nov. 2004.

23

23

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

24

24

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Multilinguality


Distribution of languages in 170 ontologies with multilingual labels
-

out of 1420 ontologies currently
-

June 2007
-

collected (~12%)

25

25

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Ontology Search with
OntoSelect

“Find the background knowledge that fits your NLP task …”



Keyword, topic, document
-
specific ontology search



Relevance criteria address ontology content and structure:


Coverage
-

Term Matching


How many of the terms in a text collection are covered by labels for
classes and properties?


Structure
-

Properties Relative to Classes


How detailed is the knowledge structure that the ontology
represents?


Connectedness
-

Number of Included Ontologies


Is the ontology connected to other ontologies and how well
established are these?

26

26

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

27

27

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

28

28

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

29

29

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

30

30

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

31

31

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

32

32

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Related Services


Manually maintained ontology libraries


Protege Ontology Library (Stanford Univ., USA)


DAML Ontology Library (DAML.org)


No longer maintained


SchemaWeb Directory (schemaweb.info)


No longer active?



Semantic Web search engines


SWOOGLE (Univ. of Maryland, USA)


Watson (Knowledge Media Institute, UK)

33

33

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

NLP in the Ontology Life Cycle

Text
-
Driven Ontology Search

Ontology Population from Text

Ontology Learning from Text

NL Interaction with KBs

34

34

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

SmartWeb Ontology
-
based Annotation (SOBA)


Generate a Knowledge Base from Web Documents
on Soccer World Cup to be used for knowledge
based QA in the SmartWeb mobile dialogue system


http://www.smartweb
-
projekt.de/


Ontology
-
based wrapping of HTML tables


Ontology
-
based Information Extraction


Named
-
Entity Recognition/Classification, Event Extraction


Ontology
-
based Information Extraction on image captions


NE/Event extraction for linking images to ontology classes


Ontology Population


Instantiation of classes with extracted NEs and events

Ontology Population

Paul Buitelaar, Philipp Cimiano, Anette Frank, Stefania Racioppa
SOBA: SmartWeb Ontology
-
based Annotation

In: Proc. of the
Demo Session at the International Semantic Web Conference, Athens GA, USA, Nov. 2006.


35

35

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007











Web

Web

Interface


SmartWeb corpus

(XML)






Crawler



(HTML)

Monitor

Text

Images

Linguistic

Annotation &

Information

Extraction

Results




(XML)

Linguistic Annotation & IE





Class

Instantiation


SWIntO &

OntoBroker

(F
-
Logic)

Tables

Image

Captions









(XML)

Textual

Data

Semi
-

Struct.

Data


KB

(RDF)

SOBA
-

offline


36

36

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

KB Generation

KB

semistruct#Uruguay_vs_Bolivien_29_Maerz_2000_19:30:sportevent#LeagueFootballMatch

[ externalRepresentation@(de)
-
>> "Uruguay vs. Bolivien (29. Maerz 2000 19:30)";


dolce#"HAPPENS
-
AT"
-
> semistruct#"29. Maerz 2000 19:30_interval";


sportevent#heldIn
-
> semistruct#"Montevideo_Centenario_29Maerz_2000_19_30_Stadium";


sportevent#team1Result
-
> 1;


sportevent#team2Result
-
> 0;


sportevent#attendance
-
>49811;


sportevent#team1
-
> semistruct#"Uruguay_vs_Bolivien_29Maerz_2000_19:30_Uruguay_MatchTeam";


sportevent#team2
-
> semistruct#"Uruguay_vs_Bolivien_29Maerz_2000_19:30_Bolivien_MatchTeam";




semistruct#Uruguay_vs_Bolivien_29_Maerz_2000_19:30

[


sportevent#matchEvents
-
> soba#ID11].


soba#ID11:sportevent#Ban

[


sportevent#commitedBy
-
> semistruct#Uruguay_vs_Bolivivien_(…)_Luis_CRISTALDO].

semistruct#Uruguay_vs_Bolivien_29_Maerz_2000_19:30

[ sportevent#matchEvents
-
> soba#ID25].


soba#ID25:sportevent#Foul

[ sportevent#commitedBy
-
> semistruct#Uruguay_vs_Bolivien_Luis_CRISTALDO].


mediainst#ID67:media#Picture

[ media#URL
-
> "http://fifaworldcup.yahoo.com/06/de/photos/124155.jpg";


media#shows
-
> ID25].


37

37

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007


R
esults over 50 manually annotated match reports






Sample of good results

F
-
measure > 0,6


SWIntO class + attribute

-

65 classes
-

manually annotated

extracted

extracted OK

Precision

Recall

F
-
measure

ShowingYellowRedCard

23

15

13

0,867

0,565

0,684

penalized_player

22

2

1

0,500

0,045

0,083

Block

26

13

12

0,923

0,462

0,615

committed_by

17

1

1

1,000

0,059

0,111

committed_on

8

3

3

1,000

0,375

0,545

FreeKick

69

36

32

0,889

0,464

0,610

committed_by

52

11

11

1,000

0,212

0,349






Sample of average
results

0,3 < F
-
Measure < 0,6

CornerKick


53

24

20

0,833

0,377

0,519

committed_by

14

4

4

1,000

0,286

0,444

Header

58

34

23

0,676

0,397

0,500

committed_by

55

7

6

0,857

0,109

0,194

committed_on

11

3

3

1,000

0,273

0,429

Cross

95

31

25

0,806

0,263

0,397

committed_by

77

5

5

1,000

0,065

0,122

committed_on

26

2

2

1,000

0,077

0,143

Sample of bad results

F
-
Measure < 0,3

BallDeflection

35

5

4

0,800

0,114

0,200

committed_by

34

2

2

1,000

0,059

0,111

OBIE Results

MACRO AVERAGE

on types

P 0.51 / R 0.23 / F 0.31

on attributes

P 0.38 / R 0.06 / F 0.11

MICRO AVERAGE

on types

P 0.72 / R 0.26 / F 0.38

on attributes

P 0.88 / R 0.06 / F 0.12

38

38

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

SOBA


online


39

39

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

NLP in the Ontology Life Cycle

Text
-
Driven Ontology Search

Ontology Population from Text

Ontology Learning from Text

NL Interaction with KBs

40

40

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Ontology Learning (from Text)

Terms

(Multilingual) Synonyms

referee, trainer, goalkeeper, ...

{Torwart, doelman, goalkeeper, ...}

c := GOALKEEPER :=

i(c), |c|, Ref
c
(c)


GOALKEEPER

c
PLAYER, PLAYER

c
PERSON

CROSS (domain:PLAYER, range:PLAYER)

CROSS

R
ASSIST

disjoint (PLAYER, REFEREE)

Concept Formation (Classes)

Class Taxonomy

Relations

Relation Taxonomy

Axioma

Philipp

Cimiano

Ontology

Learning

and

Population

from

Text
:

Algorithms,

Evaluation

and

Applications
.

Springer,

2006
.

Paul Buitelaar, Philipp Cimiano, Bernardo Magnini
Ontology Learning from Text: An Overview
In: Paul Buitelaar, Philipp
Cimiano, Bernardo Magnini (eds.)
Ontology Learning from Text: Methods, Evaluation and Applications

Frontiers in
Artificial Intelligence and Applications Series, Vol. 123, IOS Press, July 2005.

41

41

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

OntoLT



NLP
-
based Protege PlugIn
http://olp.dfki.de/OntoLT/OntoLT.htm

Paul Buitelaar, Daniel Olejnik, Michael Sintek

A Protégé Plug
-
In for Ontology Extraction from Text Based on Linguistic Analysis
In: Proceedings of the 1st European Semantic Web Symposium (ESWS), Heraklion, Greece, May 2004.


42

42

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Linguistic Patterns + Statistical Relevance

43

43

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Class Candidate Extraction

44

44

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Generate Ontology Fragments

45

45

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

RelExt
-

Relation Extraction


flanken


SUBJ
:
FOOTBALLPLAYER

“Klasnic”

flanken

DOBJ:
FOOTBALLPLAYER
“Klose”

.
.
.

beschimpfen (to insult)

SUBJ:
FOOTBALLPLAYER
“Klasnic”

.
.
.

.
.
.


Extend SmartWeb Ontology with ‘Event Relations’

“Ballack shoots the ball in the net.” >
Relation:
Shoot(
Domain:
FootballPlayer,
Range:
BallObject)

Relation:

flanken(
Domain:
FootballPlayer,
Range:
FootballPlayer)

Alexander Schutz, Paul Buitelaar
RelExt: A Tool for Relation Extraction in Ontology Extension
In: Proc. of the 4th International
Semantic Web Conference, Galway, Ireland, Nov. 2005.

46

46

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Class

Candidate

Lexical

Relation

Related

Class Candidate

Web

Resource

Taxonomy

Class

Equivalence

ISOLDE

-

Web
-
based Taxonomy Extraction

Extracting
Taxonomies

from

Wikipedia &

Web Dictionaries


Nicolas Weber, Paul Buitelaar
Web
-
based Ontology Learning with ISOLDE
In: Proc. of the Workshop on Web Content Mining with
Human Language at the International Semantic Web Conference, Athens GA, USA, Nov. 2006.


47

47

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Part III


Ontologies and the Lexicon

48

48

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

NLP in the Ontology Life Cycle

Ontology Search

Ontology Population

Ontology Learning

KB Retrieval

(Multilingual)

Lexicon

49

49

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Multilinguality in Ontologies

University

School

is_part_of

Student

studies_at

Staff

works_at

Campus

located_at

has_German_term

has_US
-
English_term

has_Dutch_term

Fakultät

School

Faculteit

50

50

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Towards Lexicalized Ontologies

University

School

is_part_of

Term

has_term

Fakultät

instance_of

DE

language

faculteit

instance_of

NL

language

school

EN
-
US

language

Student

studies_at

Staff

works_at

Campus

located_at

51

51

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

LingInfo

-

Lexicon Model for Ontologies

LingInfo

Term
-
1

instanceOf

term

hasOrthographicForm

XX

hasLang

DomainClass

hasLingInfo

Paul

Buitelaar,

Michael

Sintek,

Malte

Kiesel

A

Lexicon

Model

for

Multilingual/Multimedia

Ontologies

In
:

Proceedings

of

the

3
rd

European

Semantic

Web

Conference,

Budva,

Montenegro,

June

2006

hasMorphSynInfo

WordForm
-
1

52

52

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

LingInfo

-

Lexicon Model for Ontologies

LingInfo

Term
-
1

instanceOf

fakulteitsgebouw

hasOrthographicForm

NL

hasLang

Multilingual Terms

SCHOOL

hasLingInfo

“department building”

“school”

53

53

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

LingInfo

-

Lexicon Model for Ontologies

Term
-
1

fakulteitsgebouw

hasOrthographicForm

NL

hasLang

hasMorphSynInfo

WordForm
-
1

Morpho
-
Syntactic Info

LingInfo

instanceOf

SCHOOL

hasLingInfo

“department building”

“school”

N

hasPoS

54

54

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

LingInfo

-

Lexicon Model for Ontologies

Term
-
1

fakulteitsgebouw

hasOrthographicForm

NL

hasLang

hasMorphSynInfo

WordForm
-
1

N

hasPoS

Term
-
2

hasStem

Term
-
3

hasStem

fakulteit

hasOrthographicForm

gebouw

hasOrthographicForm

Decomposition

LingInfo

instanceOf

SCHOOL

hasLingInfo

“department building”

“school”

“department”

“school”

“building”

55

55

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Mapping Lexical to Semantic Structure

Term
-
1

fakulteitsgebouw

hasOrthographicForm

NL

hasLang

hasMorphSynInfo

WordForm
-
1

instanceOf

LingInfo

instanceOf

SCHOOL

hasLingInfo

“department building”

“school”

Decomposition

N

hasPoS

Term
-
2

hasStem

Term
-
3

hasStem

fakulteit

hasOrthographicForm

gebouw

hasOrthographicForm

“department”

“school”

“building”

56

56

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Mapping Lexical to Semantic Structure

Term
-
1

fakulteitsgebouw

hasOrthographicForm

NL

hasLang

hasMorphSynInfo

WordForm
-
1

instanceOf

LingInfo

instanceOf

SCHOOL

hasLingInfo

LingInfo

instanceOf

BUILDING

hasLingInfo

“department building”

“school”

Decomposition

N

hasPoS

Term
-
2

hasStem

Term
-
3

hasStem

fakulteit

hasOrthographicForm

gebouw

hasOrthographicForm

“department”

“school”

“building”

57

57

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Mapping Lexical to Semantic Structure

Term
-
1

fakulteitsgebouw

hasOrthographicForm

NL

hasLang

hasMorphSynInfo

WordForm
-
1

instanceOf

LingInfo

instanceOf

SCHOOL

hasLingInfo

LingInfo

instanceOf

BUILDING

hasLingInfo

“department building”

“school”

isLocatedAt

Decomposition

N

hasPoS

Term
-
2

hasStem

Term
-
3

hasStem

fakulteit

hasOrthographicForm

gebouw

hasOrthographicForm

“department”

“school”

“building”

58

58

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Mapping Lexical to Semantic Structure

Term
-
1

call

hasOrthographicForm

EN

hasLang

hasMorphSynInfo

WordForm
-
1

V

hasPoS

Arg
-
1

hasArg

Arg
-
2

hasArg

SUBJ

hasGramFunc

LingInfo

instanceOf

CALL

hasLingInfo

NP

hasPhraseType

mapsTo

hasAgent
-
1

PERSON

hasAgent

Pred
-
Arg Structure

59

59

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

Mapping Lexical to Semantic Structure

Term
-
1

call

hasOrthographicForm

EN

hasLang

hasMorphSynInfo

WordForm
-
1

V

hasPoS

Arg
-
1

hasArg

Arg
-
2

hasArg

SUBJ

hasGramFunc

LingInfo

instanceOf

CALL

hasLingInfo

NP

hasPhraseType

mapsTo

hasAgent
-
1

PERSON

hasAgent

ORGANIZATION

worksFor

SCHOOL

isa

Coercion/Bridging:

“The Heller school called. They wanted to know ...”

Pred
-
Arg Structure

60

60

©

Paul

Buitelaar
:

TALN
07



Toulouse,

June

2007

OntoSelect



Thomas Eigner, Michael Velten (DFKI)

SOBA


Anette Frank (DFKI, now at Univ. Heidelberg), Stefania Racioppa (DFKI),
Philipp Cimiano (AIFB) and others ...

OntoLT


Michael Sintek (DFKI), Daniel Olejnik (now at IDS Scheer) and others ...

RelExt


Alexander Schutz (now at DERI Galway)

ISOLDE


Nicolas Weber (now at KnowCenter, Graz)

LingInfo



Michael Sintek, Massimo Romanelli (DFKI), Vanessa Micelli (European Media
Lab, Germany) and others ...

Acknowledgements