CSIT600f: Introduction to Semantic Web

snufflevoicelessInternet και Εφαρμογές Web

22 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

78 εμφανίσεις

1

CSIT600f: Introduction to Semantic Web

Dickson K.W. Chiu

PhD, SMIEEE


Text: Antoniou & van Harmelen:
A Semantic Web Primer


Ref: Ivan Herman:
Tutorial on Semantic Web Technology


Dickson Chiu 2006

CSIT600b s1
-
2

Towards a Semantic Web


WWW is an impressive success:


amount of available information (> 1 Giga
-
page)


number of human users (> 200 Mega
-
user)


The current Web represents information using


natural language (English, Hungarian, Chinese,…)


graphics, multimedia, page layout


Humans can process this easily


can deduce facts from partial information


can create mental associations


are used to various sensory information


(well, sort of… people with disabilities may have serious
problems on the Web with rich media!)

Dickson Chiu 2006

CSIT600b s1
-
3

Need for understanding Web info


Tasks often require to combine data on the Web:


hotel and travel infos may come from different sites


searches in different digital libraries


etc.


Again, humans combine these information easily


even if different terminologies are used!

Dickson Chiu 2006

CSIT600b s1
-
4

However…


However: machines are ignorant!


partial information is unusable


difficult to make sense from, e.g., an image


drawing analogies automatically is difficult


difficult to combine information


is
<foo:creator>

same as
<bar:author>
?


how to combine different XML hierarchies?




Dickson Chiu 2006

CSIT600b s1
-
5

Example: Searching


The best
-
known example…


Google et al. are great, but there are too many
false hits


adding descriptions to resources should improve
this

Dickson Chiu 2006

CSIT600b s1
-
6

Where we are Today:

the Syntactic Web

[Hendler & Miller 02]

Dickson Chiu 2006

CSIT600b s1
-
7

The Syntactic Web is



A hypermedia, a digital library


A library of documents called (web pages) interconnected by
a hypermedia of links


A database, an application platform


A common portal to applications accessible through web
pages, and presenting their results as web pages


A platform for multimedia


BBC Radio 4 anywhere in the world!


Peer
-
to
-
peer sharing (BT, edonkey,
PPLive
,

)


A naming scheme


Unique identity for those documents



A place where computers do the presentation (easy)
and people do the linking and interpreting (hard).


Why not get computers to do more of the hard work?


Dickson Chiu 2006

CSIT600b s1
-
8

Hard using the Syntactic Web



Finding the image of something


Find pictures that contain red birds with blue background


Complex queries involving
background knowledge


Find information about

animals that use sonar but are not
either bats or dolphins



Locating information in
data repositories


Travel enquiries


Prices of goods and services


Results of human genome experiments


Finding and using

web services



Visualise surface interactions between two proteins


Delegating complex tasks to web

agents



Book me a holiday next weekend somewhere warm, not too
far away, and where they speak French or English

Dickson Chiu 2006

CSIT600b s1
-
9

What is the Problem?

Consider a typical web page:


Markup comprise


rendering
information
(e.g., font size
and colour)


Hyper
-
links to
related content


Semantic content
is accessible to
humans but not
(easily) to
computers


Dickson Chiu 2006

CSIT600b s1
-
10

What information can we see


WWW2002

The eleventh international world wide web conference

Sheraton waikiki hotel

Honolulu, hawaii, USA

7
-
11 may 2002

1 location 5 days learn interact

Registered participants coming from

australia, canada, chile denmark, france, germany, ghana, hong kong, india,
ireland, italy, japan, malta, new zealand, the netherlands, norway,
singapore, switzerland, the united kingdom, the united states, vietnam,
zaire

Register now

On the 7
th

May Honolulu will provide the backdrop of the eleventh
international world wide web conference. This prestigious event


Speakers confirmed

Tim berners
-
lee

Tim is the well known inventor of the Web,


Ian Foster

Ian is the pioneer of the Grid, the next generation internet




Dickson Chiu 2006

CSIT600b s1
-
11

Information a machine may see


WWW2002

The eleveth iteratioal world wide web
coferece

Sherato waikiki hotel

Hoolulu, hawaii, USA

7
-
11 may 2002

1 locatio 5 days lear iteract

Registered participats comig from

australia, caada, chile demark, frace,
germay, ghaa, hog kog, idia, irelad,
italy, japa, malta, ew zealad, the
etherlads, orway, sigapore, switzerlad,
the uited kigdom, the uited states,
vietam, zaire

Register ow

O the 7
th

May Hoolulu will provide the
backdrop of the eleveth iteratioal world
wide web coferece This prestigious evet 

Speakers cofirmed

Tim berers
-
lee

Tim is the well kow ivetor of the Web, 

Ia Foster

Ia is the pioeer of the Grid, the ext
geeratio iteret 



Dickson Chiu 2006

CSIT600b s1
-
12

Solution: XML markup with

meaningful


tags?

<name>
WWW2002

The eleveth iteratioal world wide webco
</name>

<location>
Sherato waikiki hotel

Hoolulu, hawaii, USA
</location>


How about


<conf>
WWW2002

The eleveth iteratioal world wide webco
</conf>

<place>
Sherato waikiki hotel

Hoolulu, hawaii, USA
</place>


Then how about


<
会议
>
WWW2002

The eleveth iteratioal world wide webco
</
会议
>

<
地点
>
Sherato waikiki hotel

Hoolulu, hawaii, USA
</
地点
>

Dickson Chiu 2006

CSIT600b s1
-
13

What Is Needed?


A resource should provide information about
itself


also called “metadata”


metadata should be in a machine processable
format


agents should be able to “reason” about
(meta)data


metadata vocabularies should be defined

Dickson Chiu 2006

CSIT600b s1
-
14

What Is Needed (Technically)?


To make metadata machine processable, we
need:


unambiguous names for resources (URIs)


a common data model for expressing metadata
(RDF)


and ways to access the metadata on the Web


common vocabularies (Ontologies)


The “Semantic Web” is a metadata based
infrastructure for reasoning on the Web


It extends the current Web (and does not
replace it)

Dickson Chiu 2006

CSIT600b s1
-
15

Adding

Semantics



External agreement

on meaning of annotations


E.g., Dublin Core (
http://dublincore.org/
)


Agree on the meaning of a set of annotation tags


Problems with this approach


Inflexible


Limited number of things can be expressed


Use
Ontologies

to specify meaning of annotations


Ontologies provide a vocabulary of terms


New terms can be formed by combining existing ones


Meaning (
semantics
) of such terms is formally specified


Can also specify relationships between terms in multiple
ontologies

Dickson Chiu 2006

CSIT600b s1
-
16

History of the Semantic Web


Web was

invented


by
Tim Berners
-
Lee

(amongst others), a
physicist working at CERN


TBL

s original vision of the Web was much more ambitious than
the reality of the existing (syntactic) Web:









TBL (and others) have since been working towards realising this
vision, which has become known as the
Semantic Web


E.g., article in May 2001 issue of Scientific American



... a goal of the Web was that, if the interaction between person and
hypertext could be so intuitive that the
machine
-
readable

information
space gave an accurate representation of the state of people's thoughts,
interactions, and work patterns, then
machine analysis

could become
a very powerful management tool, seeing patterns in our work and
facilitating our working together through the typical problems which
beset the management of large organizations.”

Dickson Chiu 2006

CSIT600b s1
-
17

Berner
-
Lee

s Architecture



Data Exchange



Semantics+reasoning



Relational Data

?

?

???

???

???


Relationship between layers is not clear


OWL DL extends “DL subset” of RDF

Dickson Chiu 2006

CSIT600b s1
-
18

A Spectrum of Ontology

Catalog/

ID

General

Logical

constraints

Terms/

glossary

Thesauri

“narrower

term”

relation

Formal

is
-
a

Frames

(properties)

Informal

is
-
a

Formal

instance

Value
Restrs.

Disjointness,
Inverse, part
-
of…

Dickson Chiu 2006

CSIT600b s1
-
19


Ontology in Philosophy
-

a philosophical
discipline

a branch of philosophy that deals
with the nature and the organization of reality



Science of Being (Aristotle, Metaphysics, IV, 1)


studies
being

or
existence

as well as the
basic
categories

thereof


trying to find out what
entities

and what
types of
entities

exist


has strong implications for the conceptions of
reality
.

Ontology: Origins and History

Dickson Chiu 2006

CSIT600b s1
-
20

Ontology in Linguistics

“Tank“

Referent

Form

Stands for

Relates to

activates

Concept

[Ogden, Richards, 1923]

?

Dickson Chiu 2006

CSIT600b s1
-
21


An ontology is an engineering artifact [
Neches91
]:



defines
basic terms

and
relations

comprising the
vocabulary

of
a topic area


the rules for combining terms and relations to define extensions to
the vocabulary



An explicit specification of a conceptualization


[Gruber93]



Formal

specification of a
shared

conceptualization
(of a
certain domain) [
Borst 97
]:


Shared understanding of a domain of interest


Formal and machine manipulable model of a domain of interest


Ontology in Computer Science

Dickson Chiu 2006

CSIT600b s1
-
22

Structure of an Ontology

Ontologies typically have two distinct components:

1.
Names for important concepts in the domain


Elephant

is a concept whose members are a kind of animal


Herbivore

is a concept whose members are exactly those animals
who eat only plants or parts of plants


Adult_Elephant

is a concept whose members are exactly those
elephants whose age is greater than 20 years

2.
Background knowledge/constraints on the domain


Adult_Elephant
s weigh at least 2,000 kg


All
Elephant
s are either
African_Elephant
s or
Indian_Elephant
s


No individual can be both a
Herbivore

and a
Carnivore


Dickson Chiu 2006

CSIT600b s1
-
23

Ontology Elements


Concepts

(classes) + their hierarchy


Concept properties (
slots

/ attributes)


Property restrictions (type, cardinality, domain
, etc.
)


Relations between concepts (disjoint, equality, etc.)


Instances



E
-
R diagram / UML diagram ???


Note: “Property”


“Slot”


“Relation”


“Relationtype”



“Attribute”


Semantic link type”

Dickson Chiu 2006

CSIT600b s1
-
24

A Semantic Web


First Steps


Extend existing rendering markup with
semantic markup


Metadata annotations that describe content/function of web
accessible resources


Use Ontologies to provide
vocabulary

for annotations



Formal specification


is accessible to machines


A prerequisite is a standard web ontology language


Need to agree common
syntax

before we can share semantics


Syntactic web based on
standards

such as
HTTP

and
HTML

Make web resources more accessible to automated processes

Dickson Chiu 2006

CSIT600b s1
-
25

More
Example: Automatic Assistant


Your own personal (digital) automatic assistant


knows about your preferences


builds up knowledge base using your past


can combine the local knowledge with remote services:


hotel reservations, airline preferences


dietary requirements


medical conditions


calendaring


etc


It communicates with
remote

information (i.e., on
the Web!)



Dickson Chiu 2006

CSIT600b s1
-
26

Example: Database Integration


Databases are very different in structure, in content


Lots of applications require managing several
databases


after company mergers


combination of administrative data for e
-
Government


biochemical, genetic, pharmaceutical research


etc.


Most of these data are now on the Web


The semantics of the data(bases) should be known


how this semantics is mapped on internal structures is
immaterial

Dickson Chiu 2006

CSIT600b s1
-
27

Example: Digital Libraries


It is a bit like the search example


It means catalogs on the Web


librarians have known how to do that for centuries


goal is to have this on the Web, World
-
wide


extend it to multimedia data, too


But it is more: software agents should also be
librarians!


help you in finding the right publications

Dickson Chiu 2006

CSIT600b s1
-
28

Example: Semantics of Web Services


Web services technology is great


But if services are ubiquitous, searching issue
comes up
,
for example:


“find me the most elegant Schrödinger equation solver”


what does it mean to be


“elegant”?


“most elegant”?


mathematicians ask these questions all the time…


It is necessary to characterize the service


not only in terms of input and output parameters…


…but also in terms of its semantics

Dickson Chiu 2006

CSIT600b s1
-
29

How Simple Ontologies Help


not as costly to build and potentially


more importantly, many are available


provide a controlled vocabulary


website organization and navigation support


support expectation setting (e.g. user interface)


“umbrella” structures from which to extend content
(e.g.,
UNSPSC
)


searching support


sense disambiguation support (e.g., terms belong to
different categories)


Deborah McGuinness. Ontologies Come of Age. The Semantic Web: Why, What and
How, MIT Press, 2001. (
MS
-
Word
)

Dickson Chiu 2006

CSIT600b s1
-
30

How Structured Ontologies Help


more structure => more power


consistency checking


completion (of unspecified attributes and relations)


interoperability support


validation and verification testing or even encode
entire test suites


structured, comparative, and customized search


“intelligence” in application, e.g., system
configuration support


Dickson Chiu 2006

CSIT600b s1
-
31

Benefits of Semantic Web


Communication between people


Interoperability between software agents


Reuse of domain knowledge


Make domain knowledge explicit


Analyze domain knowledge


Dickson Chiu 2006

CSIT600b s1
-
32

The Semantic Web is Not



Artificial Intelligence on the Web”


although it uses elements of logic…


… it is much more down
-
to
-
Earth (we will see later)


it is all about properly representing and characterizing metadata


of course: AI systems may use the metadata of the SW


but it is a layer way above it


“A purely academic research topic”


SW is out of the university labs now


lots of applications exist already (see examples later)


big players of the industry use it (Sun, Adobe, HP, IBM,…)


of course, much is still be done!


Building an ontology is not a goal in itself