Publishing to the Semantic Web

religiondressInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 3 χρόνια και 7 μήνες)

70 εμφανίσεις

Publishing to the Semantic Web

Dr

Owen
Conlan

Dr

Alexander O’Connor

The World
Wide Web


HTTP


HTML

The Semantic
Web Vision


RDF


Semantic
Web Stack

Web 2.0 / The
Social Web


Tagging


Crowd

Sourcing

The Web of
Data (Linked
Data)


VoID


a䉰Bd楡

卥浡m瑩挠Web


oe慳潮ing


Logic,

Rules


Trust

The Long Road for Semantic Web

1992

1998/9

2003

2006

????


Tim Berners Lee driven


“Linked Data uses a small slice of the technologies that
make up the Semantic
Web”


Treat Schemas as Vocabularies


Reuse existing schemas

The Linked Data Movement


Community
project with W3C support started in
early
2007


Idea
:
take existing (open) data sets
and make them
available
on the Web in RDF


Interlink
them with
other data sets

Linking Open Data Project

A Pretty (Scary) Diagram

DBpedia


Transforming Wikipedia into
a knowledge
base


Structure
from


Infoboxes


HTML
(titles
)


Categories Links


other languages, redirects,
disambiguations
,
etc


Uses: as a controlled vocab,
as an ontology








Check out :
http
://dbpedia.org/page/Trinity_Coll
ege,
_Dublin


(or Google “Trinity College Dublin
dbpedia
”)


Publish
structured data
in
RDF on the web
using
URIs
and shared vocabularies

rather than the
traditional Semantic Web focus on ontologies and
inference



Lowers
barriers to
entry


Fosters
widespread
adoption


Mature
tools, techniques, patterns

Linking Data


Formulated by Tim Berners
-
Lee (2006):

1.
Use URIs as names for things

2.
Use
HTTP URIs

so that people
/
apps can lookup
these
names

3.
When someone
/
an app looks up a URI, provide
useful
information

4.
Include
links to other URIs
so that they can discover
more things



This
not

an unambiguous specification, just a set of
principles.

Linked Data Principles


The Hypertext Transfer Protocol (HTTP) is an
application
-
level protocol for
distributed
,
collaborative
, hypermedia information systems. It is
a
generic
,
stateless
, protocol which can be used for
many tasks beyond its use for hypertext, such as
name servers and distributed object management
systems, through
extension of its request methods
,
error codes
and
headers
. A feature of HTTP is the
typing

and
negotiation

of
data representation
,
allowing systems to be built independently of the
data being transferred. [RFC2616]

http://
what.is.http.org
/
ask?question


A Uniform Resource Identifier (URI) is a compact
sequence of characters that identifies an abstract or
physical resource [RFC3986]


Syntax: URI = scheme “:”
hier
-
part [“?” query] [“#”
fragment]


Example





Note
: scheme not the same as protocol

URIs


Linked data needs
dereferenceable

URIs (ones we
can use HTTP to retrieve a description of that
resource)


But we cannot
serialise

people, things over the
internet (yet?) => we publish RDF documents on the
web that describe them


A real
-
word object != a document about that object


e.g. creation
-
date for you != creation
-
date for your web
-
page

Identifying Linked Data Resources


URI
that identifies a real
-
word object != URI
that identifies a document about that object


Can
make statements about object and can make
statements about the document describing
it


How
do we link these 2 URIs together?

Identifying Linked Data Resources


303 Redirect (e.g. http://
example.uk
/people/
dave
-
smith)


Used
for large, dynamic data sets


Flexible
because redirection can be separately configured
for each
resource


e
.g.
can store data in multiple files or DB. Can change this
at deployment/run
-
time.


Typically
used for resource descriptions in large data
-
sets

URI styles for Linked Data


Fragment
(e.g. http://
example.uk
/
people#dave
-
smith
)


Used
for small, static data sets


Reduced
number of HTTP round
-
trips => reduced
latency


A
single HTTP request retrieves the entire
document


May
transmit unnecessary data across the web


Used
for
RDFa

(defined via
RDFa

“about=” attribute)


Typically
used for vocabulary definitions

URI styles for Linked Data

1.
Create
URIs for concept/thing and
documents


e
.g.
http://
biglynx.co.uk
/people/
dave
-
smith (URI identifying the
person Dave Smith
)


http
://
biglynx.co.uk
/people/
dave
-
smith.rdf

(URI for RDF/XML
document
describingDave

Smith
)


http
://
biglynx.co.uk
/people/
dave
-
smith.html
(URI
for HTML
document describing Dave Smith)

2.
Use
HTTP redirects/content negotiation to access
the
desired
resource description for the specific
user agent

1.
Client
HTTP GET request on a URI identifying a object

2.
Server
recognizes URI, it answers using the HTTP 303 to send the
URI of a description of the object

3.
Client
HTTP GET request on new
URI

4.
Server
sends document from new URI

303 Redirect Approach


The picture below shows how dereferencing a HTTP
URI identifying a non
-
information resource plays
together with content negotiation
:








Simples…

Huh?

1.
Assign a URI to the RDF document defining the concepts


e.g. http
://
biglynx.co.uk
/vocab/
sme
/ (document URI)

2.
Assign
fragment identifiers to concepts within the
document


e.g. http
://
biglynx.co.uk
/vocab/
sme#SmallMediumEnterprise


http://biglynx.co.uk/vocab/
sme#
Team

3.
Use
HTTP requests to get the description

1.
Client
truncates a fragment URI to just refer to the
document

2.
Send
HTTP GET to request the
document

3.
Server
sends back the full
document

4.
Linked
data application now inspects triples to find fragment

Fragment Approach


Class

<!
--

http://www.pizza.com/ontologies/pizza.owl#ThinAndCrispyBase



<owl:Class rdf:about="&pizza;ThinAndCrispyBase">


<rdfs:subClassOf rdf:resource="&pizza;PizzaBase"/>


</owl:Class
>



Property

<
!
--

http://www.pizza.com/ontologies/pizza.owl#hasIngredient
--
>


<owl:ObjectProperty rdf:about="&pizza;hasIngredient">


<rdf:type rdf:resource="&owl;TransitiveProperty"/>


<owl:inverseOf rdf:resource="&pizza;isIngredientOf"/>


</owl:ObjectProperty>

Now we can refer to stuff

5 Steps to Publishing Linked Data

1.
Understand the Principles

2.
Understand your Data

3.
Choose URIs for Things in your Data

4.
Set up Your Infrastructure

5.
Link to other Data Sets

A.
Use URIs as names for
things


Anything
, not just
documents


You
are not your
homepage


Information resources (can be transmitted electronically)
and non
-
information
resources (cannot be transmitted electronically, e.g. a
person!)


B.
Use
HTTP
URIs


Globally
unique names, distributed
ownership


Allows
people to lookup those names

Step 1: Understanding the Principles

C.
Provide useful information in RDF when someone looks up a
URI


We can include RDF triple statements
!


D.
Include RDF links to other URIs


To
enable discovery of related information e.g. via “follow your
nose” browsing


Relationship Links


to add
context


Identity
Links


for URI aliases in other
sources


Vocabulary
Links


to enable self
-
description

Step 1: Understanding the Principles
(cont.)


What are the key things in your data?


People


Places


Events


Book


Flims


Musician






This why domain expertise are critically important

Step 2: Understand your Data


What vocabularies can be used to describe these?



Principles:


Reuse
, don’t
reinvent


Mix liberally



Examples:


foaf

--

Friend
-
of
-
a
-
Friend
ontology


geonames

--

GeoNames

ontology


skos

--

Simple Knowledge Organization
System


c
kan.net

Step 2: Understand your Data (cont.)

Step 2: Common Vocabularies


bibo

--

Bibilographic

ontology


cc
--

Creative Commons
ontology


damltime

--

Time Zone
ontology


doap

--

Description of a Project
ontology


event
--

Event
ontolog


foaf

--

Friend
-
of
-
a
-
Friend
ontology


frbr

--

Functional Requirements for
Bibliographic
Records


geo
--

Geo wgs84
ontology


geonames

--

GeoNames

ontology


mo

--

Music
Ontology


opencyc

--

OpenCyc

knowledge
base


owl
--

Web Ontology
Language


pim_contact

--

PIM (personal
information management) Contacts
ontology


po

--

Programmes

Ontology (BBC)


rss

--

Really Simple Syndicate (1.0)
ontology


sioc

--

Socially Interlinked Online
Communities
ontolog


sioc_types

--

SIOC extension


skos

--

Simple Knowledge
Organization System


umbel
--

Upper Mapping and
Binding Exchange Layer ontology


wordnet

--

WordNet

lexical
ontology


yandex_foaf

--

FOAF (Friend
-
of
-
a
-
Friend)
Yandex

extension ontology



Use HTTP URIs Keep out of other people’s namespaces


Create
own URI and include alias
information


Abstract
away from implementation
details
:


http://dbpedia.org/resource/Berlin


Is better than this:


http://www4.wiwisss.fu
-
berlin.de:2020/demos/dbpedia/cgi
-
bin/resource.php?id=/Berlin


Use Natural Keys within
URIs:


Need
to ensure the uniqueness of
URIs


U
seful
to base them on some existing primary
key


Whenever
possible, use a key that is meaningful within the domain of
the data set.
e.g
. use the ISBN as part of the URI of a book

Step 3: Choosing URIs


Common patterns for
URIs:


http
://
dbpedia.org
/resource/Berlin


Thing


http
://
dbpedia.org
/data/Berlin


RDF


http
://
dbpedia.org
/page/Berlin


HTML



Or use the file name
extension:


http
://biglynx.co.uk/people/dave
-
smith


http
://biglynx.co.uk/people/dave
-
smith.html


http
://
biglynx.co.uk
/people/
dave
-
smith.rdf

Step 3: Choosing URIs (cont.)


Describe the Data
-
set!


e.g. dataset name, authorship, updates, licensing terms, crawler
support, SPARQL endpoint location
, .
..


Vocabulary of Interlinked
Datasets (
VoID
)


A little later…


Pick
a Publication
Pattern


Is
your input data:
queryable
, structured or text
?


What
is the data volume
?


Is
it static or dynamic
?


Test it

Step 4: Set up Your Infrastructure

Step 4: Set up Your Infrastructure
(cont.)


Popular predicates for linking


owl:sameAs


Foaf:depection


Foaf:homepage


Foaf:topic


Foaf:based_near


Foaf:maker
/
foaf:made


Foaf:page


Foaf:primaryTopic


Rdfs:seeAlso

Step 5: Linking


VoID

(from "Vocabulary of Interlinked Datasets") is an RDF based schema
to describe linked
datasets


A dataset
is a collection of data, published and maintained by a single
provider, available as RDF, and accessible, for example, through
dereferenceable

HTTP URIs or a SPARQL
endpoint








http:
//semanticweb.org/wiki/VoiD

Step 5:
Linking (cont.)

1.
Understand the Principles

2.
Understand your Data

3.
Choose URIs for Things in your Data

4.
Set up Your Infrastructure

5.
Link to other Data Sets

Thank you!

Owen.Conlan@scss.tcd.ie

1.
http
://
linkeddata.org

2.
Debugging
Semantic Web sites with
cURL
,
http://
dowhatimean.net
/2007/02/debugging
-
semantic
-
web
-

sites
-
with
-
curl

3.
Linked
Data Tutorial
,
http
://www.slideshare.net/mediasemanticweb/linked
-
data
-

michael
-
hausenblas
-
2009
-
03
-
05

4.
Linked
Data Applications, M
Hausenblas
, DERI Technical Report 2009

5.
Linked
Data: Evolving the Web into a Global Data Space, Tom Heath ,
Christian
Bizer

http://
linkeddatabook.com

References