Semantic Web and Linked Data

sounderslipInternet και Εφαρμογές Web

22 Οκτ 2013 (πριν από 3 χρόνια και 1 μήνα)

89 εμφανίσεις

Semantic Web and Linked Data

By

Harsh
Pareek

07005007

Raman Sharma 07005010

Sumit

Somani

07005012

Shiv

Shankar 07005026

Outline


Outline


Motivation


Semantic Web: History and development


Linked Open Data


Linked Open Data Technologies


DBpedia
: An example of LOD


Accessing LOD

Motivation


Limitations of NLP


“In 2003, President of US ordered Iraq invasion.
George believed it to be a great decision.”


How do we know that George referred here is
referring to George Bush, and he was then
President of US.


It is due to world knowledge


Semantic Web helps us overcome this lack of
world knowledge and helps in processing the
language. In this case, co
-
reference was solved.

Motivation


Query:
“List all phones which have a battery
life of 12 hours and cost less than Rs. 10000”


This data may not be explicitly present on the web


But, the information on web is enough to answer
this query. Lack of structured data is the
bottleneck.


Need to represent information about phones in a
database
-
like format and perform
sql
-
like queries

Motivation


Query:
“List all phones which have a battery
life of 12 hours and cost less than Rs. 10000”


Many valid pages may not contain the word 10000
but we should be able to infer that the price is
<10000


Semantic Web could be used to build Query
systems.

Motivation

Original Doctor Appointment: Thu 9:00
-
10:00am

New Constraint : Thu 9:30 am onwards


If
we store
data on web in semantic format,
machine could realize conflict, and would search
for alternate doctor free times or similar doctors.


Tim Berners
-
Lee calls this as “optimization”.
Semantic web could make the web smarter,
mechanically usable and accurate.


Semantic Web

“The semantic web is not a separate web, but an
extension of the current one, in which
information is given a well
-
defined meaning,
better enabling computer and people work
better in cooperation”.

[1]



Courtesy : Berners
-
Lee T.,
Hendler
, J.,
Lassila
, O. (2001)

The Semantic Web. Scientific American 284(5):34
-
43

Pic Courtesy: wikipedia.org

www. mechanicsnationalbank.com


Semantic Web: History



Courtesy : http://novaspivack.typepad.com/nova_spivacks_weblog/2007/10/web
-
30
----
the
-
a.html

Ontology


Ontology is a formal representation of
knowledge as a set of concepts within a
domain and the relationship among those
concepts.


We need
ontologies

for :
-


Sharing common understanding of information.


Reuse of domain knowledge.


Making domain assumptions explicit.

Ontology


Camera Ontology

Courtesy :
Minsoo

Kim
,
Minkoo

Kim: Developing Protégé Plug
-
in: OWL Ontology
Visualization using Social Network.
JIPS 4
(2): 61
-
66 (2008)

Ontology


Examples:


Wordnet


FOAF (Friend of a Friend)


Gene Ontology


GeoPolitical

Ontology


Thought treasure ontology


Cyc


Jamendo


Customer Complaint Ontology


Courtesy :
http://en.wikipedia.org/wiki/Ontology_(information_science)#Examples_of_publi
shed_ontologies

From Ontology to Linked Data


But
ontologies

are domain specific


But to match semantic search requirements
we have to use all
ontologies

together


How can we use all the available
ontologies


The answer is to create link all of them
together, making a meta
-
ontology


Courtesy :
http://en.wikipedia.org/wiki/Ontology_(information_science)#Examples_of_publi
shed_ontologies

Linked Open Data


A way of linking these
ontologies

so as to



encourage
reuse



reduce
redundancy



maximize inter
-
connectedness



enable
network effects to add value to data

Linked Open Data Technology (1/2)

• URI (Unique Resource Identifier)
-
> The unique
name by which something is referred


HTTP (Hyper Text Transfer Protocol)
-
> Provides
basic access mechanism using WWW for lookup


RDF (Resource Description Framework)
-
> Data
format to describe relationships among entities


OWL (Web Ontology Language)
-
> Provides a
common understanding of concepts aiding in
reasoning

Linked Open Data Technology (2/2)

• Use URI for unique nomenclature for things




anything, not just
web pages




all
kinds

of
information
resources

• Use HTTP
as URI




provides globally
unique
names




allows
using existing web for lookup


Encode
useful information in RDF




when
servicing a URI lookup


Include RDF links to other
URI




enable
discovery of related
information

• Encode further information using OWL




enable reasoning about information across domains


RDF
-

OWL

<
rdf:Description

rdf:about
=
"subject"
>



<predicate

rdf:resource
=
"object"

/>

<predicate>
literal value
</predicate>

<
rdf:Description
>

Courtesy :
http://www.linkeddatatools.com/introducing
-
rdf


RDF
-

OWL

Courtesy : http://www.linkeddatatools.com/introducing
-
rdf
-
part
-
2

RDF


OWL : An Example (1/3)

<
rdf:RDF



xmlns:rdf
=
“http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#”


xmlns:feature
=
"http://www.linkeddatatools.com/clothing
-
features#"
>



<
rdf:Description

rdf:about
=
"http://www.linkeddatatools.com/clothes#t
-
shirt"
>



<
feature:size
>
12
</
feature:size
>



<
feature:color


rdf:resource
=
"http://www.linkeddatatools.com/colors#

white"
/>



</
rdf:Description
>

</
rdf:RDF
>

Courtesy : http://www.linkeddatatools.com/introducing
-
rdf
-
part
-
2

RDF


OWL : An Example (2/3)

Courtesy:
NeonTool

RDF
-

OWL: An Example
(3/3)

<
owl:Class

rdf:ID
=
"
SpaceTimeThing
"
>

<
rdfs:label

xml:lang
=
"en"
>
things in
our time and space
</
rdfs:label
>

<
rdfs:comment

xml:lang
=
"en"
>
A
specialisation of #$
SpatialThing

and #$
TemporalThing
. A collection
of things that physically exist in
our universe
.</
rdfs:comment
>

<
rdfs:subClassOf

rdf:resource
=
"#
SpatialThing
"
/>

<
rdfs:subClassOf

rdf:resource
=
"#
TemporalThing

/>

</
owl:Class
>

Courtesy:
http://www.qrst.de/ontology/owl.xml

Pic Courtesy: http://
www.
pctechs.biz,
www. thedoublethink.com


Critique

Aka Semantic
Modelling


Requires
Human
Intelligence





Difficult to
be done by
machines

Linked Open Data

Courtesy: http://linkeddata.org/

DBpedia


Wikipedia contains structural
information such as


"
infobox
" tables


categorisation information


Images


geo
-
coordinates


links to external Web pages


Dbpedia

lets us treat
Wikipedia as a database
which can be queried




Courtesy: http://en.wikipedia.org/wiki/DBpedia

Infobox

Courtesy: http://en.wikipedia.org/wiki/Sachin_Tendulkar

DBpedia


Contains:


3.4 million things


Abstracts in
upto

92 different languages


1,460,000 links to images


5,543,000 links to external web pages


4,887,000 external links into other RDF datasets


565,000 Wikipedia categories


How to access Linked Data

Querying
DBpedia


Offline: Linked Open Data Crawl


Billion Triple Challenge Dataset


SPARQL

PREFIX
dbprop
: <http://dbpedia.org/property/>

PREFIX db: <http://dbpedia.org/resource/>

SELECT ?who ?work ?genre WHERE {
db:Tokyo_Mew_Mew

dbprop:illustrator

?who .


?work
dbprop:author

?who .


OPTIONAL { ?work
dbprop:genre

?genre } .

}

SPARQL

Courtesy:
http://dbpedia.org/sparql

Document Web
vs

Linked Data

Web of Linked Documents


A global

filesystem



Human usage



Primary
objects
documents



Links
between
documents



Low degree
of structure


Implicit Semantics
of
content and
links

Web of Linked Data


A
global
database



Machine interpretation



Primary objects


entities or things



Links
between
entities



High Degree
of
structure


Explicit Semantics
of
content and
links

Conclusion


Imposing structure and standards on available information

increasing its usability and value




As semantic web spreads it would become priceless, allowing
machines to analyze all the data on the Web



the content, links,
and even transactions between people and computer



Searching over all of linked data is possible but at current stage not
effective.As

the structure becomes larger and more accepted it
would become easier




Ontology creation still requires human intelligence.

But by "bolstering human intelligence" definition of AI, we could
win the battle

References


Berners
-
Lee T.,
Hendler
, J.,
Lassila
, O. (2001) The
Semantic Web. Scientific American 284(5):34
-
43


Christian
Bizer
, Tom Heath, Tim Berners
-
Lee. Linked
Data


The Story So Far. IJSWIS


http://linkeddata.org/

Further Reading


NLP and the Semantic Web
http://www.csc.villanova.edu/~nlp/pres1/presentation
.pdf


Proceedings of the NLP4SW conference:
http://www.dcs.shef.ac.uk/~diana/courses/lrec
-
nlp
-
semweb
-
tutorial.html

Questions?

EXTRA

Falcon demo

Ontology Learning


Semantic annotation


annotate in the texts
all mentions
of

instances
relating to concepts
in the ontology


Ontology
learning


automatically derive an
ontology
from Texts


Ontology
population


given an ontology,
populate
the concepts
with instances derived
automatically from a text

Ontology Learning: Hearst
Patterns[1992]


Such NP as {NP}* {
or|and
} NP


“such games as baseball and cricket”


NP {,NP}* {,} {
and|or
} other NP


“rabbits and other animals”


But, “rabbits and other pets”


NP {,} including {NP,}* {
or|and
} NP


“fruits including apples and pears”


NP{,} especially {NP,}*{
or|and
}NP


“Europeans, especially Italians”


But, “US Presidents, especially democrats”


Extended by newer systems such as
KnowItAll

NLP for Semantic Web

So how does Natural language processing
fit in?


Semantic Web requires machine
-
interpretable semantics in
order to process textual information on the internet


Natural
language processing is vital to
the success
of the
semantic web because it
is the
method of communication
between humans
and software
agents


Parsing
, knowledge
representation, information
extraction
,
disambiguation, term recognition and
semantic
analysis are
used in many semantic web technologies

NLP for Semantic Web


Linked Open Data is mostly academic and
volunteer work


Converting the current snapshot of the web to
Semantic Web requires effort and time


This is disregarding the fact that the Web is
growing at very high rates


Semi
-
automated mechanisms using NLP
techniques are required to keep up with the
increasing content

Semantic Web for NLP


Entity Disambiguation


Word Sense Disambiguation using
ontologies



Adds context to information


Allows using richer lexicon


Use world knowledge


Eg
. “Senator Green gave the green light for the
green bill in parliament”


Eg
. “Moses led the
J
ews to the banks of Jordan”

Semantic Web for NLP


Question Answering


“Sir Edward Heath died from pneumonia”


Sir Edward Heath
-
> UK Prime Minister
-
>politician


Died from
-
> killed by


Pneumonia
-
>disease



“Has a politician died of a lung disease?”

Would Web Search + NLP win Jeopardy?

Source: Stephen Wolfram’s Blog

(
http://blog.stephenwolfram.com/2011/01/jeopardy
-
ibm
-
and
-
wolframalpha/
)