Semantic Web - University of Wisconsin-Platteville

economickiteInternet και Εφαρμογές Web

21 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

88 εμφανίσεις

Semantic Web



Josh Fleck

Department of Compute
r Science

University of Wisconsin
-

Platteville

Platteville, WI 53818

fleckjo@uwplatt.edu



Abstract


Semantic Web aims to
help users find, share, and combine information easier. It is a system that
uses formats such as RDF and OWL in order to capture the meaning of data and convert it into
something that can be easily understood by a machine. With Semantic Web, applications a
re
more easily
able to analyze data and respond to more complex tasks from users. Instead of the
user having to find information about a subject from multiple sites, the application
using
Semantic Web would

be able to condense all of the relevant informati
on on the Web for the user,
saving them a lot of time.






Introduction



The Semantic Web was created by the World Wide Web Consortium (W3C) to encourage using
common formats on the Web. Semantic Web describes the properties of data and also the
relationships bet
ween data. It tries to organize and structure the current Web’s d
ocuments into a
logical group or web of data. According to the W3C, The Semantic Web provides a common
framework that allows data to be shared and reused across application enterprise, and community
boundaries [1].
There is a lot of information on the Web,

and Semantic Web looks to combine all
that information

and give meaning to it
, instead of having all of that information separate.



What is meant by
semantic

in the Semantic Web is not that computers are going to understand
the meaning of anything, but that the logical pieces of meaning can be mechanically manipulate
d
by a machine to useful ends [9
].

There are many parts to make Semantic Web work, which can
be

seen in

the Semantic Web Stack in

figure 1.

Rather than just creating a collection of datasets,
Semantic Web creates relationships between data called Linked Data.


Figure 1 shows the complexity of Semantic Web. I will be covering the basics including URI
,
XML, RDF,

RDFS,

querying, and ontologies of Semantic Web.




2



Figure 1: The Semantic Web Stack [
7
]



Linked Data


Linked Data

is what Semantic Web is all about. It

involves using the Web to create links
between data from different sources. This could be anything from linking databases from
different companies to consolidating information at a single company. According to Bizer,
Linked Data refers to data published o
n the Web in such a way that it is machine
-
readable, its
meaning is explicitly defined, it is linked to other external data sets and can in turn be linked to
from external data sets

[
6
]
.
Linked Data allows applications to grab large amount of related data
from the web. To create Linked

Data, common formats

such as Resource Description
Framework (RDF)

and Uniform Resource Identifier (URI)

must be used so that applications can
easily grab the related data.




RDF



RDF is a
standard that is built on top of
XML

that is used for describing information about
resources on the Web.

According to W3C, RDF is intended for representing metadata about Web
resources, such as the title, author, and modification date of a Web page, copyright and licensing
3


information abo
ut a Web document, or the availability schedule for some shared resource [1].
RDF is intended to be used by

computer

applications instead of being seen by users.
By putting
information into RDF, applications
can more easily find and process inform
ation tha
t is stored on
the Web.







Figure 2: Triple Example


RDF is similar to a
n

entity
-
relationship diag
ram, except it is in code

form. It puts the information
of a resource into the form of subject
-
predicate
-
object
, triple,

expressions. The subject represents
the res
ources,
the predicate
represents the char
acteristics and details of the
resource, and the
object represents the connection between the subject and the object. For example, “The sky has
the color blue” in

RDF is as the triple: a subject denoting “the sky”, a predicate denoting “has the
color”, and an object denoting “blue” [2].

Figure 2 above reinforces the triple structure.



XML


According to W3C standards, XML format is used to encode the RDF. In an RDF/XML
document there are two types of XML nodes: 1) resource XML nodes and 2) property XML
nodes. Resource XML nodes are the subjects and objects of statements, and they usually are
rdf:Description
tags that have a

rdf:about

attribute on them giving the URI o
f the resource they
represent [
9
].
Property nodes are the predicate of the triple statements.



Figure 3: Table Example [
10
]


For a table such as the one above in Figure
3
, all
of the information would have to be broken
down into triples in order to put it into RDF/XML.

For the example in Figure 3
, the title of the
album would be the subject, the artist, country, company, price, and year would be the
object,
and the predicate

is the connection between the album and

those characteristics. Figure 4

shows
how a table would be stored in RDF/XML.


Subject

Object

Predicate

4



Figure 4: XML Example [
20
]

Using xml, the description of the albums is stored into tags. So for the first album, Bob Dylan is
stored a
s the artist, 1985 is stored as the year, and so on. Notice the rdf: Description statement in
the code. This represents a resource node.

As seen here, it looks like it would be easy for a
program to analyze and pull the properties of an album for use.



RD
FS


RDF Schema (RDFS) is a set of classes with certain properties using the RDF extensible
knowledge representation language, providing basic elements for the description of ontologies,
otherwise called RDF vocabularies, intended to structure RDF resources
. These resources can be
saved in a triple to reach them with the query language SPARQL

[
7
]
.

I’ve listed the core
components of RDFS below with descriptions

taken from the
W3C

website

[5]
.



Classes




r
dfs:Resource
: All things described by RDF are called resources, and are instances of the
class rdfs:Resource

5




rdfs:Class: This is class of resources that are RDF classes



rdfs:Literal: The class of literal values such as strings and integers



rdfs:Datatype: The class of
data
-
types



rdf:XMLLiteral: The class of XML literal values



rdf:Property: The class of RDF Properties


Properties




rdfs:range: Used to declare the class or data
-
type of the object in a triple



rdfs:domain: Used to declare the class of the subject in a triple



rdfs:type: Used to state that a resource is an instance of a class



rdfs:subClassOf: Used to state that all resources related by one property are also related
by another



rdfs:label: Used to provide a human
-
readable version of a resource’s name



rdfs:comment
: Used to provide a human
-
readable description of a resource



URI


A URI is a string of characters that represents a resource. It is used as an identifier so that
interactions with the resource can take place. A URI is can be classified as a locator (URL)
, as a
name (URN), or as both. As their names suggest, a name would provide a resource’s identity,
while a locator would provide an address to get to that resource. A URI identifies a resource in
which the data would be provided.


URIs should not be confu
sed with URLs. Although a URI could be classified as a URL, not all
URIs are locations. Even though URIs may look like web addresses, there may not be a website
at that address.



Ontology Languages


For developers that want to define the terms, data, and

the relationships of t
heir

data
,

ontology

is
used. In computer science

ontology is a model for describing the world that includes types,
properties, and relationship types [
4
]. In other words, o
ntology is a set of definitions that

developers can create th
at

can vary in complexity. One standard called Web Ontology Language
(OWL) is the most common standard that is built on top of RDF to define ontology. OWL is a
W3C standard that is written in XML. It is used

to define and process the meanings of data and
r
elationships on the Web.


In order to use an OWL, the ontology must be created. This is done by making classes,
properties, and providing information about those classes and properties.

Listed below are all of
the common components of ontologies, and short

explanations for those components [
4
].


6




Individuals: instances or objects (the basic or "ground level" objects)



Classes: sets, collections, concepts, classes in programming, types of objects, or kinds of
things



Attributes: aspects, properties, features, c
haracteristics, or parameters that objects (and
classes) can have



Relations: ways in which classes and individuals can be related to one another



Function terms: complex structures formed from certain relations that can be used in
place of an individual ter
m in a statement



Restrictions: formally stated descriptions of what must be true in order for some assertion
to be accepted as input



Rules: statements in the form of an if
-
then (antecedent
-
consequent) sentence that describe
the logical inferences that can
be drawn from an assertion in a particular form



Axioms: assertions (including rules) in a logical form that together comprise the overall
theory that the ontology describes in its domain of application. This definition differs
from that of "axioms" in gene
rative grammar and formal logic. In those disciplines,
axioms include only statements asserted as
a priori

knowledge. As used here, "axioms"
also include the theory derived from axiomatic statements



Events: the changing of attributes or relations



Query
Languages


SPARQL is a

query language designed for RDF. Its queries contain a set of triple patterns that
are named basic graph patterns. The triple patterns are like RDF triples except that each of the
subject, predicate, and object may be a variable. A b
asic graph pattern matches a sub graph of the
RDF data when the RDF terms from that sub graph can be
substituted for the variables [
8
]. An
example of a simple query to find the title of a book i
s shown in figure 5

below.



Figure 5 [
8
]: SPARQL example


7



How Is RDF Used?


RDF could be used in a situation where there are multiple products and multiple reviewers on
those products, and all of the vendors have their own databases. A user could look at all of the
sites individually, because none of those vendor
s are going to want to put all of their
competition
s products and reviews into a database, or they could use RDF and have all of the
information together. If all of the vendors were using RDF,
the user could simply just search for
the product they’re looki
ng for, and a list of products from the different vendors could come up
so that they could compare. None of the vendors would be responsible for naming conventions or
data formats, because the meaning is stored in RDF.


An application could then be written

to grab all of this information and present it to customers.
Rather than having the customer go through all of the different vendor’s sites or application to
view their products, one application or website could present all the information together.

Altho
ugh a
n application would still have to be written to grab all of this information,
it would
still
be a lot easier than keeping a central database and constantly having to update it from all of
the vendors.



Future of the Web


Semantic Web is important for the future of the Web because it gives information on the Web
meaning, which will make it easier for computers to process and integrate information on the
internet without the need of human interaction to gather all of the dat
a. Automated software will
be able to store and share information gathered throughout the web, which will lead to greater
efficiency for users. This
happens by gathering information from different sources, combining
information, and then presenting that in
formation to users in a meaningful way. Some examples
of the data that could be presented are car prices from different sellers, information on
medicines, plane schedules, dates of events, and computer updates.



Currently tagging systems are used on some
websites, such as YouTube, which allow the user to
search for a term, and then the search will bring up all of the videos with relevant tags.
The
problem with this is that those tags will only work with that website and not all of the other
websites that m
ay use tags on the internet.


Most of the Web is currently in HTML, which has several limitations.
It cannot connect meaning
to any of the information that it has. For example, if you had a website to sell items such as
bikes, with details including price,

brand, and color, there is no way to show that there is a
relationship between one of the details of the bike, and the bike itself. In HTML you can put the
details together on the screen, but it will be hard for a

machine to pull that information, without

writing a specific program to parse that website. HTML describes documents and the links
between them, while RDF and OWL can describe the meaning inside that document.



8


Disadvantages of Semantic Web


The biggest problem with Semantic Web is that most of
the current Web is not compatible with
it. Most of the web is in HTML while Semantic Web uses RDF

on top of XML
. Also, RDF is not
a very
easy
language to understand for developers. It has a high learning curve because it was
developed by people with
academic backgrounds in logic and artificial intelligence

[
10
]
.



Another problem is that it could be more time
-
consuming to create content because the developer
would need to make two different formats for all of their data. They would need one for users
to
see and another for computers. This issue could be avoided if an application developed a
computer
-
readable format automatically from the format that is meant for the users

[
7
]
.


Censorship could also be a potential problem. Currently text
-
analyzing meth
ods can be bypassed
by using different words or images. Using Semantic Web would make censorship a lot easier
because it stores the meaning, making content blocking much easier for an application

to do

[
7
]
.



Real
-
Life Applications of Semantic Web

Although

Semantic Web is not too widespread yet, it is starting to be used more. One example is

Semantic Web being used for a digital music a
rchive in Norway
.

Semantic Web technology is
primarily used for enclosing the enormous amounts of metadata on music tracks
available within
the archives so that a larger amount of the

content

will be used in broadcasting, potentially
providing the broadcaster with an advantage over the competition, being better informed and
more interesting

[
1
1
]
.

Another example was developed at the University
of

Texas Health Science Center in Houston to
try and respond to health problems quicker. The system is called SAPPHIRE (situational
awareness and preparedness for public health incidences using reasoning engi
nes), and it
integrates data from health care providers, hospitals, and literature. SAPPHIRE receives reports
every 10 minutes

on emergency room cases, patients’ symptoms, health records, and clinicians’
notes from other hospitals in the area. It is succes
sful because it can pull information from a lot
of different places, and because of its success more Semantic Web integration in health care is
wanted

[3]
.



Conclusion


In conclusion, Semantic Web aims to turn the current unstructured data on the Web into

a web of
data. It can help users find, share, and combine information easier by storing the meaning of the
data. Instead of making users go to multiple sites, Semantic Web allows an application developer
to easily pull data from multiple sites that use th
e RDF format. With a web of data, information
on the Web would be condensed and the Web would be much more efficient for everyone.



9


References
:



1.

Christian Bizer, T. H.
-
L. (2009).
Linked Data
-

The Story So Far.

Retrieved March 25,
2012, from Tom Heath: http://tomheath.com/papers/bizer
-
heath
-
berners
-
lee
-
ijswis
-
linked
-
data.pdf

2.

Lee Feigenbaum, I. H. (2009, January 19).
The Semantic Web in Action
. Retrieved March
25, 2012, from Scientific American:
http://people.cs.
kuleuven.be/~danny.deschreye/SemanticWebAction.pdf

3.

Mirhaji, P. (2007, March).
Semantic Web Use Cases and Case Studies
. Retrieved March
25, 2012, from W3C: www.w3.org/2001/sw/sweo/public/UseCases/UniTexas

4.

Ontology (information science)
-

Wikipedia, the free

encyclopedia
. (n.d.). Retrieved April
15, 2012, from Wikipedia, the free encyclopedia:
en.wikipedia.org/wiki/Ontology_(information_science)

5.

RDF Schema
. (n.d.). Retrieved April 15, 2012, from W3C: www.w3.org/TR/rdf
-
schema/#ch_properties

6.

Semantic Web
. (n.d.). Retrieved March 25, 2012, from W3C:
http://www.w3.org/standards/semanticweb/

7.

Semantic Web
-

Wikipedia, the free encyclopedia
. (2012). Retrieved March 2012, 8, from
Wikipedia, the free encyclopedia: http://en.wikipedia.org/wiki/Semantic_Web

8.

SPARQL

Query Language for RDF
. (2008, January 15). Retrieved March 25, 2012, from
W3C: http://www.w3.org/TR/rdf
-
sparql
-
query/

9.

Tauberer, J. (2008, January).
What is RDF and what is it good for?

Retrieved March 25,
2012, from Resource Description Framework: http://www.rdfabout.com/intro/

10.

The Semantic Web
. (n.d.). Retrieved March 25, 2012, from w3schools.com:
www.w3schools.com/web/web_semantic.asp

11.

Tønnesen, D. R. (2007, September).
Case Study: A D
igital Music Archive (DMA) for the
Norwegian National Broadcaster (NRK) using Semantic Web techniques
. Retrieved
March 25, 2012, from W3C:
http://www.w3.org/2001/sw/sweo/public/UseCases/NRK/