Intro to Semantic Web

sounderslipInternet and Web Development

Oct 22, 2013 (3 years and 7 months ago)

58 views

Intro to Semantic Web
Programming

From Semantic Web Programming
by J.
Hebeler

et al.,

Semantic Web


Represents data in formats amenable to automated
processing, integration, and reasoning.


Data is king and it provides even greater value when
connected with other data sources to create a linked data
web.


Sem

Web standards include RDF, OWL, and SPARQL


The Linking Open Data Initiative:

3/09 and 2/10
http://www.ted.com/talks/tim_berners_lee_on_the_next_web.
html


http://blog.ted.com/2010/03/08/the_year_open_d/



2008 Article: http://tomheath.com/papers/bizer
-
heath
-
berners
-
lee
-
ijswis
-
linked
-
data.pdf



Introduction


Semantics offer the leverage to make more information better
and not overwhelmingly worse


Offers a new approach to extremely tough but lucrative
challenges that employ vast amount of information and
services


Awareness of
Sem

Web is required for any solution that
depends on dynamic information and service resources


Part 1: Chapters 1 and 2: “Introducing Semantic Web
Programming”
-

Chapter 2 implements “Hello Semantic Web
World”


Two main areas drive a semantic web App: Knowledge
representation and Application Integration.


Introduction (
cont.d
)


Part 2: Chapters 3 to 7: “Foundations of Semantic Web
Programming”
-

details Knowledge Representation


Ch. 3: “Modeling Information” shows data modeling with RDF;


Ch. 4: “Incorporating Semantics”


creates an Ontology knowledge
model with RDFS and OWL;


Ch. 5: “Modeling Knowledge in the Real World”


exercises an
ontology via Appl. Frameworks and
reasoners

, such as Pellet


Ch. 6: “Discovering Information” extracts useful information from a
knowledge model with SPARQL


Ch. 7: “Adding Rules” : Shows the use of SWRL, a W3C standard.

Introduction (
Cont.d
)


Part 3: “Building Semantic Web Applications”


Chapters 8
-
11


Integrates the knowledge base with an application that acts upon it


Solid programming base for the semantic web


Ch. 8: “Applying a Programming Framework”


explores Jena semantic
web framework


Ch. 9: “Combining Information”


ports info from sources such as
relational databases, web services, and other formats


Ch. 10: ”Aligning Information”


aligning data with ontological
concepts to unify disparate information


Ch. 11: “Sharing Information”


outputs the info in many formats, viz.,
RDFa
,
microformats
, SPARQL endpoints, etc.


Uses
FriendTracker

App to directly show the programming concepts

Introduction (Continued)


Part 4: “Expanding Semantic Web Programming”


Ch. 12
-
15


Ch. 12: “Developing and Using Semantic Services”


adds semantics to
services to allow them to participate in the Semantic Web


Ch. 13: “Managing Space and Time”


adds space and time
considerations to the knowledge representation


Ch. 14: “Applying Patterns and Best Practices”


shows architectural
patterns for constructing various Semantic Web Apps


Ch. 15: “Moving Forward”


discussed the future


focuses on four
critical evolving areas: ontology management, advanced integration
and distribution, advanced reasoning, and visualization.


Website: http://semwebprogramming.org




Chapter 1: Preparing to Program a Semantic Web of Data


Make useable sense out of large, distributed
information found throughout the WWW


Objectives of the chapter:


Form a useful, pragmatic definition of
Sem

Web


ID the major components and relate them


Outline how
Sem

Web impacts Apps


Discuss myths and hype


Explore the origin and foundation of
Sem

Web


Find out real world applications

Semantic Web Programming: Chapter 1 Map


Semantic Web (SW)


Origin


Foundation


DL and Graph Theory


Components


SW


Statements, Ontology, Instance, and Language


SW Tools


Frameworks, IDE,
Reasoner
, and KB


Features


Expressiveness, Inference, Integration, and Unique



Programming


Examples


Impacts


Data centric, Sharing Data, Dynamic Data, Expressive Data


Roadblocks


Myths and Hype

Definition:
Sem

Web


A web of data described and linked in ways to
establish context or semantics that adhere to
defined grammar and language constructs


No formal standard for such programmed
semantics; Also, aggregation, sharing, and
validation are not easy. E.g., Building


SW


addresses semantics through
standardized connections to related info by
labeling data uniquely and addressable


Figure 1
-
2: Isolation Vs the Semantic Web


Feature

WWW

Semantic Web

Fundamental Component

Unstructured Content

Formal

Statements

Primary

Audience

Humans

Applications

Links

Indicate

Location

Indicate Location and
Meaning

Primary Vocabulary

Formatting

Instructions

Semantics and Logic

Logic

Informal/nonstandard

Description Logic

Sem

Web


SW statements allow the definition and
organization of info to form rich expressions,
simplify integration and sharing, enable
inference, and allow meaningful information
extractions


While the info remains distributed, dynamic,
and diverse


In summary, SW improves your App’s ability to
effectively utilize large amounts of diverse info
on the scale of the WWW

SW Relationships


Include definitions, associations, aggregations, and
restrictions


Figure 1.3


Example Graph


Your own “good” personal secretary?


Family, Friends,
Associates, Suppliers, Distributors, Employees, Bank, ..


Establish both concepts (e.g., a Person has a birth date)
and instances (e.g., John is a friend of Bill)


The former defines an ontology; the latter


instance
data


Statements can be asserted or inferred; the former is
created directly, while the later needs a
reasoner

to
infer additional statements logically (dashed lines)

SW


Sw

statements employ a SW vocabulary and language
to identify different types of
stmnts

and relationships.


SW offers several languages; Choice to balance user’s
needs for performance, integration, and
expressiveness.


The
Stmnts

are in two forms:
Knowledgebases

and
files; KB offer dynamic, extensible storage like
relational DB; Files contain static
stmnts
.


Example:
http://www.geonames.org/ontology


http://en.wikipedia.org/wiki/GeoNames



Can enhance existing data sources (relational DB, web
page, web service) and applications (standalone
desktop App, mission
-
critical enterprise App, and large
scale web app/svc).

The FOAF Project: Pg. 29
-
30


Source:
http://www.geonames.org/ontology/ontology_v2.2.1.rdf


<?xml version="1.0" encoding="UTF
-
8" ?>


-

<
rdf:RDF

xml:base
="
http://www.geonames.org/ontology
"
xmlns:skos
="
http://www.w3.org/2004/02/skos/core#
"
xmlns:gn
="
http://www.geonames.org/ontology#
"
xmlns:owl
="
http://www.w3.org/2002/07/owl#
"
xmlns:rdf
=
"
http://www.w3.org/1999/02/22
-
rdf
-
syntax
-
ns#
"
xmlns:rdfs
="
http://www.w3.org/2000/01/rdf
-
schema#
"
xmlns:dcterms
="
http://purl.org/dc/terms/
"
xmlns:foaf
="
http://xmlns.com/foaf/0.1/
">


Same: in Turtle Format


Links to Discuss


Geonames

Feature Codes:
http://www.geonames.org/statistics/total.html


Linked data:
http://linkeddata.org/


Sem

Web Blog:
http://www.geospatialsemanticweb.com/2006/10/14/ge
onames
-
ontology
-
in
-
owl


Web Services Overview:
http://www.geonames.org/export/ws
-
overview.html


Collective Intelligence:
Algorithms of the Intelligent Web,
Marmanis

and
Babenko
, 2009 (Thanks to Luis
Atencio
)



Source:
http://www.geonames.org/ontology/ontology_v2.2.1.rdf

-

<
gn:Code

rdf:about
="
http://www.geonames.org/ontology#T.RK
">



<
skos:definition

xml:lang
="
en
">
a conspicuous, isolated
rocky mass
</
skos:definition
>



<
skos:inScheme

rdf:resource
="
http://www.geonames.org/ontology#T
" />



<
skos:prefLabel

xml:lang
="
en
">
rock
</
skos:prefLabel
>



</
gn:Code
>


Source:
http://www.geonames.org/ontology/ontology_v2.2.1.rdf

-

<
gn:Code

rdf:about
="
http://www.geonames.org/ontology#S.HTL
">



<
skos:definition

xml:lang
="
en
">
a building providing lodging and/or
meals for the public
</
skos:definition
>



<
skos:inScheme

rdf:resource
="
http://www.geonames.org/ontology#S
" />



<
skos:prefLabel

xml:lang
="
en
">
hotel
</
skos:prefLabel
>



</
gn:Code
>


Source:
http://www.geonames.org/ontology/ontology_v2.2.1.rdf

-

<
gn:Class

rdf:about
="
http://www.geonames.org/ontology#S
">



<
rdfs:comment

xml:lang
="
en
">
spot, building, farm,
...
</
rdfs:comment
>



</
gn:Class
>


-

<
gn:Class

rdf:about
="
http://www.geonames.org/ontology#T
">



<
rdfs:comment

xml:lang
="
en
">
mountain, hill, rock,
...
</
rdfs:comment
>



</
gn:Class
>


Source: http://en.wikipedia.org/wiki/GeoNames



GeoNames

is a geographical data base available and accessible through various
Web services
, under
a
Creative Commons

attribution license.


Database and web services

The
GeoNames

database contains over 10,000,000
geographical names

corresponding to over 7,500,000
unique features.
[1]

All features are categorized into one out of nine feature classes and further
subcategorized into one out of 645 feature codes. Beyond names of places in various languages, data
stored include
latitude
,
longitude
, elevation, population, administrative subdivision and
postal codes
.
All
coordinates

use the World
Geodetic

System 1984 (
WGS84
).

Those data are accessible free of charge through a number of Web services and a daily database
export.
[2]

The Web services include direct and reverse
geocoding
, finding places through postal
codes, finding places next to a given place, and finding
Wikipedia

articles about
neighbouring

places.


Wiki interface

The core of
GeoNames

data base is provided by official public sources, of which quality may vary.
Through a
wiki

interface, users are invited to manually edit and improve the data base by adding or
correcting names, move existing features, add new features, etc.


Semantic Web integration

Each
GeoNames

feature is represented as a
Web resource

identified by a stable
URI
. This URI provides
access, through
content negotiation
, either to the HTML wiki page, or to a
RDF

description of the
feature, using elements of the
GeoNames

ontology
.
[3]

This ontology describes the
GeoNames

features properties using the
Web Ontology Language
, the feature classes and codes being described
in the
SKOS

language. Through Wikipedia articles URL linked in the RDF descriptions,
GeoNames

data
are linked to
DBpedia

data and other RDF
Linked Data
.

Comparison of Relational DB and KB

Feature

Relational Database

Knowledge

Base

Structure

Schema

Ontology Statements


Data

Rows

Instance Statements


Admin Language

DDL

Ontology

Statements

Query

Language

SQL

SPARQL

Relationships

Foreign

Keys

Multidimensional

Logic

External to

DB, with
triggers

Formal Logic Statements

Uniqueness

Key for table

URI

Major Programming Components


SW Statement: A triple (subject
-
Predicate
-
Object)


thousands of these simple 3
-
tuples are combined in an
App.


URI
-

unique name across the entire Internet. Each SW
Statement has an URI. Could include a URL, an abstract
URN, or IRI (Intl
Resrc

ID)


SW Languages


Has a set of keywords to instruct the SW
tools.


An Ontology


Has statements that define concepts,
relationships, and constraints. Similar to a
DB schema or
OO class diagram.
Use the many existing
Ontologies
, for
better quality and speed.


Instance Data


Statements about specific instances; not a
generic concept. This forms the bulk of the SW


an
ontology containing the concept ‘person’ may be used by
millions of instances of ‘person’


Example: The
nces

site:
http://nces.ed.gov/


Programming Concepts (
Contd
)


Construction tools:
To construct and integrate a SW
through ontology/instance creation/import. GUI
-
based
SW editor to see and explore; APIs


Interrogation tools:
Navigate thru the SW to return a
requested response. Query language


Reasoners
:
Add inference to SW; logical additions to gain
classification and realization (same as relationship).
Leverage assertions


Rule Engines:
Support inference typically beyond that can
be deduced from description logic. Merge
ontologies
,
count and string searches.


SW Frameworks:
package all the tools above to have an
integrated flow. We use open source alternatives for both
GUI and API. So, you can start right away.

Impacts on Programming


Web Data
-
Centric: Place data at its center;
Data is key


Semantic Data: Place meaning directly within
the data Vs in programming instructions


Data Integration/Sharing: Access and share
rich info resources throughout the WWW


Dynamic Data: Enable dynamic, run
-
time
changes to the structure and contents of your
info


Add comments on each later on

Roadblocks, Myths, and Hype


Roadblocks: web
-
centric development; accepting of
dirty, conflicting data; and dynamic addition of new
data.


SW perspective: data
-
centric programming and
distributed information programming


Traditional approach: ETL (extract, transform, and load)
does not scale; has a single point of failure


WWW is full of data
-

good, bad, and wrong data. To fix,
we need to know the truth; not easy. SW can define
reputation and manage conflicting info


Remaining open to new data is not an easy
perspective. SW maintains a nimble, agile view of data.

Myths


Pursuit of one big info model


SW supports a
multitude of distributed info sources with a
multitude of perspectives.
Your solution and
you

need to have that perspective.


One view


Allow any view for info analysis.
Get started quickly and adapt/evolve.
Encourages agility, and decoupled, modular
design. Look for existing SW sources and use


Acceptance


New technologies scare people,
as they should.
Change also scares people.

Hype


Hype doomed AI; SW, does incorporate excellent AI
research of the past; and offers a useful improvement
in leveraging info. SW is an evolutionary step in making
info work harder for us


Hype may make complex technology look too simple.
Tough, challenging problems demand complex
technical solutions. SW reduces unnecessary
complexity to focus on necessary complexity, that of
managing info and knowledge we produce


Hype can overpromise the inherent challenges in using
a new technology. Do not expect a perfect world of
tools, frameworks, and the SW itself.
You may hit sum
bumps, but hang in there, it will be worth it.

SW Origins


Scientific American article by Tim Berners
-
Lee
et al.,
discussed SW for machine readability, easy info
integration, info inference, unique naming, and rich
representations, …


Graph Theory



1736/Euler
-

nodes and relationships.
Graphs imply the answer mathematically; brute force
exhaustive analysis otherwise.


Description Logic


1980s; rules to construct valid,
useful knowledge representations, knowledge
representations that are decidable and can actually
produce an answer. From first order logic. From AI
research. Info not tacit (If else). The externalized form
reveals the info for verification, integration, reasoning,
and interrogation. Relationships beyond inheritance to
seek dependable logic.

Links


Haiti
-

http://topics.nytimes.com/top/news/internati
onal/countriesandterritories/haiti/index.html


http://nces.ed.gov/


SEC and XBRL:
http://www.cpa2biz.com/Content/media/PRO
DUCER_CONTENT/Newsletters/Articles_2010/
CorpFin/SEC_XBRL.jsp