using Semantic Web Technologies:

steelsquareInternet and Web Development

Oct 20, 2013 (3 years and 8 months ago)

91 views

© 2008 Hewlett
-
Packard Development Company, L.P.

The information contained herein is subject to change without notice

Enterprise Information Integration
using Semantic Web Technologies:

RDF as the Lingua Franca


David Booth, Ph.D.

HP Software

Semantic Technology Conference 20
-
May
-
2008


In collaboration with Steve Battle, HP Labs


Latest version of these slides:

http://dbooth.org/2008/stc/slides.ppt

2

Disclaimer


This work reflects research and is presented for
discussion purposes only. No product
commitment whatsoever is expressed or implied.
Furthermore, views expressed herein are those
of the author and do not necessarily reflect those
of HP.

3

Outline


PART 0: The problem


PART 1: RDF: The lingua franca for information exchange


Why

1.
Focus on semantics

2.
Easier data integration

3.
Easier to bridge other formats/models

4.
Looser coupling


How

1.
RDF message semantics

2.
REST
-
based SPARQL endpoints

3.
XML with GRDDL transformations

4.
Aggregators


PART 2: POC: A SPARQL adaptor for UCMDB


What is UCMDB


SPARQL adaptor

4

PART 0


The problem


5

Problem 1: Integration complexity


Multiple producers/consumers need to share data


Tight coupling hampers independent versioning

Provisioning

Discovery

Change

Management

Compliance Management

Incident

Management

Release

Management

Monitoring

Ticketing

Release Managers

Unix System

Administrators

Networking Engineers

Networking

Administrators

Compliance

Managers

Windows System

Administrators

Operation Centers

Storage

Administrators

Source Control

6

Problem 2: Babelization


Proliferation of data models (XML schemas, etc.)


Parsing issues influence data models


No consistent semantics


Data chaos



Tower of Babel, Abel Grimmer (1570
-
1619)

7

PART 1


RDF: The lingua franca
for information exchange


8

Why?

Four reasons . . .

9

Why?

1. Focus on semantics


XML:


Schema is focused on how to serialize


Constrains more than the model


Parent/child and sibling relationships are not named


Are their semantics documented? E.g., does sibling order
matter?


RDF:


One URI per concept


Syntax independent



Who cares about syntax?

10

Why?

2. Easier data integration


11

Why?

2. Easier data integration


Blue App has model

12

Why?

2. Easier data integration


Red App has model










Need to integrate Red & Blue models

13

Why?

2. Easier data integration


Step 1: Merge RDF


Same nodes (URIs) join automatically

14

Why?

2. Easier data integration


Step 2: Add relationships and rules


(Relationships are also RDF)

15

Why?

2. Easier data integration


Step 3: Define Green model


(Making use of Red

& Blue models)

16

Why?

2. Easier data integration


What the Blue app sees:


No difference!

17

Why?

2. Easier data integration


What the Red app sees


No difference!

18

Why?

3. RDF helps bridge other formats/models


Producers and consumers may use different formats/models


Rules can specify transformations


Inference engine finds path to desired result model

RDF

Model

Transform

A1

A2

A3

B1

B2

C1

C2

X

Y

Z

Ontologies

& Rules

Ontologies

& Rules

Ontologies

& Rules

19

Why?

4. Looser coupling


Without breaking consumers:


Ontologies can be mixed and extended


Triples can be added


Producer & consumer can be versioned more
independently

20

Example of looser coupling


RedCust and GreenCust ontologies added


Blue app is not affected

(Blue app)

Consumer

Producer

21

How?

Four ways . . .

22

How?

1. RDF message semantics


Interface contract specifies RDF, regardless of
serialization


RDF pins the semantics

Consumer

Producer

RDF

23

How?

2. REST
-
based SPARQL endpoints


Consumer

Producer

SPARQL

RDF

HTTP

24

REST
-
based SPARQL endpoints


Why REST:


HTTP is ubiquitous


Simpler than SOAP
-
based Web services (WS*)


Looser
process

coupling

25

REST
-
based SPARQL endpoints


Why SPARQL:


One endpoint supports multiple data needs


Each consumer gets what it wants


Insulates consumers from internal model changes


Inferencing transforms data to consumer's desired model


Looser
data

coupling

26

How?

3. XML with GRDDL transformations


GRDDL is a W3C standard


GRDDL permits RDF to be "gleaned" from XML


XML document or schema specifies desired GRDDL
transformation


GRDDL transformation produces RDF from XML
document


Mostly intended for getting microformat and other
data/metadata from HTML pages

27

Using GRDDL for XML document
semantics


Each XML format can be viewed as a
custom serialization
of RDF
!


GRDDL transformation produces semantics of the XML document


Helps bridge XML and RDF worlds


Same XML document can be consumed by:


Legacy XML app


RDF app


App interface contract can specify RDF


Serializations can vary


Semantics are pinned by RDF

28

Using GRDDL for XML document
semantics

See: http://dbooth.org/2007/rdf
-
and
-
soa/rdf
-
and
-
soa
-
paper.htm

Client

Normalize

to RDF

Serialize as

XML/other/RDF

RDF Engine


/ Store

Service

Core App

Processing

29

How?

4. Aggregators


Gets data from multiple sources


Provides data to consumers



Aggregator

A1

A2

A3

B1

B2

C1

C2

X

Y

Z

Ontologies

& Rules

Ontologies

& Rules

Ontologies

& Rules

SPARQL

30

Aggregator


Conceptual component


Not necessarily a separate physical service


Handles mechanics of getting data


Different adaptors for different sources


REST, WS*, Relational, XML, etc.


Diverse data models


Might do caching and query distribution (federation)


Provides model transformation


Plug in ontologies and inference rules as needed

31

PART 2


Proof
-
of
-
Concept: A SPARQL
adaptor for UCMDB


32

IT Service Management (ITSM)


Manage IT
environment


Configuration
Management Data
Base (CMDB) is central

33

The HP Universal CMDB (UCMDB)

CMDB : Configuration Management DB

Goal:


Maintain a comprehensive and
current record of all configuration
items (CIs) and their relationships

34

Example: host information



Host properties

One particular host
machine

http://cmdb.mercury.com#nt.35014541

35

SPARQL adaptor


Uses existing SOAP
interface to UCMDB


Enables SPARQL
queries


Results can be RDF


No model
transformation (yet)

SOAP interface

SPARQL adaptor


HP UCMDB


36

Architecture of SPARQL adaptor

SOAP interface

SPARQL adaptor


HP UCMDB


SPARQL

compile

TQL

XML


RDF


lift

Database

CMDB metadata


export


submit


OWL

Design Time

Run Time

37

UCMDB ontology


The HP UCMDB ontology
defines CI types and
relationship hierarchies.


Derived automatically from
HP UCMDB metadata.

38

Jena based implementation


Jena, ARQ, Joseki developed at HP Labs*.

Jena : Semantic Web toolkit

RDF, OWL,

inference

ARQ : Query Engine

SPARQL query

algebra, evaluation




Joseki : SPARQL server

SPARQL protocol

* http://www.hpl.hp.com/semweb/

39

Query returning a table

host_name

"ILDTRD129"
^^<http://www.w3.org/2001/XMLSchema#string>


"JONI"
^^<http://www.w3.org/2001/XMLSchema#string>


"MBADIR
-
IL"
^^<http://www.w3.org/2001/XMLSchema#string>


Select the names of host servers

on the network with addresses from 192.168.81.0

SELECT ?host_name

WHERE {


[ a object:network ] attr:network_netaddr "192.168.81.0" ;


link:member [ a object:host ;


attr:host_dnsname ?host_name

] }

40

Query returning an RDF subgraph

Describe a network
(192.168.81.0)

with host
servers containing a
DB.


41

Example RDF result set



Database

Host

42

Outline


PART 0: The problem


PART 1: RDF: The lingua franca for information exchange


Why

1.
Focus on semantics

2.
Easier data integration

3.
Easier to bridge other formats/models

4.
Looser coupling


How

1.
RDF message semantics

2.
REST
-
based SPARQL endpoints

3.
XML with GRDDL transformations

4.
Aggregators


PART 2: POC: A SPARQL adaptor for UCMDB


What is UCMDB


SPARQL adaptor

© 2008 Hewlett
-
Packard Development Company, L.P.

The information contained herein is subject to change without notice

Questions?