Semantic Web and the Grid

woodruffpassionateInternet and Web Development

Oct 21, 2013 (3 years and 9 months ago)

113 views

Semantic Web and the Grid

Brian Matthews

Brian Matthews

2

euroCRIS seminar 2004

2

Contents


A Changing Environment for Research


The Semantic Web


The Grid


The Semantic Grid


What does that mean for CRIS and OA?


Conclusion

Brian Matthews

3

euroCRIS seminar 2004

3

A Future Environment for Research


OA and CRIS as drivers for the management
and access to information


Need for shared metadata and exchange
mechanisms


Central control impossible/undesirable


a loosely coupled federated approach


based on common interchange and access standards


W3C, GGF, IETF, OASIS, EuroCRIS, WfMC etc



Changes in technology


resource discovery


enables access


Two leading technology opportunities


Semantic Web and the GRID

Brian Matthews

4

euroCRIS seminar 2004

4

The Semantic Web

Adding machine readable information about the web, to the
web.



The Web is chaotic
-

why are resources are linked?


Imagine a library where all the books have the same text on the cover, and the only catalogues are
compiled by photocopying the books, cutting up the copies, and arranging the words in the order of
frequency. Johan Hjelm


Google

is great at returning all the pages on the web that mention "Tim
Berners
-
Lee“


But what about returning those pages written by Tim Berners
-
Lee?


The Semantic Web adds well
-
defined meaning to describe the Web
(
Metadata
).


The Semantic Web is an extension of the current web in which the
information is given well
-
defined meaning, better enabling
computers and people to work in cooperation


Tim Berners
-
Lee, James Hendler and Ora Lassila

The Semantic Web, Scientific American, May 2001



Brian Matthews

5

euroCRIS seminar 2004

5

Add Meaning to Resources

Brian Matthews

6

euroCRIS seminar 2004

6

Semantic Web:

A Layered Architecture

Basic Syntax of
the Web

Language of triples for
describing resources

Formalism for defining
and sharing vocabularies

Reasoning over statements
about resources

“The Web of Trust”

Brian Matthews

7

euroCRIS seminar 2004

7

Machine Readable Meaning


Meaning becomes
machine readable
-

so
software agents
can use it
for
:


Improving searches (indexing, cataloguing)


Convey information on the usage of the resource
(access control, IPR).


Convey information on the actors involved (user
preferences, device profiles, privacy preferences)


Give third party opinions on the content of another
site (rating services, brokering).


Essentially,
Metadata
of all kinds

Brian Matthews

8

euroCRIS seminar 2004

8

Progress so far


A lot more than you might think!


Base standards are now mature:


RDF, RDF Schema, OWL


many others reaching maturity:


Many shared vocabularies emerging


DC, DMoz, Prism, FOAF, VCard, SKOS, RSS….


Lots of RDF out there!


Mozilla, Adobe, RSS,


Still a lot of work to do


reasoning, trust, provenance, tools,


But we are getting there!

Brian Matthews

9

euroCRIS seminar 2004

9

Example: SKOS


Community effort led by
CCLRC/W3C


A vocabulary to represent
Thesauruses


Heavily used in the library
community


but traditionally locked up in institutional
databases


Allow people to share
controlled vocabularies for
cataloguing resources


Examples


GEMET


environmental data


GCL


e
-
Government


English Heritage


W3C glossary

CRIS
2

CRIS
1

CRIS portal

Query
distributor

and collator


Users

Thesaurus

Service







Brian Matthews

10

euroCRIS seminar 2004

10

Example: Simile


Project of MIT + HP Labs + W3C


Publishing digital library information onto the semantic
web.


Make semantic interoperability of metadata a reality for
digital libraries by:


providing reusable software for browsing, searching and mapping
heterogeneous metadata


using semantic web technologies


identifying issues, gaps and best practices


allow libraries to share information


Provide semantic web browser, and RDF based datasets


for art history information


combined from different sources


Using SKOS as the thesaurus format.



OA within the Semantic Web

Brian Matthews

11

euroCRIS seminar 2004

11

Semantic Web and OA



Semantic web provides an underlying
mechanism to support OA:


common metadata


data exchange mechanism


searching and browsing across web


query language and logic


interoperability


lose coupling.


Can also support CRIS this way too.


CERIF in OWL (Lopatenko)


And also Data Sets


CCLRC Metadata format


also in RDF Schema


But that is not the only main technology change

Brian Matthews

12

euroCRIS seminar 2004

12

The Grid

The Grid provides an environment that enable software
applications to integrate instruments, displays, computational
and information resources that are managed by diverse
organisations in widespread locations.




Provide access to a global distributed computing environment


via authentication, authorisation, negotiation, security



Identify and allocate appropriate resources


interrogate information services
-
> resource discovery


enquire current status/loading via monitoring tools


decide strategy
-

eg move data or move application


(co
-
)allocate resources
-
> process flow



Schedule tasks and analyse results


ensure required application code is available on remote machine


transfer or replicate data and update catalogues


monitor execution and resolve problems as they occur


retrieve and analyse results
-

eg using local visualization



So far typically in large
-
scale science and engineering.

Brian Matthews

13

euroCRIS seminar 2004

13

To make this happen you need .
. .


agreed protocols (cf WWW
-
> W3C)


defined application programming interfaces (APIs)


existence of directories for both

system and
application


distributed data management


availability of current status of resources


monitoring tools


accepted authentication procedures and policies


network traffic management



provided by Grid
-
based toolkits and services


Brian Matthews

14

euroCRIS seminar 2004

GRID History



mid 90s


Globus


The GRID Bible


Based on “traditional”
protocols (IETF)


Taken up by e
-
Science


Standardised via
GGF


Now converging with
Web


Web Services
-

WSRF


Brian Matthews

15

euroCRIS seminar 2004

15

Computer simulations

real
-
time

collection

Multi
-
source

Data Analysis

desktop & VR clients
with shared controls

Unitary Plan Wind Tunnel

Example: NASA IPG

archival

storage

Brian Matthews

16

euroCRIS seminar 2004

16

Example: DataGrid


LHC will produce several PBs of data per year for
at least 10 years from 2005 .



Data analysis will be carried out by farms of
1000’s of commodity processors (the “computing
fabric”) in each of about 10 regional Tier1 centres
-

RAL is UK Tier1


Each Tier1 centre will need to hold several PBs
of raw data and results of physics analysis


Strong focus on middleware and testbeds
-

open
source

Brian Matthews

17

euroCRIS seminar 2004

17

What Next? The Semantic Grid

Semantic Grid

distributed computation

GRID

WEB

Semantic Web

machine readable semantics

thanks to Dave de Roure

Brian Matthews

18

euroCRIS seminar 2004

18

What Next? The Semantic Grid


Current GRID is
“hand
-
crafted”


users have to know a lot
about the available
resources


users have to “write scripts”
to use the GRID


Add machine readable
semantics (metadata)


The Semantic GRID

Semantic
Grid

distributed computation

GRID

WEB

Semantic
Web

machine readable semantics

thanks to Dave de Roure

“the GRID is an application

of the Semantic Web”

de Roure, Goble

Brian Matthews

19

euroCRIS seminar 2004

19

But what does that mean?


more automation


more negotiation


more autonomy


more self
-
monitoring and control


use of autonomous agents



Will make the Grid much more like the
electricity Grid


You don’t need to know where the stuff comes from.

Brian Matthews

20

euroCRIS seminar 2004

20


Major UK e
-
Science project


Bio
-
informatics


In
-
silico experimentation


www.mygrid.org.uk


Based on a GRID architecture


Uses Semantic Web Tools for


Workflow and service discovery


Prior to and during enactment


Semantic registration


Workflow assembly


Semantic service typing of inputs and outputs


Provenance of workflows and other entities


Experimental metadata glue


Use of RDF, RDFS, DAML+OIL/OWL


Instance store, ontology server, reasoner


Materialised vs at point of delivery reasoning.


myGrid Information Model


About to join them to work on workflow



Semantic Grid

Example

Brian Matthews

21

euroCRIS seminar 2004

21

What does this mean for CRIS &
OA?


Portal with knowledge
-
assisted user interface

Digital Curation Facility

SCIENTIFIC DATASETS




metadata




PUBLICATIONS




metadata





CRIS

metadata

publish

validate

GRIDs

Ambient, Pervasive Access

The Semantic Grid is what makes this work!

Brian Matthews

22

euroCRIS seminar 2004

22

Example: Validation


Validate results from paper


need to access paper (OA)


need to link to data (and metadata)


need to access analysis and visualisation tools


need common metadata and access to resources
across Grid.


Grid middleware

Local data

Local metadata

DA 1

Data Portal

Pub Portal

Local data

Local metadata

DA 2

Local data

Local metadata

IR 1

Local data

Local metadata

IR 2

Brian Matthews

23

euroCRIS seminar 2004

23

Example: Science as a process


Within a Grid environment

Submit

proposal

Prepare

experiment

Generate

results

Analyse

results

Write

report

Provenance

metadata

+

access

conditions

data

description

+

+

+

data

location


Related

material


Collecting the metadata can then become part of the

experimental support environment

CRIS

DA

IR

Brian Matthews

24

euroCRIS seminar 2004

24

Example: the Nature of a Publication


Traditional publication as continuous text, with
static graphs and images


Change the notion of the content of the
publication


hypertext


include active components


links to simulations, visualisations


a much more dynamic document


a multimedia presentation


How will publishers cope?


How will publication archives cope?


Brian Matthews

25

euroCRIS seminar 2004

25

So how to achieve this?


Resource discovery


good metadata


common formats


standards



Resource negotiation


for data and services


Quality of service
guarantees


Policies and contracts


Security and trust


Provenance


Monitoring and
payment


Work flow


Reasoning tools


Autonomous agents


Autonomic systems


Links to legacy


especially database
systems


querying systems


Collaborative working
environments


Design methods

Brian Matthews

26

euroCRIS seminar 2004

26

Progress


Moving quite fast on this from many
different directions


e
-
Science


Next Generation Grid Report


FP6/7


Semantic Grid at GGF


OA initiatives


Digital Curation a major concern



Real exciting opportunity to pull it all
together


Brian Matthews

27

euroCRIS seminar 2004

27

Conclusions


Semantic Grid and Open Access


enables


enabling


CRIS as an information coordinator


Archiving and curation


need to archive much more


data, programs, visualisation and analysis tools, formats,
calibrations, versions, OS ……


Workflow a key component


Metadata collection and maintenance is a big
problem.


B.M.Matthews@rl.ac.uk