OGSA-DAI - Center for Computation & Technology

obtainablerabbiΔιαχείριση Δεδομένων

31 Ιαν 2013 (πριν από 4 χρόνια και 6 μήνες)

152 εμφανίσεις

http://www.
ogsadai.org.uk

Introduction to OGSA
-
DAI

The OGSA
-
DAI Team

info@ogsadai.org.uk

2

http://www.
ogsadai.org.uk

The OGSA
-
DAI Project


A generic framework for integrating data
access and computation


Uniform interface to relational, XML, flat file data resources


Using the grid to take specific classes of
computation nearer to the data


Kit of parts for building tailored access and
integration applications


Investigations to inform DAIS
-
WG


One reference implementation for DAIS


Releases publicly available NOW



3

http://www.
ogsadai.org.uk

Project Partners

Powered by ….

Funded by the

Grid Core Programme

4

http://www.
ogsadai.org.uk

Project Membership

Principal Investigators

Project Manager

Programme Management

Board Chair

Technical Review Board


Chair

Research Team

IBM Dissemination Team

EPCC Team

Charaka

Charaka

Mike

Ally

Amy

Mario

Malcolm

Kostas

Norman

Paul

Neil

Andy

Simon

Brian

Dave

Patrick

Neil

IBM Development Team

6

http://www.
ogsadai.org.uk

Project Status


Current release 4.0


Globus Toolkit 3.2 compliant


Platform and language independent


Java 1.4


Document model


Work concentrated on data access


Wraps data resources without hiding underlying data
model


Provide base for higher
-
level services


Distributed Query Processing (DQP)


Data federation services

7

http://www.
ogsadai.org.uk

Supported Data Resources

Relational

XML

Other

MySQL



Xindice



Files



DB2



eXist

?

Oracle



PostgreSQL



SQLServer



8

http://www.
ogsadai.org.uk

Web Service Architecture

Service

Registry

Service

Consumer

Service

Provider


Bind

9

http://www.
ogsadai.org.uk

OGSA
-
DAI Service
Architecture

DAISGR

Service

Consumer

GDSF

GDS


Bind

10

http://www.
ogsadai.org.uk

OGSA
-
DAI Services


OGSA
-
DAI uses three main service types


DAISGR (registry) for discovery


GDSF (factory) to represent a data resource


GDS (data service) to access a data resource








This will change

DAISGR

GDSF

GDS

Data

Resource

locates

creates

11

http://www.
ogsadai.org.uk

GDSF and GDS


Grid Data Service Factory (GDSF)


Represents a data resource


Persistent service


Currently static (no dynamic GDSFs)


Cannot instantiate new services to represent other/new
databases


Exposes capabilities and metadata


May register with a DAISGR


Grid Data Service (GDS)


Created by a GDSF


Generally transient service


Required to access data resource


Holds the client session

13

http://www.
ogsadai.org.uk

DAISGR


DAI Service Group Registry (DAISGR)


Persistent service


Based on OGSI ServiceGroups


GDSFs may register with DAISGR


Clients access DAISGR to discover


Resources


Services (may need specific capabilities)


Support a given portType or activity


14

http://www.
ogsadai.org.uk

Analyst

Registry

DAISGR

Factory

GDSF

registerService

findServiceData

findServiceData


Data resource publication through registry


Data location hidden by factory


Data resource meta data available through
Service Data Elements

Location

15

http://www.
ogsadai.org.uk

Interaction Model: Start up

OGSI Container

OGSI Container

GDSF

DAISGR

1. Start OGSI containers with


persistent services.

2. Here GDSF represents Frog


database.

16

http://www.
ogsadai.org.uk

Interaction Model:
Registration

OGSI Container

OGSI Container

GDSF

DAISGR

3. GDSF registers with DAISGR.

Frogs: GSH

17

http://www.
ogsadai.org.uk

Interaction Model:
Discovery

OGSI Container

OGSI Container

GDSF

DAISGR

4. Client wants to know about


frogs. Can:


(i) Query the GDSF directly


if known or

(ii) Identify suitable GDSF


through DAISGR.

Frogs: GSH

Mmmmm


Frogs?

18

http://www.
ogsadai.org.uk

Interaction Model: Service
Creation

OGSI Container

OGSI Container

GDSF

DAISGR

5. Having identified a suitable


GDSF client asks a GDS to be


created.

Frogs: GSH

GDS

19

http://www.
ogsadai.org.uk

Interaction Model: Perform

OGSI Container

OGSI Container

GDSF

DAISGR

6. Client interacts with GDS by


sending Perform documents.

7. GDS responds with a
Response


document.

8. Client may terminate GDS
when finished or let it die
naturally
.

Frogs: GSH

GDS

20

http://www.
ogsadai.org.uk

Interaction Model: Summary


Only described an access use case


Client not concerned with connection mechanism


Similar framework could accommodate service
-
service
interactions


Discovery aspect is important


Probably requires a human


Needs adequate definition of metadata


Definitions of ontologies and vocabularies
-

not something
that OGSA
-
DAI is doing …


21

http://www.
ogsadai.org.uk

More Complex Behaviour

Data Resource

Container

Client

GDS

GDT

Data Resource

Container

GDS

GDT

Deliver data back to the client.

Data Resource

Deliver data

another GDS.

And there's a lot more that you can do …

22

http://www.
ogsadai.org.uk

Usage Patterns

G

A

Q

S+R

Data


Q
-

Query

D
-

Delivery

S
-

Status

R
-

Result

U
-

Update

I
-

Data id

Q+D

A

C

G

S

R

G

C

A

Q

S

D

R

A

G

Q+U

S

Retrieve

Update/Insert

Pipeline

G2=C

G1=P

A

I

Q1

S2

S1

U/R

Q2+D

Q1+D

G2=C

A

G1=P

S2

S1

Q2

U/R

Actors



-

OGSI process


-

Non
-
OGSI process

A
-

Analyst

C
-

Consumer

G
-

GDS

P
-

Producer

Call

Response

Data Flow

A

P

G

U

I

Q

S

A

P

G

U

I

S

Q+D

23

http://www.
ogsadai.org.uk

Project Using OGSA
-
DAI


24

http://www.
ogsadai.org.uk

Projects Using OGSA
-
DAI

OGSA
-
DAI

(http://www.ogsadai.org.uk)

AstroGrid

(http://www.astrogrid.org/)

BioSimGrid

(http://www.biosimgrid.org/)

BioGrid

(http://www.biogrid.jp/)

Bridges

(http://www.brc.dcs.gla.ac.uk/projects/bridges/)

eDiaMoND


(http://www.ediamond.ox.ac.uk/)

FirstDig

(http://www.epcc.ed.ac.uk/~firstdig/)

GeneGrid

(http://www.qub.ac.uk/escience/projects.php#genegrid)

GEON

(http://www.geongrid.org/)

IU RGRBench

(http://www.cs.indiana.edu/~plale/projects/RGR/OGSA
-
DAI.html)

myGrid

(http://www.mygrid.org.uk/)

N2Grid

(http://www.cs.univie.ac.at/institute/index.html?project
-
80=80)

ODD
-
Genes

(http://www.epcc.ed.ac.uk/oddgenes/)

OGSA
-
WebDB

(http://www.gtrc.aist.go.jp/dbgrid/)

INWA

(http://www.epcc.ed.ac.uk/)

25

http://www.
ogsadai.org.uk

Project classification

OGSA
-
DAI

Biological

Sciences

Physical

Sciences

Commercial

Applications

Computer

Sciences



FirstDig



INWA



Bridges



AstroGrid



BioSimGrid



BioGrid



eDiamond



myGrid



ODD
-
Genes



N2Grid



GEON



MCS



IU RGBench



OGSA Web
-
DB



GeneGrid



GridMiner

26

http://www.
ogsadai.org.uk

Points to Note


Feedback from users largely positive


Good suggestions


Fair criticisms


How OGSA
-
DAI is being used


Where it succeeds and where it fails


Helping us to capture requirements


Hope to allow user contributions


Plan to establish a policy/framework for this


Engage more with User Community


Meetings scheduled for this year


OGSA
-
DAI mini
-
workshop at AHM 2004


OGSA
-
DAI tutorials at various meetings/locations

27

http://www.
ogsadai.org.uk


e
-
D
igital
M
amm
O
graphy
N
ational
D
atabase


Mammogram
-

X
-
ray of the breast


Built prototype of a national
database of mammographic
images


In support of the UK Breast screening
programme


Employed Grid technologies to
facilitate process

Thanks to eDiaMonND project and the

Digital Database for Screening Mammography

for this image.

28

http://www.
ogsadai.org.uk


Breast screening in the UK began in 1988


Women aged 50
-
64 screened every 3 Years


Women aged 50
-
70 from 2004


1 View/Breast → 2 views by 2003


UK has


Over 90 Breast screening units throughout the UK


Each one deals with about 45000 women on average p.a.


Each centre sees 5000
-
20000 images/year


In 2001
-
02


2002
-
03


Screened: 1.4M → 1.5M


Recalled for Assessment : 77911 → 79441


Cancers detected : 10003 → 10467


Lives per year Saved: 300 → 1250 (by 2010)


Distributed team of doctors perform the analysis


29

http://www.
ogsadai.org.uk

DB2

Content

Manager

DB2

Content

Manager

DB2

Content

Manager

DB2

Content

Manager

DB2 Federation

OGSA
-
DAI

OGSA
-
DAI

OGSA
-
DAI

OGSA
-
DAI

Database

Files

OGSA
-
DAI

Core

Services

Core

Services

Core

Services

Core

Services

Data

Load

Training

App

Training

Services

UCL

KCL

UED

CHU

Core

API

Training

API

Training

Application

Core &

Training API

OGSA
-
DAI

Data

Load

Training

App

Core &

Training API

Data

Load

Training

App

Core &

Training API

Data

Load

Training

App

Core &

Training API

30

http://www.
ogsadai.org.uk


eDiaMoND Findings:


OGSA
-
DAI provides a flexible framework


Dynamically configure the system through discovery


Activities can operate with different levels of granularity


Federation can be introduced at various levels


Good documentation on how to extend the framework


Extended Activities to access IBM DB2 Content Manager


Changes between versions
broke some things


Low level XML issues

31

http://www.
ogsadai.org.uk

FirstDIG


Data mining with the First Transport Group, UK


Example: “When buses are more than 10 minutes late there is an
82% chance that revenue drops by at least 10%”


"The results of this exercise will revolutionise the way we do
things in the bus industry.“,
Darren Unwin, Divisional Manager,
First South Yorkshire.


OGSA
-
DAI

OGSA
-
DAI

OGSA
-
DAI

OGSA
-
DAI

OGSA
-
DAI Client Application

Data Mining Application

32

http://www.
ogsadai.org.uk

INWA


Innovation Node: Western Australia


Informing Business & Regional Policy:

Grid
-
enabled fusion of
global data

and
local
knowledge


Project


Run from Nov 2003
-

Aug 2004


Involved 10 partners (6 UK + 4 Australia)


Aim


Data mine commercially sensitive data


Security an absolute MUST


Employ Grid technologies


Need access to data and computational resources


Demonstrator using:


OGSA
-
DAI


Incorporate data resources


Sun DCG's TOG (Transfer
-
queue Over Globus)


Handle job submission to analyse micro array data

33

http://www.
ogsadai.org.uk

user@australia

Curtin,Australia

EPCC,UK

INWA

Grid Engine

Bank

Telco

Grid Engine

Bank

Telco

OGSA
-
DAI

OGSA
-
DAI

OGSA
-
DAI

OGSA
-
DAI

TOG

TOG

Data Browser

Data Browser

user@edinburgh

Telco data

Bank data

Australian property

UK Property

34

http://www.
ogsadai.org.uk

INWA: Lessons Learned


Performing Data Integration:


TimeZone date problems


Security issues:


Bugs in


JavaCoG in GT3


OGSA
-
DAI could not switch security for Grid data transfers


TOG had no security option


All of these have been fixed


Middleware not mature enough for
commercial deployment


35

http://www.
ogsadai.org.uk

Why OGSA
-
DAI?


Why use OGSA
-
DAI over JDBC?


Can embed additional functionality at the service end


Transformations, compressions


Third party delivery


The extensible activity framework


Avoiding unnecessary data movement


Common interface to heterogeneous data resources


Relational, XML databases, and files


Usefulness of the Registry for service discovery


Dynamic service binding process


Provision of good meta
-
data is necessary


Language independence at the client end


Do not need to use Java


Platform independence


Do not have to worry about connection technology, drivers, etc