Grid Data Management - GridPP

musicincurableΔιαχείριση Δεδομένων

31 Ιαν 2013 (πριν από 4 χρόνια και 11 μήνες)

199 εμφανίσεις

WP2: Data Management

Gavin McCance

University of Glasgow

November 5, 2001

Overview

Deliverables


Replication: GDMP


Meta
-
data: Spitfire

GridPP effort

Future work


Query Optimisation


Deliverables

EU DataGrid WP2: Major M9
deliverables met


GDMP delivered


Spitfire delivered


Architecture Document



http://www.cern.ch/grid
-
data
-
management


GDMP

Generic mirroring tool for
any file type

(read only replica)


Particular plug
-
ins for Objectivity database
files

Subscription model for automatic
synchronisation of files

Automatic update of replica catalogue


Currently uses
Globus Replica Catalogue

http://cmsdoc.cern.ch/cms/grid/

…GDMP

BrokerInfo API from WP1



Allows users of GDMP to obtain information from
the job scheduler

Mass Storage Interface from WP5


e.g. Support for file staging

Security is provided via standard GSI (single
sign
-
on)


Authorisation via
grid mapfile

File transfer made using
GridFTP

Installation:
RPM and tarball

…GDMP usage

1.
A,B) Start GDMP services (inetd)

2.
B) Registers itself with site A

gdmp_host_subscribe

3.
A) New files

Register them

gdmp_register_local_file <path
-
to
-
file>

This updates the local (on A) catalogue

4.
A) Tell the world (well..all subscribed sites)

gdmp_publish_catalogue

Will update the import catalogue on all
subscribed sites

Site A

Site B

…GDMP usage

1.
B) Get the new files from site A

gdmp_replicate_get

The new files will be transferred from

site A


site B

Globus replica catalogue updated

Filters so you only get files you want

CRC checking of file transfer

Site A

Site B

Spitfire

Provides grid enabled access to any
relational database


SQL Database Service


Storage of general meta
-
data


Service Index soon…

Secure access via GSI (single sign
-
on)

Installation:
RPM and tarball

http://hep
-
proj
-
spitfire.web.cern.ch/hep
-
proj
-
spitfire/






Allows
any HTTP compliant

system e.g. Web
-
browsers / standard C++ HTTP libraries to access
any

relational database across the grid…

…Spitfire

= SQL Database Service (Spitfire)

Oracle

PostgreSQL

+

Grid Security

+

Standard communication

protocols

(XML over HTTPS)

JAVA Servlet

based

…Spitfire security

Authentication is currently provided


Standard user & server grid certificates


For both application programs and web browsers

Authorisation matrix coming soon…


Will map grid identity to ‘role(s)’


Reader, info
-
update, manager


‘Roles’ will then map to a given database
connection with given permissions on a database


Eg. query
-
only, insert, update, create new tables

…Spitfire

Easy to install

Good documentation

Ready to run examples


For grid
-
based meta
-
data catalogue
needs..


… we need feedback!

WP2 GridPP Effort

Based at Glasgow

Effort will focus on primarily the query
optimisation task of WP2


1 PhD student, 1.5 RA


Continuing effort in development of
Spitfire and related applications


0.7 RA

Future Spitfire work

Look at common ground between WP2 and
WP3


Spitfire and R
-
GMA?

Security


Authorisation mechanisms

Other spitfire applications


Service Index, Replica Catalogue

Work on scaleable architectures


Common with e.g. replica catalogue work

Query Optimisation work

Categorise possible areas for optimisation:


User oriented: high performance


Minimising cost for specific job


Grid oriented: high throughput


Maximise efficient usage of resources


Site oriented: local policy


Respond to specific site policies / requirements


Much preliminary work done!

Workshop in December 2001…

…Query Optimisation

Short term:


Data Access optimisation


Replica Optimiser component


How long will it take to get the data here?


Developing and evaluating appropriate
algorithms for working this out and
choosing best replica…

…Query Optimisation

Modelling and Simulation


Best not to test out the more crazy
algorithms on the experiment testbed

Work underway with MONARC tool


Evaluating suitability as simulation tool for
this particular work


Integrate into the QO work

Summary

Major deliverables for M9 met


GDMP and Spitfire


GridPP will concentrate effort on Query
Optimisation task of WP2


+ continued Spitfire development


Work already underway