Middleware Summary - GridPP

disgustedtukwilaInternet and Web Development

Dec 14, 2013 (3 years and 7 months ago)

123 views

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
1

Middleware Summary

Robin Middleton (RAL/PPD)

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
2

Overview


What’s in EDG2.0


What’s planned for EDG2.1 (Rel 3.0)


Towards GridPP2


General


2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
3

Requirements


HEPCAL


43 Use cases


EDG 1.4


6 Fully implemented


12 Mostly satisfied (restrictions/complications)


16 not implemented as functionality missing


9 partially implemented


EDG 2.0


Re
-
assessment (for 1.4, 2.0 & 2.1) by AWG (FH) by 11
th

July


Authorisation, job control, optimisation improvements


Missing features


Virtual Data (not within EDG scope)


MetaData catalogues (some m/w support, but exp. clarification
needed)


2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
4

EDG 2.0

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
5

What’s in 2.0 ?


Starting from Rel 1.4…


Move to RH7.3 & use LCFGng


Jan 2003


Move to Globus 2.2 & Condor 6.4


Feb 2003


Based on the VDT packaging; compatibility with other projects


http://www.lsc
-
group.phys.uwm.edu/vdt/home.html


RB support interactive jobs, MPICH, checkpointing


RLS
-

Replica Location Service, etc (see J.Casey’s talk)


RGMA as backbone of information & monitoring system (S.Fisher talk)


Updated to GLUE schema


Storage Element (see J.Jensen’s talk)


Network Cost Function

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
6

Logging &

Bookkeeping

Server

Saving of

job checkpoint state

state.saveState()

Job

Job checkpoint states

saved in the LB server

Retrieval

of job

checkpoint


Also used (even in rel. 1) as repository of job status
info


Already proved to be robust and reliable


The load can be distributed between multiple


LB servers, to address scalability problems

Job Checkpointing

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
7

Storage

Element

Replica Manager

Replica Location

Service

Replica Optimization

Service

Replica Metadata

Catalog

SE

Monitor

Network Monitor

Information Service

Resource Broker

User Interface or

Worker Node

Storage

Element

Virtual Organization

Membership Service

RLS in EDG2.0

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
8

RGMA in EDG2.0


R
-
GMA
Consumers

LDAP

InfoProvider

GIN

LDAP

Server

LDAP

InfoProvider

Stream
Producer

GIN

Consumer
(CE)

Consumer
(SE)

Consumer
(SiteInfo)


RDBMS

Latest

Producer

GOUT

ConsumerA
PI

Archiver
(LatestProducer)

Stream
Producer

R
-
GMA

GLUE

Schema


Push mode


Updates every 30s



>70 sites (simul.)

Ack : WP3

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
9

SE in EDG2.0

Client App

API

SE HTTP

library

SSL socket

library

AXIS

SE core

SE

Java Client

Tomcat


The design of the SE follows a
layered model with a central core
handling all paths between client
and MSS.

Core is flexible and


extensible making

it easy to support

new protocols,

features and MSS

Client App

Java

Client API


C Client

RMANMAN

Apache

Web Service

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
10

WP7 in EDG2.0

http://comp7.in2p3.fr/wp7archive/

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
11

WP7


GridFTP Logging

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
12

WP7 Network Monitoring

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
13

EDG 2.1

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
14

EDG2.1


Integration schedule


Detailed integration times throughout July & August


Feature freeze end August…only debug, integrate & fix after this


General (effort) move from design & code
-
> test & bug
-
fix


Quality above quantity


Final software release of EDG


Some important new functionality (in the wings)


2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
15

EDG 2.1 (aka Rel 3.0 !!)

MAY

JUNE

JULY

AUGUST

SEPTEMBER

Testing & Bugfixing

gcc3.2.2/RH7.3 (RH8/9 test on WN/UI); VDT update?

VOMS & Security (ACLs, LCMAPS,…)

Scalability/Stability measures (RLS,R
-
GMA,MSS staging,…)

New Functionality (gridOpen, DAGs, …)

(Ack: E.Laure CERN/EDG)

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
16

TB2.1


WP1


Direct interaction of RB with R
-
GMA (Incl. Logging &
Bookkeeping)


Integration with VOMS (proxy renewal)


Job dependencies & DAGman scheduling


Job partitioning


RB support for Data prefetch (depends on WP2)


Accounting & Advance Reservation


Strong dependence on underlying system


probably
only a demonstrator

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
17


A

=

[


Executable

=

"A
.
sh"
;


PreScript

=

"PreA
.
sh"
;


PreScriptArguments

=

{

"
1
"

}
;


Children

=

{

"B",

"C"

}


]
;




B

=

[


Executable

=

"B
.
sh"
;


PostScript

=

"PostA
.
sh"
;


PostScriptArguments

=

{

"
$
RETURN"

}
;


Children

=

{

"D"

}


]
;




C

=

[


Executable

=

"C
.
sh"
;


Children

=

{

"D"

}


]
;




D

=

[


Executable

=

"D
.
sh"
;


PreScript

=

"PreD
.
sh"
;


PostScript

=

"PostD
.
sh"
;


PostScriptArguments

=

{

"
1
",

"a"

}


]


TB2.1


WP1
-

DAGs

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
18




JobType = Partitionable;


Executable = ...;


JobSteps = ...;


StepWeight = ...;


Requirements = ...;


...


...


Prejob =


[


Executable = ...


Requirements = ...;


...


...


Aggregator =


[


Executable = ...


Requirements = ...;


...


...


];


TB2.1


WP1


Job Partitioning

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
19

TB2.1


WP2


Full RLS deployment


RLI integrated with LRC


VOMS aware security


EDG Trust Manager


EDG Authorisation Manager (coarse grained)


File pre
-
fetch (needed by WP1)


not for 2.1


Replica Subscription Service


not for 2.1


First step towards proxy service for supporting
sites w/o outbound IP


Must not compromise support RLS or security

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
20

RLS at SC2002

Ack : G.McCance

Used Globus RLS

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
21

TB2.1


WP3





General performance enhancements


Performance enhancement for

GRM/PROVE use


Registry resilience (replication)


VOMS aware security

(authentication + basic authorisation)

Producer1

Producer2

Registry2

Info mastered by Registry2

Copy of info from Registry1

Copy of info from Registry3

Registry3

Info mastered by Registry3

Copy of info from Registry1

Copy of info from Registry2

Registry1

Info mastered by Registry1

Copy of info from Registry2

Copy of info from Registry3

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
22

TB2.1


WP4


Resource management


GLUE info provider maintenance


Support for LSF, Condor & advance reservation


Fault tolerance framework


Gridification


LCMAPS
-
1.0, LCAS
-
2.0, VOMS plugin, job repository


Monitoring (see Jan van Eldik’s talk)


Full architecture, Oracle & MySQL backends, alarm display



New Install & Config architecture piloted at CERN, but NOT
replacing LCFGng before end of EDG

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
23

Install & Config







CCM

SPMA

SPMA

NCM

Components

Cdispd

NCM

Registration

Notification

SPMA

SPMA.cfg

CDB

nfs

http

ftp

Mgmt API

ACL’s

Client Nodes

SWRep Servers

cache

Packages

(rpm, pkg)

packages

(RPM, PKG)

PXE

DHCP

Mgmt API

ACL’s

Installation server

DHCP

handling

KS/JS

PXE

handling

KS/JS

generator

Node

Install

CCM

Node (re)install?

Automated Installation Infrastructure



DHCP and Kickstart (or JumpStart) are re
-
generated according to CDB contents


PXE can be set to reboot or reinstall by
operator

Software Repository



Packages (in RPM or PKG format) can be
uploaded into multiple Software Repositories


Client access is using HTTP, NFS/AFS or FTP


Management access subject to
authentication/authorization

Node Configuration Manager (NCM)



Configuration Management on the node is
done by NCM
Components


Each component is responsible for configuring
a service (network, NFS, sendmail, PBS)


Components are notified by the Cdispd
whenever there was a change in their
configuration


Software Package Mgmt Agent (SPMA)



SPMA manages the installed packages


Runs on Linux (RPM) or Solaris (PKG)


SPMA configuration done via an NCM component


Can use a local cache for pre
-
fetching packages
(simultaneous upgrades of large farms)



Configuration Data Base (CDB)

Configuration Information store. The
information is updated in transactions, it
is validated and versioned. Pan Templates
are compiled into XML profiles

Configuration Information is
stored in the local cache. It is
accessed via NVA
-
API

Ack : WP4

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
24

TB2.1


WP5


SRM interface


Asynchronous interaction


SE setup for WP9(EO)/10(Bio)


VOMS aware security


Improved error handling

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
25

TB2.1


WP7




Probe Coordination Protocol





Network cost function enhancement


Network GLUE schema prototype




QoS & high throughput tests with GEANT

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
26

TB2.x User Authorisation

VO
-
VO
MS

user


service

authentication &
authorization

info

user cert

(long life
)

VO
-
VO
MS

VO
-
VOMS

VO
-
VO
MS

CA

CA

CA

low frequency

high frequency

host cert

(long life
)

authz cert

(short life)

service cert

(short life)

authz cert

(short life)

proxy cert

(short life)

voms
-
proxy
-
init

crl update

registration

registration

LCAS

edg
-
java
-
security

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
27

TB2.1
-

Security



VOMS deployment


Server manually set up at several places


Work on auto
-
config ongoing


start testing soon

VO
-
LDAP

VOMS

user

service

proxy

grid
-
mapfile

voms
-
ldap
-
sync

grid
-
proxy
-
init

phase 0.

VOMS

user

service

proxy
(voms)

grid
-
mapfile

phase 2.

VO
-
LDAP

VOMS

user

service

proxy

grid
-
mapfile

voms
-
ldap
-
sync

grid
-
proxy
-
init

phase 1.

VOMS

user

service

phase 3.

proxy
(voms)

testing the VOMS servers

user management on VOMS

compatibility mode: mixed services

fully migrated: only VOMS
-
aware services

VO
-
LDAP

grid
-
proxy
-
init

edg
-
mkgridmap

voms
-
proxy
-
init

edg
-
mkgridmap

edg
-
mkgridmap

voms
-
proxy
-
init

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
28

SAM


D0/CDF

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
29

GridPP
-
2
-

Middleware

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
30

GridPP
-
2 Middleware Directions


Policy


Mission critical to PP





OR


Demonstrable lead on international stage


OR


Contribute to wider programme leveraging benefit for PP


Guidelines


Clustering of expertise


Useful to LCG programme


Partnership/collaboration where possible (e.g. UK e
-
Science)


Tech. transfer to industrial sector


Awareness of / engagement with GGF (move to OGSA/I)


Areas


Data & Storage Management


Workload Management


Information & Monitoring


Security


Networking

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
31

Evolution of m/w Effort

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
32

GridPP2
-

Middleware


Data Storage & Management


Fuller integration of exp meta
-
data with m/w


Site
-
local data management


caches, space reservation, cleanup


Full integration of MSS


Improved replica optimisation


Workload Management


OGSIfication of the RB


Redesign of WM architecture


Develop Java client


Autonomic aspects of WM


New scheduling algorithms

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
33

GridPP2
-

Middleware


Information & Monitoring


Requirements & architecture revision/cycle


OGSIfication of core service(s) (see A.Djaoui’s talk)


QA & Production Service Dev


Information model co
-
ordination


End
-
user tools/displays


Security


LCG Security


Local Access Control


Pool accounts, GACL, /grid, batch interfaces


Local Usage Control


GridSite


VO Access Management


VO Usage Management


Interface alternative authorisation frameworks


Audit and grid intrusion detection


Tool ports to other UNIX & Windows

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
34

GridPP2
-

Middleware


Networking


Next generation Grid Network Performance Measurement Service


High performance data transport


Resource allocation & reservation services


(UKLIGHT participation)


(PPNCG support)

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
35

General

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
36

EDG Quality Group


The Quality Group (QAG) was created in August 2002 with Quality
representative (QAR) from each WP. The QAR ensure the measures are
applied inside his/her WP. Chaired by Gabriel Zaquine.


http
:
//www
.
eu
-
datagrid
.
org/QAG/


The Quality Group has produced an EDG developers guide document


The document gives an overview of the tools available and conventions to be
followed for the software development within EDG:


Packaging
-

Code Management


Automatic Build system
-

Environment
-

Interfaces and API's
-

Documentation


Test and validation process
-

Integration procedure
-

Style and
naming conventions


http://edms.cern.ch/document/358824


Work on EDG 2.0 shows that conventions are not yet being followed by
everyone

All developers must read this document and ensure their software
complies

(Slide : R.Jones


Barcelona meeting)

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
37

EDG Architecture Group


ATF has been working to clarify the details of the interactions and
interfaces of EDG 2.0


Continues to meet on a monthly basis
http://agenda.cern.ch/displayLevel.php?fid=3l148


Work driven by use cases provided by the application representatives


A document describing the architecture for EDG 2.0 has been produced:
https://edms.cern.ch/file/368971/


ATF has been further empowered to “own” the external interfaces


Intended to avoid discrepancies between the interface details agreed by
ATF and those found in the software delivered by the mware WPs


Baseline document with interface definitions now in preparation

Mware WPs please make sure ATF have the APIs for your external
interfaces

(Slide : R.Jones


Barcelona meeting)

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
38

How far have we come/to go ?

Requirements (HEPCAL (+GAG))











Architecture & Design (EDG: ATF+WPs)











Implementation











Robustness
(under daily operations)











Fault
-
tolerance
(to breakages)











Scalability
(promised but not proven)











Interoperability
(GLUE)











(n.b. very subjective !!)

2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
39

Summary


Achievements


RB, VOMS, RLS, RGMA, SE, LCFG,…


Challenges


LCG
-
1, LCG
-
2, …


OGSA migration


Engineering production quality (R
3

etc.)


2
nd

July 2003

GridPP7


Oxford


R.Middleton

Slide
40

END