Latest SC2 presentation (October 14, 2005) - LHC Computing Grid ...

townripeΔιαχείριση Δεδομένων

31 Ιαν 2013 (πριν από 4 χρόνια και 5 μήνες)

247 εμφανίσεις

INFSO
-
RI
-
508833


Enabling Grids for E
-
sciencE

www.eu
-
egee.org

ARDA status

Massimo Lamanna / LCG ARDA

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

2

Table of Contents


Introduction


Support material


Questions (Template from Matthias)


Activity


Middleware


Metadata


Prototypes


ALICE


ATLAS


CMS


LHCb


Other points


Personnel effort


Milestones


Outlook

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

3

Existing material


Last SC
2
presentation:


http://lcg.web.cern.ch/lcg/PEB/arda/public_docs/ARDAatSC
2
.ppt


SC
2
: very constructive discussions (T. Doyle and J. Shank)


Recent relevant LCG presentations:


LCG PEB (March
05
):
http://lcg.web.cern.ch/LCG/activities/arda/public_docs/
2005
/Q
1
/ARD
A
-
PEBMarch
05
.ppt



LHCC demo (May
05
):
http://lcg.web.cern.ch/LCG/activities/arda/public_docs/
2005
/Q
2
/LHCC
demo.ppt



OSG meeting (June
05
):
http://lcg.web.cern.ch/LCG/activities/arda/public_docs/
2005
/Q
2
/ARD
AatOSG_jun
05
.ppt


ARDA document page:


http://lcg.web.cern.ch/LCG/activities/arda/documents.html

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

4

Template


Presentation of workplan, milestones,


there is the SC2 request to have more detailed and meaningful
milestones


Base line services:


what is available in the current gLite version,


development and release plan for future versions


Evolution of experiments grid applications:


fraction of gLite services used, intended to be used


situation and plans for testing and certification of new versions


use of gLite in SC3/SC4


use of ARDA applications in SC3/SC4


Presentation of manpower situation


Open questions


Achievements


Concerns and risks


Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

5

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

6

Middleware


Mandate


Use the gLite middleware proactively in order to provide
feedback to developers and support the migration of
experiments’ system


Implementation


Access to the development test bed


Contribution to the testing effort (gLite team, non
-
HEP EGEE
resources)


Contribution to the set up of the preproduction service


Use of special installation for detailed tests (e.g. Taipei ARDA
test bed, ATLAS
-
Milano resource broker)


Involvement of the team and experiments people in
meetings/reviews of gLite middleware and general discussions
(ARDA workshops)


ARDA Metadata activity (AMGA)


Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

7

Access to the development test bed



Very positive experience overall


Very early access to new software



Nice feedback loop



All 4 experiments used the system to set up their early
prototypes (till beginning of 2005)


Now used to “play” with the middleware (this was always the
main goal behind the development test bed: it is not a service!)


Examples


Key components tested
days

after having them available on the

development


Watching the system since

February 05: web results



Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

8

Contribution to the testing
effort


The ARDA team developed “tests” to investigate basic
functionalities and in several cases this has been
passed (the ideas, sometimes the actual code) as
starting example for the gLite team


Examples:


FiReMan measure
-

ments (and compa
-

rison with LFC)


ACAT



LCG workshops


Data corruption tests

using gLiteIO



Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

9

Contribution to the set up of the
preproduction service


In the early phase of the setting up of the PPS, we
detected that considerable EGEE application resources
were used to contribute to the testing/certification
effort without real coordination


This was detected by us and we provided this kind of
coordination:

http://egee
-
na4.ct.infn.it/genapps/wiki/index.php/TestsOfGlite


A number of certification tests (“job storms”) have been ported
with ARDA resources


2 tutorials have been run by ARDA to speed up the transition of
a number of EGEE NA4 people to gLite (they provided more
tests for gLite and the PPS)


The PPS is just becoming available (October 2005)

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

10

Use of special installation for
detailed tests


Some tests require full access to the machines where
the middleware runs


Access to install program to “spy” CPU usage, I/O traffic, etc…


Access to a given hardware to make comparison


E.g. LGC2 RB vs gLite RB


Try to crash the system without impacting other users



Main installations


Taipei



ATLAS
-
Milano installation (ATLAS Task


Force activity)




next slide

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

11

Use of special installation for
detailed tests

30 Sept. 2005
LCG/EGEE Taskforce Meeting
First Measurement on
gLite
1.4

First
gLite
1.4 WMS
performance measurement
in Milan

gLite
1.4 WMS (4CPU, 3GB
memory)

300 simple hello world jobs
submitted by 3 parallel
threads

submission rate ~ 4.1
jobs/sec

dispatching rate ~ 0.08
jobs/sec

All jobs in bulk are submitted
to the same CE:
atlasce.lnf.infn.it

Thanks to
Elisabetta
Molinari
for setting up the
gLite
WMS in Milan

First gLite pre1.4 WMS
performance measurements
in Milano

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

12

Involvement of the team and experiments people
in meetings/reviews of gLite middleware and
general discussions


3 ARDA workshops


Last in March


A few ARDA meeting


Informal presentations


Not much successful (but the few seminars were very nice)


EGEE link


ARDA circulated the architecture and design docs to several
experiments’ experts


ARDA suggested experiments’ experts to be invited in relevant
EGEE technical for a


Other activities


Participation/presentation to GAG and the UK Metadata Group


In addition, ARDA is involved in the BaseLine service working
group and 3D working group

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

13

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

14

Metadata


ARDA was built using the idea that all the middleware would have been
provided by gLite


Eventual exception: Metadata system


Key element of all experiments system (e.g. Production system production)


ARDA studied both technology (test of experiments systems) and interface
(interaction with gLite and the HEP community)


Presentations at GAG, GridPP UK Metadata…


Good inputs from gLite


Eventually the result interface accepted in gLite


Key role of the working prototype (AMGA)


Used by LHCb (Bookkeeping system)


Presented at GridPP UK Metadata group


Used for technology research (SOAP)


Recently the ARDA implementation has been integrated in gLite


It should appear in gLite 1.5


Intergation OK, some activity on security started, future integration with
catalogues possible but not discussed in any detail yet


Used in other ARDA products providing database functionality


Notably in GANGA


By many EGEE and other non
-
HEP collaborators (ESR, Biomed, GILDA,
UNOSAT)

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

15

Metadata: ARDA Implementation


Prototype


Validate our ideas and expose a
concrete example to interested parties


Multiple back ends


Currently: Oracle, PostgreSQL, SQLite,
MySQL


Dual front ends


TCP Streaming


Chosen for performance


SOAP


Formal requirement of EGEE


Compare SOAP with TCP Streaming


Also implemented as standalone
Python library


Data stored on the file system

Python Interpreter
Metadata
Python
API
Client
filesystem
Metadata Server
MD
Server
SOAP
TCP
Streaming
Postgre
SQL
Oracle
SQLite
Client
Client
Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

16

Metadata: ARDA Implementation

Dual
Front End


Text based protocol














Data
streamed

to client in single
connection


Implementations


Server


C++, multiprocess


Clients


C++, Java, Python, Perl, Ruby



Most operations are SOAP calls
















Based on

iterators


Session created


Return initial chunk of data and session token


Subsequent request: client calls
nextQuery()

using session token


Session closed when:


End of data


Client calls
endQuery()


Client timeout


Implementations


Server


gSOAP (C++).


Clients


Tested WSDL with gSOAP, ZSI
(Python),

AXIS
(Java)


Client
Server
Database
<
operation
>
Create DB cursor
[
data
]
[
data
]
[
data
]
[
data
]
[
data
]
[
data
]
[
data
]
[
data
]
Streaming
Streaming
Client
Server
Database
query
Create DB cursor
[
data
]
[
data
]
[
data
]
[
data
]
[
data
]
nextQuery
[
data
]
nextQuery
[
data
]
Streaming
SOAP
with iterators
Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

17

AMGA at ACAT

University of
Coimbra
Scalability with Multiple Clients
-
Pings

Measure scalability of protocols

Switched 100Mbits LAN

TCP
-
S
3x faster
than
gSoap
(with
keepalive
)

Poor performance without
keepalive

Around
1.000 ops/sec
(both
gSOAP
and TCP
-
S)
1000 pings

1000

10000

1

10

100
Average throughput
[
calls
/
sec
]
#
clients
TCP
-
S
,
no KA
TCP
-
S
,
KA
gSOAP
,
no KA
gSOAP
,
KA
Client ran
out of sockets
University of
Coimbra
SOAP Toolkits performance

Test protocol performance

No work done on the
backend

Switched 100Mbits LAN

Language comparison

TCP
-
S with similar
performance in all languages

SOAP performance varies
strongly with toolkit

Protocols comparison

Keepalive
improves
performance significantly

On Java and Python, SOAP
is several times slower than
TCP
-
S
1000 pings

0

5

10

15

20

25
Execution Time
[
s
]
C
++ (
gSOAP
)
Java
(
Axis
)
Python
(
ZSI
)
TCP
-
S no KA
TCP
-
S KA
SOAP no KA
SOAP KA
Enabling Grids for E
-
sciencE

Metadata: ARDA Implementation:
Security Concepts


Security very important for BioMed (more than for HEP)

They need confidentiality, not only basic access control

Security
↔ Speed


Standalone catalogue has:


ACLs
for dirs and

Unix permissions
dirs/entries


Built
-
in group
-
management as in AFS


AMGA + LFC back end:


Posix ACLs

+
Unix permissions

for dirs/entries

(ACLs currently not checked: slow!)


Users/groups via VOMS


Currently no security on attribute basis


AMGA allows to create
views
: Safer, faster, similar to RDBMS

Security tested by GILDA team for standalone catalogue, liked built
-
in group
management & ACLs, but we
need feedback from BioMed!

The extra effort is
largely
counterbalanced by
having more active
users

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

19

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

20

ARDA prototypes: starting point


LHC

Experiment


Main focus


Basic prototype
component
/framework


Middleware


GUI to Grid


GANGA/DaVinci


Interactive
analysis


PROOF/AliROOT

High
-
level
services and
integration


DIAL/Athena

Explore/exploit
native g
L
ite
functionality &
integration


ORCA

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

21

GANGA

Jakub.Moscicki@cern.ch
3
What is Ganga
AtlasPROD
DIAL
DIRAC
LCG2
gLite
localhost
LSF
submit, kill
get output
update status
store & retrieve job definition
prepare, configure
Ganga4
Job
Job
Job
Job
scripts
Gaudi
Athena
AtlasPROD
DIAL
DIRAC
LCG2
gLite
localhost
LSF
+ split, merge, monitor,
dataset selection
Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

22

Ganga4

Jakub.Moscicki@cern.ch
9
Internal architecture
Application
Manager
Job
Manager
&
Monitoring
Job
Repository
&
Workspace
Client

Ganga 4 i
s decomposed
into
4 functional
components

These components also
describe the components in
a
distributed model.

Strategy: Design each
component
so that it could
be a
separate
service
.

But
allow to combine
two or
more
components
into a
single service

Major rewrite


End 2004, beginning
of 2005


Key contribution of
the ARDA team


Hands
-
on activity at
CERN


GANGA workshop
(London, June 2005)


http://agenda.cern.ch/
fullAgenda.php?ida=a
052763

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

23

GANGA (ATLAS and LHCb)


Common project (ATLAS and
LHCb)


Cornerstone of the ARDA
-
LHCb
activity from the beginning


More and more at the centre of
ATLAS strategy


Tutorials + presentation in the
User Task Force (lead by F.
Giannotti)


ATLAS specific:


Good perspective of integration
with the
production system



D. Liko (CERN/ARDA) is the new
distributed analysis coordinator

GANGA ATLAS team

Karl Harrison

C.L Tan

Dietrich
Liko

They did all the work

.

I joined much later

Further resources

Coordinator:
Ulrik
Edege

GridPP

A.
Soroko

ARDA

Jakub
Moscicki

Andrew Maier

ARDA Metadata (AMGA)

Birger Koblitz

Nuno Santos
ARDA defined the interface for the gLite
metadata and recently the implementation
is part of gLite itself

ARDA
-

LHCb

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

24

ALICE: interactive parallel

GridKA Schule, 30 September 2005
-
53
Su
bm
ast
er
PROOF Master
Client
xrootd
MSS
proofd
PROOF@GRID
Multitier
Hierarchical Setup with xrootd
read
-
cache
Local
File Catalogue
Storage Index
Catalogue
Site 1
Site 2
Depending on the Catalogue model,
LFNs can be either resolved by the PROOF
master using a centralized file catalogue
or only MSS indexed and resolved by
Submasters and local file catalogues.
GridKA Schule, 30 September 2005
-
52
PROOF@GRID
: 1
-
Cluster
Setup
Evolution of last year activity (SuperComputing
2004). Key ARDA contributions: integration with
underlining grid services and improvements in the
PROOF sector (connectivity, etc…)

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

25

ALICE

Access to grid
services via the C
-
Access Library

Massimo Lamanna
-
FNAL Computing Seminar 31
-
MAY
-
2005
15
ARDA shell + C/C++ API
Server
Client
Server Applicat ion
Applicat ion
C-API (POSIX)
Securit y-
wrapper
GSI
SSL
UUEnc
Securit y-
wrapper
GSI
gSOAP
SSL
TEXT
Server
Service
UUEnc
gSOAP
C++ access library for g
L
ite has been
developed by ARDA

High performance

Protocol quite proprietary...
Essential for the ALICE
prototype
Generic enough for general use
Using this API grid commands have
been added seamlessly to the
standard shell
Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

26


Catalogue inspection

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

27

CMS


ARDA developed a successful prototype


Tool for concrete investigation (ASAP)


Used to demo gLite for the LHCC


Very important (and positive) users’ feedback


Future (present)


Prototype


convergence on the CMS system


EGEE2


not only analysis: production is important!


In the framework of the CMS structure (Integration and
development activities and CMS LCG task force):


Key components of ASAP refactorised and contributed to the CMS
framework (CRAB)


Contributions to CMS dashboard (aggregation of monitor
information and high
-
level view of all CMS computing activities)


Informal set of milestones
. Close interaction with the other
contributors within the CMS task force

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

28

http://www
-
asap.cern.ch/dashboard
/

CMS dashboard

CMS Users Jobs




CMS Jobs&IO vs time

The system collects info from R
-
GMA
(middleware), Monalisa (CMS realtime),
Submission tools (Production, CRAB, ASAP)

This is essential (for CMS) and very instructive for
the MW (we are using to study the “efficiency”
of one system at a time…


SC3 jobs

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

29

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

30

Effort


Original envelope:


4 FTEs from EGEE


Matching funds from LCG: 4 FTEs


~6 FTEs


More people
interested/attracted


2 PhD students


CMS thesis (Brunel University)


Dependability thesis (Coimbra University)


Collaboration with LCG Russia (coordinate visits at CERN)


Very successful


Collaboration with ASCC Taipei


Very successful (2 FTEs at CERN

2 FTEs at ASCC)


Within EGEE


Very important role


Especially in “testing”


other colleagues enabled/coordinated to contrbute
to the gLite testing and certification


Very positive working environment


So far, people were primarily attached to one experiments but being in
a single team, they interact and augment their efficiency

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

31

Milestones


No major problem so far


But note that 2005 (internal) milestones are basically about using gLite
and giving feedback



Template: Use the gLite middleware (version
n
) on the extended
prototype (eventually the pre
-
production service) and provide feedback
(technical issues and collect high
-
level comments and experience from
the experiments)



During the year, we decided to accept “delays” (LCG Quarterly Reports):


Waiting for gLite 1.0 to arrive and stabilise (mainly Q1), we focused with one
experiment (ATLAS) on middleware evaluation: the results at that stage were
de facto valid for all the other experiments (basic functionality), while putting
all the effort on the prototype and development activity (e.g. Ganga4,
ASAP…)


Within the ALICE activity (now ALICE task force), a lot of studies of the new
middleware were done by non
-
ARDA persons and we agreed to focus on the
specific ARDA contribution


I think there is a problem (typo) on the table on the web


Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

32

WBS

Description

Due

Done

1.6.18

E2E prototype for each
experiment (4 prototypes),
capable of analysis (or
advanced production)

31
-
12
-
04

31
-
12
-
04

1.6.19

E2E prototype for each
experiments (4
prototypes), capable of
analysis and production

31
-
12
-
04

Main 2005 milestone



ALICE
: Our contribution focus almost 100% on advanced analysis
feature. Contributions on xrootd and monitor will be common with the
production usage


ATLAS
: Experience in both submitting to Grid and production system. We
hope that Ganga will be integrate as common “front end”


CMS
: Working very close in the Task Force. The monitor part does
receive data from both analysis and production jobs. Hopefully also ome
part of the task manager will be used in the future production system


LHCb
: GANGA submits to multiple back ends (including Grid) and DIRAC


Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

33

Milestones evolution


Since July, the LCG Task Forces have been set up


Under experiments’ lead


Experiments milestones will (should) become our milestones


EGEE2 perspectives


Invitation to include production in the future


As a matter of fact, contained in our milestone…


Clearly our mandate was more “users
-
analysis biased”


Already happening


Since a while: e.g. Submission of analysis jobs to the production
system (cfr. Ganga workshop in June)


In the task forces: e.g. CMS Dashboard


Not yet formalised


Wait for April 2006?



Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

34

SC3/SC4


Main interest: SC4


Document + discussions (from arda.cern.ch):


http://lcg.web.cern.ch/LCG/activities/arda/public_docs/2005/Q3/SC
4.doc


http://lcg.web.cern.ch/LCG/activities/arda/public_docs/2005/Q3/AR
DAatSC4
-
preGDB.ppt


Working document exposing several use cases


On one side, we expect to be involved “via the
experiments”


This is natural due to the increasing integration in the
experiments’ plan


Service challenges flowing into the pilot services


On the other side, it is ARDA role to prepare for the analysis
challenges using the present understanding and experience


It was a successful approach, for example, in studying the
metadata problem:


Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

35

SC3/SC4


What does it mean “to prepare for the analysis” in
concrete?


“Batch” analysis:


Low
-
latency jobs


Access to worker nodes of a mixture of production (~10 h) and
analysis jobs (<1 h)


Users dispatching large number of jobs


Job resubmission (using experiments policies)


“Interactive” analysis


Low
-
latency access to partial results


Low
-
latency interaction with Grid services


Use our experience in C
-
Access Library, xrootd, PROOF, GANGA,
DIANE…


Efficient use of experiment
-
specific services


CMS MyFriend layer // VOBOX // Edge Services

Enabling Grids for E
-
sciencE

Massimo Lamanna / CERN

36

Outlook



Achievements


Positive contributions to the experiments


Positive partnership with gLite


As a “side effect” of our EGEE contribution: good contacts and
exchange of ideas and experience with other scientific communities
in EGEE



Concerns and risks


Coherence between the middleware development, the
deployment and the experiments is a very delicate plant…