Title - Ian Bird - Cern

disturbedtonganeseΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

209 εμφανίσεις

EGEE
-
II INFSO
-
RI
-
031688


Enabling Grids for E
-
sciencE

www.eu
-
egee.org

EGEE and gLite are registered trademarks


Dr. Ian Bird

EGEE Grid Operations & Management Leader

IT Department, CERN




The EGEE Production Grid

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


EGEE

Objectives


Large
-
scale, production
-
quality

grid infrastructure for e
-
Science


Attracting new resources and

users from industry as well as

science


Maintain and further improve

gLite Grid middleware

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

2



Flagship grid infrastructure project co
-
funded by the
European Commission



Now in 2
nd

phase with 91 partners in 32 countries


Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Outline


EGEE infrastructure & services


How we got to this point


Overview of services


Status


Middleware


Training etc.


Applications


Some key successes


Interoperation/interoperability


… and related projects


EGEE and standards …


Open issues


What next?




Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

3

Service

54%

Middleware
Development

13%

Application
support

16%

Training

5%

Management,

Dissemination,

etc.

12%

EGEE Project Activities

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


2004

2006

2002

2001

Evolution of production grid

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

4

Deploying

results of EDG to provide
1
st

production service for

LHC

Middleware & test
-
beds for

an operational grid

-
Starts from LCG

-

Shared production infrastructure

-

Extended production service to other
applications

-

Growth from 40 to 190 sites

Continued expansion of
resources and applications
communities

Globus

Condor

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Applications


Many applications from a
growing number of domains


Astrophysics


Computational Chemistry


Earth Sciences


Financial Simulation


Fusion


Geophysics


High Energy Physics


Life Sciences


Multimedia


Material Sciences




~

200 Virtual
Organisations

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

5

Applications list: https://edms.cern.ch/file/722132/3/EGEE
-
II
-
DNA4.2.1
-
722132
-
v2.5
-
1.pdf

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


The EGEE Infrastructure

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

6

Production Service

Pre
-
production
service

Certification test
-
beds

Test
-
beds & Services

Operations Coordination
Centre

Regional Operations
Centres

Global Grid User
Support

EGEE Network Operations Centre

Operational Security Coordination
Team

Operations Advisory Group

Joint Security Policy
Group

EuGridPMA

(& IGTF
)

Grid Security Vulnerability Group

Security & Policy Groups

Support Structures & Processes

Training infrastructure

Training activities

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Growth

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

7

ROC

Partner
-

DoW

Partner
-

actual

Total

% non
partner

CERN

1800

3548

5943

40%

France

1252

2550

2700

6%

De/CH

1852

2695

3364

20%

Italy

2280

3539

3628

2%

UK/I

2010

4527

7720

41%

CE

1163

1622

1875

13%

NE

1860

2473

3031

18%

SEE

1289

2552

2568

1%

SWE

898

1535

1593

4%

Russia

445

527

583

10%

A
-
P

801

841

1632

48%

Total

15650

26409

34637

24%

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


CPU, countries, sites

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

8

CERN; 5943

France; 2700

De/CH; 3364

Italy; 3628

UK/I; 7720

CE; 1875

NE; 3031

SEE; 2568

SWE; 1593

Russia; 583

A
-
P; 1632

CPU / ROC

CERN, 4

France, 1

De/CH, 2

Italy, 1

UK/I, 2

CE, 7

NE, 8

SEE, 8

SWE, 2

Russia, 2

A
-
P, 8

Countries / ROC

CERN, 12

France, 10

De/CH, 14

Italy, 37

UK/I, 25

CE, 24

NE, 27

SEE, 38

SWE, 15

Russia, 15

A
-
P, 20

Sites / ROC


35000 CPU


45 countries (31 partner countries)


237 sites (131 partner sites)

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Workload

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

9

0
500000
1000000
1500000
2000000
2500000
3000000
No. jobs / month
-

all

OPS
Non-LHC
LHC
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
No. jobs / month
-

exc. LHC + Ops

Other VOs
planck
magic
geant4
fusion
esr
egrid
egeode
compchem

98000
jobs/day


13000
jobs/day

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


CPU time delivered

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

10

0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
10000000
Apr-06
May-06
Jun-06
Jul-06
Aug-06
Sep-06
Oct-06
Nov-06
Dec-06
Jan-07
Feb-07
Mar-07
Normalised CPU hours
-

all

OPS
Non-LHC
LHC
0
500000
1000000
1500000
2000000
2500000
3000000
Apr-06
May-06
Jun-06
Jul-06
Aug-06
Sep-06
Oct-06
Nov-06
Dec-06
Jan-07
Feb-07
Mar-07
Normalized CPU hours
-

exc. LHC + Ops

OPS
Other VOs
planck
magic
geant4
fusion
esr
egrid
egeode
compchem

14000
CPU
-
month/month

3600 CPU
-
month

~ 1/3 of total

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Overall load


19.6 million jobs run in 1
st

year of EGEE
-
II


56000 per day sustained
average


Peak of 98000


Non
-
LHC 13500 /day


Level of total in EGEE in 2005



8400 CPU
-
years delivered in
1 year


~1/3 of total available
sustained over the year


Peak of 50% of available in
Feb ’07


~1/3 of total was non
-
LHC in
Dec ‘06

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

11

0.00E+00
1.00E+07
2.00E+07
3.00E+07
4.00E+07
5.00E+07
6.00E+07
7.00E+07
8.00E+07
Cumulative norm. CPU hours

OPS
Non-LHC
LHC
0.00E+00
5.00E+06
1.00E+07
1.50E+07
2.00E+07
2.50E+07
Cumulative no. jobs

OPS
non-LHC
LHC
Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Grid Middleware








Higher
-
Level Grid Services


Additional functionality







Foundation Grid Middleware


Robustness


Coexistence


Interoperability

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

12

Foundation Grid Middleware



Security model and infrastructure

Computing (CE) and Storage Elements (SE)

Accounting

Information and
Monitoring


Higher
-
Level Grid Services



Workload Management

Replica Management

Visualization

Workflow

Grid Economies

...


Applications

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Workload Management






Data Management






Security





Information & Monitoring




Access


gLite Grid Middleware Services

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

13

API

Computing

Element

Workload

Management

Metadata

Catalog

Storage

Element

Data

Movement

File & Replica

Catalog

Authorization

Authentication

Information &

Monitoring

Application

Monitoring

Auditing

Job

Provenance

Package

Manager

CLI

Accounting

Site Proxy

Overview paper http://doc.cern.ch//archive/electronic/egee/tr/egee
-
tr
-
2006
-
001.pdf

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Middleware and Certification


The goal is to produce a
middleware distribution

that can
be deployed widely


Certification testing:


Installation and configuration


Component (service) functionality


System testing (trying to emulate
real workloads and stress testing)


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

14


Test
-
beds


Virtual test
-
beds for individual
testers ( ~5 )


Dynamically

allocated test nodes (
> 50 nodes)


Central certification test
-
bed


Distributed

test
-
beds for specific
functions


Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688



Pre
-
production service is now ~ 27 sites in 16 countries


Provides access to some 3000 CPU


Some sites allow access to their full production batch systems for scale
tests


Sites install and test different configurations and sets of services


Services may be initially
demonstrated in this
environment


Before further development


New VO
-
s: adapt their
applications & gain experience


(e.g. DILIGENT)


Pre
-
production service

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

15

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688



Regional operations
Centres


Core support infrastructure


Grid User Support (GGUS)


Coordination, management of
user support









EGEE Network Operations
Centre (ENOC)


Coordination with NRENs &
GEANT2

Grid Management Structure

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

16


Operations Coordination Centre


Management, oversight, coordination

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Grid Operations


Fully distributed


key are the Regional Operations
Centres


Many of the ROCs are themselves distributed organizations


Grid Operator on Duty


Weekly rotation of teams


Critical activity in maintaining usability and stability of sites


Important tools


Site Availability Monitoring and Testing(SAM)


Information system monitoring


GGUS system for trouble ticket management


Portal for operations :
https://cic.gridops.org



Significant work on operations procedures


Evolved throughout EGEE and EGEE
-
II


Contribute to establishment of regional grid infrastructures through related
projects


well beyond Europe now



Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

17

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


User Support


GGUS


now well established


Use continues to grow


Most ROCs provide dedicated effort to manage the process


similar to
operator on duty teams


Setting up user support advisory groups to steer the priorities



GGUS tool used for
all support activities


Interlinks many local
ticketing systems

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

18

No. Tickets
P
rocessed


Operations Network User All

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

19

Policy & Security



Joint Security Policy Group (JSPG)



Produces and maintains security policy and procedures


for EGEE, OSG, NDGF, WLCG, and other EU Grid infrastructures



Achieved common policy between EGEE and OSG (for interoperation)



New Grid Site Operations Policy & Updated top
-
level Security Policy



Grid User AUP accepted by
eIRG

as good approach



Current work


New policy addressing User
-
level Accounting (data privacy issues)



New policy on VO and Grid service responsibilities



Operational Security Coordination Team (OSCT) focuses on:



Incident Response & improvement



Security Monitoring



Best practice for system managers



Pan
-
regional security coordination



Grid Security Vulnerability Group


New group analyzing potential vulnerabilities

19

TAGPMA

APGridPMA

The
Americas
Grid PMA

European
Grid PMA

EUGridPMA

Asia
-
Pacific
Grid PMA

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Grid Monitoring


Becoming a critical activity to achieve reliability and stability

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

20

System Management

Fabric management

Best Practices

Security

…….

Grid Services

Grid sensors

Transport

Repositories

Views

…….

System Analysis

Application monitoring

……


“… To help improve the reliability of
the grid infrastructure …”


“ … provide stakeholders with views
of the infrastructure allowing them to
understand the current and historical
status of the service …”


“ … to gain understanding of
application
failures in the grid environment and to
provide an application view of the state
of the infrastructure …”


“ … improving system management
practices,


Provide site manager input to
requirements on grid monitoring and
management tools


Propose existing tools to the grid
monitoring working group


Produce a Grid Site Fabric Management
cook
-
book


Identify training needs

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Monitoring


Important to have standard solutions for:


Sensors


Repository schema


Interfaces


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

21

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Experiment
Dashboard

22

Information sources

Generic Grid
Services

Experiment
specific services

Experiment work
load management
and data
management
systems

Jobs
instrumented to
report monitoring
information

Monitoring systems
(RGMA, GridIce,
SAM, ICRTMDB,
MonaAlisa, BDII,
GridView…)

Collect data of VO

interest coming from

various sources

Store it in a single

location

Provide UI following

VO requirements

Analyze collected

statistics

Define alarm

conditions

VO users
with
various
roles


Potentially other


Clients:


PANDA, ATLAS production


<XML,CSV, image formats
>




INPUT

Multiple sources of information




Increasing the reliability



Providing both global and very detailed view




Can satisfy users with various


roles:


Generic user running his jobs


on the Grid


Site administrator


VO manager, production or analysis


group coordinator,


data transfer coordinator…







OUTPUT

Providing output in various

formats

(Web pages, xml, csv,


image formats)





Can be used by


various clients


both users and


applications


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

This will be shown in the demo session

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Training

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

23


B
road range of courses to
many disciplines and clients
with very different
backgrounds


Close relationships with
applications and infrastructure
activities for provision of
material and lecturers


Needs are expanding rapidly
with new communities and
‘beginner’ users



110 events; 1600 participants

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Infrastructure for training


GILDA is an effective t
-
Infrastructure for EGEE and other
European projects, providing resources and knowledge for
training events


Besides training events, GILDA is available around the clock for
grid novices, with dedicated facilities


The GILDA t
-
Infrastructure is
currently supported by 12
sites, managed on a






best
-
effort basis


GILDA is also available for





application porting


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

24

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Interoperability/interoperation


Well established with Open Science Grid in U.S.


In production use by CMS


submits work to OSG from EGEE


Weekly operations meetings attended by OSG staff


Processes set up with OSG for operations and user support workflows


OPS VO defined to support joint operations


for testing/monitoring use


Collaboration on monitoring tools and procedures



EGEE also working with other grid projects on specific
interoperability at the level of middleware:


NAREGI,
Unicore
, NDGF(ARC)



Effort in GIN in several areas key for EGEE



Important to have a user community/use case driving this


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

25

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Worldwide Grid Infrastructures

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

26


APAC


DEISA


EGEE


Naregi


NDGF


NGS



OSG


Pragma


Teragrid


G
I
N

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Collaborating e
-
Infrastructures

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

27

Potential for linking ~80 countries

TWGRID

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

28

Registered Collaborating Projects

Applications

improved services for academia,
industry and the public

Support Actions

key complementary functions

Infrastructures

geographical or thematic coverage

24 projects have registered as on February 2007:

web page

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Applications on EGEE


Multitude of applications from a growing

number of domains


Astrophysics


Computational Chemistry


Earth Sciences


Financial Simulation


Fusion


Geophysics


High Energy Physics


Life Sciences


Multimedia


Material Sciences


…..



Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

29

This is an exciting year for science


LHC, the largest
scientific instrument ever built, comes on
-
line


-

Grids are key to the success of LHC analysis

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Virtual Organizations

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

30

Total Users: 5034

Affected People: 10200

Median members per VO: 18

Total VOs: 204

Registered VOs: 116

Median sites per VO: 3

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Active VOs


Number of “active” VOs growing with time.


Turnover not shown: not same VOs every week!

31

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Reported Applications


Disciplines: 10


Sub
-
disciplines: 36


See growth and diversification

of applications.


Reported apps. only

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

32

PM3

PM11

Astronomy & Astrophysics

2

8

Computational Chemistry

6

27

Earth Science

16

16

Fusion

2

3

High
-
Energy Physics

9

11

Life Sciences

23

39

Others

4

14

Total

62

118

Condensed Matter Physics

Comp. Fluid Dynamics

Computer Science/Tools

Civil Protection

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


High Energy Physics

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

33

0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
LHC Experiment
workloads

Normalized CPU


kSI2k.hours

lhcb
cms
atlas
alice
Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


34

User Analysis with Ganga



~ 550 different users, ~100 users weekly

Usage monitoring started end 2006


Easter


~60% Atlas


~25% LHCb


~15% others


Used ATLAS and LHCb experiments,


developed with the contribution of EGEE NA4

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


IT/PSS Group Meeting

35

CMS analysis


CRAB Jobs @ FNAL (OSG)



CRAB Jobs @ CERN (EGEE)



Users on the grid:



-

April
2007
statistics
-



CMS users submitting

jobs to Grids via CRAB


(developed by CMS)





Over
1
,
000
job/day

Efficiency over
90
%

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


36

ALICE Grid Access Service


ALICE Grid Access (commands executed)



ALICE Grid Access (commands executed)



Slope changes because
of


optimised access (less
command executed


to interact with data
management)


Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


High Energy Physics


Data management:


Demonstrated data transfers at nominal rates:1.6 GB/s through FTS


1 GB/s with real (simulated) workloads


2 large experiments transferred >1 PB/month in summer 2006



Workload management


CMS


computing service challenge achieved 50k jobs/day


CMS aim this year for 100k jobs/day; ATLAS for 60k



Reliability and availability


Significant effort to ensure Tier 1 sites meet
MoU

commitments


using site and service monitoring



Grid is now the primary source of
computing resources for LCG

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

37

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Biomedical applications on different layers

Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

38

Resources

Communication layer

Middleware

Specific biomedical services

Medical Data Management

Data
-
intensive workflow management

High
-
level interfaces

Generic portals

Application specific interface






Applications

12
applications ported on the EGEE grid in areas of Medical
Data management, Imaging, Bioinformatics and Drug Discovery

Infrastructure

level

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


WISDOM


WISDOM (
http://wisdom.healthgrid.org/
)


Developing new drugs for neglected and emerging diseases with a
particular focus on malaria.


Reduced R&D costs for neglected diseases


Accelerated R&D for emerging diseases



Three large calculations:


WISDOM
-
I (Summer
2005
)


Avian Flu (Spring
2006
)


WISDOM
-
II (Autumn
2006
)



WISDOM calculations used
FlexX

from
BioSolveIT

in addition to
Autodock
.

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

39

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Docking Results

Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

40

Targets

Com
-

pounds

CPU
-
years

Duration

(wk)

Max.
CPUs

Size of
Results

(TB)

WISDOM
-
I

(Q3’05)

PBD

1M

80

6

1700

1

Avian Flu

(Q2’06)

H5N1

300k

105

6

1700

0.750

WISDOM
-
II

(Q4’06)

GST

DHFR

DHFR

Tubulin

125M

420

8

5000

2

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Confirming in vitro the results obtained in
silico

Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

41

Univ. Los Andes
:

Biological targets,
Malaria biology

LPC Clermont
-
Ferrand:

Biomedical grid

SCAI Fraunhofer
:

Knowledge extraction,

Chemoinformatics

Univ. Modena
:

Biological targets,
Molecular Dynamics

ITB CNR
:

Bioinformatics,

Molecular modelling

Univ. Pretoria
:

Bioinformatics,
Malaria biology

Academica Sinica:

Grid user interface

Biological targets

In vitro testing

HealthGrid
:

Biomedical grid,
Dissemination

CEA, Acamba project
:

Biological targets,
Chemogenomics

Chonnam nat. univ.:

In vitro testing

New

I

Avian flu data challenge: in the selection of
2250
compounds out
of initial
308585
compounds, an enrichment factor of
111
was observed. Experimental
trial confirms
7
actives out of
123
tested gave “potential
hits”.

Data challenges on malaria: the
25
most promising compounds out of
500.000
are
now being
tested
in vitro at
Chonnam

National University

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Earthsystem Sciences


Goal:

learn about the past,
the present, and possible
futures of the earth system


Community:

internationally
and interdisciplinary
distributed but strongly
interconnected


Method:

Analysing,
comparing and processing
data


Input:

data from
observations and/or other
modelling studies


Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

42

Collect & Prepare

Visualize

4

Analyse

Find & Select



Distributed Climate Data

Model Data

Observation Data

Analysis Dataset

Result Dataset

Scenario data

3

2

Data description

1

Typical workflow

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


An example workflow: “qflux”

Datavolume


Several PB


~3,1TB

(300
-
500 files)

~10,3GB

(28 files)




~76 MB



~6MB


~66KB

Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

43

Visualize


selected


result

Collect & Prepare a temporal and spatial
subset of the data

4

Analyse the
integrated, transport
of humidity between selected
levels

Find & Select relevant & available datasets



Distributed Climate Data

Analysis Dataset

Result Dataset

Wind speed

3

2

1

Temperature

Specific
humidity

Location


Various data centers
& portals



Institutional storage
& computing
facilities




local facilities




Personal Computer

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Potential use of grid technology


Search & select


Different portals

with
different

authentications

and data

descriptions


Collect & prepare


Different access
mechanisms

of the

different

providers


Pre
-
processing requires

sufficient local facilities


Analyse


Existing
tools

and already
processed
data

are

available locally
and
miss
proper description


Visualize


Detached
from the remaining
workflow

Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

44

Current issues



Central unique authentication

to a
common

catalogue with

standardized

metadata



Shared resources

with
standardized
access

hiding proprietary access
mechanisms



Commonly defined
tool description



Log

processing steps and
automatically
republish

processed data



Integrate basic visualization (
first peep
) into the
workflow

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Presentations in User Forum on applications
in EGEE and Related Projects



Specific applications



Atmosphere and Ocean
Models


Earthquake modelling


Fusion


Range of biomedical
applications


Computational Chemistry


Astrophysics


Space applications


HEP (LHC and non
-
LHC)



Applications in Related
Projects



EUMEDgrid


BalticGrid


EELA


EUChinaGrid


EUIndiaGrid


G
-
Eclipse


SymGrid


DILIGENT


BeInGrid



Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

45

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Sustainability: Beyond EGEE
-
II


Need to prepare permanent, common
Grid infrastructure


Ensure the long
-
term sustainability of the European e
-
infrastructure
independent of short project funding cycles


Coordinate the integration and interaction between National Grid
Infrastructures (NGIs)


Operate the European level of the production Grid infrastructure for
a wide range of scientific disciplines to link NGIs


Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

46

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


EGEE and standards


EGEE and other grid infrastructures need to co
-
exist and interoperate


At many levels


campus, local, national, regional, international


A large production system has inertia


cannot change quickly


Introducing new software and standards is slow, need to maintain backward
compatibility


Cannot frequently change the infrastructure



gLite choice of standard adoption is based on interoperability needs and
impact assessment on the infrastructure



Operational experience essential


Leads to
best practices
which in turn should drive standardization efforts


Actively pushing convergence for most pressing needs



The EGI/NGI era will rely on interoperability and coexistence


Appropriate and workable
standards will be essential


Care not to fix standards too soon


this is not mature technology


Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

47

See also:

http://egee
-
na
5
.web.cern.ch/egee
-
na
5
/NA
5
Standardisation.html

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Examples

EGEE has worked on real community implementations of standards


Example
1
: SRM (Storage Resource Manager)


SRM v
2.2
defined >
1
year ago to satisfy LCG requirements


Dedicated effort to reach today with beta versions of real interoperating
implementations (
5
)


and this was vital for LCG


Needed many iterations on details of the specifications


Interoperation test suites and real use case testing was essential


Also required changes to all clients


the APIs were completely changed
from SRM v
1.1



Example
2
: GLUE (information system schema)


Today this is the accumulated knowledge of experience in real large scale
production of EGEE, OSG, ARC over
5
years


The information systems are not perfect


we see scalability problems


The experience is in the schema


It can and should evolve to something better


but it must
evolve


Is an OGF working group


Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

48

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Areas of standardization

Driven by the need for interoperation, co
-
existence, etc.

EGEE is actively involved in many areas, including with OGF


Security (AAA)


Policy work & IETF
wg

on Incident Response


VOMS and proxy certificates


Interoperability with Shibboleth


Data Management


SRM, FTS


Accounting & monitoring


Common usage record, schema, sensors


Job Management


Gatekeeper interfaces


Information system


Common schema



Important for coexistence/interoperability:


areas close to fabric (accounting, monitoring, sensors, etc.) need to be
common

Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

49

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Open Issues

General issues:


Making grid tools easily usable by non
-
experts



Failures not easy to understand


Lack of consistent or thorough error reporting



Lack of consistent administrative interfaces makes them hard to
manage


EGEE issues:


Portability of current gLite distribution prevents wider acceptance
and coexistence

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

50

Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


Summary


EGEE is operating the world’s largest multi
-
disciplinary grid for
science


In continuous use for production work at significant scale


Can bring experience at operating at this scale to the community
and the standardization process


But we have to prioritize carefully


There is a long way to go to improve:


Usability, manageability, reliability, security


Interoperability and coexistence


It is time to move towards ensuring the long term sustainability of
these infrastructures


Will rely on carefully selected common solutions for key services and
processes


Ian Bird
-

OGF/EGEE User Forum
-

May
9
th
2007

51

Enabling Grids for E
-
sciencE


EGEE
-
II INFSO
-
RI
-
031688


EGEE’
07
Conference

Building Bridges…


Between Science and
business


Between users and
infrastructures


Between countries


Between scientific
disciplines


Between projects



http://www.eu
-
egee.org/egee
07

© 2006 Open Grid Forum

OGF
and
EGEE

THANK OUR EVENT

COORDINATING
PARTNERS

a
nd

SPONSORS

© 2006 Open Grid Forum

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

OGF
20
/EGEE User Forum

Coordinating Partners

© 2006 Open Grid Forum

OGF
20
/EGEE User Forum

Event Sponsors

Premier

Standard

Media

GRIDtoday

Technische

Universitat

Berlin


Enabling Grids for E
-
sciencE

EGEE
-
II INFSO
-
RI
-
031688


User Forum agenda

Ian Bird
-

OGF/EGEE User Forum
-

May 9th 2007

56

Wednesday

Opening Plenary

Astro

Workshop

Grids Mean Business

gLite

GIN

OMII
-
Europe

Poster and Demonstrations

Thursday

Data Management

Experience with
Application Domains

Users in the wider grid
community

Workflow

Poster and Demonstrations

Friday

Data Management

Experience with
Application Domains

Users in wider grid
community

Workflow

Grid Monitoring &
Accounting

Interactivity & portals

User/VO community
support

Closing Plenary