GAE

righteousgaggleData Management

Jan 31, 2013 (4 years and 5 months ago)

152 views

23rd. June 2003

JJB, GAE Workshop

1

GAE

(Grid Analysis Environment)


Overview of Caltech effort



Slides for the Caltech GAE
Workshop


June 2003

23rd. June 2003

JJB, GAE Workshop

2

Overview


GAE crucial for LHC experiments


Utility of Grids proven for production


Their use for Analysis will be the Acid Test of Grids


Large, Diverse, Distributed community of users


Support for hundreds/thousands of analysis tasks


Widely varying requirements


Need for Priority Schemes, robust authentication and
security


Operation in a severely resource
-
limited and constrained
global system


GAE is where the physics gets done


Where physicists learn to collaborate on analysis at a
distance

23rd. June 2003

JJB, GAE Workshop

3

Scope


Diagram shows
“snapshot” in time of
analysis activities


Groups of
individuals,
geographically
separated, work on
specific analysis
topics (e.g.
Supersymmetry)


Resources in the
Grid system are
shared between the
groups


Boundaries enclosing
the groups move
and change shape as
the composition or
requirements of the
groups change

23rd. June 2003

JJB, GAE Workshop

4

Architecture


Several candidate computing system
architectures have been proposed to
support GAE


At Caltech we have defined the “
CAIGEE

Architecture, in collaboration with UCSD,
UCR, FNAL and UCD


Our work is focussed on developing critical
missing components of the CAIGEE
architecture, creating demonstration
-
grade applications to determine its
validity, and working with other groups on
integration of existing software into the
CAIGEE scheme

23rd. June 2003

JJB, GAE Workshop

5

CAIGEE Architecture

23rd. June 2003

JJB, GAE Workshop

6

CAIGEE (continued)


Based on the use of Web Services or
Portals to provide heterogeneous
clients access to analysis tools and
data


An important feature is support for even
semi
-
infinitely thin clients, such as PDAs
with very limited CPU/Memory


Grid Authentication and transport
built in


mediates client/service
(portal) traffic

23rd. June 2003

JJB, GAE Workshop

7

Web Services


Data/Processing services offered via the Web


Widely adopted in the commercial world


Good tools, de facto standard protocols, support etc.


We have been confirming their usefulness for
scientific data and services


Access to RDBMS
-
resident Tags and nTuples (Oracle,
SQLServer, PostgreSQL)


Access to ROOT files


Access to Objectivity object collections


To do this, we have updated existing tools to
“talk” with Web Services:


ROOT


COJAC (3D event viewer)


Others

23rd. June 2003

JJB, GAE Workshop

8

Web Services
-

Principles


Publish

makes the service
description publicly available.


WSDL( Web Services Description
Language) is the language used
to create the service description.



Find

discovers the web service


UDDI (Universal Description
Discovery and Integration) is
the directory technology used by
service registries. The registries
contain descriptions of web
services, and support lookup.



Bind

allows the service to be
used by the client.


SOAP (Simple Object Access
Protocol) through which the
service provider, service registry
and service requestor
communicate.


SERVICE

PROVIDER

SERVICE

REQUESTOR

SERVICE

REGISTRY

1

Publish

3

Bind

2

Find

23rd. June 2003

JJB, GAE Workshop

9

Web Services: Experimental
Setup

ORACLE9i
SERVER

DATA

(META
DATA)

ORACLE9i
SERVER

DATA

(META
DATA)

MS
-
SQL

DATA

(META
DATA)

JAVA XML
API

to connect
with
Database

Server




Proxy

Server

UUDI Registry
Node


Client Web Application to
connect with database

Bind with the
provided service

SOAP Processor

WSDL file

UDDI


SOAP
R
equest and
Response


Server with

Materialized
View

Database

Available


On Fabric


layer of

Grid

(Service
Provider)


Available at Connectivity and Resource
layer of Grid


(Service
Requestor)


Provided at authentication




(Service Registry)

and security layer of Grid
.


SOAP


SOAP


Server with

Master
Database

HTTP Server

Data
Replication


through
SSL


23rd. June 2003

JJB, GAE Workshop

10

Example Web Services

23rd. June 2003

JJB, GAE Workshop

11

GAE Tools (1) Clarens


Our emphasis is on accomodating existing
analysis tools in our CAIGEE architecture


To facilitate this, we use the “Clarens
Dataserver”


Clarens is server software that makes
datasets and services available to clients
in a suitable
lingua franca


Clients initially Grid
-
authenticate with a
Clarens server, and then are able to make
use of a wide set of data and analysis
services on offer

23rd. June 2003

JJB, GAE Workshop

12

GAE Tools (2) Clarens


Clarens uses an interpreted Python
framework running inside Apache


PKI security for CA certificates


Commodity protocols (http/https) used to
talk with clients


Authorization of Web Service requests
using hierarchical ACLs for Virtual
Organisations


Distributed administration of VO/ACLs


Creating new Clarens services is
straightforward and easy: this was one of
the design goals.

23rd. June 2003

JJB, GAE Workshop

13

GAE Tools (4) Clarens


Services include:


Access to SOCATS (next slide)


Storage Resource Broker interface


Application execution (submit jobs to
cluster schedulers)


Proxy escrow


File access to files in server filesystem
or SRB files

23rd. June 2003

JJB, GAE Workshop

14

GAE Tools (5) SOCATS


“STL Optimized Caching and Transport
System”


SOCATS is a general
-
purpose tool we have
developed that is able to deliver large object
collections (result sets) in response to an SQL
query on an RDBMS


Targetted at C++ clients who wish to send a
SQL Query to a remote RDBMS (using the
Clarens dataserver) and receive back the
database rows/result set as a collection of C++
objects


Data delivered in binary format (avoid heavy
overhead of explicit XML encoding)


Large result sets are streamed efficiently to the
client, so allowing client processing to begin as
soon as the first data are available


23rd. June 2003

JJB, GAE Workshop

15

GAE Tools (6) GroupMan


Developed in response to need for user
-
friendly administration of LDAP based
“Virtual Organisations”


Import to the LDAP server of certificates
from CA


User
-
friendly GUI allows ad hoc creation of
user groups and VOs


VO data stored to allow easy extraction by
standard Grid
-
based tools


E.g. creation of Globus
gridmap

files


Part of the DPE distribution

23rd. June 2003

JJB, GAE Workshop

16

GAE Tools (5) PDA Client


A handheld GAE client:
fruits of collaboration
between NUST and
Caltech


Software is Java Analysis
Studio (JAS) ported to
the Pocket PC 2002 OS


Hardware is any Pocket
PC 2002 device


This tool is still under
development and
currently lacks
authentication/security
components

23rd. June 2003

JJB, GAE Workshop

17

GAE Tools (6) Collaboration
Desktop


Four
-
screen desktop analysis
setup


Driven by a single server and
single graphics card


Four flat panel monitors


Allows simultaneous work on:


Traditional analysis tools (e.g.
ROOT)


Software development (e.g.
VS.NET)


Even displays (e.g. IGUANA)


MonALISA monitoring displays


Persistent collaboration (e.g.
VRVS)


Online event or detector
monitoring


Web browsing, email


Chat windows, instant
messaging


Shared whiteboards etc.