contd. - NDGF

musicincurableData Management

Jan 31, 2013 (4 years and 8 months ago)

160 views

CO2
-
CG Grid Interface
overview and status

Olli Tourunen

CO2 Community Grid Workshop

DTU Copenhagen

November 20th 2007


CO2
-
CG Grid Interface overview and status

CO2
-
CG Workshop, DTU, Copenhagen, November 20th 2007

2

Requirements


Main target: Provide transparent access to grid
computational resources to the user interface component
(‘grid’)


Input: Snapshot of the user working directory


Output: Simulation results available to the user


Support for NorduGrid ARC middleware


Proper grid credential handling to avoid need for custom
security policies with participating sites


But this has to be totally hidden from the user…



CO2
-
CG Grid Interface overview and status

CO2
-
CG Workshop, DTU, Copenhagen, November 20th 2007

3

A

R

C

Grid Resource







MUFTE

Runtime

Environment

Grid Resource







MUFTE

Runtime

Environment

Architecture

Grid

Job

Manager

Grid Resource







MUFTE

Runtime

Environment

Application server

DB


Job desc 1

Job desc 2

Job desc 3



S

S

R

S

S

R

R

S

CO2
-
CG Grid Interface overview and status

CO2
-
CG Workshop, DTU, Copenhagen, November 20th 2007

4

Architecture (contd.)


Grid Job Manager (application server side)


Scans DB for new jobs


Prepares the new jobs for grid based on DB job description


Submits the jobs into grid


Keeps track of the grid jobs


Downloads the results when a job is ready


Downloads the evidence for optional autopsy when a job
fails


MUFTE Runtime Environment (grid resource side)


Compiles the software based on local configuration and
environment


Runs the simulation

CO2
-
CG Grid Interface overview and status

CO2
-
CG Workshop, DTU, Copenhagen, November 20th 2007

5

Implementation


Grid Job Manager (GJM)


There is one GJM instance per user


“One sweep at a time”
-
job, intended to be launched from
cron


Runs under user credentials to get access to user grid
proxy


Working directory currently under user home
(~/.gridmufte)


Written in Python + SQLAlchemy Object
-
RDB

mapper


Interacts with grid middleware through standard user
commands


Python API also available, might use that in the future


CO2
-
CG Grid Interface overview and status

CO2
-
CG Workshop, DTU, Copenhagen, November 20th 2007

6

Implementation (contd.)


Database


Standard PostgreSQL relational database


Common tables for all the user jobs


The jobs are identified per user, though


Runtime Environments


Compilation run on the frontend before the job is
submitted


Compilation and execution parameters taken from an ini
file (that is created by GJM based on the job description in
the DB )


Currently only serial runs are supported on one cluster


Parallel runs to follow

CO2
-
CG Grid Interface overview and status

CO2
-
CG Workshop, DTU, Copenhagen, November 20th 2007

7

Challenges


Transparent grid credential handling


Balance the security policies and ease of use


Short lived credentials?


Parallel run parameterization


User needs vs. types of resources vs. available resources


No explicit brokering support for this in ARC


Database access right management (not really an issue until this
goes to a much bigger scale)


Lots of different possibilities to solve this if needed (DB level
access rights, per user tablespaces, change staging, n
-
layer
architecture outside the DB…)


Throttling the compilation on the frontend


It looks like ARC does not have support for limiting the total
number of jobs in ‘SUBMITTING’

state.




CO2
-
CG Grid Interface overview and status

CO2
-
CG Workshop, DTU, Copenhagen, November 20th 2007

8

Future developments


These of course depend very much on the user input
and the outcome of this workshop!


Next goals


Get the MUFTE Runtime Environment installed at
more resources


Making the GJM more configurable


More robust deployment of the software on the
application server


During the first half of 2008


Parallel runs


Automated/assisted certificate handling


Reusing an existing job manager?