CENTRE FOR PARALLEL COMPUTING

designpadΤεχνίτη Νοημοσύνη και Ρομποτική

1 Δεκ 2013 (πριν από 3 χρόνια και 7 μήνες)

68 εμφανίσεις

CENTRE FOR PARALLEL COMPUTING






















8th IDGF Workshop

Hannover, August 17
th

2011

International Desktop

Grid Federation


CENTRE FOR PARALLEL COMPUTING























Experiences with the

University of Westminster

Desktop Grid


S C Winter, T Kiss, G Terstyanszky, D Farkas,
T Delaitre


CENTRE FOR PARALLEL COMPUTING

Contents


Introduction to Westminster Local
Desktop Grid (WLDG)


Architecture, deployment management


EDGeS Application Development
Methodology (EADM)


Application examples


Conclusions

CENTRE FOR PARALLEL COMPUTING

Introduction to Westminster
Local Desktop Grid (WLDG)

1

2

3

4

5

6

New Cavendish
St

576

Marylebone Road

559

Regent
Street

395

Wells Street


31

Little
Titchfield St


66

Harrow Campus

254

CENTRE FOR PARALLEL COMPUTING

WLDG Environment


DG Server on private University network


Over 1500 client nodes on 6 different
campuses


Most machines are dual core, all running
Windows


Running SZTAKI Local Desktop Grid package


Based on student laboratory PC’s


If not used by student

switch to DG mode


If no more work from DG server


shutdown
(Green policy)



CENTRE FOR PARALLEL COMPUTING

The DG Scenario

BOINC Server
BOINC workers
UoW Local Desktop Grid
Create graph
and concrete
workflow,
and submit
to DG
End
-
user
gUSE WS P
-
GRADE portal
WS P
-
GRADE
DG Submitter
submits jobs and
retrieve results
via 3G Bridge
Workers:
Download
executable and input
files
Upload
: result
CENTRE FOR PARALLEL COMPUTING

WLDG: ZENworks deployment


BOINC clients installed automatically and maintained
by specifically developed Novell ZENworks objects


MSI file has been created to generate a ZENworks object
that installs the client software.


BOINC Client Install Shield Executable converted into an MSI
package (/a switch on the BOINC Client executable)


BOINC client is part of the generic image installed on all lab
PC’s throughout the University


Guaranteed that any newly purchased and installed PC
automatically becomes part of the WLDG


All clients registered under same user account

CENTRE FOR PARALLEL COMPUTING

EDGeS Application Development
Methodology (EADM)


Generic methodology for DG application porting


Motivation: Special focus required when
porting/developing an application to a SG/DG
platform


Defines how the recommended software tools, eg.
developed by EDGeS, can aid this process


Supports iterative methods:


well
-
defined stages suggest a logical order


but (since in most cases process is non
-
linear) allows
revisiting and revising results of previous stages, at any
point

CENTRE FOR PARALLEL COMPUTING

EADM


Defined Stages

1.
Analysis of current
application

2.
Requirements analysis

3.
Systems design

4.
Detailed design

5.
Implementation

6.
Testing

7.
Validation

8.
Deployment

9.
User support,
maintenance & feedback

CENTRE FOR PARALLEL COMPUTING

Application Examples


Digital Alias
-
Free Signal Processing


AutoDock Molecular Modelling

CENTRE FOR PARALLEL COMPUTING

Digital Alias
-
Free Signal
Processing (DASP)


Users:

Centre for Systems Analysis


University of Westminster


Traditional DSP
based on
Uniform sampling


Suffers from aliasing


Aim
: Digital Alias
-
free Signal Processing (DASP)


One solution is Periodic Non
-
uniform Sampling (PNS)


The DASP application designs PNS sequences


Selection of optimal sampling sequence is computationally expensive
process


A linear equation has to be solved and a large number of solutions
(~10
10
) compared.


The analyses of the solutions are independent from each other


suitable for DG parallelisation


CENTRE FOR PARALLEL COMPUTING

DASP
-

Parallelisation

Solve
p
r
q
r
q
r
r







)
1
2
(
1
2

Find best
Permutation for
solution 1, 1+m,
1+2m

Find best
Permutation for
solution 2, 2+m,
2+2m

Find best
Permutation for
solution m, 2m,
3m,


q
r
,
q
r+1
,

,
q
2r
-
1
q
r
,
q
r+1
,

,
q
2r
-
1
q
r
,
q
r+1
,

,
q
2r
-
1
Find globally best solution
Locally best
solution
Locally best
solution
Locally best
solution
Computer 1
Computer 2
Computer m
Solve
p
r
q
r
q
r
r







)
1
2
(
1
2

Find best
Permutation for
solution 1, 1+m,
1+2m

Find best
Permutation for
solution 2, 2+m,
2+2m

Find best
Permutation for
solution m, 2m,
3m,


q
r
,
q
r+1
,

,
q
2r
-
1
q
r
,
q
r+1
,

,
q
2r
-
1
q
r
,
q
r+1
,

,
q
2r
-
1
Find globally best solution
Locally best
solution
Locally best
solution
Locally best
solution
Computer 1
Computer 2
Computer m
CENTRE FOR PARALLEL COMPUTING

DASP


Performance test results

Period T
(factor)

Sequential

DG
worst

DG
median

DG best

# of
work
units

Speedup
(best
case)

# of
nodes
involved
(median)

18

13 min

9 min

7 min

4 min

50

3.3

59

20

2 hr

29 min

111
min

43 min

20 min

100

7.5

97

22

26 hr

40 min

5h
1min

3 hr

24 min

2 hr 31
min

723

11

179

24

~820 hr

n/a

n/a

17 hr

54 min

980

46

372

CENTRE FOR PARALLEL COMPUTING

DASP


Addressing the
performance issues


Inefficient load balancing


solutions

of
the equation should

be

grouped based on the
execution time required to analyse individual solutions


Inefficient work

unit
generation


some

of
the solutions should
be
divided into subtasks
(more
work units
)


Limits to the possible speed
-
up




User
-
community/application developers to
consider redesigning the algorithm

CENTRE FOR PARALLEL COMPUTING

AutoDock Molecular Modelling


Users:


Dept of Molecular & Applied Biosciences, UoW


AutoDock:


a suite of automated docking tools


designed to predict how small molecules, such as
substrates or drug candidates, bind to a receptor
of known 3D structure


application components:


AutoDock performs the docking of the ligand to a set of grids
describing the target protein


AutoGrid pre
-
calculates these grids

CENTRE FOR PARALLEL COMPUTING

Need for Parallelisation


One run of
AutoDock

finishes in a reasonable
time

on a single PC


However, thousands of scenarios have to be
simulated and analysed to get stable and
meaningful results.


AutoDock

has to be run multiple times with the same
input files but with random factors


Simulations runs are independent from each other


suitable for DG



AutoGrid

does not require Grid resources

CENTRE FOR PARALLEL COMPUTING

AutoDock component workflow

gpf

file

pdb

file (ligand)

pdb

file
(receptor)

prepare_ligand4.py

prepare_receptor4.py

pdbqt

file

pdbqt

file

AUTOGRID

AUTODOCK

map files

Pamela

dpf

file

AUTODOCK

AUTODOCK

AUTODOCK

AUTODOCK

dlg

files

SCRIPT1

SCRIPT2

best
dlg

files


pdb

file

CENTRE FOR PARALLEL COMPUTING

Computational workflow in

P
-
GRADE

receptor.pdb

ligand.pdb

Autogrid

executables,

Scripts (uploaded by the

developer , don’t change it)

gpf

descriptor file

dpf

descriptor file

output
pdb

file

Number of
work units

1.
The Generator job creates specified numbered of
AutoDock jobs.

2.
The AutoGrid job creates
pdbqt

files from the
pdb

files, runs the
autogrid

application and generates
the map files. Zips them into an archive file. This
archive will be the input of all AutoDock jobs.

3.
The AutoDock jobs are running on the Desktop Grid.
As output they provide
dlg

files.

4.
The Collector job collects the
dlg

files. Takes the
best results and concatenates them into a
pdb

file.

dlg

files

CENTRE FOR PARALLEL COMPUTING

AutoDock


Performance test
results

Speedup
0
20
40
60
80
100
120
140
160
180
200
10
100
1000
3000
# of work units
Speedup
CENTRE FOR PARALLEL COMPUTING

DG Drawbacks: The “Tail”
Problem

Jobs >> Nodes

Jobs ≈ Nodes

CENTRE FOR PARALLEL COMPUTING

Tackling the Tail Problem


Augment the DG infrastructure with
more reliable nodes, eg. service grid or
cloud resources


Redesign scheduler to detect tail and
resubmit tardy tasks to SG or cloud

CENTRE FOR PARALLEL COMPUTING

Cloudbursting: Indicative Results

CENTRE FOR PARALLEL COMPUTING

AutoDock
-

Conclusions


CygWin

on Windows implementation
inhibited performance


can be improved using (eg.)


DG to EGEE bridge


Cloudbursting


AutoDock

is black
-
box legacy application


source code not available


code
-
based improvement not possible


CENTRE FOR PARALLEL COMPUTING

Further Applications


Ultrasound Computer Tomography
-

Forschungszentrum

Karlsruhe



EMMIL


E
-
marketplace optimization
-

SZTAKI


Anti
-
Cancer Drug Research (CancerGrid)
-

SZTAKI


Integrator of Stochastic Differential Equations in Plasmas
-

BIFI


Distributed Audio Retrieval
-

Cardiff University


Cellular Automata based Laser Dynamics
-

University of
Sevilla


Radio Network Design


University of
Extramadura



An X
-
ray diffraction spectrum analysis
-

University of
Extramadura


DNA Sequence Comparison and Pattern Discovery
-

Erasmus
Medical
Center


PLINK
-

Analysis of genotype/phenotype data

-

Atos

Origin


3D video rendering
-

University of Westminster


CENTRE FOR PARALLEL COMPUTING

Conclusions


Performance Issues


Performance enhancements


accrue from cyclical enterprise level hardware
and software upgrades


Are countered by
performance
degradation


arising from shared nature of resources


Need to focus on
robust performance
measures


in face of random unpredictable run
-
time
behaviours

CENTRE FOR PARALLEL COMPUTING

Conclusions


Load Balancing
Strategies


Heterogranular workflows


Tasks can differ widely in size and run times


Performance prediction, based eg. on previous runs, can inform
mapping (up to a point) ..


.. but after this, may need to re
-
engineer code (white box only)


.. or consider offloading bottleneck tasks to reliable resources


Homogranular workflows


Classic example: parameter sweep problem


Fine grain problems (#Tasks >> #Nodes) help smooth out the overall
performance, but ..


.. tail problem can be significant (especially if #Tasks


#Nodes)


Smart detection of delayed tasks coupled with speculative
duplication

CENTRE FOR PARALLEL COMPUTING

Conclusions


Deployment Issues


Integration within enterprise desktop management
environment has many advantages, eg.


PC’s and applications are continually upgraded


Hosts and licenses are “free” on the DG


… but, also some drawbacks:


No direct control


Typical environments can be slack and dirty


Corporate objectives can override DG service objectives


Examples: current UoW Win7 deployment, green agenda


Service relationship, based on trust


DG bugs can easily damage trust relationship, if not caught quickly


Example: recent GenWrapper bug


Non
-
dedicated resource


Must give way to priority users, eg. students




CENTRE FOR PARALLEL COMPUTING

The End

Any questions?