SI2-SSE: Pipeline Framework for Ensemble Runs on the Cloud

rangaleclickSoftware and s/w Development

Nov 4, 2013 (4 years and 1 month ago)

67 views

SI2
-
SSE: Pipeline Framework for Ensemble Runs on the Cloud

Beth
Plale

(PI), Indiana University | Craig Mattocks (Co
-
PI), University of Miami


Figure:
Scheduling tasks in Azure using
Sigiri

Middleware

Climate

change

can

have

adverse

impacts

on

strength

of

storms
.

Even

modest

changes

in

Ocean

surface

temperature

can

have

a

significant

impact

on

hurricane

strength,

making

the

coastal

regions

increasingly

vulnerable

to

storm

surge
.




SLOSH

(Seam,

Lake,

and

Overland

Surges

from

Hurricanes)

is

a

computational

model

to

predict

storm

surges

in

coastal

areas
.



Scientists

usually

run

a

large

ensemble

of

SLOSH

instances

to

cope

with

errors

and

uncertainties

with

storm

tracks

and

landfall

location
.



In

this

project,

we

develop

generalized

tools

for

rapid

and

cost
-
effective

deployment

of

a

large

number

(between

500

and

15
,
000
)

of

small

tasks

on

cloud

resources
.

The

project

develops

a

pipeline

framework

for

running

ensemble

simulations

on

the

cloud
.




We

use

the

SLOSH

model

as

the

specific

motivating

application
.

Users

who

could

benefit

from

the

application

include

the

National

Hurricane

Center

who

is

a

partner

on

the

project,

Federal

Emergency

Management

Administration

(FEMA),

the

U
.
S
.

Army

Corps

of

Engineers,

and

state

and

local

emergency

managers
.



U
sers

submit

jobs

using

a

web

portal
.

A

Service

Manager

(e
.
g
.
,

Azure

Daemon)

fetches

the

submitted

jobs

and

schedules

them

in

the

cloud

resources
.

The

Service

Manager

balances

loads

across

worker

nodes

through

partitioning

the

SLOSH

instances
.




The

worker

nodes

run

the

SLOSH

instances

and

locally

merge

the

output

files

generated
.

A

separate

merge

process

aggregates

the

intermediate

file

across

the

worker

nodes




The

Scheduler

within

the

Service

Manager

reduces

storage

and

I/O

overheads

in

handling

and

aggregating

intermediate

output

files
.




We
use two approaches: a
MapReduce

runtime (Twister4Azure) and a
Sigiri

Middleware. Users are
able to effect tradeoffs between cost and
delay metrics.


The

SLOSH

instances

generate

a

number

of

output

files

(for

a

number

of

groups)

that

record

Maximum

Envelope

of

Winds

(MEOWs)

and

Maximum

of

Maximum

of

MEOWs

(MOMs)
.



Dividing

output

files

into

groups

facilitates

interactive

visualization

and

analysis
;

each

group

is

captured

by

three

parameters
:

storm

direction,

forward

motion,

and

storm

category
.


Figure: Maximum Envelope of water for a hypothetical
storm of category 1, speed 5 miles per hour and headed
in the northwest direction in the Miami basin.

Elastic

processing
:

Revise

the

leased

resources

in

an

on
-
line

fashion

depending

submitted

loads
.

Metadata

harvest
:

Automatic

capture

of

metadata

and

provenance

for

the

SLOSH

output

datasets

to

contribute

towards

trust

and

to

reduce

the

burden

of

sharing

the

datasets
.

The

metadata

could

be

used

to

find

which

SLOSH

simulation

contributed

each

of

the

max

values

in

the

MEOWs/MOMs
.

Develop

a

simple

web
-
based

Interface

(UI)

for

the

system
.

SLOSH Execution Model

Job Scheduling in Cloud

Experiments

References

1.
Kavitha

Chandrasekar
,

Milinda

Pathirage
,

Samindra

Wijeratne
,

Craig

Mattocks,

Beth

Plale

2012
.

Middleware

Alternatives

for

Storm

Surge

Predictions

in

Windows

Azure,

3
rd

Workshop

on

Scientific

Cloud

Computing
,

pp

3
-
12
,

ACM,

NY,

NY

10
.
1145
/
2287036
.
2287040


2.
E
.

C
.

Withana

and

B
.

Plale
.

Sigiri
:

uniform

resource

abstraction

for

grids

and

clouds
.

Concurrency

and

Computation
:

Practice

and

Experience,

2012
.


3.
B
.

Glahn
,

A
.

Taylor,

N
.

Kurkowski
,

and

W
.

Shaffer
.

The

role

of

the

slosh

model

in

national

weather

service

storm

surge

forecasting
.

National

Weather

Digest,

33
(
1
)
:
3

14
,

2009
.


Sigiri

Web Service

Job Queue



Azure Daemon

Azure queue

User jobs

Azure Blob Storage

Merge tasks

Worker
roles

Final output

1

2

3

N

Distribute

Set 1

Set 2

Set p

Basin

SLOSH

Program

SLOSH

Program

SLOSH

Program

Intermediate

output

Intermediate

output

Intermediate

output

Aggregate

1

2

3

k

Final Results

(An envelope and a track file per group)

Figure:
SLOSH ensemble execution model

2000
3000
4000
5000
6000
7000
8000
141
257
348
385
Time (sec)

Execution time with varying track files ( total VM =20)

Average
Min
Max
2000
3000
4000
5000
6000
7000
8000
10
15
20
Time (sec)

Execution time with varying Azure VMs ( track
files=141)

Average
Min
Max
Ongoing Efforts

Introduction

Team Members:
Abhirup

Chakraborty
,
Kavitha

Chandrasekar
,
Milinda

Pathirage
,
Isuru

Suriarachchi