Pattern Programming - Department of Software and Information ...

emptyslowInternet και Εφαρμογές Web

12 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

80 εμφανίσεις

1

Pattern Programming


Introduction to Seeds
Framework

ITCS 4/5145 Parallel Programming Spring 2013 PatterrnProgIntro.ppt Modification date:
April 15, 2013


Problem Addressed


To make parallel
programming more useable
and scalable.



Parallel programming
--

writing programs using
multiple computers and
processors collectively to
solve problems
--

has a
very long history but still a
challenge.


2

Traditional approach



Traditional approach



Explicitly specifying message
-
passing (MPI),
and




Low
-
level threads APIs (Pthreads, Java threads,
OpenMP, …).



Need a better structured approach.



3

Pattern Programming Concept

Programmer begins by constructing his program using
established computational or algorithmic

patterns


that
provide a structure.

4


Design patterns


part of software engineering for many years:



Reusable solutions to commonly occurring problems *


Patterns provide guide to

best practices

, not a final
implementation


Provides good scalable design structure


Can reason more easier about programs


Potential for automatic conversion into executable code
avoiding low
-
level programming


We do that here.


Particularly useful for the complexities of
parallel/distributed computing

* http://en.wikipedia.org/wiki/Design_pattern_(computer_science)

5

In Parallel/Distributed computing

What patterns are we talking about?


Low
-
level algorithmic patterns
that might be embedded into a
program such as fork
-
join, broadcast/scatter/gather.



Higher level algorithm patterns
for forming a complete
program such as workpool, pipeline, stencil, map
-
reduce.


We concentrate upon higher
-
level

computational/algorithm


level patterns rather than lower level patterns.

Some Patterns

6

Workers

Workpool

Master

Two
-
way
connection

Compute node

Source/sink

7

Workers

Pipeline

Master

Two
-
way
connection

Compute node

Source/sink

One
-
way
connection

Stage 1

Stage 3

Stage 2

8

Divide and Conquer

Divide

Two
-
way
connection

Compute node

Source/sink

Merge

9

All
-
to
-
All

Two
-
way
connection

Compute node

Source/sink

10

Stencil

Two
-
way
connection

Compute node

Source/sink

Usually a
synchronous
computation


-

Performs number
of iterations to
converge on
solution

e.g. for solving
Laplace

s/heat
equation

On each iteration,
each node
communicates

with neighbors to
get stored

computed values

Parallel Patterns


Advantages


Possible to create parallel code from the pattern
specification automatically


see later.


Abstracts/hides underlying computing environment


Generally avoids deadlocks and race conditions


Reduces source code size (lines of code)


Hierarchical designs with patterns embedded into
patterns, and pattern operators to combine patterns


Disadvantages


New approach to learn


Takes away some of the freedom from programmer


Performance reduced slightly(but compare using high
level languages instead of assembly language)

11

Previous/Existing Work


Patterns explored in several projects.


Industrial efforts


Intel

Threading Building Blocks (TBB), Intel Cilk plus, Intel Array
Building Blocks (ArBB).

Focus on very low level patterns such as fork
-
join



Universities:


University of Illinois at Urbana
-
Champaign and University of
California, Berkeley


University of Torino/Università di Pisa Italy

12

Our approach

Focuses on a few higher level patterns of wide
applicability (e.g. workpool, synchronous all
-
to
-
all,
pipelined, stencil).


Software framework developed called “Seeds” to easily
construct an application from established patterns
without need to write low level message passing or
thread based code.


Will to automatically distribute code across processor
cores, computers, or geographical distributed
computers and execute the parallel code.


13

Pattern Programming

with the Seeds Framework




14

Basic User Programmer Interface

To create and execute parallel programs,
programmer selects a pattern and implements
three principal Java methods with a module class:



Diffuse method


to distribute pieces of data.


Compute method


the actual computation


Gather method


used to gather the results



Programmer also has to fill in details in a

bootstrap


class to deploy and start the
framework.

Diffuse

Compute

Gather


Run module


bootstrap class

The framework self
-
deploys on a geographically distributed
platform and executes pattern.

15


Module


class

public Data DiffuseData (int segment) {



DataMap<String, Object> d =new DataMap<String, Object>();



input Data = ….



d.put(

n慭a彯f彩nputd慴愢, inpu瑄慴愩a



return d;


}


public Data Compute (Data data) {



DataMap<String, Object> input = (DataMap<String,Object>)data; //data produced by DiffuseData()



DataMap<String, Object> output = new DataMap<String, Object>(); //output returned to gatherdata



inputData = input.get(

n慭a彯f彩nputd慴a






… // computation




output.put("name_of _results", results); // to return to GatherData()



return output;


}




public void GatherData (int segment, Data dat) {



DataMap<String,Object> out = (DataMap<String,Object>) dat;



outdata = out.get (

n慭敟o晟r敳ults





result … // aggregate outdata from all the worker nodes. result a private variable


}

By framework

By framework

segment used by
Framework to keep
track of where to
put results

GatherData
gives back Data
object with a
segment number

Data cast into a DataMap

16

Module class

Seeds Workpool

DiffuseData, Compute, and GatherData Methods

DiffuseData

DataMap d

Returns d to
each slave

DataMap d created in diffuse

DataMap output created in compute

DataMap output

GatherData

Private variable
total (answer)

Compute

DataMap input

Master

Slaves

Note
DiffuseData,
Compute and
GatherData
methods start
with a capital
letter although
method names
should not!

17

DataMap methods


put (String, data)


puts data into DataMap identified by string


get (String)


gets stored data identified by string



DataMap extends Java HashMap which implement a Map, see
http://doc.java.sun.com/DocWeb/api/java.util.HashMap

Data and DataMap classes

For implementation convenience two classes:



Data

class used to pass data between master and slaves


Uses a

segment


number to keep track of packets


as they go from one method to another.



DataMap

class inside compute method


DataMap
is a subclass of
Data

and so allows casting.



18

Seeds Implementations


JXTA P2P networking version
suitable for a fully distributed
network of computers and requiring an Internet connection
even in just running on a single computer,



No Network


JXTA P2P version

for running on a single
computer, not requiring an Internet connection


Multicore version

implemented with threads for more efficient
execution on single multicore computer or shared memory
multiprocessor system
--

does not require an Internet
connection.




Two JXTA versions use the same application code and run in
a similar fashion


Multicore version use same Module code but slightly
Bootstrap Run Module source code (see next)

19

package edu.uncc.grid.example.workpool;

import java.io.IOException;

import net.jxta.pipe.PipeID;

import edu.uncc.grid.pgaf.Anchor;

import edu.uncc.grid.pgaf.Operand;

import edu.uncc.grid.pgaf.Seeds;

import edu.uncc.grid.pgaf.p2p.Types;

public class RunMonteCarloPiModule {


public static void main(String[] args) {



try {




MyModule

pi = new
MyModule
();




Seeds.start
( "/path/to/seeds/seed/folder" , false);

//
starts framework, deploy
on
list of
servers




PipeID

id = Seeds.startPattern(new Operand( (String[])null, new Anchor( "hostname"






, Types.DataFlowRoll.SINK_SOURCE), pi ) );


//
starts seeds pattern




System.out.println
(
id.toString
() );




Seeds.waitOnPattern
(id);





//
waits for pattern to complete




System.out.println
( "The result is: " + pi.getPi() ) ;




Seeds.stop
();







//stop
framework



} catch (SecurityException e) {




e.printStackTrace();



} catch (IOException e) {




e.printStackTrace();



} catch (Exception e) {




e.printStackTrace();



}


}

}

Bootstrap class

JXTA P2P version

20

This code deploys
framework and starts
execution of pattern

Different patterns have
similar code

Name of module class


Thread
-
based version
using shared memory



Faster than JXTA P2P
version on a multicore
platform



Bootstrap class does
not need to start/stop
JXTA P2P.
Seeds.start()
and Seeds.stop() not
needed.


Module class
unchanged

21

public class RunMonteCarloPiModule {


public static void main(String[] args) {


try {



MyModule pi=new MyiModule();



Thread id =
Seeds.startPatternMulticore
(


new Operand( (String[])null


, new Anchor( args[0],








Types.DataFlowRole.SINK_SOURCE)


, pi ), 4 );



id.join
();




System.out.println( "The result is: " + pi.getPi() ) ;


} catch (SecurityException e) {




}


}

}

Bootstrap class

Multicore version

Measuring Time

Can instrument code in the bootstrap class:


public class RunMyModule {


public static void main (String [] args ) {



try{





long start = System.currentTimeMillis();





MyModule m = new MyModule();





Seeds.start(. );





PipeID id = ( … );





Seeds.waitOnPattern(id);



Seeds.stop();





long stop = System.currentTimeMillis();





double time = (double) (stop
-

start) / 1000.0;





System.out.println(

䕸散畴u潮 瑩浥‽•‫ 瑩浥⤻




} catch (SecurityException e) { …










22

Compiling/executing


Can be done on the command line (ant
script provided) or through an IDE (Eclipse)

23

Pattern Programming

with the Seeds Framework





Workpool pattern


Matrix addition and multiplication

24

25

Matrix Addition, C = A + B


Add corresponding elements of each matrix to form elements of
result matrix. Given elements of
A
as
a
i
,
j

and elements of
B
as
b
i
,
j
, each element of
C
computed as
:

Add

A

B

C

Easy to parallelize


each processor computes one C element
or group of C elements

26

Add

A

B

C

Workpool Implementation

Slave computation


Adds one row of A with one row of B to create one row of C


(rather than each slave adding single elements)

27

Slaves (one for each row)

Master

Compute node

Source/sink

C

A

B

Send one row of A
and B to slave

Return one
row of C

Following example 3 x 3 arrays and 3 slaves

Workpool implementation

MatrixAddModule.java

Continues on several sides


package
edu.uncc.grid.example.workpool
;

import …

public class
MatrixAddModule

extends
Workpool

{

private static final long
serialVersionUID

= 1L;


int
[][]
matrixA
;


int
[][]
matrixB
;


int
[][]
matrixC
;


public
MatrixAddModule
() {



matrixC

= new
int
[3][3];


}



public void
initMatrices
(){



matrixA

= new
int
[][]{{2,5,8},{3,4,9},{1,5,2}};



matrixB

= new
int
[][]{{2,5,8},{3,4,9},{1,5,2}};


}


public
int

getDataCount
() {


return 3;


}


public void
initializeModule
(String[]
args
) {



Node.
getLog
().
setLevel
(
Level.WARNING
);

}

28

In this example
matrices are 3 x 3

Some initial values

Required method. Number of
data objects (Slaves)


public Data
DiffuseData
(
int

segment) {



int
[]
rowA

= new
int
[3];


int
[]
rowB

= new
int
[3];



DataMap
<String,
int
[]> d =new
DataMap
<String,
int
[]>();



int

k = segment;


for (
int

i
=0;i<3;i++) {



rowA
[
i
] =
matrixA
[k][
i
];



rowB
[
i
] =
matrixB
[k][
i
];


}


d.put
("
rowA",rowA
);


d.put
("
rowB",rowB
);


return d;

}


29

segment variable used to select rows

DiffuseData method

rowA and rowB put in d DataMap to
send to slaves

DataMap d returned are pairs of
string key and associated array

Copy one row of A and one row of B
into rowA, rowB to be sent to slaves


public Data Compute(Data
data
) {



int
[]
rowC

= new
int
[3];


DataMap
<String,
int
[]> input = (
DataMap
<
String,int
[]>)data
;


DataMap
<String,
int
[]> output = new
DataMap
<String,
int
[]>();



int
[]
rowA

= (
int
[])
input.get
("
rowA
");


int
[]
rowB

= (
int
[])
input.get
("
rowB
");



for (
int

i
=0;i<3;i++) {



rowC
[
i
] =
rowA
[
i
] +
rowB
[
i
];


}




output.put
("
rowC",rowC
);


return output;

}


30

Get two rows from data
received

Compute method

Add rows

Put result row into output with
key to be sent back to master

public void
GatherData
(
int

segment, Data
dat
) {



DataMap
<
String,int
[]> out = (
DataMap
<
String,int
[]>)
dat
;



int
[]
rowC

= (
int
[])
out.get
("
rowC
");



for (
int

i
=0;i<3;i++) {



matrixC
[segment][
i
]=
rowC
[
i
];


}


}


31

Note segment variable and
Data from slave

GatherData method

Get C row sent from slave

Place row into result matrix

Segment variable associated with
Data used to choose correct row

Bootstrap class
-

RunMatrixAddModule.java

package
edu.uncc.grid.example.workpool
;

import …

public class
RunMatrixAddModule

{

public static void main (String []
args

) {

try {


long start =
System.currentTimeMillis
();



Seeds.start
(
args
[0]

,false);


MatrixAddModule

m = new
MatrixAddModule
();


m.initMatrices
();


PipeID

id =
Seeds.startPattern
(new Operand ((String[])
null,new

Anchor (
args
[1]
,
Types.DataFlowRoll.SINK_SOURCE
),m));


Seeds.waitOnPattern
(id);


m.printResult
();


Seeds.stop
();


long stop =
System.currentTimeMillis
();


double time = (double) (stop
-

start) / 1000.0;


System.out.println
("Execution time = " + time);






32

In this example the path to
Seeds and local host name are
command line arguments

33

Matrix Multiplication, C = A * B

One slave computes one
element of result in
workpool implementation

34

Slaves (one for each element of result)

Master

Compute node

Source/sink

C

A

B

Send one row of A
and one column of B
to slave

Return one
element of C

Following example 3 x 3 arrays and 9 slaves

Workpool implementation

MatrixAddModule.java

Continues on several sides


package
edu.uncc.grid.example.workpool
;

import …

public class
MatrixAddModule

extends
Workpool

{

private static final long
serialVersionUID

= 1L;


int
[][]
matrixA
;


int
[][]
matrixB
;


int
[][]
matrixC
;


public
MatrixAddModule
() {



matrixC

= new
int
[3][3];


}



public void
initMatrices
(){



matrixA

= new
int
[][]{{2,5,8},{3,4,9},{1,5,2}};



matrixB

= new
int
[][]{{2,5,8},{3,4,9},{1,5,2}};


}


public
int

getDataCount
() {


return
9
;


}


public void
initializeModule
(String[]
args
) {



Node.
getLog
().
setLevel
(
Level.WARNING
);

}

35

In this example
matrices are 3 x 3

Some initial values

Required method. Number of
data objects (Slaves)

Note on mapping rows and
columns to segments





Arow Bcol

segment 0


0

0

segment 1


0

1

segment 2


0

2

segment 3


1

0

segment 4


1

1

segment 5


1

2

segment 6


2

0

segment 7


2

1

segment 8


2

2

36

int Arow =segment/3;


Int Bcol segment%3;


public Data
DiffuseData
(
int

segment) {



int
[]
rowA

= new
int
[3];


int
[]
colB

= new
int
[3];



DataMap
<String,
int
[]> d =new
DataMap
<String,
int
[]>();



int

a=segment/3,b = segment%3 ;


for (
int

i
=0;i<3;i++) {



rowA
[
i
] =
matrixA
[
a
][
i
];



colB
[
i
]

=
matrixB
[
i
][
b
];


}


d.put
("
rowA",rowA
);


d.put
(“
colB
",
colB
);


return d;

}


37

segment variable used to
select element in A and B

DiffuseData method

rowA and colB put in d DataMap to
send to slaves

DataMap d returned are pairs of
string key and associated array

Copy one row of A and one column of
B into rowA, colB to be sent to slaves


public Data Compute(Data
data
) {



int
[]
rowC

= new
int
[3];


DataMap
<String,
int
[]> input = (
DataMap
<
String,int
[]>)data
;


DataMap
<String,
Integer
> output = new
DataMap
<String,
Integer
>();



int
[]
rowA

= (
int
[])
input.get
("
rowA
");


int
[]
colB

= (
int
[])
input.get
(“
colB
");


int

out = 0;


for (
int

i
=0;i<3;i++) {



out +=
rowA
[
i
]*
colB
[
i
];



}




output.put
(

out",out
);


return output;

}


38

Get two rows from data
received

Compute method

Matrix multiplication, one result

Put result into output with key
to be sent back to master

public void
GatherData
(
int

segment, Data
dat
) {



DataMap
<
String,
Integer
> out = (
DataMap
<
String,
Integer
>)
dat
;



int

answer =
out.get
("out");



int

a=segment/3, b=segment%3;



matrixC
[a][b]= answer;


}


39

Note segment variable and
Data from slave

GatherData method

Get result sent from slave*

Place element into result matrix

Segment variable associated with
Data used to choose correct row

* Cast from Integer to int not necessary

Bootstrap class
-

RunMatrixMultiplyModule.java

JXTA P2P version

package
edu.uncc.grid.example.workpool
;

import …

public class
RunMatrixMultiplyModule

{

public static String
localhost

= "T5400";

// name of local machine

public static String
seedslocation

= "C:
\
\
seeds_2.0
\
\
pgaf
";

public static void main (String []
args

) {

try {


long start =
System.
currentTimeMillis
();



Seeds.start
(
seedslocation

,false);


MatrixMultiplyModule

m = new
MatrixMultiplyModule
();


m.initMatrices
();


PipeID

id =
Seeds.startPattern
(new Operand ((String[])
null,new

Anchor (
localhost
,
Types.DataFlowRoll.SINK_SOURCE
),m));


Seeds.waitOnPattern
(id);


m.printResult
();


Seeds.stop
();


long stop =
System.currentTimeMillis
();


double time = (double) (stop
-

start) / 1000.0;


System.out.println
("Execution time = " + time);






40

In this example, local host
and path to Seeds are
hardcoded.

Acknowledgements

Extending work to teaching environment
supported by the National Science Foundation
under grant "Collaborative Research: Teaching
Multicore and Many
-
Core Programming at a
Higher Level of Abstraction" #1141005/1141006
(2012
-
2015).

Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the authors and do not
necessarily reflect the views of the National Science Foundation.

Work initiated by Jeremy Villalobos in his PhD thesis

Running
Parallel Applications on a Heterogeneous Environment with
Accessible Development Practices and Automatic Scalability,


UNC
-
Charlotte, 2011. Jeremy developed

Seeds


pattern
programming software.

42

Department of Computer Science


Pattern Programming Group


http://coitweb.uncc.edu/~abw/PatternProgGroup/


Please contact B. Wilkinson if you would like to be
involved in this work for academic credit:



Undergraduate senior project/capstone project
(ITCS 4650/51/81/82, ITCS 4990/91)


Graduate level ITCS 6880 Individual Study


MS thesis (ITCS 6991)


(Currently no student funding available.)

Questions