Strategies for Teaching Parallel and

wastecypriotInternet και Εφαρμογές Web

10 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

48 εμφανίσεις

csinparallel.org

Patterns and Exemplars: Compelling
Strategies for Teaching Parallel and
Distributed Computing to CS
Undergraduates

Libby
Shoop

Joel Adams Dick Brown

csinparallel.org

Today’s messages


Parallel Design Patterns provide an
established, practical set of principles for
teaching PDC


“Exemplar” example applications with
multiple implemented solutions provide
motivation for students and teaching
materials for instructors


Patterns and Exemplars fit together naturally
and are ready for deployment

csinparallel.org

Parallel Design Patterns


Following on the original Gang of Four design
patterns work


Active work on parallel design patterns and
parallel pattern languages:


Catalog parallel patterns used in solutions and
describe a methodology for using the pattern

csinparallel.org

Past Work


Lea :


Java Concurrency Patterns book


Mattson, Saunders, and
Massingil

:


PPLP book


Ralph Johnson et al. :


Parallel Programming Patterns online;
books of Visual C++, .NET examples


Oretega
-
Arjona

book


McCool,
Reinders
, and Robison book



Kreutzer, Mattson, et al. :


Our Pattern Language (OPL) online



ParaPLoP

Workshop on Parallel
Programming Patterns

ParaPLoP

‘10

1999

2004

2010

2011

2010

2012

csinparallel.org

Pattern Approach


Using existing
design knowledge
when
designing new parallel programs


L
eads
to parallel software systems that
are:


modular
, adaptable, understandable and
evolve
easily



Also provides an effective problem
-
solving
framework and a guide for teaching about
good parallel solutions

csinparallel.org

PATTERNLETS

csinparallel.org

Patternlets


… are minimalist
,
scalable, executable
programs
,
each illustrating a particular pattern’s behavior:


Minimalist

so that students can grasp the concept
without non
-
essential details getting in the way


Scalable

so that students see different behaviors as
the number of threads changes


Executable

so that


Instructors can use it in a live
-
coding demo


Students can use it in a hands
-
on exercise

Patternlets

let students see the pattern in action

csinparallel.org

Existing
Patternlets

(so far)


OpenMP


F
ork
-
J
oin


SPMD


Master
-
Worker


Parallel For
L
oop (blocks)


Parallel For
L
oop (stripes)


Reduction


Private


Atomic


Critical


Critical2


Sections


Barrier



MPI


SPMD


Master
-
Worker


Message Passing


Parallel
F
or Loop (stripes)


Parallel F
or
L
oop (blocks
)


Broadcast


Reduction


Scatter


Gather


Barrier

MPI
Patternlets

OpenMP

Patternlets

csinparallel.org

/*
masterWorker.c

(MPI) … */

#
include <
stdio.h
>

#include <
mpi.h
>

int

main(
int

argc
, char**
argv
) {


int

id =
-
1,
numProcs
=
-
1, length =
-
1;


char
hostName
[MPI_MAX_PROCESSOR_NAME]
;


MPI_Init
(&
argc
, &
argv
);


MPI_Comm_rank
(MPI_COMM_WORLD, &id);


MPI_Comm_size
(MPI_COMM_WORLD,
&
numProcs
)
;


MPI_Get_processor_name

(
hostName
, &length)
;


if ( id == 0 ) {
// process
with ID == 0
is the master



printf
("Greetings from the master,
#%
d (%s) of %d processes
\
n”,
id,
hostName
,
numProcs
)
;


} else {

/
/ processes with
IDs
> 0 are workers



printf
("Greetings from a worker,
#%
d (%s) of %d processes
\
n”,
id,
hostName
,
numProcs
)
;


}


MPI_Finalize
();


return 0;

}

csinparallel.org

S
ample Executions

$
mpirun

-
np

1
./
masterWorker

Greetings from the master,
#0 (
node
-
01)
of 1
processes


$
mpirun


np

8 ./
masterWorker

Greetings from the master,
#0 (
node
-
01)
of
8
processes

Greetings from a worker,
#1 (node
-
02)
of
8
processes

Greetings from a worker, #5 (node
-
06)
of 8 processes

Greetings
from a worker,
#3 (node
-
04)
of
8
processes

Greetings
from a worker,
#4
(node
-
05)
of 8 processes

Greetings from a worker, #7 (node
-
08) of 8 processes

Greetings
from a worker, #2 (node
-
03)
of 8
processes

Greetings from a worker, #6 (node
-
07) of 8
processes

csinparallel.org

/*
masterWorker.c

(
OpenMP
) … */

#
include <
stdio.h
>

#include <
omp.h
>

int

main(
int

argc
, char**
argv
) {


int

id =
-
1,
numThreads

=
-
1;

/
/ #pragma
omp

parallel


{


id =
omp_get_thread_num
();


numThreads

=
omp_get_num_threads
()
;




if ( id == 0 ) {
// thread with ID 0 is master


printf
(”Greetings
from the master,
#%
d of %d threads
\
n
\
n”,
id,
numThreads
);


} else
{ // threads with IDs > 0 are workers


printf
(”Greetings
from a worker,
#%
d of %d threads
\
n
\
n”, id
,
numThreads
);


}


}


return
0;

}

csinparallel.org

S
ample Executions

$ ./
masterWorker





// pragma
omp

parallel disabled

Greetings from the master,
#0
of 1
threads


$ ./
masterWorker





// pragma
omp

parallel enabled

Greetings from a worker,
#1
of 8 threads

Greetings from a worker,
#2
of 8 threads

Greetings from a worker,
#5
of 8 threads

Greetings from a worker,
#3
of 8 threads

Greetings from a worker,
#6
of 8 threads

Greetings from the master,
#0
of 8 threads

Greetings from a worker,
#4
of 8 threads

Greetings from a worker,
#7
of 8 threads



csinparallel.org

EXEMPLARS

csinparallel.org

Motivation


Everyone in CS needs PDC


Not everyone is naturally drawn to PDC topics














How shall we motivate
every CS
undergraduate

to learn the PDC they
will need for their careers?



csinparallel.org

Motivation


Everyone in CS needs PDC


Not everyone is naturally drawn to PDC topics





Proposal:
Teach PDC concepts with compelling
applications.


Some CS students draw by concepts and tech


Other CS students drawn by the applications




How shall we motivate
every CS
undergraduate

to learn the PDC they
will need for their careers?



csinparallel.org

Exemplars

An
exemplar

is:


A representative applied problem


plus



multiple code solutions implemented in
various PDC technologies, with commentary




csinparallel.org

Exemplar A
(from EAPF
Practicum)


Compute
π via numerical
integration


Implemented solutions


Serial


Shared memory (
OpenMP
, TBB,
pthreads
, Windows
Threads, go language)


Distributed computing (MPI)


Accelerators (CUDA, Array Building Blocks)


Comments:


Flexible uses: demo, concepts, tech, compare


But
not a compelling application

csinparallel.org

Exemplar B (
from EAPF
Practicum)


Drug design







Implemented solutions


Serial


Shared memory (
OpenMP
, boost threads, go
lang
)


Map
-
reduce framework (
Hadoop
)

I
n
tr
o
d
u
c
ti
o
n

to

th
e

Drug Design

Ex
e
m
p
l
a
r

Probl
em

defi
ni
ti
on

An
i
m
por
t
ant

pr
obl
em

i
n
t
he
bi
ol
ogi
cal

sci
ences
i
s
t
he
dr
ug
desi
gn
pr
obl
em
.


The
goal

i
s
t
o
f
i
nd
sm
all
m
olecules,

called
lig
a
n
d
s
, th
a
t a
re
g
o
o
d
c
a
n
d
id
a
te
s
fo
r u
s
e
a
s
d
ru
g
s
.



At

a
hi
gh
l
evel
,

t
he
p
ro
b
le
m
is sim
p
le
to
sta
te
. A
p
ro
te
in
a
sso
cia
te
d
w
ith
a
n
in
te
re
stin
g
d
ise
a
se

is id
e
n
tifie
d
. Th
e
th
re
e
-
di
m
ensi
onal

st
r
uct
ur
e
of

a
t
ar
get

pr
ot
ei
n
f
or

t
he
desi
r
ed
dr
ug
i
s
f
ound
by
som
e
m
eans
(ex
perim
ent
ally

or
t
hrough
a
m
olecular
m
odeling
com
put
at
ion).



A
c
ol
l
ect
i
on
of

lig
a
n
d
s is te
ste
d
a
g
a
in
st th
e
p
ro
te
in
u
s
in
g
a

docki
ng
al
gor
i
t
hm
:
fo
r e
v
e
ry
o
rie
n
ta
tio
n
o
f th
e

lig
a
n
d
re
la
tiv
e
to
th
e
p
ro
te
in
, a
co
m
p
u
ta
tio
n
te
sts if th
e
lig
a
n
d
b
in
d
s w
ith
th
e
p
ro
te
in
in
u
se
fu
l
ways

(
f
or

exam
pl
e,

t
yi
ng
up
a
bi
ol
ogi
c
al
l
y
ac
t
i
ve
re
g
io
n
o
n
th
e
p
ro
te
in
). A
s
c
o
re
is
s
e
t
dependi
ng
on
t
hese
bi
ndi
ng
pr
oper
t
i
es
and
t
he
best

scor
es
ar
e
f
l
agged
t
o
i
dent
i
f
y
t
he
l
i
gands
th
a
t w
o
u
ld
m
a
k
e
g
o
o
d
d
ru
g
c
a
n
d
id
a
te
s
.

Ap
p
l
i
c
a
t
i
o
n

Ar
c
h
i
t
e
c
t
u
r
e

L
e
ve
l

P
a
t
t
e
r
n
s

The
appl
i
cat
i
on
ar
chi
t
ect
ur
e
l
evel

pat
t
er
ns
c
o
n
s
titu
te
th
e
h
ig
h
e
s
t le
v
e
l in
th
e
O
P
L
h
ie
ra
rc
h
y
o
f
pat
t
er
ns,

and
concer
n
t
he
ar
chi
t
ect
ur
al

desi
gn
of

l
ar
ge
sof
t
war
e.


As
descr
i
bed
i
n
t
he
m
odul
e
In
tro
d
u
c
tio
n
to
P
a
ra
lle
l D
e
s
ig
n
P
a
tte
rn
s
, th
e
re
a
re
tw
o
k
in
d
s
o
f a
p
p
lic
a
tio
n
a
rc
h
ite
c
tu
re
le
v
e
l
pat
t
er
ns:


s
tru
c
tu
ra
l
pat
t
er
ns
descr
i
be
t
he
over
al
l

or
gani
zat
i
on
of

a
sof
t
war
e
appl
i
cat
i
on,

and
how
an
appl
i
cat
i
on’
s
com
put
at
i
onal

pat
t
er
ns
i
nt
er
act
;

and
com
put
at
ional
pat
t
er
ns
descr
i
be
t
he
essent
i
al

cl
asses
of

com
put
at
i
ons
t
hat

m
ake
up
an
appl
i
cat
i
on.

Not
e
t
hat

som
e
appl
i
cat
i
on
ar
chi
t
ect
ur
e
pat
t
er
ns
m
ay
not

l
ead
t
o
par
al
l
el
i
sm
;


f
or

exam
pl
e,

t
he
F
i
n
i
te
s
ta
te
m
a
c
h
i
n
e

com
put
at
ional
pat
t
ern
of
t
en
cannot

be
program
m
ed
ef
f
ect
iv
ely

in
parallel.



As
a
st
ar
t
i
ng
poi
nt

f
or

appl
yi
ng
t
he
pat
t
er
n
m
et
hodol
ogy
t
o
t
hi
s
pr
obl
em
,

we
f
irst o
b
se
rv
e
th
a
t w
e

want

t
o
appl
y
t
he
s
am
e
c
om
put
at
i
on
(
c
om
par
i
ng
l
i
gands

t
o
t
he
t
ar
get

pr
ot
ei
n)

t
o
var
i
ous

dat
a,

csinparallel.org

Exemplar B
(from EAPF
Practicum
)


Comments


Compelling application


Molecular dynamics,
docking algorithm



Substitute for docking algorithm to score ligands:



(score is maximal



match count)


Relates to genetic alignment algorithm


Multiple ways to scale: # ligands, ligand length, # cores


Random strings with
random lengths
for variable
computational load per ligand



Wo
r
k
i
n
g

w
i
t
h

a
c
t
u
a
l

l
i
g
a
n
d

a
n
d

p
r
o
t
e
i
n

d
a
t
a

i
s

b
e
y
o
n
d

t
h
e

s
c
o
p
e

o
f

t
h
i
s

mo
d
u
l
e
,

s
o

w
e

w
i
l
l

re
p
re
se
n
t th
e
co
m
p
u
ta
tio
n
b
y
a
sim
p
le
r strin
g
-
based
com
par
i
son.


Speci
f
i
cal
l
y,




Pr
ot
ei
ns
and
l
i
gands
wi
l
l

be
r
epr
esent
ed
as
(ra
n
d
o
m
ly
-
gener
at
ed)

char
act
er

st
r
i
ngs.




T
h
e
d
o
c
k
i
n
g
-
pr
obl
em

com
put
at
i
on
wi
l
l

be
r
epr
esent
ed
by
com
par
i
ng
a
l
i
gand
st
r
i
ng
L
t
o
a
pr
ot
ei
n
st
r
i
ng
P.


The
scor
e
f
or

a
pai
r

[
L,

P]

wi
l
l

be
t
he
m
axi
m
um

num
ber

of

m
at
chi
ng
charact
ers
am
ong
all
possibilit
ies
w
hen
L
i
s
com
par
ed
t
o
P,

m
ovi
ng
f
r
om

l
ef
t

t
o
r
i
ght
,

al
l
owi
ng
possi
bl
e
i
nser
t
i
ons
and
del
et
i
ons.


For

exam
pl
e,

i
f

L
i
s
t
he
st
r
i
ng

cxt
bcr
v”

and
P
is th
e
strin
g
“lca
cx
tq
v
iv
g
” th
e
n
th
e
sco
re
is 4
, a
risin
g
fro
m
th
is co
m
p
a
riso
n
o
f L
to
a

segm
ent

of

P:

l c
a

c

x
e

t
q

v
i v
g


c

x

t
b
c

r
v

T
h
i
s
i
s
n
o
t th
e
o
n
l
y
c
o
m
p
a
r
i
s
o
n
o
f th
a
t l
i
g
a
n
d
to
th
a
t p
r
o
te
i
n
th
a
t y
i
e
l
d
s
fo
u
r
m
a
tc
h
i
n
g

charact
ers.


Anot
her
one
is

l
c
a
c

x
e

t
q
v
i


v
g


c

x

t
r b
c
v

However
,

th
e
re
is
n
o
c
o
m
p
a
ris
o
n
th
a
t m
a
tc
h
e
s
fiv
e
c
h
a
ra
c
te
rs
w
h
ile
m
o
v
in
g
fro
m
le
ft to

rig
h
t, so
th
e
sco
re
is 4
.



A
s
e
q
u
e
n
t
i
a
l

i
mp
l
e
me
n
t
a
t
i
o
n


We

w
i
l
l

d
e
v
e
l
o
p

a
n
d

d
i
s
c
u
s
s

p
a
r
a
l
l
e
l

s
o
l
u
t
i
o
n
s

t
o

o
u
r

s
i
mp
l
i
f
i
e
d

d
r
u
g

d
e
s
i
g
n

p
r
o
b
l
e
m
i
n

o
t
h
e
r

m
o
d
u
l
e
s
. F
o
r
n
o
w
, w
e
w
i
l
l p
re
se
n
t se
q
u
e
n
tia
l co
d
e
fo
r a
so
lu
tio
n
, a
s a
re
fe
re
n
ce
p
o
in
t fo
r th
e

par
al
l
el

sol
ut
i
ons.


The
exam
pl
e
code
dd
_
serial
.
cpp
(s
e
e
A
p
p
e
n
d
ix
) p
ro
v
id
e
s
s
u
c
h
a
n

im
p
le
m
e
n
ta
tio
n
in
C
++.

1.

I
nt
roduci
ng
t
he
i
m
pl
em
ent
at
i
on


In
th
is
im
p
le
m
e
n
ta
ti
o
n
, th
e
c
l
a
s
s
M
R
e
n
c
a
p
s
u
l
a
te
s
th
e
m
a
p
-
re
d
u
ce
ste
p
s G
e
n
e
ra
te
_
ta
sks(),
M
a
p
(
)
, a
n
d
R
e
d
u
c
e
(
)
a
s
p
r
i
v
a
te
m
e
th
o
d
s
(
m
e
m
b
e
r
fu
n
c
ti
o
n
s
)
, a
n
d
a
p
u
b
l
i
c
m
e
th
o
d
r
u
n
(
)

in
v
o
ke
s th
o
se
ste
p
s a
cco
rd
in
g
to
a
m
a
p
-
re
d
u
ce
a
lg
o
rith
m
ic stra
te
g
y
. W
e
h
a
v
e
h
ig
h
lig
h
te
d

calls
t
o
t
he
m
et
hods
represent
ing
m
ap
-
re
d
u
ce
ste
p
s in
fo
llo
w
in
g
co
d
e
se
g
m
e
n
t fro
m
M
R
::ru
n
().




Gener
at
e_t
asks(
t
asks)
;

// a
s
s
e
rt
--
ta
s
k
s
is
n
o
n
-
em
pt
y


w
h
ile
(!ta
s
k
s
.e
m
p
ty
()) {


M
a
p
(
ta
s
k
s
.fr
o
n
t(
)
, p
a
i
r
s
)
;

ta
s
k
s
.p
o
p
();

}



d
o
_
s
o
rt(p
a
irs
);


in
t n
e
x
t =
0
; // in
d
e
x
o
f firs
t u
n
p
ro
c
e
s
s
e
d
p
a
ir in
p
a
irs
[]

w
h
ile
(n
e
x
t <
p
a
irs
.s
iz
e
()) {

csinparallel.org

Exemplars + Patterns


Exemplar implementations offer a rich
opportunity for learning patterns


Examples


π

as area (among 8 PDC implementations):


D
ata
D
ecomposition,
G
eometric
D
ecomposition;
P
arallel
F
or
L
oop, Master
-
Worker, Strict
D
ata
P
arallel, Distributed
A
rray; SIMD, Thread
P
ool, Message
P
assing, Collective
C
ommunication, Mutual
E
xclusion


Drug design (among 4 PDC implementations):


Map
-
Reduce; Data
D
ecomposition; Parallel
F
or
L
oop,
F
ork
-
Join, BSP, Master
-
Worker, Task
Q
ueue, Shared
A
rray,
Shared
Q
ueue; Thread
P
ool, Message
P
assing, Mutual
E
xclusion


Drug design

π as area

csinparallel.org

Conclusion


Patterns



a meaning for “parallel thinking,”
best practice from industry


Patternlets



minimalist, scalable, executable
programs, each illustrating a particular
pattern’s
behavior


Exemplars



motivation, hands
-
on/demo,
teaching resource, opportunities for PDC


These are naturally combined and ready for
deployment