Distributed Programming CA107 Topics in Computing Series - DCU

coleslawokraSoftware and s/w Development

Dec 1, 2013 (3 years and 8 months ago)

58 views

Distributed Programming

CA107 Topics in Computing Series

Martin Crane

Karl Podesta

The Basics…..


What is a Distributed System (DS)?


How does it differ from a Parallel Computer (MPP)?


differences become fuzzy…now called Supercomputers or
High Performance Computers (HPC)


Supercomputers and Supermodels:


both expensive


both hard to deal with/prone to tantrums


both look glamorous but...


Both spend lots of time doing tedious tasks for others:


mostly matrix
-
vector products for Supercomputers


being live mannequins for Supermodels

Why High Performance Computing?


Solve larger and larger scientific problems


advanced product design


economic analysis


weather prediction/ climate modelling



Store and process huge amount of data


data mining and knowledge discovery


image processing, multi
-
media information


internet information storage and search (eg
GOOGLE)

Different Supercomputers
(MPPs) in Your Neighbourhood


Single Instruction, Multiple Data (SIMD)


as seen on PlayStation 2


very useful for processing large arrays eg

a(i) = b(i) + c(i)*d(i)

{as are found in games}


Multiple Instruction, Multiple Data (MIMD)


as seen in Deep Blue


But these are dinosaurs
-

we want something
more flexible

Problems with Traditional
Supercomputer (ie MPP)


Expensive


Very high starting cost ($10,000s per node)


Expensive software


High maintenance cost


Costly to upgrade



Vendor dependent


lots of companies have come and gone (datacube,
Connection Machines etc.)

So, real/poor people cannot do HPC!

PC Cluster: a poor
-
man’s
supercomputer!


built from high
-
end PCs and high
-
speed comms
network


supports standard parallel programming based on
message
-
passing model (MPI language)


cheap (16 node cluster can cost less than $10k)

Cluster Diagram Here

DCU CA Cluster Resources


“John the Baptist” Cluster


built by Redbrick using old CA machines


24 individual 450MHz machines


connected by a fast ethernet switch


harbinger of better things….


“The one that is to come”……


24 SMP machines


each with 2 GHz


plus loadsa memory!


arrives about Xmas time, appropriately enough

What are the issues in HPC?


Communication Vs Computation


size/ nature of problem


interconnect speed/ processor speed


Fault tolerance


quality of hardware


nature of problem


Load balancing


nature of problem/ quality of programmer


even an easy problem can be made difficult &
slow by a bad implementation

Influence of Nature of Problem
on Speed


What is speed?


speed up

is better: Time on 1 node/ Time on n nodes


Speed
-
up and Problems


very good
: embarrassingly parallel problems


fair to middling
: regular and synchronous problems


a bit of cross
-
talk between nodes


bad
: irregular/ asynchronous problems


lots of cross
-
talk between nodes