Programming Multicores in Erlang Joe Armstrong Ericsson

bahrainiancrimsonSoftware and s/w Development

Nov 13, 2013 (3 years and 7 months ago)

133 views

1
Programming Multicores in Erlang
Joe Armstrong
Ericsson
2
Plan


Theory


Why parallel programming is difficult


Erlang - 1 minute intro


Make some abstractions for parallelism


Demo them (time permitting)


Does it work?


Who uses Erlang?
3
4
To program a multicore we must write
parallel programs
5
Why is
parallel
programming difficult?
6
Programming parallel
algorithms in languages designed for
sequential programming is not
only extremely difficult

it is also crazy


7
Erlang vs C etc. ?

C etc. - start off with sequential stuff, try to
make it parallel.

Erlang - start of with parallel stuff, try to avoid
sequential parts.


8
Erlang programmers have been writing parallel
programs since 1986. When multicores came
many of their programs just went faster without
them having to do anything.[1]
[1]
Provided they had used lots of processes and had no sequential bottlenecks, and ....
Erlang just goes faster


9
Erlang does lot's of other things

Code change on-the-fly

Error detection and correction

Transparent distribution

Soft real-time GC

Process isolation

Designed for non-stop systems

Millions of lines of code

Dozens to hundreds of nodes
10
Why is parallel programming difficult?

Algorithmic difficulties

Pragmatic difficulties (latency, failure)

Accidental difficulties (wrong programming
language)


11
Using the wrong
abstractions makes life
artificially difficult
“accidental complexity” is caused not by the problem
themselves but as a side effect of the technology we use


XLVIII x XCIII

=
MMMMCDLXIV
13


ActionScript Ada Algol Apl AppleScript Assembly Awk

BAL Bash Basic Bcpl C C# C++ C--

CLOS Caml Clojure Clu Cobol CobolScript ColdFusion

Comal D Datalog Delphi Dylan EcmaScript Eiffel F#

FP Factor Focal Forth Fortran Groovy Haskell J

Java JavaFX JavaScript Lisp Logo Lua Lucid

Mathematica Matlab Miranda Modula-2 Modula-3 Nial

OCaml Oberon Objective Caml Objective-J

Objective-C Ops5 PHP PL/0 Pascal Perl Pizza Pop-11

Poplog PostScript Prolog Python Ratfor Rebol Ruby

SETL SISAL SQL Scala Scheme Self Simula Smalltalk

Snobol Snowball Spitbol Squeak Standard ML

SuperCollider Tcl Tex Visual Basic XUL
Don't use these langauges
14


Use one of these

Go Limbo Occam Oz CHILL Concurrent Clean Erlang

But only two are supported


15

But can't I just use a message passing library
with my sequential language?”

No!”

Do I really have to learn a new language?”

Yes”

Great”


16
Efficiency of a concurrent language
depends upon

Process spawn time

Message passing time

Context switching time
17
Two models of Concurrency
Shared Memory

- mutexes

- threads

- locks
Message
Passing

- messages

- processes

(also know as agent languages and as

asynchronous programming” by MS)
18
Problem 1
Your program
crashes in
the critical region
having corrupted
memory


19
Thou shalt not
share memory


20
From: Alan Kay <alank@wdi.disney.com>

Date: 1998-10-10 07:39:40 +0200

To: squeak@cs.uiuc.edu

Subject: Re: prototypes vs classes was: Re: Sun's
HotSpot

Folks --
Just a gentle reminder that I took some pains at the last
OOPSLA to try to remind everyone that Smalltalk is not only
NOT its syntax or the class library, it is not even about classes.
I'm sorry that I long ago coined the term "objects" for this topic
because it gets many people to focus on the lesser idea.

The big idea is "messaging"
The Big Idea is Messaging


21


22

Decouples A and B

A and B can proceed in parallel

A and B can be on different machines

A and B on different continents

Failures inside A cannot effect B

The programming model does not change
when there is a scale change

If you use a queue B can be in the future
(it's a time machine)
A
B
Msg
B ! Msg
receive

Msg → …
end


23

Erlang like” semantics

Pure message passing

Lightweight processes

Millions of processes

Isolated processes

Code and data can be send in messages

Error detection and correction across machine boundaries

Higher-order functions/processes


24
Erlang in 1 minute

Function evaluation

spawn

send

receive
fib(0) → 0;
fib(1) → 1;
fib(N) → fib(N-1) + fib(N-2)
> lecture:fib(30).
832040
Pid = spawn(fun() → ...end)
Pid ! Msg
receive

Pattern1 →

Actions1;

Pattern2 →

Actions2;

end


25
RPC in Erlang

Send a message

Wait for a reply
rpc(Pid, Request) →

Pid ! {self(), Request},

receive
{Pid, Reply} →

Reply

end.
loop(State) →

receive

{From, Request} →

Reply = ....

State1 = …

From ! {self(), Reply},

loop(State1)

end.


26
Promises/futures/eventual

Liskov/Hewett/Baker

Dispatch a computation get a promise
that it will be done later

You ask for the result later
fib(0) → 0;
fib(1) → 1;
fib(N) → fib(N-1) + fib(N-2)
promise(F) →

S = self(),

spawn(fun() → S ! {self(), F()} end).
yield(Promise) →

receive {Promise, Value} → value end.
P = promise(fun() → fib(45) end),

do something else in parallel ...
Val = yield(P),
...


27
par eval

make many promises in parallel
fib(0) → 0;
fib(1) → 1;
fib(N) → fib(N-1) + fib(N-2)
parallel(L) →

Promises = [promise(F) || F ← L],

[yield(P) || P ← Promises]
sequential(L) →

[F() || F ← L].
Jobs = [fun() → fib(I) end || I ← seq(1,35)],
parallel(Jobs),
sequential(Jobs)


28
Demo

millions of

processes”


29
The Erlang Experience

Efficient

Scales for very large systems. Copying overhead “not
a problem”

High reliability is possible

Multi-core ready (here-and-now)

Works in large S/W projects (> 1 million lines of code)

Used in many “core” Internet applications

Plays well with other languages (but not in memory)


30
How do we write parallel programs?

Programmer writes code using sequential
functions, spawn, send and receive

Programmer
tries to use
lots of processes

Processes are mapped onto available cores at
run-time


31
How we work

Write programs using concurrency to structure the
application

Run on multi-core

Tune
if necessary

Fine tuning is sometimes necessary - Round robin is not
good enough, need core affinity


32
Programmer
tries to use
lots of
parallel processes. Is this easy?

Depends on the problem

Most of our
problems are
embarrassingly
parallel.
We don't look for the concurrency it
hits us in the face - think 100K users accessing
a system in parallel - many more sessions than
cores.

Erlang itself (the kernel) spawns off dozens
of processes


33
Does it work?

Depends on the problem

Have seen x 4 on quad cores, no change to
program. (Also see 0.8 on quads)

In general it's pretty good - we aim for a speed
up of 75%/core - so on 20 cores we aim to go
15 times faster

Why no faster?


34
Why no faster?

Sequential bottlenecks
- Amdahl's law

Core/core message passing times are non-
uniform (not on NOC architectures for example
Tilera)

Digging deeper
Compact Muon Solenoid
Honey,
we've just

made a black .....