Why is Matlab so slow? Why is Matlab so slow?

lightnewsSoftware and s/w Development

Nov 18, 2013 (3 years and 11 months ago)

86 views



Why is Matlab so slow?
Why is Matlab so slow?
IAP 2009
IAP 2009
Scott Gorlin
Scott Gorlin
gorlins@mit.edu
gorlins@mit.edu
http://stellar.mit.edu/S/project/advanced-matlab/
http://stellar.mit.edu/S/project/advanced-matlab/


Everyone Gets Frustrated
Everyone Gets Frustrated

Why does it take forever to do x?
Why does it take forever to do x?

Other applications analyze my data faster…
Other applications analyze my data faster…


Everyone Gets Frustrated
Everyone Gets Frustrated

Why only 50%?!?!
Why only 50%?!?!


The Curse of Compilation
The Curse of Compilation

Computers don’t speak English
Computers don’t speak English

English commands (‘m-files’) must either
English commands (‘m-files’) must either
be
be
interpreted
interpreted
or
or
compiled
compiled


The Curse of Compilation
The Curse of Compilation

Compiled
Compiled

Fast languages (i.e. C++) are
Fast languages (i.e. C++) are
compiled
compiled
to
to
machine code, which turns them into
machine code, which turns them into
executables
executables

Slow creation time, fast execution time
Slow creation time, fast execution time


The Curse of Compilation
The Curse of Compilation

Compiled
Compiled

Static
Static
memory – declare all variables, types
memory – declare all variables, types

Can’t
Can’t
change memory types, add/remove variables
change memory types, add/remove variables

Separate functions for int, double, etc!
Separate functions for int, double, etc!


The Curse of Compilation
The Curse of Compilation

Interpreted
Interpreted

Every command immediately translated,
Every command immediately translated,
executed at run-time
executed at run-time

No up-front time,
No up-front time,
much slower
much slower
execution time
execution time

NB: ~ 500 vs 10
NB: ~ 500 vs 10
μ
μ
s per command
s per command

Allows dynamic memory addressing
Allows dynamic memory addressing


The Curse of Compilations
The Curse of Compilations

Interpreted
Interpreted

Examples of dynamic benefit:
Examples of dynamic benefit:

Don’t need to declare/init variables
Don’t need to declare/init variables

Can interactively run code (console, gui’s)
Can interactively run code (console, gui’s)

Immediately update code (anonymous functions)
Immediately update code (anonymous functions)

Work with dynamic data types (structures, cells)
Work with dynamic data types (structures, cells)

Drawback
Drawback

Takes longer to interpret commands than to execute
Takes longer to interpret commands than to execute
them!
them!


Optimization
Optimization

Vectorize…
Vectorize…

Get rid of
Get rid of
for

loops
loops

fib = [1 1 2 3 5 8 13 21 34 ...];
fib = [1 1 2 3 5 8 13 21 34 ...];

s = sum(fib);
s = sum(fib);
instead of
instead of

for
i = 1:length(fib)
i = 1:length(fib)
s = s+fib(i);
s = s+fib(i);
end


Optimization
Optimization

Vectorize...
Vectorize...

Can get overly complex! Eg,
Can get overly complex! Eg,


Interpreters
Interpreters

JIT
JIT

Just-In-Time compilation
Just-In-Time compilation

Matlab immediately compiles parts of functions
Matlab immediately compiles parts of functions
when they are called, before running
when they are called, before running

Small up-front cost (ms) first time a function is
Small up-front cost (ms) first time a function is
called
called

Much faster execution for some commands
Much faster execution for some commands

Esp. loops
Esp. loops


Optimization
Optimization

What no one knows:
What no one knows:

JIT makes vectorizing obsolete!
JIT makes vectorizing obsolete!

(sometimes)
(sometimes)


Optimization
Optimization

For 10-1000x speedup, write JIT-compliant code!
For 10-1000x speedup, write JIT-compliant code!

On Stellar: accel_matlab.pdf describes details of this
On Stellar: accel_matlab.pdf describes details of this

Shaded: Accelerated data types (as of v6.5)
Shaded: Accelerated data types (as of v6.5)


Optimization
Optimization

Cell vs double arrays
Cell vs double arrays

60 fold acceleration!
60 fold acceleration!

W/o JIT, is ~3
W/o JIT, is ~3


Optimization
Optimization

Here, the
Here, the
for
loop is
loop is
made JIT compliant by
made JIT compliant by
using scalar indexing
using scalar indexing
instead of a vector
instead of a vector

In fact, here the JIT code
In fact, here the JIT code
is
is
faster
faster
than the
than the
vectorized code!
vectorized code!

W/o JIT, 1
W/o JIT, 1
st
st
ratio is 1
ratio is 1


Optimization
Optimization

JIT Takehome:
JIT Takehome:

Interpreting is powerful but SLOW
Interpreting is powerful but SLOW

Write code which can be ‘compiled’
Write code which can be ‘compiled’

Give up some flexibility for a lot of speed!
Give up some flexibility for a lot of speed!

To test code compliance:
To test code compliance:

>>
>>
feature
feature


accel off

[execute function again], note time difference
[execute function again], note time difference


Optimization
Optimization

JIT Cheat Sheet
JIT Cheat Sheet

All code in a loop must be compliant for loop to
All code in a loop must be compliant for loop to
be compiled
be compiled

Supported data types
Supported data types

Ie int/double/char arrays, not cells/structs
Ie int/double/char arrays, not cells/structs

Built-in (non m-code) Matlab commands
Built-in (non m-code) Matlab commands

For loops: scalar indexes
For loops: scalar indexes

if/switch/while: scalar expressions
if/switch/while: scalar expressions

3 or fewer dimensions per array
3 or fewer dimensions per array


Optimization
Optimization

JIT No-nos (version/platform dependent)
JIT No-nos (version/platform dependent)

4+ dimensional arrays
4+ dimensional arrays

Function calls
Function calls

Interpreted commands/dynamic allocation
Interpreted commands/dynamic allocation

Publishing, evaluating w/ R-click, console
Publishing, evaluating w/ R-click, console
commands
commands

In some versions, using
In some versions, using
i
i
,
,
j
j
,
,
e
e
without defining
without defining
them first
them first

Calling function w/ different data types (forces
Calling function w/ different data types (forces
recompile)
recompile)


Optimization
Optimization

JIT (Un)Fortunate Truth:
JIT (Un)Fortunate Truth:

It is
It is
VERY POORLY
VERY POORLY
documented
documented

See original spec (online), then read 1-liners in the
See original spec (online), then read 1-liners in the
release notes for every version of Matlab since 6.5
release notes for every version of Matlab since 6.5

But, JIT improves in every version
But, JIT improves in every version

Ie console accelerated in 7.6
Ie console accelerated in 7.6

Doesn't work as well on OSX, Linux, but getting
Doesn't work as well on OSX, Linux, but getting
better
better

Vectorize if possible, when unsure
Vectorize if possible, when unsure


Optimization
Optimization

Local variables copied when passed to
Local variables copied when passed to
functions
functions

Not entirely true… (see links under
Not entirely true… (see links under
Materials)
Materials)

Copy-On-Write
Copy-On-Write

Pass-In-Place
Pass-In-Place


Optimization
Optimization

Copy-On-Write
Copy-On-Write

Variables not
Variables not
really
really
copied until they are changed
copied until they are changed

Passing or renaming a variable does not increase
Passing or renaming a variable does not increase
memory if the variable is unchanged!
memory if the variable is unchanged!


Optimization
Optimization

In-Place Operations
In-Place Operations

Function must return the same variable it's
Function must return the same variable it's
passed, both in definition and when called
passed, both in definition and when called


Everyone Gets Frustrated
Everyone Gets Frustrated

Why only 50%?!?!
Why only 50%?!?!


Multithreading
Multithreading

A computer can only do 1 thing at once
A computer can only do 1 thing at once

Start Windows, everything freezes…
Start Windows, everything freezes…


Multithreading
Multithreading

Multithreading – illusion of ‘concurrent’
Multithreading – illusion of ‘concurrent’
processes
processes

OS rapidly switches between ‘threads’ and
OS rapidly switches between ‘threads’ and
allows each to work for a brief period of
allows each to work for a brief period of
time
time


Multithreading
Multithreading

Side note:
Side note:

Windows is a poor scheduler, meaning threads do not always
Windows is a poor scheduler, meaning threads do not always
begin when they should
begin when they should

This leads to timing jitters, or random delays in Matlab up to
This leads to timing jitters, or random delays in Matlab up to
several hundred ms
several hundred ms

Can be helped by ‘realtime’ priority: Matlab thread takes
Can be helped by ‘realtime’ priority: Matlab thread takes
precedence over other threads
precedence over other threads

Task Manager -> Processes -> MATLAB.exe R-click and set priority
Task Manager -> Processes -> MATLAB.exe R-click and set priority
to Realtime
to Realtime

Can be done programatically through a Java/C function, or download
Can be done programatically through a Java/C function, or download
Psychtoolbox for an implementation
Psychtoolbox for an implementation

On a single core, will FREEZE windows until process completes!
On a single core, will FREEZE windows until process completes!


Multithreading
Multithreading

Parallel Programming
Parallel Programming

2 or more threads to simultaneously do something
2 or more threads to simultaneously do something

On a single-core computer gives simultaneous
On a single-core computer gives simultaneous
execution
execution

I.e., one thread handles GUI, other does background
I.e., one thread handles GUI, other does background
calculations
calculations

On a dual-core computer, ideally, doubles performance
On a dual-core computer, ideally, doubles performance


Multithreading
Multithreading

Matlab is Single-Threaded
Matlab is Single-Threaded

Means you can only do 1 thing at once
Means you can only do 1 thing at once

On newer, dual-core computers this is not
On newer, dual-core computers this is not
optimal
optimal


Multithreading
Multithreading

Matlab is Single-Threaded
Matlab is Single-Threaded

New versions have ‘multi-threading’ features
New versions have ‘multi-threading’ features

This is a misnomer – only a multithreaded
This is a misnomer – only a multithreaded
BLAS
BLAS

Faster matrix operations (sometimes), but that's it
Faster matrix operations (sometimes), but that's it


Multithreading
Multithreading

Matlab is Single-Threaded
Matlab is Single-Threaded

Benefits
Benefits

Don’t worry about
Don’t worry about
Thread Safety
Thread Safety

On a Dual Core, setting Realtime Priority lets
On a Dual Core, setting Realtime Priority lets
Matlab dominate one core, and the rest of the
Matlab dominate one core, and the rest of the
computer runs on the other!
computer runs on the other!

We can ‘hack’ multithreading, some of the time
We can ‘hack’ multithreading, some of the time


Multithreading
Multithreading

Simplest hack –
Simplest hack –
timer
timer
function
function

This actually creates a Java object, and executes a callback
This actually creates a Java object, and executes a callback
function with variable delay, interval, etc
function with variable delay, interval, etc

However, not a true multithread – will not execute while another
However, not a true multithread – will not execute while another
process is dominating! (at least through v7.4)
process is dominating! (at least through v7.4)

Therefore mostly useful in GUI/console applications, etc
Therefore mostly useful in GUI/console applications, etc

NB:
NB:
Brief
Brief
testing in v7.6 indicates it may be asynchronous now!
testing in v7.6 indicates it may be asynchronous now!

Not recommended to build true parallel applications with this
Not recommended to build true parallel applications with this
method...
method...


Multithreading
Multithreading

True parallel programming
True parallel programming

Must either start a 2
Must either start a 2
nd
nd
Matlab session,
Matlab session,

Two Matlab sessions can run in parallel via COM, Java, or
Two Matlab sessions can run in parallel via COM, Java, or
shared memory spaces
shared memory spaces

Actually a client/server interface – one session will dispatch
Actually a client/server interface – one session will dispatch
jobs to the other
jobs to the other

Write a C/Java applet for native threading,
Write a C/Java applet for native threading,

Complicated but powerful
Complicated but powerful

Or, use the distributed computing toolbox
Or, use the distributed computing toolbox

$$, not provided with free MIT student bundle
$$, not provided with free MIT student bundle


Automation Server
Automation Server

Simplest parallel Matlab example
Simplest parallel Matlab example

See External Interfaces/COM Automation
See External Interfaces/COM Automation
Server for documentation
Server for documentation

Starts 2
Starts 2
nd
nd
Matlab session via COM/ActiveX
Matlab session via COM/ActiveX

(Windows only, sadly)
(Windows only, sadly)

COM protocol may be too slow for you
COM protocol may be too slow for you


Automation Server
Automation Server

Create server with:
Create server with:

h = actxserver('matlab.application');
h = actxserver('matlab.application');

fields(h), methods(h)
fields(h), methods(h)
show ways to control
show ways to control
new Matlab server. Important are:
new Matlab server. Important are:

Execute(h, ‘command’)
Execute(h, ‘command’)

Feval(h, ‘fcnName’, numout, arg1, arg2, …)
Feval(h, ‘fcnName’, numout, arg1, arg2, …)

PutFullMatrix, GetFullMatrix
PutFullMatrix, GetFullMatrix
, etc
, etc

Can even hide window, but make sure to later close it
Can even hide window, but make sure to later close it
programatically!
programatically!

h.Visible = 0; ... h.Quit();
h.Visible = 0; ... h.Quit();


Automation Server
Automation Server

Benefits
Benefits

Native Matlab interface, easy to use, get/send
Native Matlab interface, easy to use, get/send
data directly through Matlab
data directly through Matlab


Automation Server
Automation Server

BUT:
BUT:

By default, still single threaded!!
By default, still single threaded!!

Calling Execute and Feval starts function in server
Calling Execute and Feval starts function in server
workspace, but client Matlab
workspace, but client Matlab
pauses
pauses
and waits for
and waits for
completion!
completion!

Makes sense for Feval(waits for a return value) but not
Makes sense for Feval(waits for a return value) but not
for Execute
for Execute

Must find a way to return control to client while server
Must find a way to return control to client while server
computes!
computes!


Automation Server
Automation Server

Remember
Remember
timer()
timer()
?
?

Execute or call a function which starts a timer in the
Execute or call a function which starts a timer in the
new workspace!
new workspace!

This will allow client to return, and server will process
This will allow client to return, and server will process
timer thread immediately after
timer thread immediately after

Main problem: still a SLOW protocol
Main problem: still a SLOW protocol

Data transfer not too fast
Data transfer not too fast

Min ~3ms simply to invoke function in server and return
Min ~3ms simply to invoke function in server and return

May or may not be fast enough for your application
May or may not be fast enough for your application


Parallel Matlab
Parallel Matlab

Better, more complicated
Better, more complicated

Multithread in Java/C-mex code
Multithread in Java/C-mex code

Open 2
Open 2
nd
nd
Matlab, but communicate via a different protocol
Matlab, but communicate via a different protocol

Sockets
Sockets

Java RMI between the JVM’s
Java RMI between the JVM’s

Actually quite simple – code posted online in MatlabRMI.zip
Actually quite simple – code posted online in MatlabRMI.zip

Also enables the second Matlab session to run
Also enables the second Matlab session to run
anywhere on your
anywhere on your
network
network
, for a truly distributed program
, for a truly distributed program

May run with Matlab Component Runtime (untested)
May run with Matlab Component Runtime (untested)

Shared memory through C-mex
Shared memory through C-mex

I have no idea how to do this!
I have no idea how to do this!

MPI
MPI

Distributed computing toolbox
Distributed computing toolbox

Must purchase, I have no experience with so cannot recommend
Must purchase, I have no experience with so cannot recommend
either way
either way

Seems good for parallel array operations, batching identical tasks
Seems good for parallel array operations, batching identical tasks
across worker pool – but not designed for concurrent programming
across worker pool – but not designed for concurrent programming


Optimization
Optimization

Take home 4 steps for faster programs
Take home 4 steps for faster programs

JIT compliant code
JIT compliant code

Memory management
Memory management

Copy-On-Write
Copy-On-Write

Pass-In-Place
Pass-In-Place

Write a parallel program
Write a parallel program

Write a Java/C object
Write a Java/C object


Friday
Friday

Object Oriented Programming
Object Oriented Programming