The Programmer's Abstract Machine - Electrical and Computer ...

coleslawokraSoftware and s/w Development

Dec 1, 2013 (3 years and 17 days ago)

158 views

§
2.1 The Programmer’s

Abstract Machine

Douglas Wilhelm Harder,
M.Math
. LEL

Department of Electrical and Computer Engineering

University of Waterloo

Waterloo, Ontario, Canada


ece.uwaterloo.ca

dwharder@alumni.uwaterloo.ca


© 2012 by Douglas Wilhelm Harder. Some rights reserved.

Based in part on Gary Nutt,
Operating Systems
, 3
rd

ed.

Chapter 2:

Using the Operating System


Topics:


The programmer’s abstract machine


Resources


Processes and threads


Writing concurrent programs


Objects

2

Outline


This topic introduces the abstract machine


Abstract and virtual machines


Cloud computing


Computing resources over time


Consequences


Sequential and parallel computation


Examples: quick sort and merge sort


Implementation

The Programmer's Abstract Machine

3

Abstract Machines


The operating system provides an abstract interface
between the programmer and the hardware


Suppose you had to rewrite an application as large as a word
processor each time a company built a new CPU



An operating system reduces development costs but
increases run times with respect to any interactions


The use of main memory and the CPU is invisible to the
programmer


The interaction between the programmer and devices is through
function calls in libraries


Programs must still be complied and translated into machine
instructions and linked to the various libraries

The Programmer's Abstract Machine

4

Virtual Machines


One recent further abstraction is to the creation of
virtual
machines


Now the individual machine instructions become abstracted to a
common set of instructions


Programs written in Java and C#, for example, need only be
compiled once after which they can be run on any machine that
implements a virtual machine


The Java Virtual Machine (JVM) runs Java byte code


The Common Language Runtime (CLR) of the .NET framework
runs an
Intermediate Language


Again, the benefits are reduced development costs and further
increased run times

The Programmer's Abstract Machine

5

Cloud Computing


Even more recent is the advent of
cloud computing


Now even the individual machines are abstracted away


Computing becomes a service which is seamlessly distributed to
available resources

including the possibility of multiple remote
processors



As always, the benefits are reduced development costs
and further increased run times


The Programmer's Abstract Machine

6

Computing Resources


Moore’s law allows us to afford the additional costs

The Programmer's Abstract Machine

User:
Wgsimon

7

Computing Resources


Other resources are also increasing exponentially:


Hard drive capacity is following a similar curve to Moore’s law


Kyder’s

law


Butter’s law makes similar predictions about network capacity


Nielsen's law says bandwidth doubles each year


Heady’s

law observes that pixels
-
per
-
dollar also follows Moore’s
law



Caveat:


Moore’s Second Law says that the capital costs for
semiconductor fabrication also increases exponentially...


The Programmer's Abstract Machine

8

Hint for Fame


An engineer’s recipe for fame (if not fortune):


1.
Find an exponentially increasing process

2.
Call it a law

3.
Publicize it

4.
Get it named after you...

The Programmer's Abstract Machine

9

Computing Resources


Not all resources improve so quickly:


Seek time has only halved from 20 ms in 1990 to 9 ms today


Down to 3 ms for high
-
end server drives


Transfer rates depend on rotational speed


The 7200 rpm Barracuda was introduced in 1991


15 000 rpm hard drives became available in 2005


100

300 MB/s transfer rates


Cost: 5¢/GB


10¢/GB



Solid state drives improve on this


Transfer rates of 100

500 MB/s


Cost: $1/GB


$2/GB

The Programmer's Abstract Machine

10

Computing Resources


Clock speed used to follow Moore’s law


It doubled every two years


It reached a plateau in 2002 at 3 GHz


The best available today is 4 GHz



There are other solutions


Using multiple processors


Multi
-
core processors


Distributed computing


The Programmer's Abstract Machine

11

Computing Resources


There were concerns over a
memory wall

in the 1990s:


CPU speed was following Moore’s law


The speed of RAM increased more slowly: 10 % per annum


Memory was becoming the bottleneck


With the CPU clock rates plateau, RAM has caught up:


DDR SDRAM started at 100 MHz


DDR4 SDRAM capable of 4 GHz



There are other solutions


More registers in the processor


Multiple levels of caching on the processor


On multi
-
core machines, each core may have a L1 cache while the
cores may share L2 and L3 caches

The Programmer's Abstract Machine

12

Consequences


Despite the additional levels of abstraction being placed
on top of the hardware, hardware improvements more
than make up for this additional cost



To benefit from multiple serial processing elements, we
first need to look at our execution model

The Programmer's Abstract Machine

13

Computation


There are two techniques for execution:


Sequential execution


Parallel sequential execution




The Programmer's Abstract Machine

14

Sequential Computation


In a sequential execution, programs follows a predictable
sequence of instructions


Branching and jumps are defined by conditional and looping
statements as well as function calls



Given a problem:


An algorithm is designed to solve that problem


It is implemented in a high
-
level programming language:
source


It is compiled into machine instructions:
binary program


The binary program is executed


It requests and frees resources and determines the solution

The Programmer's Abstract Machine

15

Sequential Computation


In a sequential execution, the sub
-
problems are solved
through function calls


Functions have a well defined

entrance and exit points


The calling function halts execution

until the function returns

The Programmer's Abstract Machine

16

Parallel Sequential Computation


As an alternative, consider the following modification:


Break the base problem into independent tasks


For each task, determine an algorithm that solves it


Implement the solutions as source code


There is one

base
task


Compile the solutions into a binary program


When executed, the binary program begins a sequential
execution of the base task


If another task must be executed, it begins execution as a
sequential computation
parallel

to the base task


The Programmer's Abstract Machine

17

Parallel Sequential Computation


There is a
hierarchical

relationship between tasks


There is one base task


Each other task is forked as the parallel execution of another
executing task


This creates a tree structure








We will refer to
parent tasks

and
child tasks

The Programmer's Abstract Machine

18

Parallel Sequential Computation


In a parallel execution, the tasks are forked from another
tasks, however, the original continues executing


Child tasks have well defined

entrance points relative to the

parent task


The parent task continues execution



This can happen when:


The parent does not require an

immediate result from the child


It does not matter in which order the

parent or child are executed

The Programmer's Abstract Machine

19

Example:

Quick Sort


Consider quick sort:


Select a pivot point in the list


Distribute the entries relative to the pivot


Recursively call quick sort on each of the two sub
-
lists



When quick sort returns, that sub
-
list is sorted


We only have to know when all recursive tasks are complete



Each recursive call could occur in parallel


If so, the run
-
time can be reduced from
Q
(
n

ln
(
n
))

to
Q
(
n
)

The Programmer's Abstract Machine

20

Example: Merge Sort


Consider merge sort:


Divide the list into two


Recursively sort each of the sub
-
lists


Merge the results



Problem:


We can apply merge sort on each sub
-
list in parallel


The parent task, however, must know when the two child tasks
are complete before the sorted sub
-
lists can be merged

The Programmer's Abstract Machine

21

How are Tasks Implemented?


How are the parent and child tasks related?


Is the memory shared or distributed?


Is the data shared or distributed?



There are numerous models:


The
thread model

has the executing tasks share memory and
resources


The
message passing model

has each task executing as an
independent process with an inter
-
process
message passing
interface


The
data parallel model

has each task accessing the same
memory, but memory is partitioned between the tasks

The Programmer's Abstract Machine

22

The CMSIS Interface


ARM Holdings, as well as specifying


The ARMv7 architecture


The Cortex
-
M3 microcontroller interface



also specifies an interface to an RTOS: CMSIS


Any application using this interface can be compiled to a
platform which implements this interface





All images
and interfaces based
on
www.arm.com/cmsis

23

The Programmer's Abstract Machine

CMSIS


The
Cortex
TM

Microcontrol

Software
Interface Standard


(CMSIS) is an interface to the underlying microcontroller

24

The Programmer's Abstract Machine

http://www.arm.com/products/processors/cortex
-
m/cortex
-
microcontroller
-
software
-
interface
-
standard.php

CMSIS


The user can access the hardware either through the

interface or through direct calls

25

The Programmer's Abstract Machine

http://www.arm.com/products/processors/cortex
-
m/cortex
-
microcontroller
-
software
-
interface
-
standard.php

CMSIS


CMSIS specifies data structures, defines,
typedefs
,

enumerations and functions


You can download the specifications from the ARM website:



http
://
www.arm.com/products/processors/cortex
-
m/




cortex
-
microcontroller
-
software
-
interface
-
standard.php

26

The Programmer's Abstract Machine

CMSIS


RTOS functions defined in CMSIS include:


Kernel status


Thread management


Waiting on events


Timer management


Sending and receiving signals


Mutual exclusion and semaphores for
concurrency


Memory pool management


Message handling


Mail queue
management

27

The Programmer's Abstract Machine

CMSIS


Kernel status:


Start the kernel and check its status

osStatus


osKernelStart


(
osThreadDef_t

*
thread_def
, void *
argument )

int32_t

osKernelRunning

( void )

28

The Programmer's Abstract Machine

CMSIS


Thread management:


Create or terminate a thread, get or set a thread’s priority, and
accessing this thread’s ID or forcing it to yield the processor

osThreadId


osThreadCreate


(
osThreadDef_t

*
thread_def
, void *argument)

osStatus

osThreadTerminate


(
osThreadId

thread_id

)

osStatus


osThreadSetPriority

(
osThreadId

thread_id
,
osPriority

priority )

osPriority


osThreadGetPriority

(
osThreadId

thread_id

)

osThreadId


osThreadGetId

( void
)

osStatus

osThreadYield

( void )




29

The Programmer's Abstract Machine

CMSIS


Timer management:


Create, start or restart, and stop a timer

osTimerId


osTimerCreate

(
osTimerDef_t

*
timer_def
,
os_timer_type

type, void *
argument )

osStatus


osTimerStart


(
osTimerId

timer_id
, uint32_t
millisec

)

osStatus


osTimerStop


(
osTimerId

timer_id

)


30

The Programmer's Abstract Machine

CMSIS


Waiting on events:


Wait for a time delay or wait for any event (signal, message, mail,
or time delay)

osStatus


osDelay

( uint32_t
millisec

)

osEvent


osWait

( uint32_t
millisec

)

31

The Programmer's Abstract Machine

CMSIS


Sending and receiving signals:


Set, get, clear, or wait on a signal of an executing thread

int32_t

osSignalSet


(
osThreadId

thread_id
, int32_t signal )

int32_t

osSignalGet

(
osThreadId

thread_id

)

int32_t

osSignalClear

(
osThreadId

thread_id
, int32_t
signal )


osEvent


osSignalWait


( int32_t
signals, uint32_t
millisec

)

32

The Programmer's Abstract Machine

CMSIS


Mutual exclusion and semaphores for concurrency:


Create and initialize, wait on, and release a mutual exclusion flag

osMutexId


osMutexCreate


(
osMutexDef_t

*
mutex_def

)

osStatus


osMutexWait


(
osMutexId

mutex_id
, uint32_t
millisec

)

osStatus


osMutexRelease

(
osMutexId

mutex_id

)



Create and initialize, wait on, and release a
semaphore

osSemaphoreId


osSemaphoreCreate


(
osSemaphoreDef_t

*
semaphore_def
, int32_t
count )

int32_t


osSemaphoreWait

(
osSemaphoreId

semaphore_id
, uint32_t
millisec

)

osStatus



osSemaphoreRelease

(
osSemaphoreId

semaphore_id

)

33

The Programmer's Abstract Machine

CMSIS


Memory pool management:


Create and initialize, allocate (possibly zeroing), and free
memory within a pool

osPoolId


osPoolCreate

(
osPoolDef_t

*
pool_def

)

void
*

osPoolAlloc


(
osPoolId

pool_id

)

void
*

osPoolCAlloc

(
osPoolId

pool_id

)

osStatus


osPoolFree


(
osPoolId

pool_id
, void *
block )

34

The Programmer's Abstract Machine

CMSIS


Message handling:


Creating, sending, and waiting on a message

osMessageQId


osMessageCreate

(
osMessageQDef_t

*
queue_def
,
osThreadId

thread_id

)

osStatus



osMessagePut

(
osMessageQId

queue_id
, uint32_t info, uint32_t
millisec

)

osEvent



osMessageGet

(
osMessageQId

queue_id
, uint32_t
millisec

)

35

The Programmer's Abstract Machine

CMSIS


Mail and mail queue management:


Create and initialize a mail queue and allocate memory for
(possibly zeroing), send, receive, and free memory for mail

osMailQId


osMailCreate

(
osMailQDef_t

*
queue_def
,
osThreadId

thread_id

)


void
*

osMailAlloc


(
osMailQId

queue_id
, uint32_t
millisec

)

void
*

osMailCAlloc

(
osMailQId

queue_id
, uint32_t
millisec

)

osStatus


osMailPut


(
osMailQId

queue_id
, void *
mail )

osEvent


osMailGet


(
osMailQId

queue_id
, uint32_t
millisec

)

osStatus


osMailFree


(
osMailQId

queue_id
, void *
mail )

36

The Programmer's Abstract Machine

CMSIS


CMSIS is an interface for a real
-
time operating system


Any vendor could implement additional functions


An interface that implements all CMSIS
-
defined algorithms is
said to be “CMSIS compliant”

37

The Programmer's Abstract Machine

Summary


This topic covered the abstract machine


Abstract and virtual machines


Cloud computing


Computing resources over time


Consequences


Sequential and parallel computation


Examples: quick sort and merge sort


Implementation details of CMSIS


The Programmer's Abstract Machine

38