The OSCAR Cluster System

coleslawokraSoftware and s/w Development

Dec 1, 2013 (3 years and 16 days ago)

76 views

The OSCAR Cluster System

Tarik Booker

CS 370

Topics


Introduction


OSCAR Basics


Introduction to LAM


LAM Commands


MPI Basics


Timing


Examples

Welcome to OSCAR!


Welcome to the free Linux
-
based clustered
system


Use multiple computers to create one powerful
multi
-
processor system

Account Setup


Fill in the sign
-
in sheet


Receive account password (paper slip)


Log in:


Use
SSH Only

to log into: oscar.calstatela.edu


SSH (Secure Shell) Log In


Use
cs370studentxx as
your account
(where xx = your
account number)

student30 example:

Environment (LAM) Setup


LAM (Local Access Minicomputer) is the
implementation for MPI (Message Passing
Interface).


To run your parallel programs, you need to
have this running.

Environment (LAM) Setup (2)


After logging in to your account, type (assume ‘>’ is
prompt):


>ls


You should have two files: hello.c and hosts


We need to run LAM. Do this by typing:


>lamboot hosts


Note: to see more in
-
depth loading, type:


>lamboot

v hosts


Both methods are perfectly fine.


Environment (LAM) Setup (3)


LAM should have taken a while to load. (We are
starting a LAM process daemon on each node)


After done, verify LAM is running by typing:


>ps

ux


This is merely a list of the processes running on your
account.


LAM is now setup and running on your account.

(running lam process)

LAM Troubleshooting



If anything happens with your LAM process
(i.e. LAM no longer shows up on your process
list) use the previous steps to start your LAM
process again.


If something is wrong with your LAM process
(i.e. LAM is loaded, in the process list, but
refuses to run, or runs indefinitely), use the
“lamhalt” command, simply:


>lamhalt


Compiling a Parallel Program


Included in your account is the ‘hello.c’ program. We’ll
use this as a test program for LAM/MPI.


We will be using the MPI C compiler. Use the
command:


>mpicc hello.c


This will compile your parallel program.


To specify the output file, type:


>mpicc hello.c
-
o hello


This compiles hello.c into the executable called ‘hello.’


Running Your Parallel Program


Running a program through MPI is a bit different than
other interfaces. You must use the ‘mpirun’ command
and specify the number of nodes used.


The typical usage is:


>mpirun N hello


‘hello’ is the previous executable from the last slide.
The ‘N’ (UPPERCASE!) says to use all nodes.


Note that we don’t have to use all nodes. Try typing:


>mpirun n0
-
5 hello


(this uses only the nodes between 0 and 5)


(Also try >mpirun n4,5,6 hello)


The MPI Program


Let’s look at hello.c


The two most important functions are:


MPI_Init(MPI_COMM_WORLD);


MPI_Finalize();


These functions initialize and close the parallel
environment (respectively).


LAM Commands


LAM is our specific implementation of MPI



LAM comes with additional non
-
MPI
commands (for node management)



Most not necessary, but useful


lamboot

lamboot(hostfile)



Starts LAM Environment


Use

v flag for verbose boot



>lamboot

v hosts


lamhalt



Shuts down lam environment




>lamhalt

mpirun




Runs an mpi program




>mpirun N hello

lamclean


If your program terminates “badly,” use
lamclean to delete old processes and allocated
resources.



>lamclean

wipe


Stronger version of lamhalt that kills every
node on lam




>wipe


laminfo



Detailed information list for LAM environment




>laminfo

lamnodes


List all nodes in the LAM environment




>lamnodes

lamshrink


Remove a node from the LAM environment
(without rebooting)




Ex: >lamshrink n3


(Note: This also invalidates node n3, or leaves
an empty slot in its place)

lamgrow


Add a node to the LAM environment (without
rebooting)



>lamgrow oscarnode3



Also: >lamgrow

n 3 oscarnode3


(Adds oscarnode3 to the previously empty n3 slot.
Note the space between n and 3!
)

lamexec


Run a non
-
MPI program in the LAM
environment




>lamexec {non
-
MPI program}

Termination Order of Bad Programs


In the event of a bad termination (program
takes up too much memory, doesn’t stop, etc.)
use this order of termination:


>lamclean



(good)


>lamhalt




(better)


>wipe




(severe)


>kill

9 [process_number]

(nuclear)



Basic MPI Functions


We Covered MPI_Init and MPI_Finalize


MPI_Send


MPI_Recv


MPI_Bcast


MPI_Reduce


MPI_Barrier


Note:


MPI is a
Message Passing

Interface


Don’t necessarily use Shared Memory


Instead, information is passed around nodes

MPI_Send


Send a variable to another node



MPI_Send(variable, number of variables to send,
MPI Data type, node that receives message, MPI
Tag, group communicator)


Ex:


MPI_Send(&value, 1, MPI_INT, 2, 0,
MPI_COMM_WORLD)


MPI_Recv


Receive a variable from another node


MPI_Recv(variable, number of variables to send, MPI
Data type, node sending message, message tag,
group communicator, MPI status indicator)


Ex:


MPI_Recv(&value, 1, MPI_INT, 0, 0,MPI_COMM_WORLD,
&status


(Note: You must create an MPI_Status variable when using
this function)

MPI_Bcast


Broadcasts a variable to all nodes


MPI_Bcast(variable, number of variables, Data
type of variable, node that sent broadcast,
nodes to send messages to)


Ex:


MPI_Bcast(&value, 1, MPI_INT, 0,
MPI_COMM_WORLD)

MPI_Reduce


Collect data at a node


Converge information with a specific operation


MPI_Reduce(variable to send, variable that receives,
number of values to receive, MPI Data type, reduction
operation, node receiving data, communicator to use)


Ex:


MPI_Reduce(&nodePi, &pi, 1, MPI_FLOAT, MPI_SUM, 0,
MPI_COMM_WORLD)


There are many types of reduction operators (not only
summation); you can even create your own


MPI_Barrier


Use a barrier in MPI


MPI_Barrier(communicator)



Ex:


MPI_Barrier(MPI_COMM_WORLD)


Timing in MPI


Introduction


Timing functions


What to do

Timing Intro


MPI has timing features


Not computational time but “Wall time”


Ticks

Timing Functions


Wall time function


double MPI_Wtime(void)



Clock Tick Function


double MPI_Wtick(void)

What to do with timing


Select starting point and store time



Select end point and store time



Subtract start from end, and multiply by tick


Use “%.30lf” in printf to display time instance

Code Example

start_time = MPI_Wtime();


/* Code that does something */


end_time = (MPI_Wtime()
-

start_time)*tick);


Programming Examples


Ring


Arctan (Using Gregory’s formula)


Pi (Using Euler’s formula)


Music Program


Mandelbrot Program

The Ring


Pass a variable, one at a time, to each node in
the universe (environment)

Value

Value

Code Example


int main(int argc, char** argv)


{



int size, node;



int value;




MPI_Status status;



MPI_Init(&argc, &argv);




MPI_Comm_size(MPI_COMM_WORLD, &size);



MPI_Comm_rank(MPI_COMM_WORLD, &node);




if(node == 0)



{



printf("Value:");



scanf("%d", &value);



MPI_Send(&value, 1, MPI_INT, node+1, 0, MPI_COMM_WORLD);



}



else



{



MPI_Recv(&value, 1, MPI_INT, node
-
1, 0, MPI_COMM_WORLD, &status);







if(node < size
-

1)



{



MPI_Send(&value, 1, MPI_INT, node+1, 0, MPI_COMM_WORLD);



}



}



printf("Node %d has %d in value.
\
n", node, value);




MPI_Finalize();



return 0;


}

Ring Code Example (2)

#include <stdio.h>

#include <mpi.h>


int main(int argc, char** argv)

{


int size, node;


int value;



MPI_Status status;


MPI_Init(&argc, &argv);



MPI_Comm_size(MPI_COMM_WORLD, &size);


MPI_Comm_rank(MPI_COMM_WORLD, &node);


Ring Code Example (3)

if(node == 0)


{


printf("Value:");


scanf("%d", &value);


MPI_Send(&value, 1, MPI_INT, node+1, 0, MPI_COMM_WORLD);


}


else{


MPI_Recv(&value, 1, MPI_INT, node
-
1, 0, MPI_COMM_WORLD, &status);





if(node < size
-

1)


MPI_Send(&value, 1, MPI_INT, node+1, 0, MPI_COMM_WORLD);




}


printf("Node %d has %d in value.
\
n", node, value);


MPI_Finalize();


return 0;

}

Parent node

Everyone
else
receives

All nodes but
parent and last
node send

Let’s Run Ring example…

Computing arctan (tan
-
1
) of x


Using Gregory’s Formula


arctan(x) = x
-

x
3
/3 + x
5
/5
-

x
7
/7 + x
9
/9
-





Let’s use MPI to program this formula

Arctan code

int main(int argc, char** argv)

{


int size, node; //MPI variable placeholders


int i, j,x; // Loop counters


double init_value;



double angle = 0.0;


double sum = 0.0;


int terms; // Number of terms processed



double finished_sum = 0.0;


MPI_Status status;


MPI_Init(&argc, &argv); // Start MPI environment


MPI_Comm_size(MPI_COMM_WORLD, &size); //Get MPI size


MPI_Comm_rank(MPI_COMM_WORLD, &node); //Get this node number


Arctan code (2)

if(node == 0)


{


printf("Angle:");


scanf("%lf", &angle);


printf("Number of arctan terms:");


scanf("%d", &terms);


}



MPI_Bcast(&terms, 1, MPI_INT, 0, MPI_COMM_WORLD);


MPI_Bcast(&angle, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);

Arctan code (3)

// Start processing arctan



init_value = angle;


double middle_sum;



for(x=node; x<terms; x=x+size
-
1) {


middle_sum = 0.0;


double index = (double)x
-

1.0;


index = index + x;


double temp = init_value;



for(i = 0; i<(int)index
-

1; ++i)


temp = temp * init_value;



middle_sum = temp / index;



if(x % 2 == 0)


middle_sum = middle_sum *
-
1.0;




sum = sum + middle_sum;


}




if(node==0)


sum = 0.0;




MPI_Reduce(&sum, &finished_sum, 1, MPI_DOUBLE,
MPI_SUM, 0, MPI_COMM_WORLD);


Arctan code (4)


MPI_Barrier(MPI_COMM_WORLD); // Wait for all
processes



if(node == 0)


printf(" Arctan of %lf = %.20lf
\
n",angle, finished_sum);




MPI_Finalize();


return 0;

}

Let’s run arctan example

Computing Pi


Using Euler’s formula


Pi/4 = arctan(1/2) + arctan(1/3)



Let’s use MPI to compute this value

Computing Pi (2)


Arctan Code is the same


Run twice



Set barrier, then compute


4 * [arctan(1/2) + arctan(1/3)]

Additional OSCAR Programs


Created for directed studies and other classes


Outside scope of this class



Music Program


Mandelbrot Program

OSCAR Music Program


Uses each node’s internal speaker to play a
choreographed song.

OSCAR Fractals


Fractals use heavy computations to generate
images


Clustered computers are perfect for dealing
with fractal generation


LAM comes with XMTV, a graphics server (that
runs on top of LAM)

The Mandelbrot Set


A subset of the
complex plane

consisting of parameters for
which the Julia set is connected

The Mandelbrot Set can be
computed using OSCAR


OSCAR Cluster Has Countless
Uses


Any intense computation (mathematical,
musical, graphical) can be solved in no time
with OSCAR Cluster



Check webpage for any information (questions,
announcements, etc.)