Getting Started on Emerald

bugenigmaΛογισμικό & κατασκευή λογ/κού

30 Οκτ 2013 (πριν από 3 χρόνια και 7 μήνες)

58 εμφανίσεις

Getting Started on Emerald

ITS
-

Research Computing Group


its.unc.edu

2

Course Objectives


Word for the Day:
Heterogeneous


Emerald: the Swiss army knife of computing,
something for everyone :)


Something you can use today


A reference for something




you can use tomorrow

its.unc.edu

3

Course Objectives Cont.


Educate users on the broader aspects of
research computing


Practical knowledge to allow you to
efficiently perform your research


Pointers towards more advanced topics

its.unc.edu

4


Course Objectives


What are compute clusters and Emerald in particular?


Accessing Emerald


login


file systems


Running jobs on Emerald


Job Management


job schedulers


batch commands


submitting jobs


specialty scripts


Available Software


software


package space


Compiling Code


Course Outline

its.unc.edu

5

Help Documentation


Getting Started on Emerald


http://help.unc.edu/6020


General overview of Emerald for range of users



Short Course


Getting Started on Emerald


http://help.unc.edu/6479


Detailed notes for beginning Emerald users

What is a compute cluster?

What is Emerald?

its.unc.edu

7

Emerald Linux Cluster

its.unc.edu

8


General Purpose Linux Cluster


Maintained by Research Computing Group


Appropriate for all users regardless of expertise
level


Other Servers:


Cedar/Cypress (128
-
processor SGI/
Altix
)



a large shared memory system


Topsail (4160
-
processor Dell Linux Cluster)


homogeneous capability cluster with fast interconnect


Mass Storage


Account access

What is Emerald?

its.unc.edu

9

What is a compute cluster?

Some Typical Components


Compute Nodes


Interconnect


Shared File System


Software


Operating System (OS)


Job Scheduler/Manager


Mass Storage

its.unc.edu

10

Emerald is a
Heterogeneous

Cluster


Compute Nodes


Xeon blades, IBM Power
4 and Power5


Interconnect


Gigabit Ethernet (aka
gigE

or
GbE
)


Shared File Systems


AFS, NFS, and GPFS


Mass Storage


~/ms


Software


much licensed and public
domain s/w in
package
space


Operating Systems (OS)


RH5 (64bit), RH4 (32 bit)
and AIX (64 bit)


Job Scheduler/Manager


all handled by LSF

its.unc.edu

11

Emerald Overview


its.unc.edu

12

Advantages of Using
Emerald


High performance


Large capacity


Parallel processing


Many available software packages


Variety of compiling options


Shared file systems


Mass storage


its.unc.edu

13

Emerald Compute Nodes


Mostly IBM BladeCenter Xeon
blades


all are
dual

Socket
Intel Xeons


1, 2, or 4 cores/socket (i.e. 2,4,8
processors/node)


2.0, 2.8, 3.0, 3.2 GHz processors


varying memory, mostly 2 or 4 GB
per core


IBM Power 4 and 5


large memory, varying processor speeds


Cluster is constantly
evolving

its.unc.edu

14

Emerald Blades

A chassis with 14 blades

its.unc.edu

15

Emerald Summary


Over 200 host blade nodes, Intel Xeon


Over 800 blade cores


typically 2
-
4 GB memory per core


4 IBM AIX p575’s, Power 5


64 cores, large memory


2 large memory Intel “Nehalem” X5570 nodes


8 cores, 96 GB memory, 2.93
Ghz

cpu


Gigabit Ethernet switching fabric


Running 32 and 64 bit Linux and 64 bit AIX

its.unc.edu

16

Emerald Details


Run the
lshosts

command to see resources for each node
(host). Note host, model,
ncpus
,
maxmem
, resources


%
lshosts


HOST_NAME type model
cpuf

ncpus

maxmem

maxswp

server
RESOURCES


bc12
-
n01

X86_64
Xeon_3_2

12.0
2

3954M

996M Yes (
X64bit
blade blade12 L26
lammpi

mem3 mem4 mpich2 mpichp4 RH5 tmp25G
xeon32
)


bc10
-
n10 X86_64 Xeon_2_8 11.7 2 3954M 996M Yes (X64bit
blade blade10 L26
lammpi

mem3 mem4 mpich2 mpichp4 RH5 tmp25G
xeon28)


bc09
-
n01 X86_64 Xeon_2_8 11.7 2 3954M 996M Yes (X64bit
blade blade9 L26
lammpi

mem3 mem4 mpich2 mpichp4 RH5 tmp25G
xeon28)


bc01
-
n01 X86_64 Xeon_3_0 11.9 8 32190M 29313M Yes (X64bit
blade blade1 L26
lammpi

mem32 mpich2 mpichp4 RH5 tmp100G
xeon30)

Accessing Emerald

its.unc.edu

18

Logging Into Emerald


UNIX/Linux/OSX


ssh
my_onyen@emerald.unc.edu


ssh

l my_onyen emerald.unc.edu



Windows: SSH Secure Shell


X windows software
-
> shareware.unc.edu


Setting up a Profile for Emerald


Forwarding X11 packets


its.unc.edu

19

Head Nodes


Emerald has multiple head nodes or login
nodes for


login and basic file manipulation


compiling


testing short (~ <1 min), small memory jobs


Login nodes run the
Linux
operating system


take the Introduction to Linux class or see some
of the many online tutorials if you are
unfamiliar with Linux

its.unc.edu

20

Home Directory on Emerald


Home Directory


/
afs
/
isis
/home/m/y/
my_onyen
/


250 MB quota


~/private/


Files backed up daily [ ~/
OldFiles

]


Space quota/usage in Home Directory:


fs

lq

its.unc.edu

21

Work Directories on
Emerald


No space limit but periodically cleaned


Not backed up!!!


Work Directories:


/netscr/my_onyen, /nas/my_onyen,
/nas2/my_onyen


totals 26.2 TB


/largefs


optimized for large file operations (> 1MB)


23 TB


/smallfs


optimized for small file operations (< 1MB)


16 TB


its.unc.edu

22

File Permissions


Your home directory is in AFS space. AFS is
a distributed networked file system.


Permissions are determined by ACLs (access
control lists)


see Introduction to AFS
(
http://help.unc.edu/215
)


The other files systems, /
largefs
, /
netscr
,
etc. are controlled by the usual Linux file
permissions


making everything under /
netscr
/
myOnyen

accessible:
chmod


R
a+rX

/
netscr
/
myOnyen

its.unc.edu

23

Mass Storage

“To infinity … and beyond”
-

Buzz Lightyear


access via ~/ms


looks like ordinary disk file
system


data is actually

stored on tape


“limitless” capacity


data is backed up


For storage only, not a work
directory (i.e.
don’t run jobs
from here
)


if you have many small files,
use tar or zip to create a

single file for better
performance


Sign up for this service on
onyen.unc.edu

Job Scheduling and Management


its.unc.edu

25

What does a Job Scheduler
and batch system do?

Manage Resources


allocate user tasks to resource


monitor tasks


process control


manage input and output


report status, availability, etc


enforce usage policies

its.unc.edu

26

LSF


All Research Computing clusters use
LSF

to do job
scheduling and management


LSF (Load Sharing Facility) is a (licensed) product
from Platform Computing


Fairly distribute compute nodes among users


enforce usage policies for established queues


most common queues:
int, now, week, month


RC uses Fair Share scheduling, not first come, first
served (FCFS)


LSF commands typically start with the letter
b

(as
in batch), e.g. bsub, bqueues, bjobs, bhosts, …


see
man pages

for much more info!


its.unc.edu

27

Simplified view of LSF

bsub

R X64bit

q week myjob

Login Node

Jobs Queued

job routed
to queue

job_J

job_F

myjob

job_7

job dispatched to run on
available

host which
satisfies job
requirements

user logged in to login
node
submits

job

its.unc.edu

28

Common batch commands


bsub
-

submit jobs


bqueues


view info on defined queues


bqueues

l week


bkill


stop/cancel submitted job


bjobs


view submitted jobs


bjobs

u all


bhist


job history


bhist

l <jobID>


bhosts


status and resources of hosts (nodes)

its.unc.edu

29

Common batch commands


bpeek


display output of running job


Use
man pages

to get
much
more info!


man bjobs


bfree


query LSF to find job slots currently
available that fit your resource requirement


this is a RC command extension


bfree

help (or

h)


jobmon


monitor changes in job status


this is a RC command, typically runs in a
separate window

its.unc.edu

30

Submitting Jobs: bsub
Command


Submit Jobs
-

bsub


All files must be in scratch space, e.g. /netscr,
/largefs, /smallfs


Home directory is not mounted on compute nodes


bsub [
-

bsub_opts] executable [
-
exec_opts]


its.unc.edu

31

bsub continued


Common bsub options:



o <filename>



o out.%J


-
q <queue name>


-
q now


-
R “resource specification”


-
R xeon30


-
n <number of processes>



used for parallel, MPI jobs


-
a <application specific esub>



-
a mpichp4 (used on MPI jobs)


its.unc.edu

32

Two methods to submit jobs:


bsub example: submit the executable job,
myexe, to the week queue to run on a 64 bit
Linux OS and redirect output to the file
out.<jobID> (default is to mail output)


Method 1:

Command Line


bsub

q week

R X64bit

o out.%J myexe


Method 2:

Create a file (details to follow)
called, for example, myexe.bsub, and then
submit that file. Note the redirect symbol, <


bsub < myexe.bsub

its.unc.edu

33

Method 2 cont.


The file you submitted will contain all the bsub
options you want in it, so for this example
myexe.bsub will look like this


#BSUB

q week


#BSUB

o out.%J


#BSUB

R X64bit


myexe


This is actually a shell script so the top line could
be the normal #!/bin/csh, etc and you can run any
commands you would like.


if this doesn’t mean anything to you then nevermind :)

its.unc.edu

34

Parallel Job example

Batch Command Line Method



bsub


q week

o
out.%J

-
n 30
-
a mpichp4
mpirun.lsf myParallelExe

Batch File Method



bsub < myexe.bsub


where
myexe.bsub

will look like this

#BSUB

q week

#BSUB

o
out.%J

#BSUB

a mpichp4

#BSUB

n 30

mpirun.lsf
myexe

its.unc.edu

35

Submitting Jobs: Specialty
Scripts


Running a SAS job through batch (2 ways)


bsub
-
q week
-
R blade sas program.sas


bsas

test.sas


Running a
Matlab

job through batch (2
ways)


bsub

-
q week
-
R blade
matlab

-
nodisplay

-
nojvm

-
nosplash

program.m

-
logfile

program.log


bmatlab

test.m


its.unc.edu

36

Interactive Jobs: Setup


X
-
Windows


Linux/OSX


X11 client



Windows


X
-
Win32


Offered on UNC Software Acquisition site


https://shareware.unc.edu


Port forwarding on SSH Secure Shell


Setting up a session on X
-
Win32


its.unc.edu

37

Interactive Jobs:
Submission



Ip or
-
Is


bsub

q int

R blade

Ip sas


bsub

q int

R blade

Ip gv


bsub

q int

R blade

Ip matlab


bsub

q int

Is tcsh


Specialty Scripts


xsas


xstata


Software

its.unc.edu

39

Licensed Software


over 20 licensed software applications (some are
site licensed, others restricted)


Matlab, Maple, Mathematica, Gaussian, Accelrys
Materials Studio and Discovery Studio modules, Sybyl,
Schrodinger, SAS, Stata, ArcGIS, NAG, IMSL, Totalview,
and more.


compilers (licensed and otherwise)


intel, PGI, absoft, gnu, IBM


Numerous other packages provided for research
and technical computing


including BLAST, PyMol, SOAP, PLINK, NWChem, R,
Cambridge Structural Database, Amber, Gromacs, Petsc,
Scalapack, Netcdf, Babel, Qt, Ferret, Gnuplot, Grace,
iRODS, XCrySDen, and more.

its.unc.edu

40

Available Software


Most of the software is installed under AFS
and is made available through package
space.


AFS (Andrew File System) is a distributed
networked file system. Your home directory
and software packages are mounted in AFS
space.


A new token is issued at login and it expires
after 24 hours. Use
klog

to renew this.


Changes made to your package space are
preserved over login sessions.

its.unc.edu

41

Package Space


Use ipm (Isis Package Manager) to manage your
packages.


ipm commands


ipm add (ipm a)


ipm remove (ipm r)


ipm query (ipm q)


Available packages


http://help.unc.edu/1689


man ipm


Compiling

its.unc.edu

43

Compiling on Emerald


Compilers


FORTRAN 77/90/95


C/C++


Parallel Computing


MPI (MPICH, LAM/MPI, MPICH
-
GM)


OpenMP


its.unc.edu

44

Compiling Details on
Emerald

Compiler

Package name

Command

Intel

intel_fortran, intel_CC

ifort, icc, icpc


Portland Group

pgi

pgf77,
pgf90,pgcc,pgCC

Absoft

profortran

f77, f90


GNU

gcc

gfortran, g77, gcc, g++


its.unc.edu

45

Compiling MPI programs


Use the MPI wrappers to compile your
program


mpicc, mpiCC, mpif90, mpif77


the wrappers will find the appropriate include
files and libraries and then invoke the actual
compiler


for example, mpicc will invoke either gcc, icc,
or pgcc depending upon which package you have
loaded

its.unc.edu

46

Compiling Details on
Emerald


Add a compiler into your working
environment


ipm add package_name


Compile a code


command code.c

o executable


Run executable on a compute node using
the bsub command


bsub

q week

R blade executable


its.unc.edu

47

Contacting Research
Computing


Questions?





For assistance with Emerald, please
contact the Research Computing Group:


Email:
research@unc.edu


Phone: 919
-
962
-
HELP


Submit help ticket at
http://help.unc.edu