Introduction to Grid & Cluster Computing
Sriram Krishnan, Ph.D.
sriram@sdsc.edu
Web Portals
Rich Clients
Set of Biomedical Applications
Motivation: NBCR Example
Resources
Telescience Portal
Cyber
-
Infrastructure
Web Services
Workflow
Middleware
PMV ADT
Vision
Continuity
APBSCommand
APBS
Continuity
Gtomo2
TxBR
Autodock
GAMESS
QMView
Cluster Resources
•
“A computer cluster is a group of tightly coupled
computers that work together closely so that in
many respects they can be viewed as though
they are a single computer.” [
wikipedia
]
•
Typically built using commodity off
-
the
-
shelf
hardware (processors, networking, etc)
–
Differs from traditional “supercomputers”
–
Now at more than 70% of deployed Top500 machines
•
Useful for: high availability, load
-
balancing,
scalability, visualization, and high performance
Grid Computing
•
“Coordinated resource sharing and problem solving in
dynamic multi
-
institutional virtual organization.” [
Foster,
Kesselman, Tuecke
]
–
Coordinated
-
multiple resources working in concert, eg. Disk &
CPU, or instruments & database, etc.
–
Resources
-
compute cycles, databases, files, application
services, instruments.
–
Problem solving
-
focus on solving scientific problems
–
Dynamic
-
environments that are changing in unpredictable
ways
–
Virtual Organization
-
resources spanning multiple
organizations and administrative domains, security domains,
and technical domains
Grids are not the same as Clusters!
•
Foster’s 3 point checklist
–
Resources not subjected to centralized
control
–
Use of standard, open, general
-
purpose
protocols and interfaces
–
Delivery of non
-
trivial qualities of service
•
Grids are typically made up of multiple
clusters
Popular Misconception
•
Misconception: Grids are all about CPU
cycles
–
CPU cycles are just one aspect, others are:
•
Data: For publishing and accessing large
collections of data, e.g. Geosciences Network
(GEON) Grid
•
Collaboration: For sharing access to instruments
(e.g. TeleScience Grid), and collaboration tools
(e.g. Global MMCS at IU)
SETI@Home
•
Uses 1000s of internet
connected PCs to help in
search for extraterrestrial
intelligence
•
When the computer is idle, the
software downloads ~ 1/2 MB
chunk of data for analysis.
•
Results of analysis sent back
to the SETI team, combined
with 1000s of other participants
•
Largest distributed computation
project in existence
–
Total CPU time: 2433979.781
years
–
Users: 5436301
•
Statistics from 2006
NCMIR TeleScience Grid
* Slide courtesy TeleScience folks
Condor pool
SGE Cluster
PBS Cluster
Globus
Globus
Globus
Application Services
Security Services (GAMA)
State
Mgmt
Gemstone
PMV/Vision
Kepler
NBCR Grid
Day 1
-
Using Grids and Clusters: Job Submission
•
Scenario 1
-
Clusters:
–
Upload data to remote cluster using
scp
–
Log on to the said cluster using
ssh
–
Submit job via command
-
line to schedulers, such as
Condor or the Sun Grid Engine (SGE)
•
Scenario 2
-
Grids:
–
Upload data using to Grid resource using
GridFTP
–
Submit job via Globus command
-
line tools (e.g.
globus
-
run
) to remote resources
•
Globus services communicate with the resource specific
schedulers
Day 1
-
Using Grids & Clusters: Security
Day 1
-
Using Grids & Clusters: User Interfaces
Day 2
-
Managing Cluster Environments
•
Clusters are great price/performance
computational engines
–
Can be hard to manage without experience
–
Failure rate increases with cluster size
•
Not cost
-
effective if maintenance is more
expensive than the cluster itself
–
System administrators can cost most than
clusters (1 Tflops cluster < $100,000)
Day 2
-
Rocks (Open Source Clustering Distribution)
•
Technology transfer of commodity clustering to
application scientists
–
Making clusters easy
–
Scientists can build their own supercomputers
•
Rocks distribution is a set of CDs
–
Red Hat Enterprise Linux
–
Clustering Software (PBS, SGE, Ganglia, Globus)
–
Highly programmatic software configuration
management
•
http://www.rocksclusters.org
Day 2
-
Rocks Rolls
Day 3
-
Advanced Usage Scenarios: Workflows
•
Scientific workflows emerged as an
answer to the need to
combine
multiple
Cyberinfrastructure components in
automated process networks
•
Combination of
–
Data integration, analysis, and visualization
steps
–
Automated
“scientific process”
•
Promotes scientific discovery
Day 3
-
The Big Picture: Scientific Workflows
Here:
John Blondin, NC State
Astrophysics
Terascale Supernova Initiative
SciDAC, DOE
Conceptual SWF
Executable SWF
From
“Napkin Drawings”
…
… to
Executable Workflows
Source: Mladen Vouk (NCSU)
Day 3
-
Kepler Workflows: A Closer Look
Day 3
-
Advanced Usage Scenarios: MetaScheduling
•
Local schedulers are responsible for load
balancing and resource sharing within
each local administrative domain
•
Meta
-
Schedulers are responsible for
querying, negotiating access and
managing resources existing within
different administrative domains in Grid
systems
Day 3
-
MetaSchedulers: CSF4
•
What is the CSF Meta
-
Scheduler?
–
C
ommunity
S
cheduler
F
ramework
–
CSF4 is a group of Grid services hosted inside the
Globus Toolkit (GT4)
–
CSF4 is fully WSRF compliant
–
Open Source project and can be accessed at
http://sourceforge.net/projects/gcsf
–
The development team of CSF4 is from Jilin
University, PRC
Day 3
-
CSF4 Architecture
L
o
c
a
l
M
a
c
h
i
n
e
P
B
S
S
G
E
C
o
n
d
o
r
L
S
F
L
o
c
a
l
M
a
c
h
i
n
e
P
B
S
S
G
E
C
o
n
d
o
r
:
:
CSF
4
Services
Q
u
e
u
i
n
g
S
e
r
v
i
c
e
R
e
s
o
u
r
c
e
M
a
n
a
g
e
r
L
S
F
S
e
r
v
i
c
e
G
r
a
m
P
B
S
G
r
a
m
C
o
n
d
o
r
G
r
a
m
F
o
r
k
G
r
a
m
S
G
E
W
S
-
G
R
A
M
g
a
b
d
R
e
s
o
u
r
c
e
M
a
n
a
g
e
r
F
a
c
t
o
r
y
S
e
r
v
i
c
e
J
o
b
S
e
r
v
i
c
e
R
e
s
e
r
v
a
t
i
o
n
S
e
r
v
i
c
e
G
T
2
E
n
v
i
r
o
n
m
e
n
t
G
a
t
e
K
e
e
p
e
r
G
r
a
m
P
B
S
G
r
a
m
S
G
E
G
r
a
m
C
o
n
d
o
r
G
r
a
m
F
o
r
k
R
e
s
o
u
r
c
e
M
a
n
a
g
e
r
G
r
a
m
S
e
r
v
i
c
e
W
S
-
M
D
S
Meta Information
Grid Environment
G
r
a
m
L
S
F
Day 4
-
Accessing TeraScale Resources
•
I need more resources! What are my options?
–
TeraGrid
: “With 20 petabytes of storage, and more
than 280 teraflops of computing power, TeraGrid
combines the processing power of supercomputers
across the continent”
–
PRAGMA:
“To establish sustained collaborations and
advance the use of grid technologies in applications
among a community of investigators working with
leading institutions around the Pacific Rim”
Day 4
-
TeraGrid
TeraGrid is a “top
-
down”,
planned Grid
PSC
PSC
Extensible Terascale Facility
•
Members: IU, ORNL, NCSA, PSC,
Purdue, SDSC, TACC, ANL,
NCAR
•
280 Tflops of computing capability
•
30 PB of distributed storage
•
High performance networking
between partner sites
•
Linux
-
based software
environment, uniform
administration
•
Focus is a national, production
Grid
PRAGMA Grid Member Institutions
31 institutions in 15 countries/regions
(
+ 7 in preparation
)
UZurich
Switzerland
NECTEC
ThaiGrid
Thailand
UoHyd
India
MIMOS
USM
Malaysia
CUHK
HongKong
ASGC
NCHC
Taiwan
HCMUT
IOIT
-
HCM
Vietnam
AIST
OsakaU
UTsukuba
TITech
Japan
BII
IHPC
NGO
NTU
Singapore
MU
Australia
APAC
QUT
Australia
KISTI
Korea
JLU
China
SDSC
USA
CICESE
Mexico
UNAM
Mexico
UCN
Chile
UChile
Chile
UUtah
USA
NCSA
USA
BU
USA
ITCR
Costa Rica
BESTGrid
New Zealand
CNIC
GUCAS
China
LZU
China
UPRM
Puerto Rico
Track 1: Agenda (9AM
-
12PM at PFBH 161)
•
Tues, July 31: Basic Cluster and Grid
Computing Environment
•
Wed, Aug 1: Rocks Clusters and
Application Deployment
•
Thurs, Aug 2: Workflow Management and
MetaScheduling
•
Fri, Aug 3: Accessing National and
International TeraScale Resources
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο