Technologies for Cluster Computing

footballsyrupΛογισμικό & κατασκευή λογ/κού

1 Δεκ 2013 (πριν από 3 χρόνια και 7 μήνες)

58 εμφανίσεις

Technologies for
Cluster Computing


Oren Laadan

Columbia University

<orenl@cs.columbia.edu>

ECI, July 2005

ECI


July 2005









2

Course Overview (contd)


What is Cluster Computing ?


Parallel computing, enabling technologies,
definition of a cluster, taxonomy


Middleware


SSI, operating system support, software support


Virtualization & Process Migration


Resource sharing


Job assignment, load balancing, information
dissemination


Grids

ECI


July 2005









3

Motivation


Demanding Applications


Modeling and simulations (physics, weather,
CAD, aero
-
dynamics, finance, pharmaceutical)


Business and E
-
commerce (Ebay, Oracle)


Internet (Google, eAnything)


Number crunching (encryption, data mining)


Entertainment (animation, simulators)


CPUs are reaching physical limits


Dimensions


Heat dissipation


ECI


July 2005









4

How to Run Applications Faster ?


3 ways to improve performance:


Work Harder


Work Smarter


Get Help


And in computers:


Using faster hardware


Optimized algorithms and techniques


Multiple computers to solve a particular task

ECI


July 2005









5

Parallel Computing


Hardware: Instructions or Data ?


SISD


classic cpu


SIMD


vector computers


MISD


pipelined computers


MIMD


general purpose parallelism


Sofware ajdustments


Parallel programming: multiple processes
collaborating, with communication and
synchronization between them


Operating systems, compilers etc.


ECI


July 2005









6

Parallel Computer Architectures

Taxononmy of MIMD:


SMP
-

Symmetric Multi Processing


MPP
-

Massively Parallel Processors


CC
-
NUMA
-

Cache
-
Coherent Non
-
Uniform Memory Access


Distributed Systems


COTS


Commodity Off The Shelf


NOW


Network of Workstations


Clusters

ECI


July 2005









7

Taxononmy of MIMD (contd)


SMP


2
-
64 processors today


Everything shared


Single copy of OS


Scalability issues (hardware, software)


MPP


Nothing shared


Several hundred nodes


Fast interconnection


Inferior cost/performance ratio

ECI


July 2005









8

Taxonomy of MIMD (contd)


CC
-
NUMA


Scalable multiprocessor system


Global view of memory at each node


Distributed systems


Conventional networks of independent nodes


Multiple system images and OS


Each node can be of any type (SMP, MPP etc)


Difficult to use and extract performance

ECI


July 2005









9

Taxonomy of MIMD (contd)


Clusters


Nodes connected with high
-
speed network


Operate as an integrated collection of
resources


Single system image

High performance computing


commodity
super computing

High availability computing


missions
critical applications


ECI


July 2005









10

Taxonomy of MIMD
-

summary


ECI


July 2005









11

Enabling Technologies


Performance of individual components


Microprocessor (x2 every 18 months)


Memory capacity (x4 every 3 years)


Storage (capacity same !)


SAN, NAS


Network (scalable gigabit networks)


OS, Programming environments


Applications


Rate of performance improvements
exceeds specialized systems

ECI


July 2005









12

The “killer” workstation


Traditional usage


Workstations w/ Unix for science & industry


PC’s for administrative work & work processing


Recent trend


Rapid convergence in processor performance and
kernel
-
level functionality of PC vs Workstations


Killer CPU, killer memory, killer network, killer OS,
killer applications…



ECI


July 2005









13

Computer Food Chain

ECI


July 2005









14

Towards Commodity HPC


Link together multiple computers to jointly
solve a computational problem


Ubiquitous availability of commodity high
performance components




Out
: expensive and specialized proprietary
and parallel computers


In
: cheaper clusters of loosely coupled
workstations

ECI


July 2005









15




History of Cluster Computing

1960

1990

1995+

1980s

2000+

PDA

Clusters

ECI


July 2005









16

Why PC/WS Clustering Now ?


Individual PCs/WS become increasing powerful


Development cycle of supercomputers too long


Commodity networks bandwidth is increasing and
latency is decreasing


Easier to integrate into existing networks


Typical low user utilization of PCs/WSs ( < 10% )


Development tools for PCs/WS are more mature


PCs/WS clusters are cheap and readily available


Clusters can leverage from future technologies and
be easily grown

ECI


July 2005









17

What is a Cluster ?


Cluster

-

a parallel or distributed processing
system, which consists of a collection of
interconnected stand
-
alone computers
cooperatively working together as a single,
integrated computing resource.


Each node in the cluster is


A UP/MP system with memory, I/O facilities, & OS


Connected via fast interconnect or LAN


Appear as a single system to users and applications


ECI


July 2005









18

Cluster Architecture

Sequential Applications

Parallel Applications

Parallel Programming Environment

Cluster Middleware

(Single System Image and Availability Infrastructure)

Cluster Interconnection Network/Switch

PC/Workstation





Network Interface
Hardware

Communications

Software

PC/Workstation





Network Interface
Hardware

Communications

Software

PC/Workstation





Network Interface
Hardware

Communications

Software

PC/Workstation





Network Interface
Hardware

Communications

Software

Sequential Applications

Sequential Applications

Parallel Applications

Parallel Applications

ECI


July 2005









19

A Winning Symbiosis


Parallel Processing

Create MPP or DSM

like parallel processing systems


Network RAM

Use cluster
-
wide available memory to aggregate a
substantial cache in RAM


Software RAID

Use arrays of WS disks to provide cheap, highly
available and scalable storage and parallel IO


Multi
-
path communications

Use multiple networks for parallel file transfer

ECI


July 2005









20

Design Issues


Cost/performance ratio


Increased Availability


Single System Image (look
-
and
-
feel of one system)


Scalability (physical, size, performance, capacity)


Fast communication (network and protocols)


Resource balancing (cpu, network, memory, storage)


Security and privacy


Manageability (administration and control)


Usability and applicability (programming environment,
cluster
-
aware apps)

ECI


July 2005









21

Cluster Objectives


High performance

Usually dedicated clusters for HPC

Partitioning between users


High throughput

Steal idle cycles (cycle harvesting)

Maximum utilization of available resources


High availability

Fail
-
over configuration

Heartbeat connections



Combined: HP nd HA

ECI


July 2005









22

Example: MOSIX at HUJI

ECI


July 2005









23

Example: Berkeley NOW


Cluster Components


Nodes


Operating System


Network


Interconnects


Communication protocols & services


Middleware


Programming models


Applications

ECI


July 2005









25

Cluster Components: Nodes


Multiple High Performance Computers


PCs


Workstations


SMPs (CLUMPS)



Processors


Intel/AMD x86 Processors


IBM PowerPC


Digital Alpha


Sun SPARC

ECI


July 2005









26

Cluster Components: OS


Basic services:


Easy access to hardware


Share hardware resources seemlessly


Concurrency (multiple threads of control)


Operating Systems:


Linux



(Beowulf, and many more)


Microsoft NT


(Illinois HPVM, Cornell Velocity)


SUN Solaris


(Berkeley NOW, C
-
DAC PARAM)


Mach (

-
kernel)

(CMU)


Cluster OS


(Solaris MC, MOSIX)


OS gluing layers

(Berkeley Glunix)

ECI


July 2005









27

Cluster Components: Network


High Performance Networks/Switches


Ethernet (10Mbps), Fast Ethernet (100Mbps),
Gigabit Ethernet (1Gbps)


SCI (Scalable Coherent Interface
-

12µs latency)


ATM (Asynchronous Transfer Mode)


Myrinet (1.2Gbps)


QsNet (5µsec latency for MPI messages)


FDDI (fiber distributed data interface)


Digital Memory Channel


InfiniBand

ECI


July 2005









28

Cluster Components:
Interconnects


Standard Ethernet


10 Mbps, cheap, easy way deploy


bandwidth & latency don’t match CPU capabilities


Fast Ethernet, and Gigabit Ethernet


Fast Ethernet


100 Mbps


Gigabit Ethernet


1000Mbps


Myrinet


1.28 Gbps full duplex interconnect, 5
-
10

s latency


Programmable on
-
board processor


Leverage MPP technology

ECI


July 2005









29

Interconnects (contd)


Infiniband


Latency <
7

s


Insdustry standard based on VIA


Connects components within a system


SCI


Scalable Coherent Interface


Interconnection technology for clusters


Directory based cache scheme


VIA


Virtual Interface Architecture


Standard for low
-
latency communications
software interface

ECI


July 2005









30

Cluster Interconnects: Comparison







Criteria

Gigabit
Ethernet

Gigabit
cLAN

Infiniband

Myrinet

SCI

Bandwidth (MB/s)

< 100

< 125

850

230

< 320

Latency (µs)

< 100

7
-
10

< 7

10

1
-
2

Hardware
Availability

Now

Now

Now

Now

Now

Linux Support

Now

Now

Now

Now

Now

Max # of nodes

1000’s

1000’s

> 1000’s

1000’s

1000’s

Protocol
implementation

Hardware

Firmware
on adaptor

Hardware

Firmware
on adaptor

Firmware
on adaptor

VIA support

NT / Linux

NT / Linux

Software

Linux

Software

MPI support

MVICH

3
rd

party

MPI/Pro

3
rd

party

3
rd

party

ECI


July 2005









31

Cluster Components:
Communication protocols


Fast Communication Protocols (and user Level
Communication):


Standard TCP/IP, 0
-
Copy TCP/IP


Active Messages (Berkeley)


Fast Messages (Illinois)


U
-
net (Cornell)


XTP (Virginia)


Virtual Interface Architecture (VIA)

ECI


July 2005









32

Cluster Components:
Communication services


Communication infrastructure


Bulk
-
data transport


Streaming data


Group communications


Provide important QoS parameters


Latency, bandwidth, reliability, fault
-
tolerance


Wide range of communication methodologies


RPC


DSM


Stream
-
based and message passing (e.g., MPI, PVM)

ECI


July 2005









33

Cluster Components: Middleware


Resides between the OS and the
applications


Provides infrastructure to transparently
support:


Single System Image (SSI)

Makes collection appear as a single machine


System Availability (SA)

Monitoring, checkpoint, restart, migration


Resource Management and Scheduling (RMS)

ECI


July 2005









34

Cluster Components:
Programming Models


Threads (PCs, SMPs, NOW..)


POSIX Threads, Java Threads


OpenMP


MPI (Message Passing Interface)


PVM (Parallel Virtual Machine)


Software DSMs (Shmem)


Compilers


Parallel code generators, C/C++/Java/Fortran


Performance Analysis Tools


Visualization Tools

ECI


July 2005









35

Cluster Components:
Applications


Sequential



Parametric Modeling


Embarrassingly parallel


Parallel / Distributed


Cluster
-
aware


Grand Challenging applications


Web servers, data
-
mining

ECI


July 2005









36

Clusters Classification (I)


Application Target


High Performance (HP) Clusters


Grand Challenging Applications


High Availability (HA) Clusters


Mission Critical applications

ECI


July 2005









37

Clusters Classification (II)


Node Ownership


Dedicated Clusters


Non
-
dedicated clusters


Adaptive parallel computing


Communal multiprocessing

ECI


July 2005









38

Clusters Classification (III)


Node Hardware


Clusters of PCs (CoPs)


Piles of PCs (PoPs)


Clusters of Workstations (COWs)


Clusters of SMPs (CLUMPs)

ECI


July 2005









39

Clusters Classification (IV)


Node Operating System


Linux Clusters (e.g., Beowulf)


Solaris Clusters (e.g., Berkeley NOW)


NT Clusters (e.g., HPVM)


AIX Clusters (e.g., IBM SP2)


SCO/Compaq Clusters (Unixware)


Digital VMS Clusters


HP
-
UX clusters


Microsoft Wolfpack clusters

ECI


July 2005









40

Clusters Classification (V)


Node Configuration


Homogeneous Clusters


All nodes will have similar architectures and
run the same OS


Semi
-
Homogeneous Clusters


Similar architectures and OS, varying
performance capabilities


Heterogeneous Clusters


All nodes will have different architectures and
run different OSs

ECI


July 2005









41

Clusters Classification (VI)


Levels of Clustering


Group Clusters (#nodes: 2
-
99)


Departmental Clusters (#nodes: 10s to 100s)


Organizational Clusters (#nodes: many 100s)


National Metacomputers (WAN/Internet)


International Metacomputers (Internet
-
based,
#nodes: 1000s to many millions)


Grid Computing


Web
-
based Computing


Peer
-
to
-
Peer Computing

ECI


July 2005









42

Summary: Key Benefits


High Performance

With cluster
-
aware applications


High Throughput

Resource balancing and sharing


High Availability

Redundancy in hardware, OS, applications


Expandability and Scalability

Expand on
-
demand by adding HW