General Parallel File System

sunfloweremryologistData Management

Oct 31, 2013 (4 years and 13 days ago)

88 views

Presentation by:

Lokesh Pradhan


General Parallel File System

Introduction


File System


Way to organize data which is expected to be retained after the
program terminates by providing procedures to store, retrieve
and update data as well as manage the available space on the
device which contains it.



Types of File System

Types

Examples

Disk file system

FAT,
exFAT
, NTFS…

Optical

discs

CD,

DVD,
Blu
-
ray

Tape file system

IBM’s Linear tape

䑡瑡base file s祳瑥m

䑂2

Tra湳ac瑩潮ol file s祳瑥m

T硆
I sal潲I Ami湯n

T䙆F

䙬at

file s祳瑥m

Amazon’s S3

Cl畳瑥r

file s祳瑥m



䑩s瑲ib畴u搠file s祳瑥m



卨are搠file s祳瑥m



卡渠
file s祳瑥m



Parallel
file s祳瑥m


乆匬

CI䙓F A䙓F 卍B,
G䙓F

G偆匬
䱕協RE, PAS


In HPC world


Equally large applications


Large input data set (e.g. astronomy data)


Parallel execution on large clusters



Use parallel file systems for scalable I/O


e.g. IBM’s GPFS, Sun’s
Lustre

FS,
PanFS
, and


Parallel Virtual File System (PVFS)


General Parallel File System


Cluster
: 512 nodes today, fast
reliable communication



Shared disk
: all data and metadata
on disk accessible from any node
through disk I/O interface (i.e.,
"any to any" connectivity)



Parallel
: data and metadata flows
from all of the nodes to all of the
disks in parallel



RAS
: reliability, accessibility,
serviceability


History of GPFS


Shark video server


Video streaming from single RS/6000


Complete system, included file system, network driver, control server


Large data blocks, admission control, deadline scheduling


Bell Atlantic video
-
on
-
demand trial (1993
-
94)


Tiger Shark multimedia file system


Multimedia file system for RS/6000 SP


Data striped across multiple disks, accessible from all nodes


Hong Kong and Tokyo video trials, Austin video server products


GPFS parallel file system


General purpose file system for commercial and technical computing
on RS/6000 SP, AIX and Linux clusters.


Recovery, online system management, byte
-
range locking, fast pre
-
fetch, parallel allocation, scalable directory, small
-
block random
access.


Released as a product 1.1
-

05/98.


What is Parallel I/O?


Multiple processes
(possibly on multiple
nodes) participate in the
I/O


Application level
parallelism


“File” is stored on
multiple disks on a
parallel file system


What does Parallel System support?


A parallel file system must support


Parallel I/O


Consistent global name space across all nodes of the cluster


Including maintaining a consistent view across all nodes
for the same file


Programming model allowing programs to access file data


Distributed over multiple nodes


From multiple tasks running on multiple nodes


Physical distribution of data across disks and network
entities eliminates bottlenecks both at the disk interface and
the network, providing more effective bandwidth to the I/O
resources


Why use general parallel file systems?



Native AIX File System


No file sharing
-

application can only access files on its own node


Applications must do their own data partitioning



Distributed File

System


Application nodes (DCE clients) share files on server node


Switch is used as a fast LAN


Coarse
-
grained (file or segment level) parallelism


Server node : performance and capacity bottleneck



GPFS Parallel File System


GPFS file systems are striped across multiple disks on multiple
storage nodes


Independent GPFS instances run on each application node


GPFS instances use storage nodes as "block servers"
-

all instances
can access all disks




Performance advantages with GPFS
file system


Allowing multiple processes or applications on all
nodes in the cluster simultaneously


Access to the same file using standard file system calls.



Increasing aggregate bandwidth of your file system by
spreading reads and writes across multiple disks.



Balancing the load evenly across all disks to maximize
their combined throughput. One disk is no more active
than another.

Performance advantages with GPFS
file system (cont.)



Supporting very large file and file system sizes.



Allowing concurrent reads and writes from multiple
nodes.



Allowing for distributed token (lock) management.
Distributing token management reduces system
delays associated with a lockable object waiting to
obtaining a token.



Allowing for the specification of other networks for
GPFS daemon communication and for GPFS
administration command usage within your cluster.


GPFS Architecture Overview


Implications of Shared Disk Model


All data and metadata on globally accessible disks
(VSD)


All access to permanent data through disk I/O
interface


Distributed protocols, e.g., distributed locking,
coordinate disk access from multiple nodes


Fine
-
grained locking allows parallel access by
multiple clients


Logging and Shadowing restore consistency after
node failures

GPFS Architecture Overview (cont.)


Implications of Large Scale


Support up to 4096 disks of up to 1 TB each (4
Petabytes
)


The largest system in production is 75 TB


Failure detection and recovery protocols to
handle node failures


Replication and/or RAID protect against disk /
storage node failure


On
-
line dynamic reconfiguration (add, delete,
replace disks and nodes; rebalance file system)


GPFS Architecture
-

Special Node Roles


Three types of nodes:


File system nodes


Manager nodes


Storage nodes

Disk Data
Structures:


Large block size allows efficient use of disk bandwidth


Fragments reduce space overhead for small files


No designated "mirror", no fixed placement function:


Flexible replication (e.g., replicate only metadata, or only important
files)


Dynamic reconfiguration: data can migrate block
-
by
-
block


Multi level indirect blocks



Each disk address:


List of pointers to replicas


Each pointer:


Disk id + sector no.

Availability and Reliability


Eliminate single point of failures


Designed to transparently fail over token (lock) operations.


Supports data replications to increase availability in the vent
of a storage media failure.


Offers time
-
tested reliability and has been installed on
thousands of nodes across industries


Basis of many cloud storage offerings


GPFS’s Achievement


Used on six of the ten most powerful supercomputers in
the world, including the largest (ASCI white)


Installed at several hundred customer sites, on clusters
ranging from a few nodes with less than a TB of disk, up
to 512 nodes with 140 TB of disk in 2 file systems


20 filed patents


ASC Purple Supercomputer


which is composed of more
than 12,000 processors and has 2 PB of total disk storage
spanning more than 11,000 disks.


Conclusion


Efficient for managing data volumes


Provides world
-
class performance, scalability and
availability for your file data


Designed to optimize the use of storage


Provide highly available platform for data
-
intensive
applications



Delivering real business needs by streamline data
workflows, improvised services reducing cost and
managing the risks.

References


"File System."
Wikipedia, the Free Encyclopedia
. Web. 20 Jan. 2012.
<http://en.wikipedia.org/wiki/File_system>.


"IBM General Parallel File System for AIX: Administration and Programming Reference
-

Contents."
IBM General Parallel File System for AIX
. IBM. Web. 20 Jan. 2012.
<https://support.iap.ac.cn/hpc/ibm/ibm/gpfs/am3admst02.html>.


"IBM General Parallel File System."
Wikipedia, the Free Encyclopedia
. Web. 20 Jan. 2012.
<http://en.wikipedia.org/wiki/IBM_General_Parallel_File_System>.


Intelligent Storage Management with IBM General Parallel File System
. Issue brief. IBM, July
2010. Web. 21 Jan. 2012. <http://www
-
03.ibm.com/systems/software/gpfs/>.


Mandler
, Benny.
Architectural and Design Issues in the General Parallel File System
. IBM
Haita

Research Lab, May 2005. Web. 21 Jan. 2012. <Architectural and Design Issues in the
General Parallel File System>.


"NCSA Parallel File Systems."
National Center for Supercomputing Applications at the
University of Illinois
. University of Illinois, 20 Mar. 2008. Web. 21 Jan. 2012.
<http://www.ncsa.illinois.edu/UserInfo/Data/filesystems/>.


Parallel File System
. Rep. Dell Inc., May 2005. Web. 21 Jan. 2012.
<www.dell.com/powersolutions>.


Welch, Brent. "What Is a Cluster
Filesystem
?"
Brent B Welch
. Web. 21 Jan. 2012.
<http://www.beedub.com/clusterfs.html>.


Questions?