Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

smilinggnawboneInternet και Εφαρμογές Web

4 Δεκ 2013 (πριν από 4 χρόνια και 7 μήνες)

169 εμφανίσεις

Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

1
Parallel Computing with MATLAB
on Amazon Elastic Compute Cloud
Accelerating the pace of engineering and science
APPLi CATi on Gui dELi nEs
2

ApplicAtion Guideline

Contents
introduction
...........................................................................................................
3
Who Should Read this paper
.......................................................................
3
is Cloud Computing Right for You?
..........................................................................
3
MATLAB Parallel Computing Tools: Basic setup and Requirements
............................
4
using Parallel MATLAB on Amazon EC2
.................................................................
6
Setup
.........................................................................................................
6
performing parallel MAtlAB computations on Amazon ec2
...........................
6
Managing data
..........................................................................................
9
setting up Parallel MATLAB on Amazon EC2
........................................................
11
Setting up Your desktop computer
..............................................................
12
Setting up a Basic compute environment on Amazon Web Services
..............
13
choosing an Amazon ec2 AMi and instance
..........................................
13
Setting up MAtlAB distributed computing Server
........................................
14
configuring an AMi with MAtlAB distributed computing Server
...............
14
configuring MAtlAB distributed computing Server launch Mechanics
.......
15
Setting up a Scheduler
..............................................................................
16
MathWorks Job Manager
......................................................................
16
third-party Schedulers
............................................................................
17
network Setup
..........................................................................................
17
network Setup for using MathWorks Job Manager
..................................
17
network Setup for using third-party Schedulers
.......................................
22
Setting up the MAtlAB client on a user’s desktop
.......................................
22
Setup for using MathWorks Job Manager
...............................................
23
Setup for using third-party Schedulers
.....................................................
23
Licensing
.............................................................................................................
24
license Management on Amazon ec2
........................................................
24
The MathWorks support services
.........................................................................
24
References
...........................................................................................................
24
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

3
Introduction
Cloud computing has captured popular imagination with the immense possibili
-
ties this computing paradigm seems to offer. The term
cloud computing
itself is
variously defined — it encompasses a broad range of IT capabilities that are acces
-
sible irrespective of the user’s geographic location. These capabilities are offered as
a high-availability service by a third party on a pay-as-you-go basis, and the end
user interacts with them over the Internet
cloud
.
You can distinguish among the cloud computing services different vendors offer
on the basis of the level of abstractions they provide: hardware-as-a-service or
infrastructure cloud (e.g., Amazon Elastic Compute Cloud), platform (e.g., Google
App Engine), and applications (e.g., Web-based e-mail services). In this paper, our
interest is in cloud computing services that offer hardware as a service.
Amazon Elastic Compute Cloud (Amazon EC2) is one of the better known
hardware-as-a-service cloud computing services. This paper outlines key require
-
ments and steps for integrating MATLAB® parallel computing products – Parallel
Computing Toolbox™ and MATLAB Distributed Computing Server™ – for use with
the Amazon cloud computing service.
Who Should Read this Paper
System and cluster administrators should review the following sections:
1. Is Cloud Computing Right for You?
2. MATLAB Parallel Computing Products: Basic Setup and Requirements
3. Setting up Parallel MATLAB on Amazon EC2:
Setting up the tools for use with a cloud computing service is similar to setting
them up for use on a traditional computer cluster with a similar set of require
-
ments. However, some advanced maneuvers are required. We highlight the
additional requirements and setup steps in the sections that follow.
4. Licensing
5. The MathWorks Support Services
Users of MATLAB and Parallel Computing Toolbox should review the following
sections:
1. Is Cloud Computing Right for You?
2. MATLAB Parallel Computing Tools: Basic Setup and Requirements
3. Using Parallel MATLAB on Amazon EC2
Is Cloud Computing Right for You?
The cloud computing paradigm offers several advantages. Location-agnostic
availability is a key feature. More importantly, this computing paradigm lets orga
-
nizations break free from the cost-prohibitive task of maintaining their computing
infrastructure.
In a variety of industries and disciplines – such as science, engineering, finance, oil
exploration, and bioinformatics – the demand for massive compute resources for
cluster applications can spike for relatively brief periods of time. This is particularly
true for groups with research and development focus. These organizations can either
invest in continually adding compute resources (as well people resources to maintain
these compute resources) or outsource this to a cloud computing service.
4

ApplicAtion Guideline

Additionally, with cloud computing horizontal scaling, or acquiring additional
computer systems, is almost instantaneous. With a single button click, users can
commandeer hundreds of additional computational resources for their applications.
However, you must be aware of some fundamental issues before committing to
this paradigm.
setup, Getting started, and Cost
Setting up applications to run on a cloud computing service requires a significant
time investment, particularly in remotely installing and managing various soft
-
ware components. The fact that the operating environment is virtualized may alter
some basic assumptions about how software will behave, particularly when multi
-
ple software components need to communicate with each other. In our experience,
an administrator with no experience with a cloud computing service can take
approximately 6-8 hours to completely set up MATLAB parallel computing tools
on Amazon EC2. This time includes reviewing Amazon documentation, upload
-
ing installation files, and performing basic tests on the Amazon EC2 cluster. A
virtualized environment has implications for the performance of user applications,
too, as we discuss in the next subsection.
Cloud computing services typically charge by usage. With Amazon Web services (of
which Amazon EC2 is a part) you will bear three costs: using Amazon EC2, storing
data and machine images on Amazon storage service (S3), and any data transfers that
you perform. Review these costs before signing up for any cloud computing service.
Amazon provides an online tool to estimate monthly costs while using its services.
You may also incur additional costs for software licenses. Users’ ability to scale
applications may be limited by the number of available software licenses. For
example, a MATLAB Distributed Computing Server license provides access to a
certain number of workers. Once this pool of workers is exhausted, users must
wait until the workers are released by computations other users are running.
Performance and Quality of service
You need to consider two aspects of performance and quality of service. First, the
interaction with a cloud computing service occurs over the Internet. Data transfer
over the Internet is inherently slow and may become a significant bottleneck for
users’ applications.
Secondly, any cloud computing service relies heavily on virtualization. This means
that system resources such as the disk I/O system and networking are shared among
several virtual compute resources. This means that user applications that rely on
interworker communication (using MPI for example) or have significant disk reads
and writes may see performance deterioration, compared to a compute cluster that
dedicates specialized resources for these purposes. Data transfer from client desktop
computers to the virtual compute resources will also suffer from these bottlenecks, in
addition to the Internet-induced latencies. Amazon’s service provides some freedom in
the choice of these virtual compute resources to mitigate some of these effects.
Moreover, the combination of the two forces described above means that the
instantaneous horizontal scaling is only relative to the long hardware procurement
and set up cycles. Firing up appropriate compute resources on a cloud comput
-
ing service can take a significant amount of time. For example, on Amazon EC2,
launching an instance configured with MATLAB Distributed Computing Server
can take up to 10 minutes. This time increases as the size of user code and data
that is hosted on the instance increases.
security and Intellectual Property
Security, both in terms of your organization’s network security as well as intellec
-
tual property, is another issue that needs careful thought.
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

5
Interacting with a cloud computing service requires configuring your organiza
-
tion’s network security to allow necessary processes to communicate over the
Internet. For the Amazon EC2 service, it is essential to be able to establish an
SSH connection with a running instance for someone to be able to customize it
for users’ needs. In addition, accessing certain features of the MATLAB parallel
computing tools requires configuring desktop sharing software (such as VNC) and
a virtual private network (VPN). For example, the interactive parallel computing
capability in MATLAB parallel computing tools requires MATLAB workers to
establish a direct connection with the client MATLAB session on a user’s desktop.
In addition to network security, you must also consider whether your organization will
permit transferring data and user programs through the organization’s firewall to an
outside network. For example, users may wish to host their large data sets on Amazon’s
services to mitigate the cost of repeatedly transmitting the data over the Internet.
MAtLAB Parallel Computing tools:

Basic setup and Requirements
The fundamental setup of MATLAB parallel computing tools for use with a cluster
requires four essential software components§:
1. MAtLAB with Parallel Computing toolbox:
MATLAB and the toolbox
provide users the language and tools for programming parallel MATLAB applica
-
tions, as well as mechanisms to send applications for execution. In a typical setup,
Parallel Computing Toolbox and MATLAB are installed on the users’ desktops.
More generally, MATLAB and the toolbox are on a client that connects to a cluster
to access compute services offered by the MATLAB Distributed Computing Server
and with which the users interact directly.
2. MAtLAB Distributed Computing server
consists of workers that per
-
form computations on the cluster computers. The server and the scheduler are
installed on the cluster. You need to run as many server instances as the number of
MATLAB workers your parallel computations require.
Figure 1
: Basic setup of MATLAB
parallel computing tools.
§
Note that a user can employ MATLAB and Parallel Computing Toolbox in a desktop-only mode where
MATLAB takes over the role of scheduler and spawns the four workers locally on the user’s desktop that are
tied to the user and the MATLAB session. Users can develop and test applications locally on their desktops in
this mode before scaling up to clusters using MATLAB Distributed Computing Server.
6

ApplicAtion Guideline

3. scheduler
manages the interaction between client computers and the clus
-
ter. MATLAB Distributed Computing Server comes with a basic scheduler,
MathWorks job manager, with which you can manage MATLAB jobs (only). You
can also use third-party schedulers if you have advanced cluster management and
security needs. In this paper, we focus primarily on the MathWorks job manager.
4. License Manager
is for managing software licenses.
The MATLAB parallel computing tools enable both batch as well as interactive work
-
flows. In a batch workflow, a MATLAB user can submit a job to the cluster scheduler,
possibly shut down MATLAB, and retrieve results later, once the job has been
executed. In an interactive workflow, a MATLAB user is connected directly with the
MATLAB workers running on the cluster. The user sends commands that are executed
immediately, and results are available as soon as the command execution is complete.
The following pairs should be able to communicate with each other for the tools
to function:
1. Client MAtLAB and the scheduler:
Required for submitting jobs and
receiving results, initiating interactive sessions
2. Workers and scheduler:
Required for the scheduler to send code and data
received as jobs from the client MATLAB to the workers, and for receiving results
from individual workers
3. Client MAtLAB and workers:
Required for interactive sessions
4. License manager and all others:
Depending on the setup, you may need to
configure one or more instances of license manager to manage licenses for MATLAB,
MATLAB workers, and optionally a third-party scheduler. Note that only certain types of
MATLAB and toolbox licenses require a license manager. However, MATLAB Distributed
Computing Server requires you to configure a license manager for serving worker keys.
Allowing such communication requires opening necessary firewall ports and install
-
ing additional software, depending on how you choose to configure the tools.
The setup shown in Figure 2 mirrors a common cluster setup. The client MATLAB is
installed on users’ desktops while the server is installed on the Amazon EC2 machine
images. Users connect with the Amazon EC2 cluster the same way they would with a
regular cluster from their desktop computers with some additional setup requirements.
Figure 2:
Client MATLAB and MATLAB Distributed Computing Server
on two computer networks, with client MATLAB on a user’s desktop.
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

7
Using Parallel MAtLAB on Amazon eC2
Setup
For a MATLAB user to use a desktop installation of MATLAB to connect to a
MATLAB Distributed Computing Server cluster running on Amazon EC2, your
system administrator needs to perform two setups on the desktop:
1. VPn and/or ssH software with appropriate credentials:

This is required to be able to connect with MATLAB Distributed Computing
Server and send computations and retrieve results.
2. A Parallel Computing toolbox configuration for your MAtLAB
installation:

The toolbox configurations are available from the Manage Configurations item
in the Parallel menu in MATLAB (a Configurations manager GUI is launched).
Your system administrator may perform this setup manually by entering the
appropriate network setup properties. Alternatively, you may receive a MAT-file
containing Configuration properties. This can be imported into your MATLAB
installation using the Import Wizard available from the File menu in the
Configurations manager.
Performing Parallel MATLAB Computations on Amazon EC2
The steps required for performing parallel MATLAB computations on Amazon
EC2 are similar to those while using the local workers provided by Parallel
Computing Toolbox or using a regular cluster that runs MATLAB Distributed
Computing Server.
However, you need to execute two steps before you can start using Amazon EC2
cluster for MATLAB computations:
1. Connect to Amazon eC2:
You must first establish a connection with the
Amazon EC2 cluster using the VPN and/or SSH software your system admin
-
istrator has installed on your computer. Your system administrator will provide
instructions on how to do this. Once the connection has been established, you can
start a MATLAB session.
2. Configure the MAtLAB session:
Execute the pctconfig command from the
MATLAB command prompt to set the client’s hostname that is assigned by the
VPN server running on Amazon EC2.
>> pctconfig(‘
hostname’, ‘ip-10-8-0-4.ec2.internal
’)
The steps to determine this hostname on a Windows desktop are outlined below.
(Request assistance from your system administrator.)
a. Open a command console. (In the Start menu, click
Run
; Enter
cmd
in the text
box; and hit enter or return key) This opens a separate window.
b. Execute the command
ipconfig /all
from the command prompt.
c.
In the command window output, locate the adapter whose Description field has
the word VPN. Note the IP Address property for this adaptor.
d.
Using this IP address, execute the command
nslookup

ipaddress
. Copy the
hostname listed for the Name field. Use this hostname as an argument to the
pctconfig
command in MATLAB.
Once you have completed these two steps you are ready to perform MATLAB
computations on your Amazon EC2 cluster. Note that because MATLAB commu
-
8

ApplicAtion Guideline

nicates with Amazon EC2 over the Internet, the response can be slow depending
on the Internet traffic and the amount of data you transmit back and forth
between your desktop and the Amazon services.
Some of the steps you can execute in MATLAB are shown below. For details, refer
to the Parallel Computing Toolbox documentation available with the product
installation as well as online at
www.mathworks.com:
1.
Select the EC2 Configuration as the default from the Parallel menu in
MATLAB. As mentioned above, your administrator might set up the configu
-
ration or it might be supplied to you as a MAT-file for importing into your
MATLAB setup.
2.
You can also query the cluster status using the
findResource
command.
Figure 3:
Choosing the appropriate Parallel Computing Toolbox configuration
for Amazon EC2.
Figure 4
: Using the
findResource
command to query cluster status.
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

9
3.
You can use the
matlabpool
command to initiate an interactive session for
using
parfor
or any of the parallel routines in available in Optimization
Toolbox™ and Genetic Algorithms and Direct Search Toolbox™.
Figure 5:
An interactive parallel session. Note that response time may be slow
depending on Internet traffic and amount of data exchanged between your desktop and
Amazon EC2.
Figure 6:
Using the batch command to send a MATLAB script for execution on Amazon
EC2. The batch command automatically uses the previously selected configuration.
10

ApplicAtion Guideline

4.
Additionally, you can use functions such as
batch
,
createMatlabpoolJob
,
createJob
, or
createParallelJob
for sending MATLAB scripts and func
-
tions for offline execution on the Amazon EC2 cluster.
Managing Data
Because your desktop and the Amazon EC2 cluster reside on two completely
separate networks, there are certain differences you need to bear in mind while
managing data sets for your computations.
The file system on your desktop (and your organization) is not shared with the
computers on Amazon EC2. Thus, any MATLAB files you create on your desktop,
or data files that you use within your organization, are not visible to MATLAB
workers running on the Amazon EC2 cluster. As a result, both your code and your
data files must be transferred from your desktop to the Amazon EC2 cluster for
your computations. There are two alternatives:
1.
Using Parallel Computing toolbox configurations:

Parallel Computing Toolbox configurations let you set a
FileDependencies

property. The files pointed to by this property are bundled together and
transported to the cluster at the beginning of an interactive session or a batch
submission.
For multiple projects, you can configure multiple copies of the Amazon EC2
Configuration that your administrator has supplied with different file and path
dependencies, and pass in the name of the configurations as needed.
2.
Using Amazon services to host data and code:

For a large code base and large data sets, transmitting code and data every time
you use the Amazon EC2 cluster can be very time consuming. Discuss your
requirements with your system administrator if you have large data sets and
would like to host them on Amazon services. You can consider two options:
Figure 7:
The
FileDependencies

property lets you automati
-
cally transfer code and data
to workers.
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

11
a.
Hosting on Amazon machine images (AMIs):
This option is similar to keep
-
ing all your data and code on your desktop. Because multiple copies of AMIs
are launched when a cluster is set up on Amazon EC2, you need to copy your
data only once. Using the appropriate MATLAB path manipulation in your
MATLAB workers, you can make the code and data visible when you perform
computations.
This option may particularly be useful if the stored code base and data are
common among several users within your organization. Your administrator can
simply copy the appropriate files and provide appropriate path settings through
the
PathDependencies
property in a Parallel Computing Toolbox configura
-
tion (see Figure 7).
Note however that you are limited by the amount of storage allowed for the
AMI instance that your administrator has signed up for. See Choosing an
Amazon EC2 AMI and Instance below.
b.
Hosting using Amazon storage services:
If you must store very large data sets, consider
using the Amazon Simple Storage Service (S3) and Amazon Elastic Block Storage
(EBS). Retrieving data from Amazon S3 requires understanding the S3 API and
requires programming in languages other than MATLAB. Similarly, using Amazon
EBS requires advanced maneuvers. Contact your system administrator for assistance.
setting up Parallel MAtLAB on Amazon eC2
As described in the basic setup section above, users perform parallel MATLAB
computations by sending computations to MATLAB Distributed Computing
Server workers running on a cluster. With Amazon EC2 the cluster is a set of vir
-
tual machine instances (instances of Amazon Machine Images or AMIs, described
later in this section).
There are two ways to make an installation of MATLAB Distributed Computing
Server available to these instances. The first approach is to install the server on
the machine images directly. The server workers can be configured to be launched
when instances of these machine images are launched.
The second approach is to use the Amazon Elastic Block Storage (EBS) service in
which you can configure a snapshot of an EBS volume to hold the server product
installation. To launch a server worker requires that you to launch an instance of
an AMI, launch an EBS volume from a previously configured snapshot, connect
the two together, and then launch worker processes. This process is then repeated
for each worker launch. EBS volumes cannot be shared between virtual machine
instances, so a separate volume needs to be launched for each machine instance.
For more information on EBS visit http://aws.amazon.com/ebs
The first approach requires fewer steps for configuring machine images, maintain
-
ing them, and launching a cluster. It is a simpler option. The second approach,
however, introduces a few additional steps and is slightly more complex than the
first. However, it can provide other efficiencies, such as faster cluster launch on
Amazon EC2. (Because machine images are smaller without the server product
installation, it is faster to launch a virtual machine instance.)
For both of these approaches, there are seven key steps in setting up MATLAB
parallel computing tools for use with the Amazon cloud computing service. We
describe these steps only for the first approach in the sections that follow.
1.
Set up the desktop computer through which you will connect with Amazon
services.
2.
Set up the basic compute environment on the Amazon services.
12

ApplicAtion Guideline

3.
Set up MATLAB Distributed Computing Server to run on the Amazon cluster.
4. Set up the scheduler.
5. Set up your network.
6. Set up client MATLAB with Parallel Computing Toolbox and other toolboxes
on the MATLAB users’ desktop computers.
7. Set up the license manager.
Setting up Your Desktop Computer
Some operations related to configuring and running systems on Amazon EC2
require an SSH connection from your desktop computer to the running system.
Most Linux- or UNIX-based systems come preconfigured with an SSH client.
Windows users can use Putty, a free SSH client. For detailed instructions on set
-
ting up these tools for use with Amazon EC2, see the
Getting Started guide
at
Amazon EC2 Website. You will also need command line utilities that Amazon pro
-
vides (which require Java) for configuring the Amazon services.
In addition to having an SSH client, we recommend setting up a desktop-sharing
client on your machine, such as VNC (and a corresponding server on your
Amazon EC2 machine image). You can access graphics-based utilities and applica
-
tions installed on the Amazon EC2 machine image.
We also recommend installing a Firefox browser plug-in (ElasticFox) that enables
you to perform operations such as launching and closing systems by simple button
clicks. This is available for download from the following URL:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=609
Figure 8:
ElasticFox, a browser plug-in for common Amazon EC2 related tasks.
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

13
Setting up a Basic Compute Environment on Amazon Web Services
To use the MATLAB parallel computing tools on Amazon EC2, you must sign up for
two Amazon services. The first, Simple Storage Service (S3), lets you store your data
as well as system images that you configure as a part of the Amazon EC2 service.
For our purposes, the Amazon S3 service does not require any special configuration
steps. The Amazon EC2 service provides you with the computing infrastructure.
This service lets you configure and launch instances of Amazon Machine Images, or
AMIs, on which you can run your applications. The AMIs are in effect compressed
encrypted copies of an entire operating system (Amazon EC2 currently supports
Linux-based systems) including any application or service you have installed.
Once you sign up for these services, you can begin accessing and configuring
them using the Amazon utilities you have previously downloaded and configured
on your desktop computer. After configuring an AMI, you must store your cus
-
tomized AMIs at the Amazon S3 service. Amazon provides utilities to register
this AMI so you can retrieve and launch saved AMIs later. For details, review the
documentation available on the Amazon Website:
http://developer.amazonwebservices.com/connect/kbcategory.jspa?categoryID=87
Choosing an Amazon EC2 AMI and Instance
The first step in using Amazon EC2 is to create and configure an AMI. Amazon
enables you to build your own AMIs from scratch by bundling and uploading a
custom operating system installation from your own computers. A simpler way
is to modify and extend template AMIs that Amazon provides. Once you have a
base AMI chosen or set up, you can install and configure applications and services
based on your requirements.
The Amazon AMIs are based on Linux®. MathWorks products are supported on
the following Linux distributions:

32-bit products:
Red Hat® Enterprise Linux v.4 and above, Fedora™ Core 4 and
above, Debian® 4.0 and above

64-bit products:
Debian 4.0 and above, OpenSuSE 1.0 and above
Other distributions not listed above must be built using Kernel 2.4.x or 2.6.x, and
glibc (glibc6) 2.3.4 and above.
To configure an AMI, you first need to launch an instance of the AMI. Amazon
distinguishes among instances on the basis of the size and memory requirements
of the AMI. The instance you choose must satisfy two key requirements:
1. General system Requirements
: Amazon AMIs are based on Linux, so you
must meet the system requirements listed at the following URL:

www.mathworks.com/support/sysreq/current_release/linux.html
2. system Requirements for MAtLAB Distributed Computing server:

Choose an instance that matches your disk space and RAM requirements.
MATLAB Distributed Computing Server requires approximately 3.5GB disk space
(equivalent to installing the entire suite of MathWorks products). Also, a server
worker is a headless MATLAB process and consumes approximately a similar
amount of system resources (1GB RAM is recommended for 32-bit MATLAB).
For the purposes of configuring the AMIs, you can choose from among the
smaller instances. Be careful when you choose a 32-bit or a 64-bit instance. Once
MATLAB Distributed Computing Server is configured on a 32-bit AMI, it will still
function in 32-bit mode even when this AMI is launched as a 64-bit instance. A
64-bit server will simply not work on a 32-bit instance.
14

ApplicAtion Guideline

Once you have selected and launched an AMI instance using Amazon’s command-
line utilities, you can connect to it using an SSH client and begin configuring it
for use. We recommend that you install a desktop-sharing server (e.g., VNC) on
the AMI. This will let you access the graphics-based utilities and applications on
the AMI from your desktop computer. We also recommend that once you have
performed the basic configuration steps on the template AMI, you save a copy as a
base-AMI before further customizing it with MATLAB and other software.
setting up MAtLAB Distributed Computing server
Configuring an AMI with MATLAB Distributed Computing Server
Once you have created a basic AMI and have a running instance, it is time to con
-
figure it with MATLAB Distributed Computing Server. The MATLAB Distributed
Computing Server installation process requires the availability of the MATLAB
Distributed Computing Server installation package on the running AMI instance.
There are two ways to achieve this:
1.
Bundle and upload the installer from your desktop. You can copy required
files from the MathWorks installation DVD or download the files from your
MathWorks account. If you use this option, you must use the Amazon com
-
mand line utilities to bundle and upload the installation files to the base AMI.
2.
Download the installer directly to a running instance using a Web browser.
Again, this requires logging in to you MathWorks account and downloading the
appropriate installers for MATLAB Distributed Computing Server.
Once the installer is available on the instance, you can launch the MATLAB
installer and follow the installation process.
Figure 9:
Installing MATLAB Distributed Computing Server on an
Amazon EC2 AMI.
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

15
We recommend that you save this instance as a separate AMI. Once you have configured
and saved this AMI, you can launch several instances of the AMI to serve as work
-
ers for your parallel MATLAB computations. For the remainder of this document, we
use the designation Worker-AMI when referring to both a saved AMI configured with
MATLAB Distributed Computing Server as well as a running instance of this AMI.
Configuring MATLAB Distributed Computing Server Launch Mechanics
When using the MathWorks job manager all server workers must register with the job
manager before computations are sent to them. The registration happens automati
-
cally as a part of worker launch process. This means that you must first launch an AMI
instance that fires up the MathWorks job manager processes and supply the host URL/
IP address of this instance to the worker launch processes. There are three ways to
launch MATLAB Distributed Computing Server workers using the Worker-AMI:
1.
Manually log in to a running Worker-AMI instance and launch the MATLAB
Distributed Computing Server worker processes using the appropriate launch
scripts that come with the MATLAB Distributed Computing Server installation.
2.
Amazon EC2 lets you pass in arbitrary data to an instance on startup. For
example, using the ElasticFox Firefox browser plug-in, you can pass in arbitrary
text to an instance that can later be read during the instance boot up scripts.
Therefore, it’s possible to configure your MDCS setup through instance data.
There are a number of possible ways to do this:
a.
You can pass in a shell script and eval the shell script at instance startup. In
this shell script, you can call any UNIX® command including the commands
used for starting server workers.
b.
You can pass in structured data (e.g., XML) and have some tool read this
XML on instance startup.
Figure 10
: Launching
an instance of an
Amazon EC2 AMI
using ElasticFox.
16

ApplicAtion Guideline

3.
You can hard-code the startup of the server workers using the traditional Linux
boot script techniques as described in the server documentation. However, this
implies that the startup sequence is hard-coded to an Amazon image, which
means all instances launched from this image will do exactly the same startup
steps. There is no way to parameterize.
No special settings are required for launching server workers on Worker-AMI
instances when using third-party schedulers. The server workers are launched on
already running Worker-AMI instances.
setting up a scheduler
MathWorks Job Manager
The MathWorks job manager is a simple scheduler that ships with MATLAB
Distributed Computing Server. It can manage only MATLAB jobs and it processes
them in first-in, first-out (FIFO) order.
The job manager is available with the installation of MATLAB Distributed
Computing Server. This means that you can redesignate a Worker-AMI as the
Scheduler-AMI and modify the appropriate launch scripts to start the job manager
processes in place of the server worker processes.
The job manager operates in a SOA fashion. It requires the server workers to
already be running before users submit computations to it. Moreover, the server
workers remain alive between jobs unless they fail or are explicitly shutdown.
Thus, both the Worker-AMIs and the server workers must remain alive. As we
noted before, the server workers must know the location of the job manager host
to register themselves with the job manager before they start receiving compu
-
tations. This means that the typical launch sequence of the cluster begins with
the launch of the Scheduler-AMI (which starts up the job manager) followed by
Worker-AMI launches.
A key requirement when using the server with the job manager is that the worker
processes be able to identify each other by hostname, and that the hostname
with which a computer identifies itself be the same as the hostname with which
a computer is visible to others. Because AMI instances are virtualized, calling the
hostname command returns the hostname of the virtualized server and not the
underlying actual computer. As a workaround you can remove the etc/hostname
script on Worker-AMI. Review the network requirements for using the server with
MathWorks job manager:
www.mathworks.com/products/distriben/requirements.html
Third-Party Schedulers
Third-party schedulers provide additional scheduling and security features com
-
pared to the MathWorks job manager. These schedulers also provide the option of
managing additional applications on your cluster. Installing the third-party schedul
-
ers is similar to the process of installing MATLAB Distributed Computing Server.
With third-party schedulers, the server workers are launched as any other applica
-
tion at the beginning of a job and are shut down when the job is complete. Each
scheduler has its own mechanisms to maintain the list of cluster nodes and to
designate a head node and worker nodes. Once these instances are brought up
and are connected, the scheduler can launch and shut down server workers as any
other application.
The direct integration of schedulers such as Platform LSF®, Microsoft Windows®
Compute Cluster Server, PBS Pro®, and TORQUE with MATLAB parallel com
-
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

17
puting tools assumes the availability of a shared file system between the client
computer and the cluster computers. This assumption is not valid when client
MATLAB and MATLAB Distributed Computing Server workers are on completely
separate networks.
As a result, you must use the API for the generic scheduler interface to customize
job submission and retrieval mechanisms for all third-party schedulers. Detailed
examples are available with the MATLAB Distributed Computing Server installation.
Customized scripts must reside on both the MATLAB users’ desktop computers as
well as the Worker-AMI. For details on implementing these, review
Using Generic
Scheduler Interface
in the
Programming Distributed Jobs
and
Programming Parallel
Jobs
sections in the Parallel Computing Toolbox documentation.
network setup
Because the desktop computer, the scheduler, and the server workers reside on dif
-
ferent networks, additional network setup is required on both the desktop client
and the Amazon instances. As noted before, the three should be able to commu
-
nicate with each other for job submission (desktop and scheduler, scheduler and
workers), results retrieval (desktop and scheduler, scheduler and workers) and
interactive parallel sessions (desktop and workers).
This section describes setting up a cluster managed by the MathWorks job man
-
ager. Third-party schedulers will likely be able to use the same setup, but we
encourage you to contact the scheduler vendors for more information.
Network Setup for Using MathWorks Job Manager
Using virtual private networks (VPN) is one of the several ways to set up the com
-
munication between different software components. We successfully experimented
with OpenVPN, an open source VPN program. In fact, we must establish two VPNs.
The first VPN is required between the clients within your organization (i.e.,
users’ desktops and the license manager) and Scheduler-AMI (which runs the job
manager) to enable users to find the job manager and to submit jobs and retrieve
results from the job manager.
A second VPN is required between the Worker-AMIs (which run the MATLAB
Distributed Computing Server workers) and the Scheduler-AMI. The two VPNs
must be configured to push out their virtual networks to each other. This setup is
required for workers to establish interactive connections with the client MATLAB
running on users’ desktops as well as to establish the license manager for checking
out worker keys.
There are eight steps required to establish communication between the software
running in your organization and on Amazon EC2:
1) Set up the first VPN server on Scheduler-AMI.
2)
Set up the VPN clients for this server on user’s desktops and the license server.
3) Set up the license server.
4) Set up the second VPN server on the Scheduler-AMI.
5)
Set up the VPN client for this server on Worker-AMIs.
6) Set up the job manager launch configuration file.
7) Set up the worker launch configuration file.
8) Configure the routing table on Scheduler-AMI.
18

ApplicAtion Guideline

1) Setting up the first VPN server on Scheduler-AMI:
1.
Create an SSL root certificate (ca), server certificate (cert), and private key (key)
as well as a client certificate and a private key.
2. Generate the Diffie Hellman parameters.
3. Create a configuration file with entries for:
a. Locations for the certificates and keys generated in step 1
b. Device type (set as TAP)
c. Server protocol (set as TCP)
d. (Optional) Base IP (or subnet) (default 10.8.0.0)
e. (Optional) Port on which you want the VPN server to listen on (default
1194). You will have to open this port on your firewall.
A configuration file may have entries as follows:
4.
Note that you will need to add a push entry to the configuration file for the
second VPN subnet for using interactive capabilities. See steps below.
5. Once you have set up the configuration file you can start the VPN server and
pass the location of the configuration file you created in step 3 as an input.
2) setting up the VPn client (first VPn server) on a user’s desktop and
license server:
1.
Copy the root certificate (ca), client certificate (cert), and the client key (key)
that you created in step 1 above to the user’s desktop.
2. Create a configuration file with entries for:
a. Locations for the certificates and the key obtained in step 1
port 1194
proto tcp
dev tap
ca /etc/openvpn/keys/ca.crt
cert /etc/openvpn/keys/externalserver.crt
key /etc/openvpn/keys/externalserver.key #This file should be kept secret
dh /etc/openvpn/keys/dh1024.pem
server 10.8.0.0 255.255.255.0
ifconfig-pool-persist externalipp.txt
push “route 10.9.0.0 255.255.255.0”
duplicate-cn
keepalive 10 120
comp-lzo
persist-key
persist-tun
status external-openvpn-status.log
verb 3
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

19
b. Location of the server
c. Device type and server protocol same as VPN server
d.
Indicate that the VPN client must keep trying indefinitely to resolve the
hostname of VPN server
Entries in the configuration file may resemble the following example (on a
Windows client). Note the double back slash (‘\\’) in the path names. Double-
quotes are necessary if there are spaces in the path name.
client
dev tap
ca C:\\MyCertificates\\ca.crt
cert C:\\MyCertificates\\client.crt
key C:\\MyCertificates\\client.key
proto tcp
remote ec2-75-101-227-140.compute-1.amazonaws.com 1194
resolv-retry infinite
nobind
persist-key
persist-tun
ns-cert-type server
comp-lzo
verb 3
3.
Once you have set up the configuration file, you can start the VPN client and
pass the location of the configuration file you created in step 2 as an input.
Figure 11:
Network setup requires establishing two virtual
private networks (VPNs).
20

ApplicAtion Guideline

3) setting up the license server:
1.
Set up a VPN client on the license server as described in (2) above.
2.
Enable IP forwarding so that the incoming networking traffic from virtual net
-
working interface can go to the real networking interface (
echo 1 > /proc/
sys/net/ipv4/ip_forward
or add
net.ipv4.ip_forward=1
to
/etc/
sysctl.conf
and run the command
sysctl –p
).
4) Setting up a VPN server on Scheduler-AMI:
1. Enable IP forwarding on the Scheduler-AMI
(echo 1 > /proc/sys/net/
ipv4/ip_forward).
2. Create a configuration file with entries for:
a.
Same entries for certificate and key locations as well as device and protocol
types for the first VPN server
b.
Specifying a base IP (subnet) that is different from the first VPN server (e.g.,
10.9.0.0)
c. Specifying the port on which you want the VPN server to listen on. This
should be different from the one you specified for the first VPN server (e.g.,
1195).
d. A ‘push’ entry for the first VPN server’s subnet
A configuration file may have entries as follows:
3.
Add a push entry for this VPN server’s subnet to the first VPN server’s configu
-
ration file. Restart the first VPN server if it is already running.
4.
Start the VPN server by specifying the configuration file you created in step 2 as
the input.
server 10.9.0.0 255.255.255.0
port 1195
proto tcp
dev tap
push “route 10.8.0.0 255.255.255.0”
ca /etc/openvpn/keys/ca.crt
cert /etc/openvpn/keys/server.crt
key /etc/openvpn/keys/server.key # This file should be kept secret
dh /etc/openvpn/keys/dh1024.pem
ifconfig-pool-persist internalipp.txt
client-to-client
duplicate-cn
keepalive 10 120
comp-lzo
persist-key
persist-tun
status internal-openvpn-status.log
verb 3
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

21
5) Setting up the VPN Client (second VPN server) on Worker-AMI:
1.
Copy the client certificates and key that you created for the desktop client above
on to the Worker-AMI.
2.
Create a configuration file similar to the configuration file you created for the
desktop client above with the address and port for the second VPN server.
3.
Add a ‘push’ rule that redirects traffic for the license server to the Scheduler-
AMI, which can then route it to the license server.
A configuration file may have entries as follows:
6) setting up the job manager launch configuration file:
1.
In the
MDCE_DEF.sh file
, set the variables
JOB_MANAGER_HOST
and
HOST_
NAME
to the VPN IP address of the Scheduler-AMI. You can simply use the IP
address (e.g., 10.9.0.1) of the second VPN server for the two variables because
the Scheduler-AMI will always serve as the VPN server. Note that you need
to use the IP address of the second VPN server since the Worker-AMIs con
-
nect to the job manager AMI over the second VPN (subnet 10.9.0.0).
7) setting up the MAtLAB worker launch configuration file:
1.
In the
MDCE_DEF.sh
file on the Worker-AMI, you must set
HOST_NAME
to
the Worker-AMI instance’s VPN IP address. You will need to create a separate
script to automate this because the VPN-IP address for each instance will be
different.
8) Configure the routing table on the Scheduler-AMI:
1.
Determine the IP address assigned to the license server by the first VPN
server (e.g., 10.8.0.10) as well as the real IP address of the license server (e.g.,
172.31.45.197).
2.
Add a rule to the routing table on Scheduler-AMI to forward the traffic
directed to 172.31.45.197 to the virtual IP 10.8.0.10 using the following com
-
mand:
route add -net 172.31.45.197 netmask 255.255.255.255
dev tap0 gw 10.8.0.10
client
dev tap
proto udp
remote domU-12-31-38-00-25-71.compute-1.internal 1195
push “route 172.31.45.197 255.255.255.255” #172.31.45.197 ->
license srvr IP
resolv-retry infinite
nobind
persist-key
persist-tun
ca /etc/openvpn/keys/ca.crt
cert /etc/openvpn/keys/client.crt
key /etc/openvpn/keys/client.key
ns-cert-type server
comp-lzo
verb 3
22

ApplicAtion Guideline

3.
Every time you restart the Scheduler-AMI, the license manager will be
assigned a new IP address. As a result, you must repeat the steps above for each
Scheduler-AMI restart.
Network Setup for Using Third-Party Schedulers

A similar setup, as described above, may work for third-party schedulers, provided
the schedulers can be configured to run in a VPN environment. Consult the scheduler
vendors to discover appropriate mechanisms for enabling a MATLAB client to connect
to the scheduler and the MATLAB workers to connect back to the MATLAB client.
Setting up the MATLAB Client on a User’s Desktop
As we noted before, a MATLAB user’s interaction with an Amazon EC2 cluster
mirrors (with minor changes) their typical interaction with a regular cluster.
Encourage your users to review the concerns described earlier in this document.
For example, users may see performance deterioration because of virtualized
compute resources and because the interaction with Amazon EC2 happens
over the Internet. Additionally, users must be aware of security and intellectual
property concerns with transmitting data over the Internet and hosting it on an
external service.
As a system administrator, you will need to set up Parallel Computing Toolbox
configurations to enable your users to use Amazon EC2 cluster. Your choice of a
scheduler will decide how you set up these configurations. You can export these
configurations in the form of MAT-files, which can be distributed to several
users who can then import them into their MATLAB sessions. Users can make
multiple copies of these configurations and customize them for their project-
specific settings. For your users, switching between the schedulers and clusters
requires changing only the configuration name. Detailed information on Parallel
Computing Toolbox configurations is available (see reference 8).
Figure 12:
Parallel
Computing Toolbox
configurations.
Parallel Computing with MATLAB on Amazon Elastic Compute Cloud

23
Setup for Using MathWorks Job Manager

The Parallel Computing Toolbox configuration for the MathWorks job manager
requires that you supply the hostname of the Scheduler-AMI that runs a copy of the
job manager. You must supply the internal hostname as assigned by the VPN server
(e.g., ip-10-9-0-1.ec2.internal for a server that is assigned the 10.9.0.1 address) as the
job manager hostname (
LookupURL
field in the configuration properties).
In addition, you must install and configure any software (such as VPN or SSH)
that you require to enable MATLAB users to connect to the Scheduler-AMI, to
submit jobs, and to receive results. If you are using the setup described in the
Network Setup section, configure and launch a VPN client as described.
Setup for Using Third-Party Schedulers

As noted above, the key requirement of shared file system between users’ desktop
computers and cluster computers for the direct integration of various third-party
schedulers is not met when the users’ computers and the cluster computers reside
on completely different networks. Therefore, you will need to set up custom
scripts for users to be able to connect to the Amazon EC2 cluster. Customized
scripts must reside on both the MATLAB users’ desktop computers as well as
the Worker-AMI. For details on implementing these, review the sections
Using
Generic Scheduler Interface
in
Programming Distributed Job
s and
Programming
Parallel Jobs
sections in Parallel Computing Toolbox documentation.
In addition, you must install any software (such as VPN or SSH) that you require
for MATLAB users to connect to the Scheduler-AMI, to submit jobs, and to
receive results (see the
Network Setup
section).
Licensing
We strongly recommend reviewing The MathWorks Software License Agreement
(SLA) and consulting The MathWorks regarding license usage prior to setting
up MATLAB and parallel computing products on the Amazon EC2 service. In
particular, review the “MATLAB Distributed Computing Server” section in the
SLA. Your current MATLAB Distributed Computing Server licenses can be used
on cloud computing services. There are certain usage restrictions, however, that
you must consider. Contact your MathWorks sales representative to discuss your
requirements.
License Management on Amazon EC2
You can use your organization’s current license management infrastructure to serve
MATLAB Distributed Computing Server licenses for workers running on Amazon
EC2. To achieve this, the Worker-AMIs must be able to communicate with the license
server that resides within your organization’s network. The advantage of this setup is
that you do not need to obtain separate licenses or manage multiple license servers.
One of the possible solutions is outlined in the Network Setup section above.
It is possible to establish a separate License-AMI on the Amazon EC2 for running
the Macrovision FLEXlm® license management software (which serves MathWorks
licenses). However, the license manager needs to bind to a specific local IP or
MAC address to let MATLAB Distributed Computing Server workers find the
manager and checkout the appropriate number of keys. Note that the “static IP
addresses” offered by Amazon services is an external IP address and therefore
cannot be used by the license manager, which needs the internal address of the
physical machine on which it runs. Because an AMI instance can be launched
anywhere on the Amazon network, the IP or MAC address will always change
with each launch. This change will cause the license checkout process to fail,
which in turn will cause user-submitted batch-jobs and interactive sessions to fail.
24

ApplicAtion Guideline

Therefore, you will need to keep the License-AMI running all the time. If for some
reason the License-AMI fails or is shutdown, you must manually re-designate a
new instance for managing the licenses. However, your Amazon EC2 cluster will
remain unusable until the redesignation process is complete. Note that number of
license server redesignations is limited to four per year, after which you will need
to contact The MathWorks for each redesignation.
the MathWorks support services
Setting up MATLAB parallel computing products, tools, and services to run
on Amazon EC2 requires advanced installation maneuvers. The MathWorks
Consulting Group is available to support these activities. Consult your sales repre
-
sentative or contact the Consulting Group directly:

www.mathworks.com/services/consulting
References
1. Amazon Elastic Compute Cloud:
http://aws.amazon.com/ec2
2. Amazon Simple Storage Service:
http://aws.amazon.com/s3
3. Amazon Elastic Block Storage:
http://aws.amazon.com/ebs
4.
Amazon Elastic Compute Cloud:
Getting Started:

http://docs.amazonwebservices.com/AWSEC2/2008-05-05/GettingStartedGuide
5.
User Guide:
Parallel Computing Toolbox, The MathWorks
6.
User Guide:
MATLAB Distributed Computing Server, The MathWorks
7.
Installation Guide:
MATLAB Distributed Computing Server:
www.mathworks.
com/distconfig
8.
Programming with User Configurations,
Parallel Computing Toolbox
User Guide
,
The MathWorks
9.
Documentation, OpenVPN:
www.openvpn.net/index.php/documentation/howto.html
© 2008 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarks for a list of additional trademarks.
Other product or brand names may be trademarks or registered trademarks of their respective holders.