Link to Globus Tutorial Paper - Paolo Repole

flameluxuriantData Management

Dec 16, 2012 (4 years and 7 days ago)

216 views










Paolo Repole





Globus Implementation


George Frank


Project Requirements:


The goal of this project was to setup a grid using
assigned

software. George and I
were assigned an architecture designed by Globus which provides the Globus Toolkit
4.0.
5 for “easy” installation. The intended end result was to have a working grid
architecture that we could demonstrate a small piece of basic functionality on. To meet
this goal we had the following requirements:

Setup
and prepare a machine for installation

of the Globus toolkit:

To meet
this requirement we acquired a machine
Dell Dimension Pentium 4

with Fed
ora
Core 4, adequate memory and CPU power.
We also created several users to aid in
the i
nstallation and testing of the Globus T
oolkit.
These users w
ere

globus,
griduser,
and a personal account we used with sudo access.

Gather and install all necessary software
(excluding Globus)
:
Although the
Globus Toolkit

does provide a framework for g
r
id
ded applications it is reliant on
many other pieces of software.

It is necessary to make sure that the software
Globus is reliant on is of a compatible version. In an effort to reduce problems
during the installation problem we tried to install versions of software that were
close to or equal to a version know to work
with the current Globus Toolkit. A
chart is shown below in the Gathered Software section that shows all the version
numbers of software we installed and the expected version numbers.

Install and Configure Globus Too
lkit:

This is the most time consuming
req
uirement. Although you may think a prepackaged toolkit of software would be
easy to install
,

the Globus Toolkit is anything but. We met this requirement by
installing and configuring Globus Toolkit 4.0.5 using the 4.0.X quickstart guide
located on the Gl
obus website.

Setup a second machine to show the grid
d
ed system in use:

Nick and Mujtaba
were also working on using
setting up a machine using the Globus T
oolkit and
due to the difficulties we were both having we decided to both attempt to setup a
primary

machine and the group who finished first would then aid the other group
in turning their machine

into the secondary machine. It

turned out that George
and I finished
our primary installation before
Nick and Mujtaba’s
so we aided
them in configuring their

computer as the second
ary

machine.

Run example application of software:

To meet this requirement
we first got
GridFTP running and did a basic file transfer over the network. In an effort to
produce a more exciting end product we installed RFT and WS GRAM

to get a
proxy running.

Hardware

Requirements
:



CPU

Globus Toolkit software is not itself computationally intensive. The CPU of the
Globus host should, therefore, be chosen with the computational requirements of
the jobs it will run. If the CPU can adequa
tely support the computational
requirements of user jobs, the Globus Toolkit software should not add undue
strain.



Physical Memory

Globus Toolkit software is not memory intensive. Any system with a nominal
amount of memory sufficient to support its user b
ase will not be rendered
unusable with the addition of the Globus software.



Disk Space

Most components of the Globus toolkit require very little space. For example the
Monitoring and Discovery Service

has a footprint on disk of 5 MB.

On the other
hand whe
n deploying RLS (replica location service) on x86 architecture, a
minimum of 1GHz CPU and 1 GB RAM should be used for deployments
managing thousands of mappings. For larger deployments, Dual 1 GHz CPUs and
2+ GB RAM is
recommended
. Disk space will be depen
dent upon the scale of
your local replica catalog.

Gathered Software:

Software

Expected Version

Actual Version

GCC

3.3.5

4.0.1

G++

3.3.5

4.0.1

Tar

1.14

1.15.1

Sed

4.1.2

4.1.4

Make

3.80

3.80

Perl

5.8.4

5.8.6

Ant

1.6.5

1.6.5

J2SDK

1.4.2.10

1.4.2_15

Sudo

1.6.8p7

1.6.8p8

Xinetd

2.3.13
-
3

2.3.13
-
6

PostgreSQL

7.1 or greater

8.08



Setup Machines
,

Software
, and Experiment

Explained:

The entire process of installing the Globus Toolkit was anything but a smooth one
for us. I imagine someone with more exp
erience installing Globus could finish an
installation successfully in less than 5 hours while it took four of us more then 12 hours
to set

up two machines.
The first thing we did in the installation process was fami
liarize
ourselves with the GT4 Admin G
uide;

after giving that a quick look we realized that
there must be a better way to install this system and found the quickstart guide. The
quickstart guide in itself is a very dense document providing a great deal of information
in a small amount of spac
e. Although this has its advantages it is very easy to miss a user
change or include statement, potentially leading to hours of troubleshooting. In an
attempt to provide an easy to read account of our installation the following sections are
divided as the
y are in the quickstart guide. Our hope is that
this account will act not only
as a summary of our trials and tribulation
s

but also as an aid to future installers.


Quickstart Guide Summary:

1. Introduction:

The quickstart guide provide
s

a basic introduct
ion to the purpose of this document. In
this four line introduction there are a few pieces of important information:


1. The installer used throughout this document is the GT4.0.1 installer. There are
no changes required to use this document with later 4.
0.x installers.


2.
This is a quickstart that shows …the installation of prereqs.

Neither of these statements are completely true. Although no proced
ural changes are need
to use a
later version of the 4.0.x installer many of the example configuration files

contain lines with 4.0.1 in the document and these need to be changed to the appropriate
version for the software to run correctly. This may seem obvious but it is easy to miss
one of the version number
s

and not be able to figure out what is causing the
problem.
Although the quickstart guide claims to show the installation of all prerequisites it uses
PostgreSQL

which was not installed on our machine. We used the line “Yum
-
y install
postgresql Postgresql
-
server php
-
pgsql
” this will instal
l Postgres 8.2
.

With this
installation the library is not at “
var/lib/postgres
” but rather at “var/lib/pgsql.”

2.1.

Pre
-
requisites


This section is exactly what you would expect it to be. It shows the installation of
the basic prerequisites. This is valuable to see bo
th how to install a given application as
well as to see what version number is known to work with the Globus Toolkit. This is
how the majority of the list from the Gathered Software section abou
t was constructed.

2.2.

Building the Toolkit


This section is

fairly straight forward. It shows the building of the toolkit and
reminds users to add certain libraries to their profiles.
Unfortunately

it fails to mention to
add the source script to the user profile of a startup script so the Globus libraries
load
o
n
startup. This can be considerably confusing when you get to the Certificate authority
section.
You will receive errors when attempting to generate valid CAs, this problem
alone took us a substantial amount of time.

2.3.

Setting up security on your first
machine

This section does a great job going over how to create a CA but you must pay
close attention to what user is sending the request and what user must sign the CA. As
long as you pay close attention to user changes and add the appropriate libraries t
o startup
as explained above this section should go smoothly. It is also important to understand
what is being done in that last three lines of this section:

root@choate:/etc/grid
-
security#

vim /etc/grid
-
security/grid
-
mapfile

root@choate:/etc/grid
-
securit
y#

cat /etc/grid
-
security/grid
-
mapfile

"/O=Grid/OU=GlobusTest/OU=simpleCA
-
choate.mcs.anl.gov/OU=mcs.anl.gov/CN=Charles Bacon" bacon


These lines are adding the a
ppropriate information to the g
r
i
d
-
mapfile. The line that
needs to be added should be seen as

output earlier in the CA process. Some users may
find it easier to just edit the file rather than use the above lines to add the appropriate
information.

2.4 GridFTP


This section is very well written and not overly complicated.
The only problems
we had
in this section were related to previous CA problems. Some users may find it
helpful the put the following lines (edited for use on their own machine)
in a script
to
allow easy repeated testing of GridFTP:

choate

% globus
-
url
-
copy gsiftp://choate.mcs.anl.
gov/etc/group
file:///tmp/bacon.test.copy

choate

% diff /tmp/bacon.test.copy /etc/group

This is the first time we were able to show that some of our underlying architecture was
working. We were able to use “the grid” to copy a file securely based on CA.

2.
5.

Starting the
web services

container


This section is short and t
o the point
;

my only suggestion to fellow installers is
that you pay close attention
when you are asked to switch to root. You should also pay
attention to the following two notes supplied

in the quickstart:

The RFT warnings are expected right now because we haven't setup our database yet.
Otherwise, things look good.

140.221.8.31 is my IP address. Some people following the quickstart may see "127.0.0.1"
here. You need to fix that! Edit
$GL
OBUS_LOCATION/etc/globus_wsrf_core/server
-
config.wsdd

and
client
-
server
-
config.wsdd
, add a line reading
<parameter
name="logicalHost" value="
140.221.8.32
" />

under the <globalConfiguration>
section. For instance:

<globalConfiguration>


<parameter name="
logicalHost" value="140.221.8.32" />

Although during out install we did not have the problem I can imagine missing this note
causing great distress in the future.

2.6.

Configuring RFT


This is another very straightforward section. Unfortunately we did spe
nd a little
unneeded time trying to configure the RFT because of the way we installed
Postgres
. We
did not immediately realize that “
var/lib/postgres
” was “var/lib/pgsql” in our installation
making the suggested editing of the /var/lib/pgsql/data/postgres
ql.conf file
to include

“listen
-
addresses=’*’”

rather difficult
. Other then this small pitfall this section

s
installation went smoothly.


2.7. Setting up WS GRAM


This is the final section before setting up a second machine. Luckily it only took
us 5
-
10

minutes. If the rest of your system was installed correctly up to this point you
should have no problem whatsoever.


Setting up a Second Machine:


As a result of our group finishing before
Nick and Mujtaba
’s we turned to helping
them setup a second machi
ne. Since this was not our primary responsibility I am not
going to go into great deal regarding this setup o
ther than to say that it is very

similar to
the single machine setup. For some time there were host problems surrounding the setup
of the second
machine but after spending some time outside of class we were able to
setup a successful trust relationship using CAs. With our eventual successful setup we
were able to get a basic proxy working. This work with a proxy across machines is what
we ultimate
ly deemed our proof of a function Grid, thus ending our Globus experience.


Conclusions:


Globus is a system that is being used with increasing
frequency but does not yet
have

a

streamline installation. The quickstart guide although close to providing
what is
needed to perform a successful installation is just far enough away to cause an installer,
or in our case a group of installers, to waste hours on trivial mistakes. Even with the
additional information this report provides there must be a better w
ay to get Globus
installed. I can see no reason why this installation has not been reduced to a few
installation
scripts with
some option screens. In the end I am glad I was exposed to the
Globus Toolkit, but I wish the installation had gone smoother so
I would have had more
time to play with more advanced features and possibly write an application to run on the
grid.