Exposing Computational Resources Across ... - Open Grid Forum

indexadjustmentInternet and Web Development

Nov 13, 2013 (3 years and 10 months ago)

99 views

Exposing Computational
Resources Across
Administrative Domains

The Condor Shibboleth Integration
Project


a scalable alternative for the
computational grid

09/13/2005

2


“ … Since the early days of mankind the primary
motivation for the establishment of
communities

has been the idea that by being part of an
organized group the capabilities of an individual
are improved. The great progress in the area of
inter
-
computer communication led to the
development of means by which stand
-
alone
processing sub
-
systems can be integrated into
multi
-
computer

communities

. … “

Miron Livny, “

Study of Load Balancing Algorithms for
Decentralized Distributed Processing Systems
.
”,

Ph.D thesis, July 1983.

09/13/2005

3

The single biggest road
-
block to the grid


THE grid will not happen if resource owners
do not bring their resources to the table


Resource owners must know their resource is
secure


Security mechanisms must be scalable


09/13/2005

4

Grids focus on site autonomy


One of the underlying principles of the Grid is
that a given site must have local control over
its resources, which users can have an
account, usage policies, etc.


Grids: The Top Ten Questions, Jennifer M.
Schopf and Bill Nitzberg

09/13/2005

5

Resources must be attracted to the system


Because users demand it!


It is secure


Rules for use of each resource can be established
and enforced locally (WHO can do WHAT, and
WHEN for HOW LONG?)


It is scalable


Administration of the access to the resource is no
longer a daunting (if not impossible) task


It is easy to set up

09/13/2005

6

Introduction to Condor


In a nutshell, Condor is a specialized batch
system for managing compute
-
intensive jobs.
Like most batch systems, Condor provides a
queuing mechanism, scheduling policy,
priority scheme, and resource classifications.
Users submit their compute jobs to Condor,
Condor puts the jobs in a queue, runs them,
and then informs the user as to the result.

09/13/2005

7

Communities benefit from Matchmakers...

.. someone has to bring together community
members who have
requests

for goods and
services with members who
offer

them.


Both
sides are looking for each other


Both
sides have constraints


Both

sides have preferences



eBay is a matchmaker


Condor is a matchmaker

09/13/2005

8

Condor’s Power


A user submits a job to Condor


Condor finds an available machine on the
network and begins the job


Definition of ‘available’ is highly configurable


If a machine becomes unavailable, job is
checkpointed until the resource becomes
available, or is migrated to a different
resource


Condor does not require an account on the
remote machine

09/13/2005

9

Migrating jobs to other Pools


Flocking


Flocking is Condor's way of allowing jobs that cannot
immediately run (within the pool of machines where the job
was submitted) to instead run on a different Condor pool.


Condor
-
C


Condor
-
C allows jobs in one machine's job queue to be
moved to another machine's job queue. These machines
may be far removed from each other, providing powerful
grid computation mechanisms, while requiring only Condor
software and its configuration


Condor
-
C is highly resistant to network disconnections and
machine failures on both the submission and remote sides.

09/13/2005

10

DAGMan


DAGMan (Directed Acyclic Graph Manager)
is a meta
-
scheduler for Condor. It manages
dependencies between jobs at a higher level
than the Condor Scheduler

09/13/2005

11

Disadvantages


Even with Flocking and Condor
-
C,
administrative scalability doesn’t exist


System requires excessive communications
between administrators, exchanging of host
names and/or IP addresses


09/13/2005

12

What is Shibboleth?


Internet2 Middleware


Shibboleth leverages campus identity and
access management infrastructures to
authenticate individuals and then sends
information about them to the resource site,
enabling the resource provider to make an
informed authorization decision.

09/13/2005

13

Shibboleth Goals


Use federated administration as the lever;
have the enterprise broker most services
(authentication, authorization, resource
discovery, etc.) in inter
-
realm interactions


Provide security while not degrading privacy.


Attribute
-
based Access Control



Mike Gettes, Duke University

09/13/2005

14

Shibboleth Components:

Identity Provider


User’s home site, where authentication credentials
and attribute info is stored.


Handle Server (HS)


provides a unique and
potentially anonymous identity for the user. User
authenticates using site’s existing technology
(LDAP, Kerberos, WebISO, etc).


Attribute Authority (AA)


responds to requests
about the user from the target. Retrieves user
attributes from site’s existing identity store


typically
a user directory such as LDAP.


Both implemented with Apache, Tomcat and Java
servlets/JSP.

09/13/2005

15

Shibboleth Components:

Service Provider


Protects the target application, enforces
authentication and authorization.


Assertion Consumer Service (ACS)


maintains
state information about the user’s unique numerical
identifier (handle)


Shibboleth Attribute Requester (SHAR)


makes
requests for attributes to the user’s Identity
Provider’s Attribute Authority


Co
-
located with application web server as server
module. Implementations currently exist for Apache
(UNIX and Windows) and IIS (Windows).

09/13/2005

16

Typical Access Flow

1.
User attempts to access new Shibboleth
-
protected
resource on target site application server.

2.
User is redirected to Where are you From Server
(WAYF), selects home site (origin site). Only
necessary once per user session.

3.
User is redirected to origin site Handle Server
(HS) and authenticates with their local credentials
(eg. Username/password)

09/13/2005

17

Typical Access Flow (cont.)

4.
Handle server generates unique numerical
identifier (handle) and redirects user to target
site’s ACS.

5.
Target ACS hands off to SHAR, which uses the
handle to request attributes from the user origin
site’s Attribute Authority.

6.
User’s AA responds with an attribute assertion,
subject to Attribute Release Policies (ARP).

7.
Target site uses the returned user attributes for
access control and other application
-
level
decisions.

09/13/2005

18

Federations


Associations of enterprises that come together to exchange information
about their users and resources in order to enable collaborations and
transactions


Built on the premise of


Initially “Authenticate locally, act globally”


Now, “Enroll, authenticate and attribute locally, act federally.”


Federation provides only modest operational support and consistency in
how members communicate with each other


Enterprises (and users) retain control over what attributes are released to a
resource; the resources retain control (though they may delegate) over the
authorization decision.


Mike Gettes, Duke University

09/13/2005

19

What if we Integrated Shib with Condor?


Condor would function exactly as it now does


Flocking (or eventually Condor
-
C) would use
Shibboleth as it’s authentication model


Classified Ads would include either user
attributes or Shib Unique Identifiers



This brings us to the Condor
-
Shibboleth Integration Project!

09/13/2005

20

Project Goals


Primary goal is to create a scalable, expandable grid
-
aware
universal workflow management tool for computational grids that
doesn't require or exclude use of Globus grid map files.


Scalable.


Functions across unrelated administrative domains


Tied to the Federated Authentication Model.


No ties to Globus Certificates and Grid map files.


Expandable.


Design for relatively simple computational grids, but do not
exclude connection to future projects requiring expanded grid
services (Globus).


Grid
-
Shib project already in progress could become a future
connection.

09/13/2005

21

Project Goals


Phase I: Shib Enabled Condor Web portal


Shibboleth was originally designed as a web services
federated authentication tool, fat clients not yet available.


Phase II: Shib Enabled Condor Fat Client


Extending the existing “submit client” model with Shib
elements.


There are already other open source projects in the works
which will utilize Shibboleth in a fat client model, so we
should be sure our work now will be compatible with the fat
client model when it is supported by Shibboleth.

09/13/2005

22

Project Goals


Impact Condor as little as possible.


Identify key components that must be changed, mostly at the
execute end.


Some work can be performed by preprocessing scripts at the
submit end.


Ensure that changes made at execute end will not interfere
with other modes of Condor use.


Ensure that changes made at execute end will also work
with eventual fat client version.

09/13/2005

23

Web based Condor Scheduler node


Grid Portal must be running condor_schedd and
have $(FLOCK_TO) populated with grid resources.


Grid Portal must be configured as a Shibboleth
Server


User creates a job.


Uploads Condor submit script, or


Portal tools could help create simple submit scripts.

09/13/2005

24

Conceptual Workflow Model


User logs in to Grid Portal Web Site (Phase I)


Shibboleth web server module detects that no Shibboleth
session exists, and redirects user to the “where are you
from (WAYF)” server.


User selects their home institution and is redirected to that
Institution's Identity Provider Site, and provides credentials
that are stored in some database (LDAP, PKI, RDBMS,
Kerberos).


User is redirected back to the Web Portal Server.


The web server module makes a call back to the
Shibboleth Attribute Authority and a Shibboleth Session is
established for the user.


A Base64 encoded SAML Attribute Assertion is now
available for consumption by the Web Portal.


09/13/2005

25

Conceptual Workflow Model


User submits job.


Clicks on a 'Submit' button on Portal site.


Portal Applications pulls out key components
(name, institution) to add to class ad from mapped
HTTP request headers.


Portal Applications would append the signed base
-
64 encoded Attributes assertion to the Condor
Submit file.


Portal Applications would create a temporary
database entry to hold data files for job


Altered Condor Submit file is passed as an
argument to condor_submit from the Webserver.

09/13/2005

26

Conceptual Workflow Model


Upon completion of the job, Condor will behave as
though this were a Flocking job.


Data files are returned to the Grid Portal machine


Better solution would be to allow user to dictate in job
submission file where data will be stored.


Condor supports this, we need to evaluate how Shib
enabling Condor will affect this.


User is notified that the job is done

09/13/2005

27

Behind the scenes


Execute resources will advertise not only the typical attributes, but
will also advertise who they're willing to work for based upon Grid
Access Attributes.


The path to one final Grid Access File is declared in a Condor configuration
file on each Master node (optionally each compute node).


Jobs are matched to resources based upon usual “class ad” rules
and new Shibboleth rules contained in the Grid Access File.


When a match is made, the execute resource will THEN parse the
xml, verify the signature of the attributes added to the submit file
by Shibboleth, and extract the relevant attributes.


If the signature is good, the job will be executed following all the
functionality of flocking.

09/13/2005

28

Condor Modifications


We propose a hierarchical set of role based authorization
configuration files that match the config file structure
already in place in Condor.


Each level should establish permissions for that specific
level.


A site
-
wide file would allow the most generic level of
access (anyone from Georgetown, anyone from
University of Wisconsin, no one else).


Resource level files would specify more specific levels
of access (Arnie on this machine any time, anyone from
Georgetown only after 9:00 pm).

09/13/2005

29

Grid Access File Structure


Grid Access Template Files.


Grid Access Template Files contain.



Users (individual or affiliation).


Conditions of Use (time, date, load, ranking).


Any machine on Internet can house a Grid Access
Template File.


Any Grid Access Template File can refer to one or more
other Grid Access Template Files in an include fashion.


GAT files will be processed into a local Grid Access
File by a small GAT parsing program.

09/13/2005

30

Grid Access Parser


A Grid Access Parser will run on each machine hosting a Grid
Access Template File.


“Interactive Mode” will allow the Parser to help resource owners build
their custom Template Files.


“Cron Mode” can help track day to day changes in the Grid.


Parser will create a human readable/consumer readable text
file of all contraints listed.


All cited Grid Access Template Files will be parsed.


Final list will include ALL contraints listed in ALL Grid Access Template
Files up the chain.


Parser must handle duplicate entries gracefully.


Prevents bloated Grid Access Files.


Allows redundant Template Files to prevent any single points of failure.

09/13/2005

31

Condor Modifications


$(FLOCK_FROM) should be pointed to this Grid Access File.


This will improve scalability by passing responsibility for
deciding the final list of submitters to each resource owner.


Some other variable should be set to a site file and one or
more trust file.


The site file includes information about identity providers
and service providers.


The trust file includes information about trusted signing
credentials.


These files are provided by the Federation, and are what
represents the Federation in a physical sense, and are
necessary for the operation of every identity provider and
service provider.

09/13/2005

32

Summary


We will create a scalable, expandable computational grid
system that allows the implementation of easy to manage
computational grids.


This approach does not require Globus certificates and
mapfiles, but will not preclude them (particularly once the
separate Grid
-
Shib project is completed).


This project can be rolled out in two phases.


one that allows the creation of a web based Grid Portal.


a second that allows command
-
line access from a fat
client.

09/13/2005

33

Credits


University of Wisconsin



Miron Livny Professor of Computer Science and Condor Project Lead


Todd Tannenbaum Manager of Condor Development Staff


Ian Alderman Researcher of Data Security for Condor team


Georgetown University



Charlie Leonhardt Chief Technologist


Chad LaJoie


Brent Putman Programmer


Georgetown University Advanced Research Computing



Steve Moore Director


Arnie Miles Senior Systems Architect


Jess Cannata Systems Administrator


Nick Marcou Systems Administrator


Internet 2



Ken Klingenstein Director of the Internet2 Middleware Initiative



Mike McGill Program Manager for the Internet2 Health Sciences Initiative




Special thanks to Jess Cannata who helped engineer the specific
details of this idea.

09/13/2005

34

URL


http://www.guppi.arc.georgetown.edu/condor
-
shib/