OGSI.NET: An OGSI-compliant Hosting Container for the .NET Framework

quiverlickforkSoftware and s/w Development

Nov 2, 2013 (3 years and 8 months ago)

237 views

OGSI.NET: An OGSI-compliant Hosting Container for the
.NET Framework

Glenn Wasson, Norm Beekwilder, and Marty Humphrey
wasson@virginia.edu
, nfb5z@virginia.edu
, Humphrey@cs.virginia.edu

Grid Computing Group
Computer Science Department
University of Virginia
Charlottesville, VA 22904
Draft of March 15, 2003

Note: This paper describes on-going work (including unresolved issues) at the University of
Virginia that creates the ability to host OGSI-compliant Grid Services in the .NET Framework.
We have explicitly tried to be consistent with the excellent OGSI Java Hosting work being led by
Thomas Sandholm at Argonne National Lab, both in the design/implementation of the project and
the style of this document. This project is an on-going work; we are releasing this document to
encourage active feedback and involvement from the community. We welcome all comments.

1. Introduction

The Open Grid Services Architecture (OGSA) represents a new vision of both the grid and web
services. By defining standard communication protocols and formats, OGSA represents the
means to build truly large-scale, interoperable grid systems. The Open Grid Services
Infrastructure (OGSI) is a set of WSDL specifications and behaviors that are consistent with the
OGSA specification and which are to be implemented on a variety of hosting platforms. To date,
work has been focused on hosting environments for various Java platforms (J2SE, J2EE)
primarily operating on UNIX systems. The work described in this paper develops a hosting
environment for the Microsoft .NET framework.

In addition, we believe that the OGSA/OGSI specification processes will benefit from the
creation of hosting environments on additional platforms. We seek to not only leverage the work
being done on the Java OGSI environment in defining both types and interfaces, but to influence
both the OGSI and OGSA specifications based on our experience with the .NET platform. The
capabilities enabled by the .NET framework may be both important for grid systems and
unconsidered by current work.

2. Basic Architecture

Our two highest-level design goals in building an OGSI compliant hosting environment for .NET
were: (a) to use IIS to receive requests from clients and (b) to support the dynamic creation of
grid service instances that persist between client invocations. We wanted to use IIS in order to
take advantage of the close integration of Microsoft software and to highlight potential inter-op
issues that may arise from not using Apache. The second goal is important because we felt it
would be too inefficient to create a new server object for each request (possibly reloading large
amounts of state from previous requests). Note that in this context persist is not synonymous
with stateful, but rather alluding to efficiency concerns.

Although an issue in our early design, our basic design is to have a container entity that holds
all the service instances running on a host. This is in part to leverage the design and
implementation work of Thomas Sandholm and the ANL team on GT3. In order to allow support
for grid services with arbitrary names, we dispatch requests from IIS to the container in the ISAPI
stage with an ISAPI filter. The container process could consist of a collection of AppDomains.
Each service instance could have its own AppDomain and there could also be one additional
domain for the containers logic (some dispatching and message processing functionality). We
refer to the object in this final AppDomain as the dispatcher. Note that there is an existing issue
regarding having one AppDomain per Grid Service Instance or having one AppDomain per
service class (e.g., per factory).

The container process has a thread pool and each IIS request contacts the dispatcher (causing one
of the container process threads to be used to execute dispatcher code). The dispatcher
determines which service should get the request and transfers execution of that thread to an object
in the appropriate AppDomain. A diagram of the system is shown in Figure 1.

Inside the AppDomain, control transfers to an object called the Grid Service Wrapper (GSW).
The GSW encapsulates the service instance, including the implementations of its methods and its
service data, as well as other port types that the service supports (e.g. grid service port type,
notification port type, etc.) and various supported message serializers and deserializers. The GSW
performs processing of any message specific information (e.g. SOAP headers) and deserializes
the request message to get the name of the invoked method and any parameters. Then the GSW
searches its port types to find which implements the method. The implementations for the port
types may be specific to the service instance (and therefore written by a service author) or
provided with this container for convenience (e.g. the grid service port type). The GSW then uses
reflection on the selected port type to get a handle to the proper method and invokes it, passing in
any parameters and message specific data (e.g. SOAP header data). The data returned by the
invocation is serialized by the GSW and sent back to the dispatcher as a binary array. The
dispatcher sends that array back to IIS for return to the client.

3. System Elements

3.1 Dispatcher

The dispatcher is the interface between the client request and the service instance that serves that
request. We envision that there may be more than one dispatcher per container in order to provide
service-independent support for multiple transport protocols, such as http and httpg. In the case of
an http request, the dispatcher interfaces IIS to the service instances. In the case of an httpg
request, a dispatcher may be contacted directly by the client (see below). The remainder of the
document concentrates on how the service container will handle grid services using SOAP over
http.

The dispatchers main function is to route request messages to the appropriate service instance
and return the request result to the client (possibly through IIS). The dispatcher contains a
mapping from request URIs to AppDomains in the container. A function call to the Grid Service
Wrapper (see below) passes the raw request message to the services AppDomain. The simplicity
of the dispatcher is deliberate and assists in providing concurrent access to the container (see
below).




GSW
Container Process
Route to
instances
AppDomain
Grid Service Wrapper (GSW)
serializer
deserializer
port types
Service
instance
Route to
container
IIS
ISAPI URL
rewrite
HttpHandler
AppDomain
Dispatcher
Request
on wire


Figure 1: OGSI Container on .NET Platform

3.2 Grid Service Wrapper

The Grid Service Wrapper or GSW encapsulates the various functional units of a grid service.
Each AppDomain in the container has a GSW and each GSW wraps a grid service instance. The
GSW provides a convenient programming model for the grid service author by allowing:
• Pluggable, service-specific message serializers/deserializers
• Easy specification of port types supported by this service
• Pluggable support for port types supported by a service, but not written by the service
author (e.g. the grid service port type)

One of the difficulties encountered by both this project and the ANL OGSI project in Java was
how to allow a grid service author to inherit many different port types that they might wish
their service to support in a language with single inheritance. Both projects are now moving in a
direction in which the grid service author decorates their service with information that allows
tooling or the runtime system to add appropriate port type support without in code inheritance.
This project uses the grid service wrapper for this purpose.

A grid service author will decorate their service code with .NET attributes which will be
recognized by the container runtime when the assembly is loaded. When a GSW is instantiated in
an AppDomain (see Factories), it will load the assembly containing the appropriate grid service.
This assembly will tell the GSW which of the various serializer/deserializer modules that are
registered with the container are usable by this service. The GSW also contains a port type
array. This array points to instantiations of all the port types supported by this service
instance. Many services will have port types that are specific to them (i.e. the ones written by the
service author) and so the port type array will point to the both implementations provided by the
service author and implementations provided with this container, but supported by the service
instance (e.g. the grid service port type or the notification source port type). The supported port
types will be determined at load-time by the attributes of the service read by the GSW.

In addition to serialization information and inherited port types, the GSW also contains
information specific to the service instance it wraps. That includes not only the port type
implementations written by the service author, but also the service instances service data. The
GSW exposes this data through the interfaces defined in the OGSI specification. In future,
pluggable service data query engines can be added.

3.2.1 Service Function Invocation

Each grid service instance has some number of functions that clients will wish to invoke. In order
to perform an invocation, IIS routes a request to the dispatcher and the dispatcher routes it to the
correct GSW. The GSW performs the actual invocation on the service instance and returns the
results back along the reverse path. The GSW actually processes the request message since the
dispatcher just pass it along raw. Previously, this document had suggested that the processing
of the request message could be done in the dispatcher to separate the message-specific
processing from the service-specific processing that occurs in the services AppDomain.
However, this made concurrent access to the container more difficult because it meant that the
dispatcher had to atomically 1) send the non-invocation specific data from the message
processing (e.g. WS-Security information) to the service and 2) invoke the desired function on the
service object. By sending the request to the GSW and allow it to process the message, the
synchronization mechanisms can be kept within a single AppDomain.

When a request arrives at the IIS web server, the following steps occur.

1. ISPAI filter
: An ISAPI filter routes the messages to our managed-code HttpHandler if
that message is bound for our container
1
.
2. HttpHandler routing
: The HttpHandler forwards the message on to the containers
dispatcher, or if the request had ?WSDL as the query string, it retrieves and returns the
services WSDL.
3. Dispatcher finds Grid Service
: The dispatcher finds the appropriate GSW (and hence
AppDomain) for the service by looking up the request URI in its service table. The
dispatcher then gets a handle to that GSW and calls the PerformService method which
takes the raw message as a parameter.
4. Process SOAP headers
: The GSW processes the SOAP headers of the raw message by
running the filter of the Web Services Enhancements for Microsoft .NET (WSE) [3] .
5. Determine function name and parameters:
The GSW finds the name of the function to
invoke on the grid service and decodes any parameters from the body of the SOAP
request using the deserializer specified for the service.


1
ISPAI filters are called very early in the processing of a message by IIS. They allow us to intercept the
request message and route it to our container. The routing takes place by rewriting the URL to have an
appropriate extension. This is because routing is handled by HttpHandlers in Microsofts ASP.NET
architecture and they perform routing based on extensions. In the future ISAPI filters may be useful for
dealing with other transport protocols.
6. Find object reference for service name
: The GSW finds the port type implementing the
specific function from its port type array and then uses reflection to get a handle to the
desired method.
7. Invoke function:
The GSW invokes the requested function and gets the result.
8. Serialize the results:
The GSW serializes the results an XLM document which is then put
into a byte array, using the service-specific serializer.
9. Send results back to client:
The GSW sends the result array to this dispatcher, which
sends them to the HttpHandler, which sends them to IIS. IIS handles forwarding the
result to the client.

The inter-AppDomain communication (between the dispatcher and the GSW) as well as the intra-
AppDomain communication (between the GSW and the service instance port type objects) is
handled automatically by .NETs remoting.

3.3 Service Instances

The portion of Figure 1 labeled Service Instance represents the portion of the grid service
instance that 1) methods implementations specific to this service instance and 2) service data.
However, the full functionality of a Grid Service as defined in the Grid Service Specification [1]
is contained within the GSW. The GSW interfaces with the dispatcher to provide the services of
each grid service instance to clients.

In our approach, Grid service instances all derive off the GridService base class. This class
provides methods to allow the service author to communicate with the Grid Service Wrapper and
hence the container. This allows the grid service to access transport and message specific data
processed by the GSW (such as SOAP header data) and to determine information about the
containers host environment. As of this writing, the Grid Service specification is now called the
OGSI specification and has just reached v1.0. Additionally, service writer may inherit from the
PersistentGridService base class (which itself inherits from GridService) if they wish to have
access to some helper functions for storing the persistent state of a service (see below).

3.4 Factories

Factories are services that create instances of other services. In the .NET hosting container, this
means creating a new AppDomain and a new Grid Service Wrapper in that domain. The Grid
Service Wrapper will then load the assembly for the appropriate service instance and instantiate
all necessary port types and serializers. A factory service will store a reference to the GSW in the
new domain along with the published name of that object (e.g., a URI). This mapping must also
be sent to the dispatcher. Object references can be passed across AppDomains, so the reference to
the service can be passed from the factorys domain to the dispatchers.

4. Persistence

There are multiple types of persistence in the OGSI framework:
-- Services which are automatically loaded into the container at startup
-- Services which do not require soft-state keep-alive messages (i.e., services that have an
infinite timeout)
-- Services with state that is saved to permanent media (e.g., disk) and which can load
that state back into a running service instance
Certain services must be available to make the container useful. For example, factory services and
indexing services are needed to create and discover service instances. The .NET containers
configuration files will allow the specification of which services should be created and loaded
when the container initializes.

The Grid Service specification [1] defines a timeout for each service. A service may elect to die
if it does not receive a message from a client within this time period. This prevents unused (and
forgotten) instances from cluttering the container. However, this timeout can be set such that no
keep-alive messages are needed. This, in effect, makes the service permanent as long as the
container is running. However, if the container is shutdown or crashes, these services will not be
restarted (as opposed to services listed in the containers config file). Obviously, a service in the
containers config file will have both types of persistence. This is, there seems to be little point in
starting a service automatically, but then letting it time out.

The removal of services when their timeout expires is the job of the dispatcher. Periodically, the
dispatcher will run a garbage collector to remove old instances. The Grid Service specification
does not say that a service must immediately be removed if its timeout has expired, but rather that
clients may no longer count on that services availability. This means the dispatchers garbage
collection machinery does not need to be highly synchronized with the clocks of the service
instances. Although the behavior of this garbage collection mechanism is not precisely mandated
by the Grid Service specification, there are obviously advantages architecturally if multiple
hosting environments behave similarly (certainly clients may benefit from roughly consistent
semantics); as such, we are currently investigating how this is performed in the Java Hosting
Environment produced from Thomas Sandholms team.

Permanent service state is supported in multiple ways. First, service data elements can be
persistent. The GridService base class contains a hash table, which, through functions defined by
the Grid Service Specification. The Grid Service specification defines some standard service data
elements and service authors can populate this table with whatever extra service data is
appropriate. The PersistentGridService base class contains an additional hash table for service
data. This table saves its state to disk whenever it is updated. The PeristentService base class
provides Get() and Set() methods for dealing with service data in this hash table. However, while
this mechanism is convenient for service authors to use in saving small data elements, the
automatic save-on-update paradigm does not work well for large data structures that may be
modified many times in a short period. We plan to support Activate() and Deactivate() methods to
allow services to save custom data when they are being loaded or unloaded from the container.
These methods will be part of the GridService base class.

5. Security

There are several security concerns in the creation of a .NET hosting container. Some of these
concerns are unique to .NET, while some are indicative of the broader class of hosting
environments. First, we must provide security mechanism to the service instances. This is done
via the WSE pipeline run by the Grid Service Wrapper.

In addition, there are security concerns relating to the dispatcher and container themselves. How
are these system components to be protected from badly implemented or malicious services?
AppDomains provide the memory protection of a process without the heavyweight activity of
creating a new process for each domain. Allowing each service instance to live in its own
AppDomain provides a large amount of protection for the other services in the container. Because
the Grid Service Wrapper actually invokes functions on the service from within the services
AppDomain, any problems that cause the service to crash or hang will not effect the dispatcher or
other services in the container. However, if services are allowed to make calls into unmanaged
code, they can bypass the protection of AppDomains.

Similarly, factories will create Grid Service instances by creating a new AppDomain and then
creating a Grid Service Wrapper in that domain. That GSW will have the name of the appropriate
assembly to load for the actual grid service. By having the grid services assembly loaded and the
grid service initialized from within the new AppDomain, the remainder of the AppDomains in the
container are protected from potential bugs in service authors service creation code.

We also wish to support running services that have the local host access privilege of particular
users (i.e. the Windows equivalent of running a service as a particular UNIX user id). Note that
this does not mean simply having a service that can spawn computational jobs under a certain
user id, but having the services themselves be able to access local resources as particular users.
We are currently investigating two approaches to this problem. The first is to have a hierarchical
container system in which a single master container dispatches to one of a set of other containers,
each running under a different Windows user id. Each of these user containers could only
create services that are running under a particular user id and hence every contained service
would run as the containers owner. This approach is similar to the one used by the GT3 Managed
Job Service for running computational jobs (except that this would also apply to the services
themselves). A second option is to have any thread active in a given service to be running under
the service authors WindowsIdentity. In this way, multiple threads with different ids may be
operating within the same container. Currently we are pursuing the first option while we
investigate the operational feasibility of the second option.

Finally, it is not clear that, given this design, if we will need to provide some sandboxing
capabilities above the inherent .NET security mechanisms (e.g., evidence-based security, policy
evaluation in the CLR). While these .NET capabilities will be greatly leveraged, it is not clear if
they, alone, are sufficient given our constraints. For example, support for custom state saving for
Grid Services might create challenges.

6. Remoting as a GSR

Grid services can be identified/invoked by both their Grid Service Handle (GSH) and their Grid
Service Reference (GSR). The GSH is a unique, system-wide name that can be used to find more
detailed information necessary to communicate with the service. This information is represented
by the GSR, which can hold protocol and other information. A HandleResolver service is used to
map a GSH into a GSR. A client with a GSH would typically call a HandleResolver to get a
services GSR and then use the GSR to communicate directly with the service. We believe that
remoting offers a new type of GSR that has not been previously considered. For example, it may
be possible to contact a containers dispatcher with a GSH, and get a remoting-specific GSR for a
service. Then a client could communicate directly with that service (i.e., that remote object)
without going through the dispatcher (e.g., the use of WS-Referral and WS-Routing) or IIS. It
should be noted that the Grid Service specification allows a service to have multiple GSRs for a
single GSH and each client need not be given the same GSR by a resolver. The remoting GSR
may only be appropriate for clients running on Windows machines and so either the resolver must
be able to determine this when returning a GSR, or the resolver must return all available GSRs
and let the client decide which one to use. This approach might raise new security
concerns/issues. For example, it is not clear if httpg could be supported in this approach. In
addition, the WSE pipeline on the server side might not be engaged, which could be problematic
for SOAP header processing. These are open concerns.

7. Grid Service Base Class

A grid service is more than just a service that implements the GridService port type. In general, a
grid service may want to implement many port types. Ideally, a service author would inherit from
a set of standard port types (provided by this project) and then implement their own service-
specific functionality. However, C# (and Java) allows only single inheritance. The Globus
Toolkit 3 approach to this problem is to use a delegate programming model in which a grid
service contains a number of port type objects, each of which can point to different
implementations of the functionality of those port types.

The approach used here is to have each service instance contained within a Grid Service Wrapper
(GSW) which allows the service author to plug-in support for the various port types and
mechanisms used by the grid service, but not written by the service author. These include the
common port types, e.g. the grid service port type, notification source/sink port types, factory port
type, etc. as well as (de)serializers for various message formats and query engines for service
data. We believe that by specifying these plug-in components with attributes, they can be added
to a GSW at service creation time with requiring modifications to the service authors code. In
addition, these components could be updated without bringing down the service by loading the
new version into the services AppDomain, changing the appropriate GSW pointer and unloading
the old version.

The actual Grid Service base class (from which all services inherit) is relatively simple, providing
methods to allow the service author to access the functionality pointed to by the GSW as well as
to allow access to the service data elements contained within the service instance. We are
currently working on defining this interface.

8. Leveraging Current OGSI/OGSA Work

It is important to leverage the large body of work being done in support of the Globus Toolkit 3.
Specifically, two goals of this work are 1) to inter-operate with services running in Java-based
hosting environments and 2) to allow client to use the same software (unmodified) to access
services running in both GT3 and our .NET hosting environment. This is not as simple as
following the Grid Services Specification because that specification is not sufficient to fully
define the operation of a hosting environment. For example, the Grid Service specification does
not specify the name of the indexing service running in a container or the name of a local handle
resolver for a container. However, both clients and service authors need to have well-defined
values for these names in order to make use of them. If clients (or other services) wish to call
services running in any type of container, many such details must be worked out.

One way in which the current work is leveraged today is via their WSDL definitions for functions
and their data type definitions for standard type (e.g. service data elements). Through the
wsdl.exe and xsd.exe tools from .NET, these definitions can be operationalized into C# code and
used by the .NET container. We currently do this to a limited extent and plan to increase our use
of this in the near future.



Acknowledgments

We gratefully acknowledge the continuing contributions and feedback of (in no particular order)
Simon Cox, Thomas Sandholm, Shaun Arnold, Vijay Tewari, Dan Fay, Mark Lewin, Savas
Parastatidis, Neil Chue Hong, Steve Tuecke, and Dave Berry.

References

[1] Tuecke, S., Czajkowski, C., Foster, I., Frey, J., Graham, S., Kesselman, C., Vanderbilt,
P., and Snelling D. 2002. Grid Service Specification  Draft 11/4/02. OGSI Working Group,
Global Grid Forum. http://www.ggf.org/osgi-wg
.

[2] Sandholm, T., Tuecke, S., Gawor, J., Seed, R. Maguire, T. Rofrano, J., Sylvester, S. and
Williams, M. 2002. Java OGSI Hosting Environment Design  A Portable Grid Service Container
Framework. Globus Toolkit 3 Alpha 2 docs.

[3] Web Services Enhancements for Microsoft .NET.
http://msdn.microsoft.com/webservices/building/wse/default.aspx