The JAWS Web Server Framework - Applied Science University

burgerraraSoftware and s/w Development

Nov 18, 2013 (3 years and 8 months ago)

71 views

Q1:

JAWS: Understanding High Performance
Web Systems



Introduction

The emergence of the World Wide Web (Web) as a mainstream technology has forced
the issue on many hard problems for network application writers, with regard to
providing a high quality of

service (QoS) to application users. Client side strategies have
included client side caching, and more recently, caching proxy servers. However, the
other side of the problem persists, which is that of a popular Web server which cannot
handle the request
load that has been placed upon it. Some recent implementation of Web
servers have been designed to deal specifically with high load, but they are tied down to a
particular platform (e.g., SGI WebFORCE) or employ a specific strategy (e.g., single
thread of
control). I believe that the key to developing high performance Web systems is
through a design which is flexible enough to accommodate different strategies for dealing
with server load and is configurable from a high level specification describing the
cha
racteristics of the machine and the expected use load of the server.

There is a related role of server flexibility, namely that of making new services, or
protocols
, available. The Service Configurator pattern has been identified as a solution
towards maki
ng different services available, where
inetd

being cited as an example of
this pattern in use. While a per process approach to services may be the right abstraction
to use some of the time, a more integrated (yet modular!) approach may allow for greater
st
rategic reuse. That is, a per process model of services requires each server to redesign
and reimplement code which is common to all, and at the same time making it difficult to
reuse strategies developed for one service in another. To gain ground in this
area, the
server should be designed so that new services can be easily added, and can easily use
strategies

provided by the adaptive server framework. But generalizing the notion of
server
-
side service adaptation, one can envision a framework in which clie
nts
negotiate

with servers about how services should handled. Most protocols today have been
designed so that data manipulation is handled entirely on one side or the other. An
adaptive protocol would enable a server and a client to negotiate which parts o
f a
protocol should be handled on each end for optimal performance.


Motivation

Web servers are synonymous with HTTP servers and the HTTP 1.0 and 1.1 protocols are
relatively straightforward. HTTP requests typically name a file and the server locates the
file and returns it to the client requesting it. On the surface, therefore, Web servers appear
to have few opportunities for optimization. This may lead to the conclusion that
optimization efforts should be directed elsewhere (such as transport protocol
op
timizations, specialized hardware, and client
-
side caching).

Empirical analysis reveals that the problem is more complex and the solution space is
much richer. For instance, our experimental results show that a heavily accessed Apache
Web server (the most
popular server on the Web today) is unable to maintain satisfactory
performance on a dual
-
CPU 180 Mhz UltraSPARC 2 over a 155 Mbps ATM network,
due largely to its choice of process
-
level concurrency. Other studies have shown that the
relative performance o
f different server designs depends heavily on server load
characteristics (such as hit rate and file size).

The explosive growth of the Web, coupled with the larger role servers play on the Web,
places increasingly larger demands on servers. In particular,

the severe loads servers
already encounter handling millions of requests per day will be confounded with the
deployment of high speed networks, such as ATM. Therefore, it is critical to understand
how to improve server performance and predictability.

Serv
er performance is already a critical issue for the Internet and is becoming more
important as Web protocols are applied to performance
-
sensitive intranet applications.
For instance, electronic imaging systems based on HTTP (
e.g.
, Siemens MED or Kodak
Pictu
re Net) require servers to perform computationally
-
intensive image filtering
operations (
e.g.
, smoothing, dithering, and gamma correction). Likewise, database
applications based on Web protocols (such as AltaVista Search by Digital or the Lexis
Nexis) supp
ort complex queries that may generate a higher number of disk accesses than
a typical Web server.


Modeling


Overview of the JAWS Model


Underlying Assumptions



Infinite network bandwidth.

This is consistent with my interests in high
-
speed networks.
For a model of Web
servers which limites the network bandwidth, see [
Slothouber:96
]



Fixed network latency.

We assume the contribution of network latency to be negligible. This is w
ill be
more true with persistent HTTP connections, and true request multiplexing.



Client requests are "serialized".

Simply meaning that the server will process successive requests from a single
client in the order they are issued from the client.

Researc
h questions



What is performance when average server rate is constant.



What is performance when average server rate degrades with request rate?



What degradation best models actual performance?


Benchmarking


Benchmarking Testbed Overview


Hardware
Testbed

Our hardware testbed consisted of one Sun Ultra
-
1 and four Sun Ultra
-
2 workstations.
The Ultra
-
1 has 128MB of RAM with an 167MHz UltraSPARC processor. Each Ultra
-
2
has 256MB of RAM, and is equipped with 2 UltraSPARC processors running at
168MHz. Ea
ch processor has 1MB of internal cache. All the machines are connected to a
regular Ethernet configuration. The four Ultra
-
2 workstations are connected via an ATM
network running through a Bay Networks LattisCell 10114 ATM, with a maxmimum
bandwidth of 155
Mbps. One of the Ultra
-
2 workstations hosted the Web server, while the
three remaining Ultra
-
2 workstations were used to generate requests to benchmark the
server. The Ultra
-
1 workstation served to coordinate the startup of the benchmarking
clients and the

gathering of data after the end of benchmarking runs.

Software Request Generator

Request load was generated by the WebSTONE webclient, that was modified to be multi
-
threaded. Each ``child'' of the webclient iteratively issues a request, receives the
requested data, issues a new request, and so on. Server load can be increased by
increa
sing the number of webclient ``children''. The results of the tests are collected and
reported by the webclients after all the requests have completed.

Experiments

Each experiment consists of several rounds, one round for each server in our test suite.
Each round is conducted as a series of benchmarking sessions. Each session consists of
having the benchmarking client issue a number of requests (
N
) for a designated f
ile of a
fixed size (
Z
), at a particular load level beginning at
l
. Each successive session increases
the load by some fixed step value (
d
) to a maximum load (
L
).

The webclient requests a standard file mix distributed by WebSTONE, which is
representative
of typical Web server request patterns.

Findings

By far, the greatest impediment to performance is the host filesystem of the Web server.
However, factoring out I/O, the primary determinant to server performance is the
concurrency strategy.

For single CP
U machines, single
-
threaded solutions are acceptable and perform well.
However, they do not scale for multi
-
processor platforms.

Process
-
based concurrency implementations perform reasonably well when the network
is the bottleneck. However, on high
-
speed n
etworks like ATM, the cost of spawning a
new process per request is relatively high.

Multi
-
threaded designs appear to be the choice of the top Web server performers. The
cost of spawning a thread is much cheaper than that of a process.

Additonal informat
ion is available in
this paper
.


Adaptation

Concurrency Strategies

Each concurrent strategy has positive and negative aspects, which are summarized in the
table below. Thus, to o
ptimize performance, Web servers should be adaptive,
i.e.
, be
customizable to utilize the most beneficial strategy for particular traffic characteristics,
workload, and hardware/OS platforms. In addition, workload studies indicate that the
majority of the
requests are for small files. Thus, Web servers should adaptively optimize
themselves to provide higher priorities for smaller requests. These techniques combined
could potentially produce a server capable of being highly responsive and maximizes
throughpu
t. The next generation of the JAWS server plans to implement the prioritized
strategy.

Strategy

Advantages

Disadvantages

Single
Threaded

No context switching overhead.
Highly portable.

Does not scale for multi
-
processor
systems.

Process
-
per
-
More portable for machines
Creation cost high. Resource
request

without threads.

intensive.

Process pool

Avoids creation cost.

Requires mutual exclusion in some
operating systems.

Thread
-
per
-
request

Much faster than fork.

May require mutual exclusion. Not

as
portable.

Thread pool

Avoids creation cost.

Requires mutual exclusion in some
operating systems.


Summary of Concurrency Strategies

Protocol Processing

There are instances where the contents being transferred may require extra processing.
For
instance, in HTTP/1.0 and HTTP/1.1 files may have some encoding type. This
generally corresponds to a file having been stored in some compressed format (
e.g.
,
gzip). In HTTP, it has been customary for the
client

to perform the decoding. However,
there may
be cases where the client lacks the proper decoder.

To handle such cases, it would be nice if the
server

would do the decoding on behalf of
the client. A more advanced server may detect that a particularly large file would transfer
more quickly for the cl
ient in some compressed form. But this kind of processing would
require
negotiation

between the client and the server as to the kinds of
content
transformations

are possible by the server and acceptable to the client. Thus, the server
would be required to
adapt

to the abilities of the client, as well as the conditions of the
network connection.


JAWS Adaptive Web Server

Here we will breifly describe the object
-
oriented architecture of the JAWS Web server
framework. In order to understand the design, it is important to motivate the need for
framework architectures.

Solutions to the Reuse Problem

Software reuse is a vital
issue in successful development of large software systems.
Software reuse can reduce development effort and maintenance costs. Thus, much effort
in software engineering techniques has been devoted to the problem of creating reusable
software.

The techniqu
es for developing reusable software have evolved through several
generations of language features (
e.g.
, structured programming, functional programming,
4GLs, object
-
oriented programming), compilation tools (
e.g.
, source file inclusion,
compiled object fil
es, class libraries, components), and system design methods (
e.g.
,
functional design, complexity analysis, formal methods, object
-
oriented design, design
patterns). While each of these techniques help to facilitate the development and
integration of reusab
le software, their roles are
passive
. This means that the software
developer must make the decisions of how to put together the software system from the
repository of reusable software. The figure below illustrates the passive nature of these
solutions.


Application development with class libraries and design patterns.


The advantages of this approach is that it maximizes the number of available options to
software developers. This can be important in development environments with open
-
ended requirements, so that design flexibility is of premium value. However, the
disadv
antage is that every new project must be implemented from the ground up every
single time.

To gain architectural reuse, software developers may utilize an application framework to
create a system. An application framework provides reusable software compon
ents for
applications by integrating sets of abstract classes and defining standard ways that
instances of these classes collaborate. Thus, a framework provides an application
skeleton which can be customized by inheriting and instantiating from reuseable
components in the framework. The result is pre
-
fabricated design at the cost of reduced
design flexibility. An application framework architecture is shown in the figure below.


Application development with an application framework.


Frameworks can allow developers to gain greater reuse of designs and code. This comes
from leveraging the knowledge of an expert applications framework developer who has
pre
-
determined largely what libraries and objects to use, what patterns they follow, a
nd
how they should interact. However, frameworks are much more difficult to develop than
a class library. The design must provide an adequate amount of flexibility and at the same
time dictate enough structure to be a nearly complete application. This bala
nce must be
just right for the framework to be useful.

The JAWS Web Server Framework

The figure below illustrates the object
-
oriented software architecture of the JAWS Web
server framework. As indicated earlier, our results demonstrate the performance var
iance
that occurs as a Web server experiences changing load conditions. Thus, performance can
be improved by dynamically adapting the server behavior to these changing conditions.
JAWS is designed to allow Web server concurrency and event dispatching strat
egies to
be customized in accordance with key environmental factors. These factors include static
characteristics, such as support for kernel
-
level threading and/or asynchronous I/O in the
OS, and the number of available CPUs, as well as dynamic factors, s
uch as Web traffic
patterns, and workload characteristics.


JAWS Framework Overview


JAWS is structured as a framework that contains the following components: an
Event
Dispatcher
,
Concurrency Strategy
,
I/O Strategy
,
Protocol Pipeline
,
Protocol
Handlers
,
and
Cached Virtual Filesystem
. Each component is structured as a set of collaborating
objects implemented with the
ACE C++ communication framework
. The components
and their collaborations follow several
design patterns

which are named along the
borders of the components. Each component plays the following role in JAWS:



Event Dispatcher:

This component is responsible for coordinating the
Concurrency

Strategy

with the
I/O

Strategy
. The passive establishment of
connections with Web clients follows the
Acceptor Pattern
. New incoming
requests are serviced by some concurrency strategy. As e
vents are processed, they
are dispensed to the
Protocol

Handler
, which is parametized by I/O strategy. The
ability to dynamically bind to a single concurrency strategy and I/O strategy from
a number of choices follows the
Strategy Pattern
.



Concurrency Str
ategy:

This implements concurrency mechanisms (such as
single
-
threaded, thread
-
per
-
request, or thread pool) that can be selected adaptively
at run
-
time, using the
State Pattern

or pre
-
determined at initialization
-
time.
Configuring the server as to which co
ncurrency strategies are available follows
the
Service Configurator Pattern
. When concurrency involves multiple threads,
the strategy creates protocol handlers that follow the
Active Object Pattern
.



I/O Strategy:

This implements the I/O mechanisms (such as asynchronous,
synchronous and reactive). Multiple I/O mechanisms can be used simultaneously.
Asynchron
ous I/O is implemented utilizing the
Asynchronous Completion Token
Pattern
. Reactive I/O is accomplished through the
Reactor Patte
rn
. Both
Asynchronous and Reactive I/O utilize the
Memento Pattern

to capture and
externalize the state of a request so that it can be restored at a later time.



Protocol Handler:

This object allows system developers to apply the JAWS
framework to a variet
y of Web system applications. A
Protocol Handler

object is
parameterized by a concurrency strategy and an I/O strategy, but these remain
opaque to the protocol handler. In JAWS, this object implements the parsing and
handling of HTTP request methods. The a
bstraction allows for other protocols
(
e.g.
, HTTP/1.1 and DICOM) to be incorporated easily into JAWS. To add a new
protocol, developers simply write a new
Protocol Handler

implementation, which
is then configured into the JAWS framework.



Protocol Pipeline
:

This component provides a framework to allow a set of filter
operations to be incorporated easily into the data being processed by the
Protocol

Handler
. This integration is achieved by employing the
Adapter Pattern
.
Pipelines follow the
Streams

Pattern

for input processing. Pipeline components
are made available with the
Service Configurator Pattern
.



Cached Virtual Filesystem:

The component improves Web server performance
b
y reducing the overhead of filesystem accesses. The caching policy is strategized
(
e.g.
, LRU, LFU, Hinted, and Structured) following the
Strategy Pattern
. This
allows different caching policies to be profiled for effectiveness and enables
optimal strategie
s to be configured statically or dynamically. The cache is
instantiated using the
Singleton Pattern
.



Tilde Expander:

This mechanism is another cache component that uses a perfect
hash table that maps abbreviated user login names (
e.g.

~schmidt to user hom
e
directories (
e.g.
, /home/cs/faculty/schmidt). When personal Web pages are stored
in user home directories, and user directories do not reside in one common root,
this component substantially reduces the disk I/O overhead required to access a
system user
information file, such as
/etc/passwd
.








Q2:

ECE1770: Trends in Middleware Systems

Course Lecture Jan 18, 2001



This lecture includes the following:

o

Examples of distributed applications

o

Problems in developing distributed applications

o

Middleware Platforms, definitions and characteristics

o

Categories of middleware









Examples of Distributed Application

Distributed applications are applications that are designed to run on distributed systems.

Our concern of distributed systems in thi
s course is network
-
based or connected systems
such as LANs and WANs and not clusters or multiprocessor machines.



Some of the most popular distributed applications are:

OLTP (Online Transaction Processing)

Online reservation systems in traveling agencie
s are an example of OLDP.
In these kind of systems there is a central system and there are many other
machines that are connected to the central system from different
geographical locations.

Banking Applications

Database Management Applications

Groupware a
pplications like Lotus Note



In the above systems there are usually clients and servers and there is network in
between. On the server side there may be databases and database management systems
and on the client side there may be any number of clients with different applications.
Cli
ent applications may be thick or thin. Thin client programs have very little business
logic in them like an applet or an HTML browser but thick client programs have a lot of
business logic and computations in them like banking or simulation applications.







Some problems in developing distributed applications:

o

Data Sharing and Concurrency

o

Heterogeneity: different operating systems, different platforms, different
architectures, different programming languages and so on. For instance
transferring data be
tween machines with different architectures needs
conversion, because different machines use different data formats.
Imagine there is a 64
-
bit architecture in one side and a 32
-
bit architecture
on the other side or a machine with little
-
endian binary forma
t in one side
and big
-
endian format on the other side. Since these machines interpret
information differently, for exchanging numbers or other information
between these systems an especial program is needed to do the required
conversions.

o

Reliability in c
ommunication: If there is a failure in a single machine, the
machine simply crashes but in distributed systems it is more complicated.
For example if one side send a request and doesn’t get the response from
the other side, what is supposed to be done as r
eaction? The connection is
down or the server is down? Should the caller application wait or should it
send another request?

o

Session management tracking

o

Securities: That is more severe in distributed systems than the local
systems

There are many other pr
oblems that must be addressed for a distributed application to be
developed. If one is going to implement a distributed system all by himself/herself,
he/she has to solve all such problems. Due to theses problems we need to come up with a
standard solution
. With modules, libraries, functionality and services to address these
problems and this is what generally refer to as Middleware. Middleware tries to address
these problems.

Middleware Platforms, definitions and characteristics

Although there are many def
initions for middleware, still there is not a very clear and
exact definition for it. There are some services that used to be part of operating systems
and now are considered as part of middleware. On the other hand there are some services
that are part of

middleware and in future will be part of operating systems. Some people
may attribute other services such as TCP/IP protocol that is a part of network layer as
middleware or as a middleware layer.

The essential role of middleware is to manage the complexi
ty and heterogeneity of
distributed infrastructures and thereby provides a simpler programming environment for
distributed application developer. It is therefore most useful to define middleware as any
software layer that is placed above the distributed sy
stem’s infrastructure


the network
OS and APIs


and below the application layer. One way of viewing a middleware
platform is to look at it as residing in the middle. It means under neat the application and
above the operating system. This is possibly whe
re the term "middleware" came from.

The classical definition of an operating system is "the software that makes the hardware
usable". Similarly, middleware can be considered to be the software that makes a
distributed system programmable. Just as a bare co
mputer without an operating system
could be programmed with great difficulty, programming a distributed system is in
general much more difficult without middleware.

Middleware is sometimes informally called "plumbing" because it connects parts of a
distrib
uted application with data pipes and then passes data between them. It is also
called "glue" technology sometimes, because it is often used to integrate legacy
components.

Middleware provides transparency with respect to implementation language and to the
other heterogeneity issues mentioned above. Transparency in this context means that it is
invisible to the application developer and at the implementation level.

Any kind of service that can be fit into one of the aforementioned definitions, can be
consid
ered as middleware platform. Figure 1.







Early distributed systems technologies such as OSF DCE (Open Software Foundation
Distributed Computing environment), and Sun’s RPC (Remote Procedure Call) can be
viewed as middleware. DCE has a very good
security support, it has a time service, a
directory service but supports only the C language.

Several distributed object platforms have recently become quite popular. These platforms
extend earlier distributed systems technologies. These middleware platfo
rms are CORBA
(Common Open Request Broker Architecture), DCOM (Distributed Component Object
Model) and JAVA RMI (Remote Method Invocation).

The differences between the above middlewares are the languages and the platforms that
they support. CORBA supports

multi languages and multi platforms. One can use
different operating systems in his/her distributed system while he/she uses CORBA. It
also supports object orientation.

JAVA RMI supports just one language


JAVA
-

but it can run on different platforms
pro
vided the existence of JAVA virtual machine on the platforms. It supports object
orientation too.

DCOM is a Microsoft middleware solution. It only runs on Microsoft's operating
systems, Windows et al and it supports only Microsoft's programming languages.
Therefore it doesn’t support multi platforms. DCOM is also object oriented.

Which is the best? Which one should we use? Which one will survive in future?

There is not a simple answer to the above questions. Each of the above middlwares has
strengths and w
eaknesses.



The characteristics of the new middleware platforms can be summarized as follow:

o

Masking heterogeneity in the underlying infrastructure by cloaking system
specifics.

o

Permitting heterogeneity at the application level by allowing the various
co
mponents of the distributed application to be written in any suitable
language

o

Providing structure for distributed components by adopting object
-
oriented principles

o

Offering invisible, behind
-
the
-
scenes distribution as well as the ability to
know what’s
happening behind the scenes

o

Providing general
-
purpose distributed services that aid application
development and deployment



Categories of middleware platforms



o

Distributed Tuple Spaces

A distributed relational database offers the abstraction of
distribu
ted tuples. Its Structured Query Language (SQL) allows
programmers to manipulate sets of these tuples in a declarative
language yet with intuitive semantics and rigorous mathematical
foundations based on set theory.

Linda is a framework offering a distribu
ted tuple abstraction called
Tuple Space (TS). It allows people to publish information into TS.
On one side there are publishers and on the other side there are
subscribers. Publishers can publish into TS and subscribers can
subscribe to an item of interes
t and use the published materials.
The advantage of such a system is offering a spatial decoupling by
allowing depositing and withdrawing processes to be unaware of
each other’s identities. Publishers or producers and subscribers or
consumers are decoupled

i.e. they don’t have to be at the system at
the same time, however they can work concurrently. Data items
flow to and flow out of the system separately.

Javaspaces is a concept very closely related to Linda’s TS. Jini is a
network technology built on top

of the Javapaces. Jini network
technology provides a simple infrastructure for delivering services
in a network and for creating spontaneous interaction between
programs that use these services regardless of their
hardware/software implementation. Any kin
d of network made up
of services (applications, databases, servers, devices, information
systems, mobile appliances, storage, printers, etc.) and clients
(requesters of services) of those services can be easily assembled,
disassembled, and maintained on th
e network using Jini
technology. Services can be added or removed from the network,
and new clients can find existing services, all without
administration.



o

RPC (Remote Procedure Call)

RPC offers the abstraction of being able to invoke a procedure
whose body is across the network. RPC is a call sent from one
machine or process to another machine or process for some
service. An RPC is synchronous, beginning with a request from a
local cal
ling program to use a remote procedure and ending when
the calling program receives the results from the procedure.

An implementation of Distributed Computing Environment, DCE
RPC, includes a compiler that translates an interface definition into
a client
stub, which marshals a procedure call and its parameters
into a packet, and a server stub, which unmarshals the packet into a
local server call. The client stub can marshal parameters from a
language and machine representation different from server stub’s,

thereby enabling interoperation.

For example if client machine has a 32
-
bit architecture and server
machine has a 64
-
bit architecture and server machine is to send an
integer to client machine, this is the stub that is aware of the
differences in data for
mats and does the conversion. All this
process is invisible to the applications (Transparency).

An RPC implementation also includes a run
-
time library which
implements the protocol for message exchanges on a variety of
network transports, enabling interop
eration at that level.

Note: There is a problem with transparency! Imagine a client and a
server and a simulation application with a lot of computation on
the server side and the visualization on the client side. Therefore
server computes and sends the res
ult to the client machine. Now
imagine the server program includes three nested loops as follows:



for i=1 to 100

for j=1 to 100

for k=1 to 100

R

RPC will be called a million times. Every time that RPC is called
there are some operations that must be done
, like opening
connections, before the actual data is sent, depending on the RPC
protocol. In this case a million times a connection will be opened
and each time a few bytes of information will be sent. The ideal is
opening the connection once and sending
a lot of information.
Since the RPC call is invisible to the programmer, the code might
become very inefficient. The point is that although transparency is
a very good feature, it allows programmers to implement
inefficient systems.



o

Distributed Object M
iddleware

This is a refinement of RPC category. RPC category is procedural
oriented and languages like C, Pascal and Modula are involved.
Distributed Object Middleware is an evolution of RPC with object
oriented technology. It provides the abstraction of a
n object that is
remote but its methods can be invoked just like those of an object
in the same address space as the caller.

CORBA (Common Object Request Broker Architecture) is a
standard for distributed object computing. It is part of OMA
(Object Manage
ment Architecture), developed by OMG (Object
Management Group). CORBA is considered by most experts to be
the most advanced kind of middleware commercially available,
and the most faithful to classical object oriented programming
principles. Figure 2.






DCOM (Distributed Component Object Model) is a distributed
object technology from Microsoft that evolved from its OLE
(Object Linking and Embedding) and COM (Component Object
Model). Figure 3.






Java has a facility called RMI (Remote Method
Invocation) that is
similar to the distributed object abstraction of CORBA and
DCOM. RMI is a specification from Sun Microsystems that
enables the programmer to create distributed Java
-
to
-
Java
applications in which the methods of remote Java objects can be

invoked from other Java virtual machines, possibly on different
hosts. RMI provides heterogeneity across operating system and
Java vendor, but not across language. Figure 4.






Note: what is the advantage of object orientation?

The advantage of object
orientation can be viewed from user and
application developer perspectives.

Some of the advantages from user perspective are as follows:

o

Application objects are presented as objects that can be
manipulated in a way that is similar to the manipulation of
th
e real world objects.

o

Common functionality in different applications that is
realized by common shared objects, leading to a uniform
and consistent user interface.

o

Existing applications can be embedded in an object
-
oriented environment and object oriente
d technology does
not make existing applications obsolete.

Some of the advantages from application developer’s view are as
follows:

o

Through encapsulation of object data, applications are built
in a truly modular fashion.

o

It is possible to build applications in an incremental way,
preventing correctness during the development process.

o

Cost and lead
-
time can be saved by making use of existing
components.

Encapsulation, Inheritance and polymorphism are features of
object ori
entation.



o

MOM (Message Oriented Middleware)

MOM provides the abstraction of a message queue that can be
accessed across a network. In this system, information is passed in
the form of a message from one program to one or more other
programs.

In MOM there

are queues on client and server sides. These queues
can be viewed as mailboxes, storage places or buffers that are
managed by the MOM.

Such queues can support persistency. Persistent queues store and
keep the values and if the message arrives and system i
s not ready
to get the message or request, the queue guarantees that the
message will be available to the system. The issues with such
queues are managing the information or messages in the queue
(database issues), reliability i.e. messages will be deliver
ed in order
even if they arrive out of order. In systems with lots of request
processing and computations, queues are very useful in order for
the system not to get overflowed.

Another kind of applications that can benefit from persistent
queues are applic
ations that deal with disconnected operations i.e.
the system is not online all the time. It may be stalled or migrated
or it may be a wireless device that can not receive the information
in special situations. By having persistent queues, messages or
requ
ests are stored in the queue and whenever the system become
available again, it can get the information from the queue.

MOM provides temporal and spatial decoupling.



NOTE: Persistency and transistency are two terms in programming
languages with the fol
lowing meanings:

Persistency: Data units or objects life cycle is beyond the
program life cycle. The program is executed completely but
data units or objects related to it still exist. This becomes
possible by using especial storage like queues for data un
its
or objects.

Transistency: Data units or objects do not exist after the
program completely executed and just the result may be
available.



o

TP Monitors (Transaction Processing Monitors)

The main function of a TP monitor is to coordinate the flow of
req
uests between terminals or other devices and application
programs that can process these requests. A request is a message
that asks the system to execute a transaction. The application that
executes the transaction usually accesses resource managers, such
as database and communications systems. A typical TP monitor
includes functions for transaction management, transactional
interprogram communications, queuing, and forms and menu
management.

In banking applications imagine if there is a crash or power out
age
during a withdrawal or deposit activity from or to an account. The
system must rollback all the process and doesn’t change the
account balance. In such systems the entire workload is one unit of
work (transaction) either all of it or none of it gets do
ne. TP
monitors provide primitives, functionality and services to realize
transactions in a distributed system.

It can be said that the concept of middleware started from early TP
monitors in OLTP and has evolved to today’s middleware
concepts.

There can
be a combination of message queues and TP that
combine the properties of message queue and TP and it is an
evolution of each of the message queue and TP concepts.



o

Directory Services

A directory is like a database, but tends to contain more
descriptive, a
ttribute
-
based information. The information in a
directory is generally read much more often than it is written. As a
consequence, directories don't usually implement the complicated
transaction or rollback, schemes regular databases use for doing
high
-
vol
ume complex updates. Directory updates are typically
simple, all
-
or
-
nothing changes, if they are allowed at all.
Directories are tuned to give quick
-
response to high
-
volume
lookup or search operations. They may have the ability to replicate
information wid
ely in order to increase availability and reliability,
while reducing response time. When directory information is
replicated, temporary inconsistencies between the replicas may be
OK, as long as they get in sync eventually.

DNS (Domain Name System) is a d
istributed database that resides
in multiple machines on the Internet and is used to convert
between names and addresses and provide email routing
information.

LDAP (Lightweight Directory Access Protocol) is a protocol for accessing online
directory servic
es. It runs over TCP/IP and it is based on entries.
An entry is a collection of attributes that has a name, called a
distinguished name (DN). The DN is used to refer to the entry
unambiguously. Each of the entry's attributes has a type and one or
more valu
es. The types are typically mnemonic strings, like "cn"
for common name, or "mail" for email address. The values depend
on what type of attribute it is. For example, a mail attribute might
contain the value "babs@umich.edu". A jpeg Photo attribute would
co
ntain a photograph in binary JPEG/JFIF format. In LDAP,
directory entries are arranged in a hierarchical tree
-
like structure
that reflects political, geographic and/or organizational boundaries.
Entries representing countries appear at the top of the tree.

Below
them are entries representing states or national organizations.
Below them might be entries representing people, organizational
units, printers, documents, or just about anything else you can
think of. In addition, LDAP allows you to control which a
ttributes
are required and allowed in an entry through the use of a special
attribute called objectclass. The values of the objectclass attribute
determine the schema rules the entry must obey.

A trader service is like yellow pages. There is a language tha
t can
be used to access the trader that allows you to specify categories.
For instance one can use a trader to publish a system that has 8
processors with 64
-
bit architecture and high speed interconnects
and someone else is looking for such a machine. He/s
he can use
the language of the trader to submit his/her query and the trader
will return the object reference to the address of this machine.



o

Component Oriented Frameworks

Application programming models fall into two categories. The
classic model entail
s the creation of an application as a single
standalone entity, whereas the component model allows the
creation of an application as a set of reusable components. Perhaps
the most significant recent development is components.
Component
-
based middleware evo
lves beyond object
-
oriented
software: You develop applications by gluing together off
-
the
-
shelf components that may be supplied in binary form from a range
of vendors. This type of development has been strongly influenced
by Sun’s java Beans and Microsoft’
s COM technologies.

EJB (Enterprise Java Beans) is a specification created by Sun
Microsystems that defines a framework for server
-
side Java
components. EJB makes distributed object technology more
accessible and easier to use by offering an abstraction le
vel that is
higher and therefore more efficient for software development.

COM+ (Component Object Model) is the next generation of
DCOM that greatly simplifies the programming of DCOM. COM
refers to both a specification and implementation developed by
Micro
soft Corporation which provides a framework for integrating
components. This framework supports interoperability and
reusability of distributed objects by allowing developers to build
systems by assembling reusable components from different
vendors which c
ommunicate via COM. By applying COM to build
systems of preexisting components, developers hope to reap
benefits of maintainability and adaptability. COM defines an
application
-
programming interface (API) to allow creation of
components for use in integrat
ing custom applications or to allow
diverse components to interact. However, in order to interact,
components must adhere to a binary structure specified by
Microsoft. As long as components adhere to this binary structure,
components written in different l
anguages can interoperate.

CCM (Corba Component Model)




o

Database Access Technology

Mediators

ODBC (Open Database Connectivity) was developed to create a
single standard for database access in the Windows environment.

JDBC is an API for database access.
Java's JDBC API provides a
shared language through which the applications can talk to
database engines.

It offers a set of interfaces that create a common point at which
database applications and database engines can meet.



o

Application Servers

Application servers are applications that realize most of the
business logic.

The concept of application servers argues that business logic
should be in the thick client instead of the server.

The advent of the PC made possible a dramatic paradigm shift
from the monolithic architecture (the user interface, business logic,
and data access functionality are all contained in one application)
of mainframe
-
based applications. The client/server architecture
was in many ways a revolution from the old way of doin
g things.
Despite solving the problems with mainframe
-
based applications,
however, client/server was not without faults. For example because
database access functionality and business logic were often
contained in the client component, any changes to the b
usiness
logic, database access, or even the database itself often required the
deployment of new client component to all the users of the
application. Usually such changes would break earlier versions of
the client components, resulting in a fragile applic
ation.

The problems with the traditional client/server (two
-
tier
client/server) were addressed by the multi
-
tier client/server
architecture. Conceptually an application can have any number of
tiers, but the most popular multi
-
tier architecture is three
-
tie
r,
which partitions the system into three logical tiers:

o

User interface layer

o

Business rules layer

o

Database access layer

Three
-
tier client/server enhances the two
-
tier client/server
architecture by further insulating the client from changes in the rest
o
f the application and hence creating a less fragile application.

DBMS (Database Management System)

Web Servers



o

References



Course Lecture



Middleware, David E. Bakken, Washington State University



Managing Complexity: Middleware Explained, Andrew T.
campbell, geoff Coulson, and Michael E. Kounavis



Middleware: A model for Distributed System Services



The Design of the TAO Real
-
Time Object Request Broker,
Douglas C. Schmidt, David L. Levine, and Sume
dh Mungee,
Department of Computer Science, Washington University, January
18, 1999



CORBA in 14 Days, Jeremy Rosenberg



Communication Networks Fundamental Concepts and Key
Architectures, Leon
-
Garcia, and Widjaja



Implementation Remote Procedure Calls, Andr
ew D. Birrell and
Bruce Jay Nelson