Performance of Web servers in a distributed computing environment

estrapadesherbetSoftware and s/w Development

Nov 18, 2013 (3 years and 6 months ago)

69 views


Performance of Web servers in a distributed computing environment


W.K. Ehrlich
a
, R. Hariharan
b
, P.K. Reeser
a

and R.D. van der Mei
c,d


a
AT&T, Middletown, NJ, USA

b
Sun Microsystems, Austin, TX, USA

c
KPN Research, Leidschendam, The Netherlands

d
Vrije Univer
siteit, Amsterdam, The Netherlands


Over the past few years the applicability of Web technology has evolved from standard
document retrieval functionality to the ability to provide end
-
user interfaces in distributed
computing environments. Consequently, We
b technology is becoming an integral part of the
IT infrastructure of service providers, which raises the need to be able to predict and
understand the performance capabilities and limitations of Web servers in a distributed
computing environment. In this
paper, we propose an end
-
to
-
end performance model for Web
servers engaged in object
-
oriented distributed computing, including the impact of new
features that have not been included in performance models of Web servers before: the object
scope
,
threading

mo
del

and
location
. We have implemented the model in a simulation tool.
The performance predictions based on the model are shown to match very well with the
performance of Web servers measured in a test lab environment.

1.

INTRODUCTION

With the tremendous growt
h of the World Wide Web, the use of Web technology has
become widespread. Over the past couple of years the applicability of Web server has evolved
from standard document retrieval functionality to the ability to provide end
-
user interfaces in
distributed
computing environments. Today, Web technology has become an integral part of
the IT infrastructure of many service providers. In this context, Web servers are typically used
as front
-
end devices, hiding the business logic and the communication protocols wi
th the
remote backend systems (e.g., database management systems, file servers, authentication
servers) from the end users (see Figure 1). Typical examples of services supported by such a
multi
-
tiered IT
-
architecture are on
-
line services, where the PC
-
conn
ected end users can
retrieve and store information that is physically located at different remote backend systems.
Specific examples of services that are offered today are PC banking, buying/selling stocks
from the PC, or on
-
line services where the end use
r can view telephone bills from home.

The ability to offer on
-
line services has raised the need for service providers to get an
understanding of the performance capabilities and limitations of their IT
-
infrastructure, and to
predict the end
-
to
-
end performa
nce of the services offered to the customers. This, in turn,
requires the development and analysis of performance models for Web servers that highlight
the critical factors of the performance of Web servers in a distributed computing
environment. The resul
ts presented in this paper are a significant first step in that direction.



To implement the business logic of Web
-
based services, the servers involved typically
implement a significant amount of server
-
side scripting. The first generation standard for
ser
ver
-
side scripting is the well
-
known Common Gateway Interface (CGI). Although the use
of CGI scripts is widespread, implementation of CGI scripts has a serious drawback: for each
invocation of a CGI application a new process is forked and executed, causing

significant
performance problems on the server. To overcome this performance penalty, Web servers
may implement an Application Programming Interface (API) to perform server
-
side
processing without spawning a new process, either by interpreting embedded sc
ripting on
Web pages, or by dynamically loading precompiled code. One approach to performing server
-
side scripting is to implement a script
-
engine (SE) dedicated to processing server
-
side scripts.
Examples of SE implementations are the Active Server Pages
(ASP) technology in
Microsoft’s Internet Information Server (IIS), or the Java Server Pages (JSP) technology in
Netscape's Enterprise Server (NES). In ASP applications, for example, IIS retrieves a file,
parses the file for scripting language content, and
interprets the scripting statements. Since a
script is interpreted, a complex script may slow down the SE. Consequently, some SEs (e.g.,
VBScript, JavaScript) enable instances of objects (e.g., compiled C++ or Java code) to be
created on the Web server. In

an object
-
oriented (OO) Web environment, an object’s
methods, properties, and events are directly accessible from the script.

In the literature, several papers focus on modeling the performance of Web servers.
Slothouber [
1
] proposes to model a Web server

as an open queueing network, ignoring
essential lower
-
level details of HTTP and TCP/IP protocols. Heidemann et al. [
2
] present
analytic models for the interaction between HTTP and several transport layer protocols (TCP,
T
-
TCP and UDP), including the impac
t of slow
-
start algorithms. Dilley et al. [
3
] present a
high
-
level layered model of an HTTP server, and build a framework to collect and analyze
empirical data. Van der Mei et al. [
4
] propose an end
-
to
-
end performance model for standard
HTTP transactions,
and Reeser et al. [
5
] derive a fast
-
to
-
evaluate analytic approximation for
the response times and throughput for the model in [4]. Crovella et al. [16] study various
scheduling mechanisms for Web servers based on the document size, and show the counter
-
int
uitive result that under shortest
-
connection
-
first scheduling, long connections pay little
penalty. Barford and Crovella [17] compare the performance of Web servers implementing
HTTP versions 1.0 and 1.1, and show that when the CPU is the bottleneck, there

is relatively
little difference in performance between HTTP/1.0 and HTTP/1.1. Several other papers focus
on modeling the performance of distributed architectures (but are not particularly focused on
Web server performance). For example, Rolia and Sevcik [
6
] propose the Method of Layers
to model the responsiveness of distributed applications [
7
]. Franks et al. [
8
] use the Stochastic
Rendezvous Networks (SRVNs) formalism to model the performance of distributed client
-
Figure 1. Web server in a distributed environment.

end
users

distributed environment

Web server

backend
systems



server based systems. Chiola et al. [
9
] a
nd German et al. [
1
0] use Generalized Stochastic Petri
Nets (GSPNs) for the performance modeling of distributed systems. Savino
-
Vazquez et al.
[
11
] present a tool for predicting the performance of CORBA and DCOM based systems.

Despite the fact that signif
icant progress has been made in modeling the performance of
distributed systems, a thorough understanding of the end
-
to
-
end performance for Web servers
incorporating server
-
side processing in a distributed OO environment is lacking. Therefore,
the aim of t
his paper is to highlight the unique factors affecting Web server performance in
the context of OO distributed computing. We propose a new end
-
to
-
end performance model
for Web servers engaged in an object
-
oriented distributing computing environment. The
mo
del includes new features of object technology (object scope, threading model and
location) that have not been included in performance models of Web servers before. We have
implemented the model in a simulation tool. The validity of the model is analyzed b
y
comparing the predictions based on the simulations with performance measurements in a test
lab environment. Initial test results indicate that the model indeed captures the main
performance
-
limiting factors of Web servers in a distributed computing envir
onment.

In general, the specifics of the transaction flows depend on the Web server implementation
and on the server operating system (OS). In this paper, we focus on the dynamics of the ASP
technology for the IIS server running on the Windows NT OS. Howev
er, we emphasize that
analogous constructs are also applicable beyond the NT/IIS/ASP technology.

2.

HTTP TRANSACTION FLOWS

Each HTTP transaction proceeds through a Web server along four successive phases: TCP
connection setup, HTTP processing, SE processing,
and network I/O processing. The
different phases (see Figure 2) are briefly reviewed below. The reader is referred to [4,5] for a
detailed discussion and model of the TCP, HTTP, and I/O processing phases.

2.1.

TCP Connection Setup Phase

Before information can
be exchanged between client and server, a two
-
way connection (a
TCP socket) must be established. The TCP sub
-
system consists of a TCP Listen Queue (TCP
-
Figure 2. Model of the basic transaction flows within a Web server.

client
network
TCP
Web server
HTTP
ASP/JSP
Servlet Engine
objects
Remote
Application
(eg, Mail
Engine)
IOC
TCP flow control
network
connection
network


LQ) served by a server daemon. Immediately after the TCP socket is established, the
transaction request
is forwarded to the HTTP sub
-
system for further processing. If all slots in
the TCP
-
LQ are occupied upon arrival of a connection request, then the request is rejected
and the client must resubmit a connection request.

2.2.

HTTP Layer Processing Phase

The HTTP s
ub
-
system consists of an HTTP Listen Queue (HTTP
-
LQ) and a number of
multi
-
threaded HTTP daemons that coordinate the processing performed by a number of
(worker) threads. The dynamics of the HTTP sub
-
system are summarized as follows:


if ( an HTTP thread i
s available )

then

accept the request and retrieve the requested file;

if ( the file requires script processing )

then

forward the transaction to the SE sub
-
system;

else

if ( an I/O buffer is available )

then

forward the static file contents to the I/O sub
-
system;

else

idle until an I/O buffer becomes available;

else

if ( an HTTP
-
LQ slot is available )

then

the request enters the LQ and waits until a thread is available;

else

reject the transaction request and tear down the TCP connection.

2.3.

SE Layer Processi
ng Phase

The script
-
engine (SE) dynamics generally depend on the Web server implementation and
on the OS. In this section, we focus on the dynamics of the NT/IIS/ASP technology. We
emphasize that the aim of this discussion is to give the reader a general u
nderstanding of the
performance issues related to SE implementations, rather than to discuss ASP technology in
great detail. To this end, the terminology used will be generic and may deviate from the ASP
-
specific terminology. The reader is referred to Micr
osoft documentation for extensive
discussions of the ASP terminology and internals.

The SE sub
-
system consists of a Listen Queue (SE
-
LQ) and a pool of (handler) threads
dedicated to interpreting scripting statements and executing embedded code (e.g., C++,
Java).
During object execution, communication with a remote backend server may be needed (e.g.,
to perform a database query). The dynamics of the SE sub
-
system are described as follows:


if ( an SE thread is available )

then

accept the request and execute
the embedded object code;

if ( the HTTP/SE implementation operates in
blocking

mode )

/* that is, the HTTP thread blocks on a response from the SE thread */

then

forward the results back to the waiting HTTP thread for completion;

else

/* the HTTP/SE implem
entation operates in
non
-
blocking

mode */


if ( an I/O buffer is available )

then

forward the results to the I/O sub
-
system;

else

idle until an I/O buffer becomes available;

else

if ( an SE
-
LQ slot is available )

then

the request enters the LQ and waits un
til a thread is available;

else

reject the transaction request and tear down the TCP connection.



2.4.

I/O Processing Phase

The I/O sub
-
system consists of a number of parallel I/O buffers, an I/O controller (IOC),
and the connection from the server to the networ
k. The contents of the I/O buffers are
"drained" over the network connection to the client as scheduled by the IOC. The IOC visits
the different I/O buffers in a round
-
robin fashion, checks whether the I/O buffers have any
data to send, and if so, places a

chunk of data onto the network connection.

3.

OBJECT SCOPE, THREADING MODEL AND LOCATION

The transaction flows related to the SE depend on the object scope and threading models
used, as well as the location of the server object relative to the SE client thre
ad. Below we
briefly outline the basic concepts of object scope, threading, and location (for the ASP
technology), and their impact on the performance of the SE sub
-
system.

3.1.

Object Scope

Object scope determines whether a single instance of an object is acce
ssible from all
scripting pages, or whether the instance is accessible from only a single scripting page. The
scope of an object determines both its longevity (lifetime), as well as the level of contention
for the object instance by different SE threads (p
arallelism). Two extreme examples are
Transaction Scope objects (TSOs) and Application Scope objects (ASOs).

A TSO lives only for the duration of a single object request. Each object request results in
the client SE thread creating a separate instance of
the object so that a TSO is created,
executed, and de
-
referenced over the course of the request. Multiple TSO requests result in
multiple instances of the object. As a result, there is no contention for the same TSO instance
by multiple SE threads.

In con
trast, an ASO lives for the complete duration of the application. An ASO is created
upon start of an application (first request), and is subsequently accessible by all client threads.
Only a single instance of the object will exist for the duration of the
application. As a result,
there may be significant contention for the single ASO instance by multiple SE threads, with
the potential for data corruption if the ASO is accessed simultaneously. Consequently, the
ASO may have to be locked (synchronized) prior

to a client's method invocation on the
object, and subsequently unlocked following method execution.

Session Scope objects (SSOs) fall in between TSOs and ASOs in the sense that they live
only for the duration of a user’s session. An SSO instance is crea
ted at the start of a user's
session, and each scripting page accessed by a user over the course of a session could then
reference the same instance of the object. The reader is referred to [
12
,
13
] (and references
therein) for more information on object sc
oping models.

3.2.

Object Threading Model

Objects can be classified as single
-
threaded (ST) or multi
-
threaded (MT). ST objects have
so
-
called
thread affinity
; that is, the object’s methods can only be executed by the dedicated
thread that initially created the
object. Therefore, ST objects are never accessed concurrently.
If another thread needs to access the object’s methods, it must request the owner thread to
access the ST object on its behalf, and these requests will queue at the owner thread.
Consequently,
implementers need not worry about ST object synchronization; however,
method calls to ST objects are serialized through a specific thread.



In contrast, an MT object is not owned by a specific thread (including the thread that
created it) and can be accesse
d concurrently by multiple threads. Implementers must therefore
protect resources (e.g., shared static variables) used by a single instance of the MT object
against concurrent access by making the object "thread
-
safe" (synchronized). The reader is
referred

to [
14
,
15
] (and references therein) for more details on object threading models.

3.3.

Object Location

A third factor impacting Web performance in a distributed OO environment relates to
object
location.

That is, whether the object executes in a different proce
ss from the client SE
thread (either on the same or on a different physical machine), or whether the object executes
within the same process space as the client SE thread.

In some cases, a client does not want the object to run in its process space becaus
e an in
-
process object can read and write to any memory location within the process, thereby
reducing process reliability. Consequently, the client will use a proxy that implements only
part of the object, with the remaining portion implemented by an out
-
o
f
-
process object on a
local or remote machine. The client thread will pass the required parameters from its address
space to the address space of the object (thread) through a technique called
marshaling
. In
turn, the server thread that executes the method

call on the object requires a stub to marshal
data that the object sends back to the client. Consequently, crossing a process boundary is
very costly in terms of CPU processing. Furthermore, if the two processes are on different
machines, then additional
standardization and socket overhead are incurred. Thus, crossing a
processor boundary can be even more costly.

In other cases, a client may choose to use an in
-
process object because it is faster. In the
Microsoft COM model, however, marshaling may still o
ccur, in a manner analogous to
crossing a process boundary, if the client accesses an object that is not "owned" by the client's
thread. If the SE client thread attempts to invoke a method call on an object instance created
by a different SE thread, COM wi
ll implicitly marshal the interface between the object's
thread and the client thread. Consequently, method invocation will be indirect (i.e., the
method call will be performed on the proxy by the SE client thread, and the request
subsequently passed to th
e another SE thread for actual execution).

4.

PERFORMANCE MODELING

The SE sub
-
system dynamics depend on the scope, threading, and location models, and
lead to different performance models depending on the Web server implementation. Several
possible performanc
e modeling approaches are addressed below.

The simplest model is obtained in the case when all objects are TSOs. A TSO is created,
executed, and de
-
referenced over the course of a request by a specific SE thread, T. Since the
TSO only lives during the cou
rse of a single request, there is no concurrency in accessing the
TSO by other threads. Therefore, this scenario can be modeled by a multi
-
server queueing
system, where the servers represent threads, the customers represent transaction requests
entering th
e SE sub
-
system, and the service time represents the CPU time required to create,
execute, and de
-
reference the TSO and process scripting statements associated with thread T.

A different performance model is obtained for ST ASOs (and SSOs), which are owned

by
a specific thread, T*, but not de
-
referenced after execution. Script interpretation is handled by




SE thread T. In this case, the ASO’s methods can only be accessed by T* and transaction
requests handled by other threads that need to execute the ASO are

serialized through T* to
execute the object. These dynamics can be modeled by a single
-
server queue, where the
server represents T*, the customers represent transaction requests owned by other threads that
need to access the ASO, and the service time repr
esents the CPU time required to serialize the
transaction and execute the ASO. Note that it may be possible (although not desirable) that
T* owns multiple ASOs, which leads to a similar model.

Another situation arises for MT ASOs (or SSOs), which are not “
owned” by a specific
thread, and can be accessed concurrently by multiple threads. This situation can be modeled
by a multiple
-

(possibly infinite
-
) server queueing model, where the servers represent the
access ports to the concurrently accessible ASO, and

the customers represent transactions or
threads that need to access the ASO. The speed at which the active servers work generally
depends on the amount of "critical sections" in the ASO object code. In the extreme case
where all sections are critical, the

server node may be a processor
-
sharing node (where the
speed of the active servers is inversely proportional to the number of active servers). In the
other extreme case where the amount of critical sections is negligible, the server speed may
be assumed t
o be independent of the number of active servers.

Let C denote the resource usage attributed to object creation, E the resource usage
attributed to object execution, M the resource usage attributed to marshaling, T the number of
transactions per session, a
nd S the number of sessions per application. Then Table 1
summarizes the effect of object scope and threading model on the CPU service time required
to process a script file request.

We reemphasize that the above discussion of the modeling of different co
mbinations of
scope, threading, and location models is intended only to give a general idea of the type of
additional performance issues arising in distributed OO environments. Clearly, refinement of
the models may be required in specific implementations.

An HTTP request entering the system requires processing at different stages, executing a
different process at each stage. At each stage, the request must obtain a resource (thread or
buffer space) and then execute on processor
-
sharing CPU(s). Thus, the del
ay through the
system is the sum of the time to acquire resources at each stage, and the CPU processing time
in each stage. The processing stages include the TCP, HTTP, SE, and I/O sub
-
systems. The
resources include the TCP buffer (LQ), HTTP buffer (LQ), H
TTP threads, SE buffer (LQ),
SE threads, I/O buffer, and I/O server (IOC).

The TCP, HTTP, and I/O stages can be modeled as multi
-
server queues (4,5]). The SE
stage can be modeled as a multi
-
server queue in the case of TSOs, and as a single
-
server
queue in
the case of ASOs (where every request for a given object needs a particular thread).
The HTTP buffer is held until an HTTP thread is assigned. The HTTP thread is held until
HTTP processing is complete. (For simplicity, we assume that the Web server operate
s in
Table 1

CPU service time per object vs. object scope and threading model

Object Scope

Single
-
Threaded Object

Multi
-
Threaded Object

Transaction

C + E

C + E + M
1

Session

C/T + E

C/T + E + M
2

Application

C/ST + E + M
3

C/ST + E

+ M
4




non
-
blocking mode; adaptation to blocking mode is trivial.) The SE buffer is held until an SE
thread is assigned. The SE thread is held until SE processing is complete and an I/O buffer is
assigned. All CPU processing can be modeled as a processor
-
sha
ring queue.

The flow of a typical request that requires processing of a TSO is shown in Figure 3. In the
case of SSOs (see Figure 4), the request waits for a particular SE thread (the thread that
created the object for the session) and the CPU processing

time includes only the execution
time (because the object is created just once for each user).

In the case of ASOs (Figure 5), the request is first picked up by any SE thread. This thread
then marshals the request to the thread that created this object,
and that thread executes the
Figure 3. Web server flow for Transaction Scope Object requests.

Get SE

thread

Get HTTP
buffer

Get HTTP
thread

HTTP CPU
processing

Set up TCP
connection

Get SE
buffer

SE thread
creates

TS object

S
E thread
executes

TS object

SE thread
destroys

TS object

Get I/O
buffer

IOC
processing

per object

per HTTP request

Figure 4. Web server flow for Session Scope Object requests.

Get SE

thread

Get HTTP
buffer

Get HTTP
thread

HTTP CPU
processing

Set up TCP
connection

Get SE
buffer

SE thread

creates

SS object

(first time

in session)

SE thread
executes

SS object

SE thread
destroys

SS object

(last time

in session)

Get I/O
buffer

IOC
processing

per object

per HTTP request

Figure 5. Web server flow for Application Scope Object requests.

Get SE

thread

Get HTTP
buffer

Get HTTP
threa
d

HTTP CPU
processing

Set up TCP
connection

Get SE
buffer

SE thread
creates

AS object

(first time in
application)

SE thread
executes

AS object

SE thread
destroys

AS object

(last time in
application)

Get I/O
buffer

IOC
processing

per o
bject

per HTTP request



object. Thus, the steps involved are 1) get an SE thread, 2) CPU time for Marshaling, 3) get
the SE thread that created the object, and 4) CPU time for execution.

These models can be implemented in any modern simulation tool. T
he specific details of
our implementation are omitted here for compactness. The main inputs to the models include
the
server hardware/OS

configuration parameters (number of CPUs, TCP
-
LQ size, number
and size of I/O buffers); the
network

configuration param
eters (network connection speed,
modem speed, network and backend RTTs); the
application software

configuration
parameters (HTTP
-

and SE
-
LQ sizes, number of HTTP and SE threads); and the
application
workload

parameters (transaction request rate, HTTP and S
E CPU times, backend processing
times). The output parameters in the current implementation include server throughput, end
-
to
-
end response time (mean and standard deviation), and proportion of transactions that are
blocked at the TCP, HTTP and SE sub
-
syste
ms
.

5.

VALIDATION RESULTS

To validate the accuracy of the performance predictions based on the model discussed in
section 4, we have performed a variety of experiments in a test environment. In this section
we present a comparison between throughput predictio
ns regarding object scope obtained in
our simulation and measured values obtained in the test environment.

For each experiment, user requests accessed script files containing a fixed number of
references to the same ST object. Each simulated user repeated
ly issued requests (one at a
time) in a loop. The number of users was increased until saturation was observed. The
experiments were repeated with the referenced object declared at
transaction, session, or
application scope
. Thus, in the cases of transactio
n and session scope, each user request
generated its own instances of the object, while in the case of application scope, every user
request shared the same instance of the referenced object. These experiments are not
representative of typical scenarios in

practice, but are intended to exaggerate and highlight the
performance impacts resulting from different object scope. Figure 6 depicts the predicted and
Figure 6. Simulated vs. measured throughp
ut
-

15 ST

objects.

0
10
20
30
40
50
60
0
20
40
60
80
100
120
140
160
Number of Concurrent Users
Throughput (TPS)
SSO Model
Test
TSO Model
Test
ASO Model
Test


observed average throughput as a function of the number of test clients (simulated users) for
the case

of 15 ST object references/request.

As can be seen, object scope has a
tremendous
impact on Web server performance.
Request throughput varies by
more than 3x

depending on object scope, with session scope
achieving the highest throughput and application s
cope realizing the lowest throughput.

It is not surprising that session scope performs the best for these contrived experiments,
since this scope represents a reasonable balance between excessive object creation (with
transaction scope) and excessive objec
t/thread contention (with application scope). However,
the magnitude of the performance penalty is surprising. Also, it would have been difficult to
predict the winner between transaction and application scope, since object creation is
synchronized, so bot
h scopes result in significant code serialization. However, the results
clearly demonstrate the severe penalty incurred when marshalling is required (as with ST
application
-
scope objects, or MT objects of any scope).

The results also demonstrate that our u
nderstanding of the internals of the Web server
captures the major object considerations that affect throughput. In particular, the model
achieves an excellent fit to test data in the cases of transaction and application scope objects,
and a reasonable fit

(i.e., within 10%) in the case of session scope objects.

We reemphasize that Web server performance is a very complex interplay between many
different factors, and that the results should be judged from that perspective. In this
perspective, the results
demonstrate that our model captures the major aspects that affect
throughput, and as such is a significant step forward in understanding Web server
performance in OO distributed computing.

6.

DISCUSSION

The ultimate objective for both testing and modeling Web

server performance is to
identify appropriate configuration guidelines for deploying Web servers. Although testing is
an important technique for assessing Web server performance, it has several severe
drawbacks. In particular, load/stress testing is extre
mely time
-
consuming and tedious, testing
alone is of limited applicability beyond the test workload, and testing alone cannot predict the
performance tradeoffs in advance of major new software releases. Hence, modeling is critical
to further understand the

performance capabilities and limitations of Web servers. We
reemphasize that a simulation model of a Web server is extremely useful as a Decision
Support System for system architects, allowing them to predict the behavior of systems prior
to their deploym
ent, or the behavior of existing systems under new workload scenarios.

For Web servers that engage in OO computing, it is important to analyze factors that affect
the threads responsible for script/object execution. These factors include whether the script
ing
thread pool is synchronous or asynchronous, whether objects are single
-

or multi
-
threaded,
whether objects execute within the Web server’s process space, thread affinity, and thread
message filtering. Note that these factors can be contrasted with fact
ors previously
investigated in Web server performance studies (e.g., file
-
size distribution, transaction request
arrival process, TCP window control algorithms, or HTTP version). A simulation model
offers an ideal tool for such an investigation.

It is int
eresting to note that the performance model discussed below contrasts most of the
work on teletraffic engineering and performance modeling over the past few years. The vast


majority of papers on modeling and analysis of telecommunication systems is focused

on the
proper allocation of
bandwidth

to different types of users in the network. Examples of issues
that have been addressed in many papers are admission control, equivalent bandwidth, traffic
shaping, usage parameter control, traffic buffering priority
mechanisms (e.g., weighted fair
queueing, generalized processor sharing, weighted random early discard), traffic and
workload characterization (e.g., Markovian models, self
-
similarity, long
-
range dependence).
In contrast, in this paper the performance limi
tations of the network
nodes
, rather than the
connections between these nodes, is investigated.

In this context, one may expect a paradigm shift in telecommunications systems over the
next few years. Bandwidth tends to becomes cheaper (at least in wired n
etworks), so that
bandwidth may no longer be a scarce commodity in many cases. Instead, with the strongly
increasing demands for processing power at the network nodes (e.g., server
-
side scripting
overhead, object creation and de
-
referencing overhead) it is

not unlikely that teletraffic
engineers will tend to focus on performance of network nodes. The results presented in this
paper are among the first steps in that direction.

7.

TOPICS FOR FURTHER RESEARCH

The validation experiments performed in the test lab
environment demonstrate that the
model indeed captures the major aspects that affect throughput. Therefore, the model
presented in this paper is a significant step forward in getting an understanding of the
performance of Web servers engaged in OO distribu
ted computing. However, despite the fact
that the test results are rather accurate, it is useful to further analyze the sources of inaccuracy
in the simulation model. Some of the sources for differences between the simulated and
measured throughput include

the following. First, for the CPU service time parameter values,
the input to the simulation is based on a linear regression of the test measurements, and errors
from this fit affect the model input and hence output. In particular, our simulation does not

model OS non
-
linearities (e.g., due to context switching) that may lead to relatively lower
measured throughput under high load conditions. Second, the implementation
-
specific details
of the thread spawning algorithm in IIS are (to be the best of our know
ledge) not available in
the public domain. Therefore, our understanding of how and when threads are created in IIS is
far from complete. This behavior certainly has an effect on the throughput in the TSO and
SSO cases, and further research on the details o
f thread spawning algorithms is needed.

The model presented here can be extended in several directions. First, in the current
implementation the object threading model is assumed to be ST. To study the impact of
object threading models in more detail, we a
re currently implementing the MT dynamics.
Second, in this paper the modeling of the SE node has focused on the dynamics of the Active
Server Component (ASC) of Microsoft’s IIS Web server. It is a challenging topic for further
research to incorporate other

SE implementations (e.g., servlet technology) into the model
and study the impact on end
-
to
-
end Web server performance.


Acknowledgments:

The authors are indebted to Pravin Johri and John Francisco for their
significant contributions to this project. In a
ddition, the authors would like to thank the ITC
reviewers for their insightful comments.



REFERENCES


1.

Slothouber. A Model of Web Server Performance.
http://louvx.biap.com/whitepapers/pe
rformance/overview/
.

2.

Heidemann et al. (1997). Modeling the performance of HTTP over several transport
protocols. IEEE. Trans. Netw. 5, 616
-
630.

3.

Dilley et al. (1998). Web server performance measurements and modeling techniques.
Perf. Eval. 33, 5
-
26.

4.

Van de
r Mei et al. (2001). Web Server Performance Modeling. Telecommunications
Systems 16, 361
-
378.

5.

Reeser et al. (1999). An Analytic Model of a Web Server. In: Teletraffic Engineering in a
Competitive World: Proceedings of the 16th ITC, 1199
-
1208.

6.

Rolia and Sev
cik (1995). The method of layers. IEEE Trans. SW Eng. 21, 689
-
699.

7.

Hills, Rolia, and Serazzi (1995). Performance engineering of distributed software process
architectures. Lecture Notes in Computer Science 977 (Springer), 79
-
85.

8.

Franks et al. (1995). A too
lset for performance engineering and software design of client
-
server systems. Perf. Eval. 24, 117
-
126.

9.

Chiola, Franceschinis, Gaeta and Ribaudo (1995). GreatSPN 1.7: graphical editor and
analyzer for timed and stochastic Petri nets. Perf. Eval. 24, 47
-
68.

10.

German, Kelling, Zimmerman and Hommel (1995). TimeNET: a toolkit for evaluating
non
-
Markovian stochastic Petri nets. Perf. Eval. 24, 69
-
88.

11.

Savino
-
Vazquez, Anciano
-
Martin, Dumas, Corbacho, Puigjaner, Boudigue, and Gardarin
(2000). Predicting the behaviour

of three
-
tiered applications: distributed
-
object
technology and databases. Perf. Eval 39, 207
-
233.

12.

Homer, Enfield, Gross, Jakab, Hartwell, Gill, Francis, and Harrison (1997). Professional
Active Server Pages (Wrox Press).

13.

Box. ASP and COM apartments.
http://www.develop.com/dbox/aspapt.asp
/.

14.

Box (1997). Essential COM (Addison
-
Wesley Longman).

15.

Rogerson (1997). Inside COM (Microsoft Press).

16.

Crovella, Frangioso and Harchol
-
Bacher (1999). Connection schedul
ing in Web servers.
Proceedings of the USENIX Symposium in Internet Technologies and Systems.

17.

Barford and Crovella (1999). A performance evaluation of hypertext transfer protocols.
Proceedings of the ACM Sigmetrics International Conference on Measurements
and
Modeling of Computer Systems, 188
-
197.