Chapter 21 - Web Services - Delmar


3 Νοε 2013 (πριν από 4 χρόνια και 8 μήνες)

176 εμφανίσεις

Web Services

Chapter 21

Chapter Goals

Understand the terminology of the WWW.

Understand web clients (browsers).

Understand web servers.

Understand client and server security issues.

Understand web performance issues.

Web Services

What is the World Wide Web (WWW)?

The World Wide Web is a client
server based application
originally developed to distribute documentation.

Researchers at various locations, notably the National Center for
Supercomputer Applications at the University of Illinois, extended
the original design to include the distribution of a wide variety of
media including




small applications or applets.

Web Services

WWW clients, known as browsers, make requests from WWW
servers and display the results in the form of a page.

Pages and other resources are referenced using a universal
resource locator (URL).

The format of a URL is a resource type tag, followed by the
name of the system holding the resource, followed by the
path to the resource that may include option flags and
other data.

Web pages are written in HyperText Markup Language

A single web page may include text, graphics and other
elements from one or more servers.

HTML and the format of other page elements are
standardized allowing a given web page to be rendered
and viewed on a wide variety of web browsers.

Web pages can also include forms and buttons. These
allow data to be entered into the page via the web
browser and communicated back to the web server.

Web Services

Web Clients

Administrating WWW clients is primarily a matter of keeping up to
date with browser and page content development.

At present, leading browsers are undergoing rapid development.

New versions of some browsers are available as frequently as every
few weeks.

New page content in the form of new media data types are
continually being developed.

Not all media types are directly viewable by a given browser and not
all pages follow the HTML specifications closely enough to be
properly rendered by all browsers.

Additional software may be needed to view certain content types
such as video , animated pictures and menus.

Such additions to the browser come in two flavors:

(1) extensions to the browser program itself, often called plug

(2) separate applications started under the browser’s control,
known as helper applications.

Web Services


ins can be categorized into two major groups based on the
programming interface (API) they use.

One group is designed for Microsoft’s Internet Explorer API, and the
other group is based on the Netscape API.

Most browsers, such as Mozilla, Opera, Konquerer, use the
Netscape API and are able to make use of plug
ins designed for
that API.

ins are further categorized by processor architecture and
operating system like other application software.

As one would expect, the widest selection of plug
ins for various
media types is for Internet Explorer on Microsoft Windows on Intel

Fewer plug
in choices are available for Mac OS X and Linux and
very few plug
ins are available for other UNIX variants.

Web Services


Helper applications are standalone programs that the browser runs
to display content in formats not supported by the browser itself or a

A typical helper is Real’s RealPlayer audio and video player.

When a user clicks on a link to a RealPlayer video clip, the
browser starts the player and passes along the URL or
downloads the video clip and passes the filename of the clip
to the player depending on how the clip is specified on the

The system administrator needs to be aware of the media types his
users will need to view.

Macromedia’s Flash animation player plug
in and Real’s
RealPlayer audio and video player are two typical additions to the
base web browser that are widely used to display content found
on many web sites.

Some sites offer less common media types such as VRML or
other 3D images, Window’s media player audio or video,
Quicktime video, and others.

Web Services

Client Security Issues

Web browsers present several security problems revolving around
the issues raised by “active content”.

Active content is a program or script that is downloaded as part of
a web page and used to provide active features such as
animated menus, special page rendering effects, error checking
in forms and other features.

Most web browsers have the JavaScript scripting language built

Additionally, most browsers include a Java interpreter either built
in or as a plug

Some plug
ins such as the Macromedia Flash player
interpret active content and can be considered similar to a
scripting language in terms of their programmability.

Internet Explorer on Windows systems adds the capability of
both Windows scripting and executable applets known as

Web Services

Client Security Issues

The range of mischief an executable applet or script could potentially
cause is large.

Web browsers, Java, JavaScript interpreters and other content
viewers are designed with this in mind and combat the problem in
varying ways. However, bugs in these tools have appeared over
time and continue to appear making the display of active content
a risky activity.

Fortunately, most browsers allow the user to optionally turn off the
execution of Java applets, JavaScript programs and other active

Turning these off will disable certain interactive features of some
web pages.

The desirability of turning these features off to gain additional
security must be weighed against the requirements of the
applications the user has and the web pages they need to view.

Web Services

Client Security Issues

Bugs in the browser itself constitute another common problem.

Browsers are complex, often including their own Java virtual
machine as well as internal versions of ftp and other network

System managers at sites concerned about security should
continually monitor the browser vendor Web pages for updates
that address security problems.

WARNING: There are numerous security vulnerabilities associated with
downloaded applets and scripts on Microsoft Windows platforms that
can affect the security of other systems on a network. These include the
unintended installation of malicious software that may examine or
disrupt network traffic or adversely effect the operation of servers and
other networked systems. Security conscious sites need to consider not
only the security of their servers, but also the risks involved in their
choice of client platforms and software.

Web Services

Client Security Issues

Another client security issue is referring page

Many web browsers pass along the URL of the page they came
from to the web server of the next page they load.

This is done to help web sites track how people get to their
site. However any information encoded in the URL is
passed along as well.

Such additional data may include information believed to be
secure if the browser moves from a secure page to an
unsecured page.

Many Web sites avoid this problem by “wiping the
browser’s feet” via directing the browser to a blank or
unrevealing page after requesting secure information.

By default, many browsers will alert users to this problem
by posting an alert message when the user moves from a
secure page to an unsecured page.

Web Services

Client Security Issues

Modern browsers are capable of storing small pieces of
information from Web sites such as a password or usage

These bits of information are known as “cookies.”

The security preferences dialog box allows those
concerned about cookies to disable them or have the
browser announce the delivery of a cookie from the Web

Turning off cookies will disable password memory and
history features of some Web sites.

The decision to turn off cookies depends on the user’s
concerns about her privacy and the Web pages she views
most often.

Web Services

Web Servers

Installing and configuring a Web server is a much more
involved process than configuring a web browser.

A Web server is a very complex daemon with numerous features
that are controlled by several configuration files.

Web servers not only access files containing web pages,
graphics and other media types for distribution to clients, they
can also assemble pages from more than one file, run CGI
applications, and negotiate secure communications.

Security and performance issues are near the top of the list when
choosing, installing and configuring any web server.

Web Services

Choosing a Web Server

Choosing a web server involves an evaluation of
several related factors.


Web servers that serve web pages on the Internet face
an extremely hostile environment.

They are the point of attack for persons interested in entering
a system, stealing data or simply defacing web pages.

Web servers must properly handle a wide range of input data
without fail.

Programs run via the web server such via the Common
Gateway Interface (CGI) must likewise deal with possibly
malicious input data and explicit attempts to exploit them.

Web Services

Choosing a Web Server


Serving web pages is often a highly
I/O intensive task.

Many web page are constructed “on the fly” from the
output of programs or as the result of a database

The performance of a web site is dependant on the
performance of all the components that feed into the
web pages being served.

Included in this is the performance of the system the
web server resides on, the network it is connected to
and the data storage facility being used.

Web Services

Choosing a Web Server


Some web servers are available for only
one operating system platform.

Some CGI programs, database interconnections and
other data sources are available for only selected

A careful inventory of the desired CGI programs and
data sources is helpful in reducing the range of choices
to those where the needed software is available.

Viewed another way, if a specific platform has already
been selected, a review of the web servers, CGI
programs, etc. that are available for the selected
platform can help guide the development of the web

Web Services

Choosing a Web Server

WARNING: Based on a long string of security problems,
culminating in the infamous Code Red and Nimda worms,
many organizations have moved away from Microsoft’s
Internet Information Server (IIS) web server. Moving away
from IIS is also the recommendation of the Gartner Group.

Web Services


The most widely used web server on the Internet,
Apache, is available for all UNIX variants and
Windows NT and later.

Many UNIX variants such as Red Hat Linux, Mac OS
X and Solaris ship Apache as part of the operating
system distribution.

For those that do not, Apache is freely available in
source code form from

Aside from its wide acceptance, Apache offers a
comprehensive suite of configuration options and
features found on many other web servers.

Web Services

Server Add

If a web server were all that was needed to set up a web site, life
would be pretty easy for the system administrator and web master.
However, the typical web server is extendable via several methods.

Common Gateway Interface (CGI)

The most common route to
extending the functionality of the web server is via CGI.

Web pages can refer to CGI programs and data from forms
can be passed to them.

Web pages can be created on the fly by CGI programs that
send data via the web server directly to the client web

CGI programs might be Perl scripts, Python scripts, or even
compiled binaries.

Web Services

Server Add

Application Servers

Tools such as Zope and php
provide templates for building web pages.

These templates form an entry point into a scripting
language and access to databases easing the
development of dynamically created web pages.


Analogous to web browser plug
ins, modules
extend the web server by directly adding functions.

Like web browser plug
ins, modules are specific to a
particular web server and match that web server’s API.

Status reporting, performance enhancements such as a
in Perl interpreter, encryption utilities, and even URL
spelling correction are some of the modules that are
available for the Apache web server.

Web Services

Web Server Installation

Apache is available in both binary form from some
vendors and in source code form for all systems.

While a binary distribution saves time, it does not offer
the level of control that building from sources offers.

To prepare for an installation from source code, make
an inventory of the Apache modules that the web site
will require.

Also, check that the needed build tools are available.

Web Services

Web Server Installation

Apache is built using the “configure and make” procedure
common for many open source packages.

Like other packages that use the configure utility,
typing “configure
help” will produce a list of all of the
available option flags.

Additional modules not found in the base Apache
distribution may require additional work.

For example, adding mod_ssl, to provide secure
web connections requires that the OpenSSL
package be installed first and that an environment
variable, SSL_BASE, containing the path to
OpenSSL be set when Apache is configured.

Web Services

Web Server Configuration

Current versions of the Apache web server are configured via a
series of directives kept in a plain text file, httpd.conf.

The Apache server distribution includes a set of samples files
that the system administrator can modify.

Over 100 configuration options can be applied to control the
behavior of the Apache Web server.

Directives in the configuration files are case insensitive, but
arguments to directives are often case sensitive.

Long directives can be extended by placing a backslash at the
end of the line as a continuation character.

Lines beginning with a pound sign (#) are considered comments.

A few of the most basic options to be examined upon setting up a
new Web server are examined in the next section.

Web Services

Basic Apache Directives

At a minimum, the system administrator will want to modify the User,
Group, ServerAdmin, ServerRoot, ServerName and DocumentRoot
lines to reflect the local site.

The User and Group lines specify the user id and group id that
the Web server will operate under once started.

The ServerAdmin is an e
mail address to which the server can
send problem reports.

The ServerRoot specifies the installation directory for the server.

The ServerName is the name of the server returns to clients.

The DocumentRoot directive sets the base for the default web
page for the web server.

Web Services

Basic Apache Directives

The Alias lines may also require updating to reflect the location of
icons and other local files.

The Alias lines allow Web page designers to use shortened
names for resources such as icons instead of specifying full

UserDir WWW

Alias /icons/ /usr/local/http/icons/

ScriptAlias /cgi
bin/ /usr/local/http/cgi

Besides making Web page construction easier by providing short
names for icons and CGI programs, these directives allow
access to users’ Web pages.

Web Services

Basic Apache Directives

The UserDir line specifies the subdirectory each user can
create in his home directory to hold Web pages.

This directory, WWW in the example, is mapped to the user’s
username as follows.

A user whose username is bob has his WWW directory
mapped to http://www.astro
- ~bob.

By default, the Apache Web server will display the index.html
file in that directory, or a directory listing if the index.html file
is not found.

This indexing behavior can be controlled by a set of
directives, IndexIgnore, IndexOptions, and

IndexOptions in particular has numerous options.

Web Services

Basic Apache Directives

A new installation of Apache may also require changing
the <Directory> directives to indicate where the server
should look for documents to serve and for CGI

For example, if the server is installed in
/usr/local/apache with the documents and CGI
programs in directories under that directory, the
following <Directory > line may be necessary.

<Directory /usr/local/apache/htdocs>

Web Services

NOTE: The “user” and “group” directives in the httpd.conf file
have significant security implications.

The “nobody” user is used to severely limit the access
privileges the web server has in order to limit what an
attacker might be able to access via the web server.

These directives also specify the default user under which any
CGI program is run.

Limiting the privileges that a CGI program has access to is an
important step in making the CGI program secure.

Web Services

Server Modules

One of the more useful features found in the Apache web server is
the use of modules to extend the base server functionality.

These modules provide such services as web server status
monitoring, encrypted connections, URL rewriting and adding
native versions of CGI tools such as Perl.

For modules that are built as part of the standard Apache build,
activating them is a matter of calling the directive associated with
the module.

For example, here are the lines required to activate the
mod_status module that allows the administrator to query the
web server for status information.

<Location /server

SetHandler server

Order deny,allow

Deny from all

Allow from


Web Services

Server Modules

The Location directive describes the “page” that is used to
view that status information, while SetHandler specifies the
status entry to the mod_status module.

The triple of Order, Deny and Allow directives controls
access to this “page” limiting it to only hosts within the
specified domain.

If the server’s name were the URL used to
access this page would be,

Web Services


A more complex module to configure is mod_ssl.

This module provides the encryption used for secure web pages.

Before using ssl, a certificate to be used in the authentication of
the server will need to be purchased from a certification authority
such as Thawte or generated and signed locally.

The locally generated certificates, also called self signed
certificates, will be flagged by web browsers and require the user
to acknowledge them before viewing the web site.

The web browser can authenticate certificates purchased from a
certificate authority without any user interaction.

Web Services


Next, several directives will need to added to the Apache
configuration file to enable ssl and specify the content to be
accessed using an encrypted connection.

Here is an example that enables ssl using high quality encryption
and specifies content to use the encrypted connection.

SSLProtocol all


SSLVerifyClient none

SSLCACertificateFile conf/ssl.crt/ca.crt

<Location /secure/area>

SSLVerifyClient require

SSLVerifyDepth 1


Web Services


The ssl module has 22 directives and provides fine
control over the security of the connection.

The effort required to obtain a certificate and configure
secure web connections is well worth it.

Secure web connections form the basis of many other

Two examples are web
based e
mail and web based
remote system management.

The end
end encryption supplied by SSL is especially
important when remote users are utilizing potentially
insecure networks such as wireless networks, or network
connections offered at conferences or hotels.

Web Services

Mime types

Web servers can serve an almost limitless range of file


file includes the mapping from a mime type to
a file extension.

The most common types are provided in the sample file
provided with the Apache distribution.

Web Services

Server Security Considerations

Web servers present a difficult security

They must be widely accessible to be useful, but
tightly controlled to prevent security breaches.

They must be tolerant of any requests submitted to
them, including requests specifically constructed to

gain unauthorized access to files or

to exploit bugs in


application servers,

CGI programs or

the web server itself.

Web Services

Ports 80 and 443

By default a web server listens on port 80 for plaintext requests
and port 443 for SSL connections.

These are well
known ports and will be examined by attackers.

The port a web server listens on can be changed via the server
configuration file, however this will cause web browsers to be
unable to connect to the server unless the port number is
included in the URL specification.

For example, if the web server on were set to
listen on port 8000, the URL for the server’s default page would
be :

WARNING: Changing the port a web server listens for requests
does not improve the security of the server
. An attacker
can locate the web server by scanning all of the ports open on the

Web Services

File Access Control

The control files which determine the Web server’s function as well
as the log files it produces should not be accessible to the user ID
the Web server runs under.

Individuals attempting to gain unauthorized access are thwarted
to the extent that they cannot obtain information about the Web
server’s configuration and function.

One way to tightly control access is to set the default Apache
access rule to deny, and open up only those directories that
contain content to be distributed.

For example, the httpd.conf directives shown below set the
default access to deny and open up access to user web
directories and a system default web page area.

Web Services

File Access Control

# Set default access to deny

<Directory />

Order Deny,Allow

Deny from all


# Allow access to user’s web directories

<Directory /usr/users/*/WWW>

Order Deny,Allow

Allow from all


# Allow access to the system web directory

<Directory /usr/local/httpd/WWW>

Order Deny,Allow

Allow from all


Web Services

File Access Control

In addition to the access controls found in the web server
configuration files, many web servers provide access control for
individual user directories by means of control files found in those

Apache uses a file called “.htaccess” which contains directives
specifying access.

For example, one could restrict access to a particular directory to
a specific domain by placing this in the .htaccess file in the
directory to be protected.

deny from all

allow from

In a .htaccess file, the options are assumed to apply to the
directory the .htaccess resides in and explicit <Directory>
directives like those used in the httpd.conf file are not needed.

The access directives can include IP address ranges and
references to password databases if desired.

Web Services

Server Side Includes

Web server options under which Web pages include other files and
execute programs should be carefully scrutinized for potential access
to files not intended for distribution.

In particular, server side includes (SSI) should be used

By default, enabling SSI allows users to execute arbitrary
programs as part of an include directive.

The possible damage this can cause can be limited by using the
suexec facility to run the referenced program in a controlled
manner with privileges limited to that of the owner of the HTML

A still more restrictive and secure approach is to allow files to be
included, but disallow execution.

This is accomplished by using the IncludesNOEXEC directive
instead of the Includes directive when specifying the options
allowed for a specific directory in httpd.conf.

Web Services

Server Side Includes

Below is an example showing how to apply this
directive to a specific directory.

<Directory /web/docs/ssi>

Options IncludesNOEXEC


Web Services


CGI programs are among the biggest potential dangers to Web
server security.

These programs are run based on a URL passed to the Web
server by a client.

In normal operations this URL comes from a form or page.
However, the URL provided to a CGI program can be given
to the Web server by other means and can be carefully
constructed to exercise bugs in the CGI program itself.

For example, one of the most common attacks against a
web server is via the phf CGI program.

The phf program is not included with recent versions of
Apache, but was present in earlier versions.

Due to poor design, phf could be easily subverted.

To disable this CGI program, remove it from the cgi
directory specified in the web server configuration file.

Web Services


As a general rule, any unused CGI program should be removed from
the cgi
bin directory.

CGI programs must be carefully constructed to avert potential
problems resulting from the input passed to them.

One successful method is to use the “tainted” variable facility
found in the Perl scripting language.

If other languages are used, care must be taken to ensure that all
possible input characters are properly handled, including shell
metacharacters, quotes, asterisks, and braces.

Administrators must also be alert to the well
known problem of
very large input strings designed to overwrite small input buffers.

Security conscious sites should carefully audit CGI programs
before putting them into operation.

Web Services


WARNING: The mod_perl module for the Apache web server does not
provide any security advantages over a standalone CGI program
written in Perl. While it does offer a substantial performance
improvement, CGI programs making use of mod_perl need to be as
carefully audited as standalone CGI programs.

Similarly, the sysadmin should disallow user executable CGI programs.

Like the executable server side includes mentioned earlier, user
executable CGI opens a Pandora’s box of possible vulnerabilities.

Limit CGI programs to a controlled directory and carefully audit any
CGI programs for security vulnerabilities.

If it is necessary to run a CGI under the UID of a user other than the
web server, a wrapper such as suexec or CGIWrap can be used.

The wrapper limits the damage an attacker can cause by exploiting
a poorly written CGI program.

Wrappers are often needed when a CGI program makes use of data
that is accessible only to a particular UID.

Web Services


Some alternative approaches to standalone CGI programs are
application servers such as PHP, and ZOPE.

These tools provide a standardized CGI interface designed
specifically to avoid problems found in input from web pages.

These tools also provide for rapid development of dynamic pages
used in a growing number of web applications.

PHP is also available as an Apache module giving better
performance than that of a standalone CGI program.

WARNING: While providing a more standardized way of using CGI,
tools like ph and zope are not without problems. Application servers
can contain bugs that make vulnerable to attack like any other CGI
program or module.

For example, all versions of PHP prior to version 4.1.2 were
found to have a buffer overflow that can be exploited to gain
elevated privileges.

A privilege elevation problem was also found in ZOPE versions
prior to version 2.2.1 beta 1

Web Services

Unintended Web Servers

The pervasiveness of web browsers has made them a common
interface tool for a variety of devices and services beyond the web

This unfortunately means that there may be unsecured web
servers hiding in obscure parts of a network waiting to be

Some of these unintended web servers include the following.

Solaris’s AnswerBook2

AnswerBook2 is web based and it
installs and uses a web server (dwhttpd) running on port

Because AnswerBook2 is a web server, it does not need
to be installed on every system, a central server can be

However, it represents another possible avenue of
access to a system and should not be enabled unless

Web Services

Unintended Web Servers

The administrator can stop and start the AnswerBook2
web server with the following commands.


o stop


o start

To disable the AnswerBook2 web server from starting at
boot time, the ab2mgr init script needs to be removed
from the /etc/rc2.d directory.

rm /etc/rc2.d/S96ab2mgr


The popular linux system administration GUI,
linuxconf, is available via the web on port 98. It is a well
known port and will be scanned for by attackers.

On Red Hat Linux, web access to linuxconf can be
disabled using ntsysv, or “chkconfig linuxconf off”.

Web Services

Unintended Web Servers


Popular printers from Hewlett
Epson and others come with a built
in web server that
can be used to configure the printer when it is

While these web servers often have a password
protection scheme in place for their settings, the default
passwords are widely known.

At a minimum, network accessible printers should have
their configuration password changed and any their
firmware patched with the current set of patches
available from the vendor.

Security conscious sites may want to go further and
disable remote configuration of network accessible
printers as per the printer vendors’ documentation.

Web Services

Unintended Web Servers

Routers, switches and other network devices

Network infrastructure devices often also contain
embedded web servers.

As with printers, these devices need at a
minimum to have their default passwords

Security conscious sites should consider
disabling remote configuration of these devices
as well.

Web Services

Unintended Web Servers

Personal File Sharing

Web servers running on user’s
PC’s can pop up on a network like weeds.

On Windows 2000 and later editions, the personal file
sharing option includes a web server.

Unfortunately, this web server is the infamous IIS in
disguise and in the default installation, without any of
the numerous patches needed to secure it from attack.

Controlling this problem is difficult. A combination of
actively scanning one’s own network and a firm policy
regarding servers run on personal computers is needed
to combat the problem.

Where possible, these web servers should be
shutdown and users directed to use a common web
server where security can more readily be maintained.

Web Services

Web Servers and Firewalls

A common error in deploying web servers is to place the web server
behind the firewall and allow requests to the web server to pass
through the firewall.

While this seems like a good way to protect the web server it in
fact more often leads to the web server becoming a conduit for
attackers to pass through the firewall and gain access to the
secured network behind it.

A better approach is to place the web server outside the firewall.

In this configuration, the web server is dedicated to web serving
only, all other services except for a secure communications
facility such as ssh are removed from the system.

Placing the web server outside the firewall acts to prevent a
compromise on the web server from proceeding on to the
systems protected by the firewall.

Web Services

A still better approach for larger networks is to establish a so
“DeMilitarized Zone” or DMZ area between the firewall protected
internal network and the Internet using a second firewall.

The advantage of this approach is that the firewall between the
Internet and the DMZ offers some protection to the web server
while still allowing web requests to pass into and out of the DMZ.

The firewall between the DMZ and the internal network then acts
to prevent an attack on the web server from proceeding on to
systems on the internal network.

Either of these approaches protects the web server. However,
many web sites build their web pages on the fly from a database.

One method of handling this is to periodically push a copy of the
database out from a protected system out onto the web server.

This isolates the transaction between the web server and the

Web Services

Log Files

Web servers maintain several log files that can aid in monitoring the
security of the Web server.


Listing of each individual request fielded by the Web


Listing of every program run by the Web server. This
log is optional in the default Apache installation and can be
enabled by editting the httpd.conf file.


Listing of the errors the server encountered. Errors
from CGI programs as well as the server itself are logged to this


Listing of the previous URL accessed by a given
browser. This log is optional in the default Apache installation
and can be enabled by editing the httpd.conf file.

Web Services

Log Files

Of principal interest from a security standpoint are error_log,
agent_log, and access_log.

These logs should be reviewed periodically for purposes of
identifying CGI program problems and attempts to access files
not intended for distribution.

Another aspect of web server log files is the wealth of information
they hold regarding the usage of the web site.

Log analysis tools such as http
analyze can provide the web site
administrator with a variety of useful statistics on the usage of the
web site

WARNING: A web server’s log files can provide a wealth of information
for an attacker. Be certain that the location of the log files is not
accessible by the web server. See the discussion in the section on file
access control for a description of how to limit the parts of the file tree
the web server is allowed to serve.

Web Services

Web Performance Issues

The performance of a web server is a mixture of several factors
including the style of data served (dynamic versus static), system
resources (CPU, I/O) and the available network bandwidth.

Web requests can be viewed as requests for various objects.

A typical web page might include some text and one or more
graphical images.

A web browser will make separate requests, often in parallel, for
each element of the page.

The web server fills each request as a separate item.

Web server load is measured in the size of individual requests
and the number of requests it can fill per unit of time. Requests
are refered as “hits”.

Web Services

Web Performance Issues

The Apache web server deals with requests by using a pool of slave

The number of processes in the pool is managed dynamically by
the parent web process within the bounds set in the httpd.conf

The parameters that control the pool are shown below.







The MinSpareServers parameter specifies the minimum number
of server processes in the pool.

The MaxSpareServers specifies the maximum number of server
processes in the pool.

Web Services

Web Performance Issues

StartServers specifies how many servers to start when Apache is

The values listed for each of these parameters is the default and
in general should not be changed.

Sites that see very large numbers of hits may consider increasing
the number of servers but will need to pay careful attention to
system resources, especially memory.

Server processing of data before a request is filled by page
processing tools such as PHP or by CGI programs adds additional
load on the server.

Servers with dynamic page content may require additional
memory or faster processors to provide reasonable speed in
responding to requests.

Likewise, the speed of the network connection between the web
server and web clients will limit the maximum number of hits per
unit time that can be processed.

Web Services

Spiders and robots.txt

A performance concern for some sites is the load placed
on the web site by web crawling “spiders” or “robots”
used by various web monitoring and indexing services.

These spiders request page elements in much the same
way a web browser would but do so systematically and
often at a faster rate.

There is an agreed upon standard for web servers to
specify what parts of a site, if any, a robot should traverse
called the robot exclusion protocol.

The protocol makes use of a file called robots.txt and an
HTML META tag to control access.

Web Services

Web Caches

Another method for improving web performance is the use of an
external cache system.

Most web browsers have a cache of recently viewed pages,
graphics and other other page elements for a period of time
defined by the content provider or optionally by the web browser

This allows the browser to rapidly view the page again by loading
elements from the local cache instead re
requesting them from a
web server.

A similar technique can be applied to both the serving of web
pages and the local network. Squid, a commonly used web
cache program is listed in the reference section of this chapter.

Web Services

Web Caches

For a local network with a slow connection to the Internet, a proxy
web cache can be used to improve performance and conserve
bandwidth on the slow speed link.

A proxy web cache acts as a local reference for all web requests.

The proxy cache holds copies of web page elements for a time
period defined by the content provider or by the proxy cache

Web browsers on the local network are configured to use the
proxy cache and the proxy cache in turn makes requests for web
pages not in its cache or simply replies with the page elements
already in the cache.

Web Services

Web Caches

A Proxy web cache can be either explicitly or implicitly
configured for a web client.

Most web browsers have an option dialog box that
allows a specific proxy to be configured.

A web browser so configured will direct all web
requests to the proxy.

An implicit configuration uses a firewall or router to
intercept any web requests leaving a site and redirect
them to a proxy.

This technique does not require any additional
configuration on the client end.

Web Services

Web Caches

Some web sites use a web cache as the “front end” to
their web server.

This improves the performance for page serving by
allowing the web cache to reply to frequently
requested pages from its cache, off loading that work
from the web server itself.

One situation where this is helpful is a web site with a
mixture of static and dynamic web pages.

The web cache can take on the load of serving the
static pages while requests for dynamic pages are
passed on the web server itself.

Web Services

Beyond Caching

An extension of the idea of using a web cache as a “front end” to a
web server is to use a set of distributed web servers or web caches
to provide more web service. There are several approaches to this.

Round Robin DNS

This is a special DNS configuration that
treats a series of web servers as a single DNS entry.

When a request is made for this special entry, the DNS
server replies with one of the IP addresses in the series.

It replies with the next address in the series for the next
request and so on.

This spreads the web service load over the machines in the

Web Services

Beyond Caching

3DNS Appliances

These systems provide an enhanced version
of DNS that is tied to database.

They can not only spread load between a group of servers as
the round robin DNS method does, but also assign requests
to servers that are physically close of to the system making
the request via data on the topology of the Internet stored in
their database.

Load Balancing Routers

These systems perform a similar round
robin load sharing function but work at the packet level, routing
incoming packets destined for a web server to a series of web
servers each in turn.

Commercial Service Providers

Companies such as Akamai
provide globally distributed web caching services aimed at large
high volume web sites.


Web servers are becoming a common service that nearly every
site will offer in some fashion.

Web browsers are relatively non

Some configuration options allow the user to configure the
look and feel of the browser.

Other configuration options allow the user to implement
rudimentary security, at a loss of convenience.

Some web servers are very configurable.

Some of the configuration options allow the admin to configure
the basic operation of the server.

Other configuration options allow the admin to configure basic
security of the web server.

Web server performance is an elusive goal.

Web caches and proxies might be used to improve web server