Reengineering Legacy Client-Server Systems for New Scalable Secure Web Platforms

fortunabrontideInternet και Εφαρμογές Web

13 Νοε 2013 (πριν από 3 χρόνια και 10 μήνες)

82 εμφανίσεις


1

Reengineering Legacy Client
-
Server Systems for

New Scalable Secure Web Platforms


Julius Dichter, Ausif Mahmood

, Andrew Barrett

,


University of Bridgeport, Bridgeport CT 06601


Technology Farm, Tewksbury MA 011878


dichter@bridgeport.edu
,
mahmood@bridgeport.edu

andy@tfarm.com



Abstract


We have designed a methodology and developed a medium
-
scale soft real
-
time system,
to

allow a migration from a relatively unsecured, non
-
scalable, multi
-
tier legacy client
-
server system, to a new system, which is able to maintain all existing functionality and at
the same time provides full modern web server features. Our paper details met
hods,
which can be applied to complex legacy, synchronous, and asynchronous client
-
server
systems to create simple web
-
based solutions. We will detail one such system, and show
how it was initially migrated to a CGI solution, and then scaled
-
up to a fast,
multithreaded system. During the 1990s there was a huge proliferation of server
-
side
scripting utilizing CGIs. These systems provided dynamic web service to clients. While
their benefits were obvious and immediate, their shortcomings became evident as the
Internet matured, requiring greater scalability, more responsiveness, and increased
security. We solve problems with security, when legacy client
-
server systems need to
send asynchronous socket communication messages despite an existing firewall
prohibitin
g such communication. Our system, which is currently in operation at a large
federal government agency, is one example of how other similar systems can be
reengineered
to work in current web
-
server environments.


Keywords:

Server
-
side processing, CGI scrip
ting, Firewall, Security, Scalability,
HTTP programming, System Reengineering.



1 Introduction



Client
-
server systems have been around in the business environment since the
1980s. It was a simple evolution of the time
-
sharing systems. The advantages we
re clear:
separate the logical functions of the two tiers, and reduce the load on the backend, or
server [*]. As the benefits of this new model became clear, systems grew at a breakneck
pace. Because each system was a specific solution for a single applica
tion, these systems
were as different from each other as the any two pre
-
database file systems of the 1960s
[*]. Many of these client
-
server systems persisted into the 1990s, as the Internet became a
force in the client
-
server market. The freedom of design

and communication patterns
between the client and the server were no longer possible in a web
-
based implementation.
For each new client CGI request, the web server spawned a server process. However, this

2

process was killed after its completion, and any st
ate it may have had in the prior
invocation was lost in the subsequent one. If complex, multi
-
tier client
-
server
applications are to survive into the current web
-
based model, their migration must be
relatively simple, and they must preserve the free commun
ication patterns using just a
simple stream communication as offered by CGIs or servlets, for instance.



The newer technologies for server
-
side processing include Java
jsp
, Microsoft
asp
,
servlets, or other proprietary APIs which allow the server scrip
ts to run as a sub
-
process
of the web server itself (iPlanet’s NSAPI or Microsoft’s ISAPI). Increasingly, Java
servlets are used for such applications [*]. Servlets have an advantage over the CGI
scripts because they are scalable. They are automatically mu
ltithreaded for clients. CGIs
create a new process for every client request. Because older CGIs were developed prior
to the secure web server model, many are basically band
-
aided application programs with
minor changes allowing them to run in web server en
vironments. This often results in a
free back
-
and

forth communication between the client and server, often with socket
-
based messages being exchanged. Such communications are not desirable in security
-
conscious environments because they amount to nothing l
ess than firewall exceptions and
thus security risks.



The simple solution to the client
-
server system migration into a firewall
-
secure,
scalable web server environment is to rewrite the system from the ground up. While
sounding simple, it may be a daunt
ing task. Consider, for example, that the CGI is a front
end to a set of tiers, which may have been developed in stages with different development
companies (likely with limited documentation). We may have back
-
end sub
-
systems
accessing databases using pro
prietary APIs such as some flavor of RPC. We may have
CGIs spawning long
-
term processes, which keep client state information as long as a
session exists. Such long
-
term processes may implement asynchronous communication
back to the client by sending socket
-
based messages to a receiving
-
end of the client.
Clearly, such systems would need to be rethought and reengineered to work in a secure,
scalable web environment. In the next section we will get into the details of reengineering
the client
-
server OLD syste
m first into a web
-
based secure CGI version and then adding
scalability and performance.



2 HTTP Pro Architecture


To introduce our system reengineering methodology, we will begin by
introducing the OLD client
-
server system architecture. We will define
its functionality
and its deployment platform. Then we will show the architectural shift which allows the
system to be ported to a web environment, and finally to its last version which adds the
performance, asynchronous communication and additional securi
ty.



2.1 The OLD Client
-
Server Architecture



3

The client
-
server system was developed to facilitate management of a large
-
scale
inventory for a federal agency. It was developed originally in 19xx, and its architecture
was a three
-
tier model. The client
is a Power Builder front end, which defines the end
user’s interface into the system. Behind the firewall was a Transrouter Server, to which
the client connects using the remote procedure call interface. The Transrouter Server has
a dual function. First it

uses the Oracle OCI protocol to fetch for the Powerbuilder
application screens from a database. The client is made up of so many different screens (a
high GUI complexity) that to minimize the size of the client the screen configuration
information is stor
ed in a remote database. Second, because the client needs to populate
the data screens with actual inventory information, and Transrouter Server makes a
proprietary DCE RPC call to invoke a Functional Server which, in turn, makes an OCI
request to another
Oracle database for the population data. Because the system has many
concurrent clients, there are many ports, which are necessarily going through the firewall
for the system to function correctly. The client is implemented on a Windows NT
platform, and th
e Transrouter Server runs on HP UNIX , HP
-
UX 10.2.


To facilitate the communication between the client and the server, information
was passed from the client to the server, and other information was passed back to the
client. For example the client would

send its user id,
uid
, to identify the actual person
who was accessing the system and its session id,
sid
, to specify the session number of
which this request was a part. If a request was an initial one (first in a new session), the
client would pass a nu
ll value, and the server would return an
sid

which would be
submitted in subsequent requests from the same client. One additional important system
problem occurred when some large data requests take an inordinate amount of time, on
the order of an hour. If

the client made such a request, the RPC would block the client for
a potentially long period of time. To avoid this, the Transrouter Server would return
immediately in such cases, and the client would open a server thread, which would wait
until the Funct
ional Server returned the data set. In this way, the client could do other
things in its
main

thread.


The system architecture was affected when the agency decided to implement a
more secure environment. In this environment a second firewall was added. Th
is was an
outer wall to monitor external system access. Clients could connect from the outside,
passing through the first firewall, then the second
-

the original one
-

before accessing the
Transrouter Server. In this way, if a break
-
in was detected from t
he outside world, the
first firewall could completely shut off all system access, while the clients inside the
agency, the inside network, could still have access. The first firewall enclosed the
service
net
, which enabled users outside the agency to conne
ct. The O/S was HP
-
UX 11.0, inside
the inner firewall was the
server net
, basically the original system with HP
-
UX 10.2. This
system ran until early 2001, with each firewall containing a set of exceptions for outside
clients, increasing security risk.


Th
e system was difficult to rewrite because of the huge amount of complicated
screens encoded on the Oracle database. Further, the proprietary RPC
-
based Functional
and Transrouter Servers were a set of well
-
behaved, complex, expensive and tested

4

components.
Redevelopment of the entire system would cost on the order of several
million dollars.



2.2 The Migration to a CGI model



The motivation of moving to the CGI model from the client
-
server model was to
make the system adapt to a new architecture, while,

at the same, time modifying as few
components of a complex system as possible. In this way, a solution could be developed
relatively quickly with a low cost, while maintaining functionality. In our approach, the
modifications to the client were very minor
, and those to the Transrouter Server, nil. A
primary problem with the migration has to do with the flexible communication of freely
developed client
-
server applications. Socket communication can be back
-
and
-
forth as
needed. For instance, in a communicatio
n round, a client can send a message, receive one
back, and repeat this any number of times. This cannot occur in the CGI model. When a
CGI client sends a request, it is through the web server. All its input data is typically
encoded in an HTTP GET or POST

request. The CGI server is instantiated and responds
to the input data, and its standard output stream is forwarded as the response back to the
client. Once the CGI server completes, its process is terminated and the local
environment is lost. Effectively
, a client is limited to a single send
-
receive cycle. The
problem is how to manage state information if a session is to span multiple client
requests. In our solution, the ckient request as well as the response is a potentially large
payload. Therefore, we

send the environment as well as the request data as part of the
client HTTP POST data.



The new deployment environment, which included the two firewalls, server net
and service net, necessitated the need of a new component. The real server was on the
in
side service net, accessible only from local users. Clients accessing the system from
outside the server net, would need to go through the server net’s web server, and have the
request forwarded to the CGI server on the inside service net. A new server net

CGI
tunnel
component was developed. This component was a CGI script which took all the
client request information, and made a HTTP POST request to the service net’s CGI
server. The goal of this move was to migrate the system to a web
-
server
-
based model
su
ch that the only access into the CGI system was through the web server port on the
outside to the web server port on the inside. The idea was that the client would make its
connection as an HTTP POST request to the outer web sever, which would be passed
th
rough, or tunneled, to the inside web server. There, the inside CGI would be a new
component, named the Transrouter Client, TR Client. It would be the task of the TR
Client to excute the RPC access to the Transrouter Server. Since the TR Client was
behind
the firewalls, it had no restrictions on its socket communications within the
service net.


An interesting solution had to be devised in implementing the asynchronous
service request. To make this work, the client would also send an asynchronous id,
aid
,
to
the server, initially a null value. If the request was determined to be asynchronous (the
determination was made by the server, not the client), then the server would return to the

5

client an
aid

with a unique non
-
null value. The client would then start
a new local server
thread, which listened for a response from the Transrouter. This was not a clean solution
because it still required a hole in the service net’s firewall allowing a socket back to the
blocked, waiting client. A better and more secure solu
tion was implemented in the next
phase.



2.3 Converting the CGI To a Multi
-
Threaded Asynchronous System


The CGI solution worked quite well, and was able to get the data through to the
clients. It had shortcomings, however. First, the process was not sc
alable. When many
clients connected, some with multiple clients running concurrently on different machines,
the server was bogged down with multiple copies of the tunnel CGI (on the server net)
and the server CGI (on the service net). This problem created
a waste of resources, both
in terms of memory and performance. Second, the solution to the problem of the
asynchronous communication form the Transrouter to the client was not acceptable, as
firewall exceptions had to be made to allow backward communicatio
n. In addition, the
client was not free to make other requests in the long response waiting periods.


The solution to the scalability issue was straightforward. We needed to
multithread the
tunnel

and the
server

processes. An obvious solution was to imple
ment
both of these in Java as servlets. We did take this approach in the
tunnel
, but were unable
to do so in the
server

process. Because we were dealing with an existing hardware and
software platform, some limitations arouse. The server net, when the
tunn
el

was deployed
was running HP
-
UX 11.0 and used a Netscape 4.0 Web Server. This platform was
capable of running Java servlets. The service net, however, was running HP
-
UX 10.2.
This was an older reliable large
-
scale server, but it was not capable of runnin
g the HP
-
UX 11.0 operating system. The problem was compounded in the 11.0 was required to run
Netscape 4.0, and the Netscape 3.6 Web Server did not have sufficient
-
enough support
for Java servlets to allow that implementation. We solved the issue by implem
enting the
server

side in Netscpae’s NSAPI, an API which allows the
server

to run in the process
space of the web server. When a request of such a server process is made, the web server
runs it in a thread within its own process.


The problem of the async
hronous communication was resolved by the
modification of the software architecture if the
server

on the service net. In the CGI
model, the
server

was a CGI script, which made the client screen data request and the
inventory data request via an RPC call to

the appropriate servers. The RPC mechanisms
worked well, and used a proprietary DCE system, ENTERA. The service net’s
server

was
broken into three server components: the
session manager
,
session listener
, and the TR
Client. The first two of these componen
ts were NSAPI threads, which shared global
structures. But because threads have a similar property to CGI scripts, namely they die
and lose their states after they complete, the
session listener

was a thread, which started
on web server launch, and never t
erminated. This allowed the
session listener

to maintain
tables of state information between consecutive new and repeat client connections. The
session manager

is an ordinary thread. Whenever a client makes a connection, the web

6

server creates a
session ma
nager

thread for that client request. The
session manager

thread can read or write the shared data stored in tables in the
session listener

thread. The
last component, the TR Client, is a normal process, spawned by the
session manager
,
which makes the RPC
calls on behalf on the client to the Transaction Server. The TR
Client is a stand
-
alone UNIX process. As such, one new process would be required for
each new client request. We decided that it would be efficient to have only a single TR
Client for the life

of a client’s many possible requests. Because a
session manager

manages only one query then dies, we needed a way for each
session manager

thread
instantiated for one client to contact the same TR Client each time. We were able to
accomplish this with giv
ing out unique session IDs to clients, logging them in the
session
listener
’s table, and associating them with the server socket of the dedicated TR Client.


2.4

Typical System Scenarios


There are three typical types of transactions, which the client can ini
tiate. They
include a new session or subsequent session request, a session termination request, or
asynchronous data retrieval. In order to enable this in our system, several variables were
utilized. Variables were sent from the client in the original requ
est, and variables were
sent back to the client from the Transrouter Server response. Some of the variables are
used to pass the identity of the end user so as to respond only to queries, which the user
has a right to pose. Variables of interest in our sys
tem include the user ID, session ID,
request TYPE, and asynch ID. The user and session ID
s

are unique values that allow a
user to possible have different sessions going on, possibly from different client terminals.
The request TYPE variables are sent by th
e client to denote a new session request, a
subsequent request (meaning that the client has already posted requests in this session), or
a termination (meaning the client application is shutting down). There are additional
variables, which are used for err
or checking and validity.


2.4.1

Typical System Scenarios


The following scenarios depict what happen when a client makes a request. We
will assume that the client makes either a direct HTTP POST request to the inner web
server if it is located inside the servic
e net, or an HTTP POST request to the Tunnel
Servlet, which makes the POST request for the client, if a client is outside the firewall. In
short, we will ignore the tunnel because it is necessary only in the event of outside
firewall access.


In the case
of a client connecting for the first time, the request TYPE has an
agreed
-
upon special value. The session ID is null. When the
session manager

receives the
request, it creates a unique session ID value, and logs it with the user ID in the shared
table. The
n the
session manager

creates a new TR Client process dedicated to this user’s
session. When the TR Client launches, it creates a listener port for the
session manager
.
Then the TR Client sends a message to the
session listener

to report its port number, a
nd
the
session listener
records it in the table next to the client’s user and session IDs. The
session manager
, which has been polling the table for a valid port entry to contact, reads
the port number, and opens a socket to the dedicated TR Client. It is
at that time that the

7

client’s request gets forwarded to the to the TR Client, which in turn, makes the RPC call
to the Transaction Server. In a normal request, one in which the request is not very time
consuming, the RPC returns a query result, which is r
eturned to the
session manager
, the
forwarded to the client (possibly through the tunnel).


When a client makes a subsequent request, the request TYPE is another well
-
known value, but the client sends it a previously
-
received session ID. This differs from

an
initial request in the newly created
session manager

thread does not create a new TR
Client process, but looks up the port of the previously created one, and forwards the
request to the socket. A termination request is very simple, the
session manager

reads the
user and session ID from the request TYPE, clears the entries from the table, and returns.


The most interesting situation occurs when a new or subsequent request creates an
asynchronous request. In such a case, the TR Client makes the RPC call
to the
Transaction Server, which returns a special variable asynch
-
session ID, indicating that
the request will not be processed immediately, but it has been given a unique identifier.
The client ultimately receives this variable. It then immediately issue
s a new request
thread, sending the asynch
-
session ID variable. The
session manager
blocks and returns
only when the response is finally returned. This does not block the client, because it is
free to make other threaded requests. Each such request will re
sult in a separate
session
manager

dedicated to one request. So client requests for asynchronous data in fact are re
-
orchestrated as two independent requests and the need for a special client listening port is
eliminated.



4 The General Methodology



I
n a traditional client
-
server model, the client is free to make socket connections
to the server process or processes in a free manner. It is possible, and sometimes
desirable, to have the client act as a server to listen for asynchronous responses. In a
s
ecure environment, the model breaks down, as often the only accessible communication
channel is the HTTP port 80. Server processes, such as Java servlets or CGI scripts
-

are
run for the client request, and terminate when the request is satisfied. The probl
em can be
resolved by breaking the communication sequences between client and server into
consecutive client requests, while maintaining client state information.



The lesson we have learned is a simple but powerful methodology, which can be
applied to c
omplex legacy client
-
server systems. What makes such systems effective is
flexible communication sequences between the two parties. When the model is mapped
to a secure, firewall
-
protected web server environment, such sequences are no longer
possible, with
out anomalous exceptions. These exceptions are being slowly eliminated.


Our model solves this problem by creating a two
-
tier server
-
side application. In
our system, the
session manager

and
session listener

define the web server tier. While the
manager
ma
nages
the

current client request, the listener maintains the long term data
structures, such as lookup tables, which allow the future
session manager

threads to see

8

state changes that would be otherwise impossible in a single
-
cycle request
-
response
server
-
side programming model. The second tier, the service tier, in our case the TR
Client, was the what made the system work in a static way, meaning that the server
process was a stable, long
-
lived server, despite the client arriving into the web server in
ses
sion manager

thread bursts.


5 Future Enhancements



The future



6 Conclusions



We have developed



7 References