Summary IT Architectures and
Note: this summary is a bit of overkill for this course
(as of 29
Check the sheets, this
summary, make sure you understand the main con
cepts and practice the example exam questions!
Chapter 1: The problem
Chapter 2: The emergence of standard middleware
Chapter 3: Objects, components, and the web
Chapter 4: Web services
A technical summary of middleware
Chapter 6: Using middleware to build distributed applications
Chapter 7: Resiliency
Chapter 8: Performance and scalability
Chapter 9: Systems management
r 10: Security
Chapter 11: Application design and IT Architecture
Chapter 12: Implementing business processes
Chapter 13: Integration design
Chapter 14: Information access and information accuracy
Chapter 15: Changing and integrating applications
Chapter 16: Building an IT Architecture
: The problem
IT architecture = is a solution to the
problem “How do I structure my IT applications to best suit my
business?”. An architecture identifies the components of a problem space, shows the relationships
among them and defines technology, rules and constraints on the relationships and components.
Typical architecture figure 1
4. Environment three parts: presentation, application and data.
Middleware = Software that is necessary in practice to build distributed applications. An important
characteristic is it can work over a network (but not necessar
ily does at all times!)
Eight elements to consider in middleware (see figure 1
The communications link
The programmatic interface
A common data format
Server process control
, manages scheduling
, manages finding
others on the network
, ensures safe communication
, keeps the whole thing operation properly (fault mgt, performance…)
Silo applications = stand
alone applications that are generally difficult to integrate
Why still use silos? 1)
Don’t lose power, 2) self
contained projects easy to control, 3) development
methodologies are silo bases, 4) fear of big integrated systems, 5) fear of changing large existing
Surround architecture = interface ‘surrounding’ the silo applica
tions. Includes a Hub towards
presentation devices and a Merge to create a consolidated data view.
There are two options to change software:
Substantially change the existing application (new interface)
When to rewrite?
Serious issues with suppor
Eventually all applications need to evolve.
You can split big rewrite jobs in smaller rewrites and
To build an architecture you need:
s chapter 1:
Make an architecture that adapts easily to changing needs
Architecture separates applications logic from any representation by access
ow an evolutionary approach when
implementing the architecture
Ensure technical issues are addre
ssed up front to avoid operationally unusable solutions
Do not finish the architecture and leave it on a shelf
Enable data transfer (medium)
Chapter 2: The emergence of standard middleware
TCP/IP = set of standards developed by the military that became popular because ARPANET (who
ed TCP/IP) became so popular and evolved in the worldwide Internet.
Requirements for middleware:
Ease of use (compared to writing it yourself using low
Location transparency (not know the network and address)
Message delivery integri
ty (not lost or duplicated)
Message format integrity (not corrupted)
Language transparency (communicate with programs in other languages)
RPC = Remote Procedure Calls, the syntax of the client (caller) and the server (the called) programs
remain the same,
just as if they were on the same machine. Examples: Open Network Computing
(ONC) from Sun and Distributed Computing Environment (DCE) from OSF. It works like this: you write
an Interface Definition Language (IDL) file. The IDL generates a
sub converts parameters into a string of bits and the skeleton converts them back. This parameter
conversion is called
, see figure 2
4. The word marshalling is slowly being replaced by
. This process does requ
ire multithreading, otherwise the client is left waiting.
: programming like no server exists
: caller is blocked while waiting for server response (alleviate with multithreading)
plus many clients create overhead.
Remote database a
ccess = ability to read and write to a database on a different machine. Two
approaches: SQL or by disguising the remote database as a local database. Creates large overhead on
the network. Stored procedures are a way to speed up remote database access. The
re are loads of
remote database access technologies: ODBC, OLE DB, ADO, ADO.NET, JDBC, JDO, etc.
: poor performance on transaction processing
: excellent at processing ad
hoc queries on remote databases
Distributed transaction proces
sing middleware synchronizes transactions in multiple databases
across the network.
Transactions are used to ensure execution of all database commands or none at all. Should adhere to
Atomic = transaction is never half done
Consistent = the database
constraints hold true before and after execution of command
Isolation = data updates are not visible until entire transaction is done
Durable = transaction is truly done, updates don’t disappear for some magical reason in the
Transactions ensure ACI
D properties with database locks and rollback mechanisms.
the steps that business processes take. If someone changes one step into two smaller steps, or adds
or removes a step, they change the business process.
Message queuing = program
message queue, a very fast mailbox, the recipient does not have to
be active. You Put messages in the queue, and another program does a Get.
This is e
second response times. Best known is WebSphere MQ.
include no IDL and no marshalling. Thus, the receiver must know the message layout.
Also transactions might not be isolated; others can get in between.
network can go down without problem, doesn’t need two
(see page 14
, secure delivery.
Message queuing can be synchronized with database transactions, making it possible to build
systems with good levels of message integrity.
SQL parsing: a SQL text is send to the server which parses it to a query plan. The c
lient receives a
Query Output Description and sends the execute command plus some parameters, after which the
query result is send. To speed this process up the query plan can be cached.
hapter 3: Objects, components, and the web
middleware: in the end
of the 80’s Object Oriented middleware came into existence. An example
is Common Object Request Broker Architecture (CO
A). It basically calls methods on objects
remotely; very similar to RPC.
You no longer have client
server but client
object. Two steps
required: 1) getting a reference from to the object, and 2) calling an operation on the object.
require an IDL (Interface Definition Language)
file that create stubs and skeletons. However, most
OO middleware use a macro
language that provides
the interface (in RPC the concept interface is less
: complex, plus interoperability between CORBA implementations is low.
Advantages: it fits naturally in OO
languages. It is more flexible: interface is delinked from server,
tation and specification are detached.
Problems in OO middleware
How to get an object reference?
Three ways: 1) A special object reference is returned to the client when it first attaches to the
middleware. 2) The client calls a special “Naming” servi
ce that takes a name. The directory
returns the location to the object. 3) An operation on one object returns a reference to
another object (cannot be used to retrieve the first object)
When are objects created and deleted?
Is it a good idea for more than
one client to share an object?
an OO component creates one or more objects and then makes the
interface of some or all of these objects available to the world outside the component. Example:
ject Model (COM) from Microsof
t and Enterprise JavaBeans from Sun.
Transactional component middleware = makes the transaction processing systems easier to
implement and more scalable. It provides a container with many features, most notably transaction
support and resource pooling. A c
omponent is placed in such a container (i.e. moving it to a file
directory where it can be accessed and registered by the container). The container contains an object
wrapper that is called by the client to call the component. Because of this the container
can take care
of security and memory efficiency.
In COM+ you can declare to
an object after a transaction or operation; deactivation means
elimination. COM+ does not store anything on session basis; a feature that is typically used for
Enterprise JavaBeans (EJB) is actually a standard, not a product and thus has several
implementations. There are two types: 1) stateless session beans; its state is eliminated after every
operation invocation, and 2) stateful session beans;
hold state for their entire life. The container
decides what happens. Often values are being cached in beans to improve performance, but this
does destroy transaction integrity.
Disadvantages com+: only works on windows
Disadvantages EJB: only works using
Differs from normal applications in:
User is command
(back buttons, favorites, explicit URL addresses, etc)
Not all users have the same connection, pc, etc.
You cannot identify the user by their network address
Public medium wi
th security being a major concern
Many short interactions of many users make many internet application painfully slow
Using sessions or cookies you can store information longer when accessing a web application. In the
olden days the session was between the
workstation and application, now it is between cookie and
transaction server. If the client or connection fails; no problem because everything remains stored,
both the session and data (in the
). Stateless sessions are to be preferred.
KISS = Ke
ep it Simple, Stupid
Chapter 4: Web services
Services are easy to understand, you have a requester requesting a service, and a provider fulfilling
the request. A web browser
or other PC software
acts as a proxy on his or her behalf.
In IT context
s can be providers of services to other programs. A provider himself can access other
services to provide his own service.
A big problem is what to define as a service: what characteristics must it have? Many definitions
exists, the book follows one by des
cribing the attributes.
Independent of any requester; he is a black box
Verifiable identity (name) and interfaces
Possible to replace existing implementation with a new version with maintaining backwards
Can be invoked by requesters of its services and can invoke services himself
Contains mechanisms for recovering from errors (ACID)
Although a service provider’s internal structure is unknown to requestors of its service, a service
ecture (SOA) may be used within the provider itself. This gives a two
level approach in
that external services are implemented by internal services.
If you look at the internet, it is very service oriented. Users act as requestors and internet
provide services for them.
The W3C is responsible for many of the web standards and architectures. XML plays a major role in
providing services because it is platform, vendor, language, and implementation independent.
The key messaging technology is SOAP
; Simple Object Access Protocol. It is based on XML and is
typically carried over HTTP. A simple example of SOAP is on page 68. A SOAP message comprises of
an envelope and a body which contains the application payload. You define the namespace which
s a collection of names on which you build your SOAP message body. Optionally processing
information can be set in a header. With it you can build complex sequences of interactions
(authentication, encryption, transactions, etc).
WSDL provides the descript
ion of services in an XML document.
UDDI (Discovery, Description and Integration) is a protocol used to find (‘discover’) services. It
contains white pages (contact information), yellow pages (description of the business) and green
pages (technical informa
There remains a lot of work to be done to make sure different providers and requesters understand
and interpret each other equally; they need one ontology. The W3C is working on the Web Ontology
> not WOL!) to improve integration.
avoid UDDI many groups are just using SOAP and agreed upon
structures, avoiding complications
of extensive discovery of services.
If you go the extreme, an organization could outsource all of
IT functions by web services.
However many small organizatio
ns providing parts of the IT environment does bring other problems
(reliability, continuity, etc)
Chapter 5: A technical summary of middleware
Answers the question “What middleware do we need?” by describing the eight elements of
middleware (see page 78).
The communications link: most are restricted to TCP/IP. You might not need the extra
services offered by TCP/IP (e.g. DNS for converting names into IP addresses). You have two
types of protocols: with or without connections. Without connections it is much
like writing a
letter and a receiver on it and hope it finds its destination. UDP is connectionless. TCP is a
connection protocol and offers useful features: 1) no message loss, 2) received in the right
order, 3) no message corruption, 4) no message duplic
The middleware protocol: Middleware protocols are generally dialogs and thus require
connections. E.g. client/server (many
1), and push protocols (1
many, e.g. publish and subscribe). To build connectionless protocol you
can use SOAP and
phase commit between requester and provider cannot be implemented
through message header information; it requires additional messages to flow between the
The programmatic interface:
a set of procedure calls used by a
program. Can be a huge
variation, three classifications possible;
(1) by what
ties are communicating (terminals and mainframes, processes with
processes, clients with message queues, etc.). Historically, entities communicating with each
other have be
come smaller and smaller.
(2) Nature of the interface of which there are
two types; APIs and GPIS. An
Application Programming Interface is a fixed set of procedure calls. Generated Programming
Interfaces either generate the interface from the co
mponent source or from a separated file
written in an IDL. Within API there are three styles of interface: 1) message
based (command is encoded in a language, e.g. SQL), and 3) operation
based (name of server operation and p
arameters is built up by a series of middleware
procedure calls, e.g. COM+).
Many middleware have both API and GPI interfaces. API is for interpreters and GPI is for
(3) according to the impact on process thread control. Either 1) Block
ed (synchronous); the
thread stops untill reply arrives, 2) unblocked (asynchronous); client every now and then
looks to see whether reply has arrived, and 3) event
based; when reply arrives and client is
Data presentation: both sender and receiv
er must know the structure of the message and
their character encoding for example.
Most middleware take care of that by using
Server control: break down in three main tasks:
(1) Process and thread control: a server must be run
ning to receive messages, you might
need a load balancer that is capable of processing it.
(2) Resource management: e.g. database connection pooling
(3) Object management: objects may be activated or deactivated
Web service standards have nothing to say ab
out server control.
Naming and directory services: typically IP address or DNS. Directory services go one step
further and provide looking up functions (e.g. Microsoft Active Directory).
Security: access control, encryption.
System management: a human inte
rface to all this software for operational control,
debugging, monitoring, and configuration control.
Because of these aspects
and especially the huge amount of programmatic interfaces
there is a huge
variety of middleware.
Vendor architectures deal with w
hich and how many of the middleware you need. Currently there
are two big ones: .NET and J2EE. Both are interpreters, .NET uses a Common Language Runtime and
J2EE the Java Virtual Machine.
See page 88 for .NET architecture and page 89 for J2EE.
interprets more coding languages and J2EE runs on more platforms.
Vendor architectures serve a number of function of which three are discussed in the book:
Positioning: a well
presented architecture lets you see at a glance what elements you need to
lect to make a working system. The user knows what he is involved in.
Strawman for user target architecture: architectures tell users how functionality should be
split (e.g. into layers such as presentation, business logic, and data).
Marketing: it shows y
ou how to develop, it provides a strategy. The problem is when explain
an architecture you are explain very complex software.
Many vendors don’t explicitly name their architecture; many buy SAP or Oracle and you buy
their architecture. When you have
different architecture based software you might need Middleware
Interoperability: building software for different middleware technologies.
A Hub or Gateway,
‘Enterprise Application Integration software (EAI)’ can achieve this (sometimes this is packaged wi
on directly). The main question
is whether it is safe. For example application A sends a
message to the hub which in turn calls app B using Java Remote Method Invocation (RMI). Normally
A is guaranteed it arrives only once and arrives
however does a hub provide these
guarantees? One solution is a two
phase commit spanning app A and B.
can have a unique identifier and
the hub calls B if it did not
The same problem applies for
connection loss. E
ither A and B are handled in a single transaction, meaning two
phase commit and
synchronizing queues, or handle it at the app
level (e.g. A asks B “is the transaction done?”). Even
more complex; a
session between A and B, but a message queue does
not have a concept
session. To solve this generally a session ID is used.
Chapter 6: Using middleware to build distributed applications
From a user’s perspective, there are four groups of distributed processing technology:
n retrieval technology
Collaborative technology (e.g. email)
Internal IT distributed services (software distribution or remote systems operations)
You have two types of messages
(or three, if you count inquiries separately)
time: inquiries or ‘action
In business processes actions by others require messages that are deferrable
(capable of being postponed), often implemented using asynchronous message technology.
You cannot translate a real
time transaction into a deferrable transaction
without a lot of
thought. It is often simpler to first do a real
time transaction, and if that fails do a deferrable
Middleware choices for real time include RPC, CORBA, EJB, COM+, Tuxedo, and SOAP. It is generally
not recommended to use message queu
ing for real
time processing because:
Two transaction servers with message queuing cannot support distributed transaction
time calls have an end user for the reply; they have a time
out if it fails. With message
queuing the user may go away
if it takes too long but eventually the output response is sent;
often ending up in a “dead letter box”.
There can be an enormous amount of queues, when you have thousands of users you end up
with thousands of queues.
Queues have no IDL and no control of
For high performance you need to write your own scheduler.
An alternative view, to process deferrable message real
time has also problems:
It’s slower, messages cannot be buffered
If the destination server is down the calling server cannot
queuing software has various hooks that can be used to automate operational tasks
In most cases, real
time transactions are used between user and server, and deferrable transactions
Information retrieval is positioned al
ong four dimensions: 1) timeliness (the sp
eed), 2) usability (raw
data, fragmented information, inconsistencies, etc), 3) degree of flexibility to the query (only an ID up
to complex SQL queries), and 4) whether the users wants to get the data or wants to
when something changes (time
based push, event
based push or pull).
Distributed system software is converging towards each other on a technical level (using your TV top
box to pay your bills) and interface level (using email as a reporting mech
There are three program tiers:
Presentation tier: especially the banking industry allows doing your banking using different
devices; an ATM, web, bank clerk, phone, etc.
Whatever the channel, there are only a few
types of messages for the back
time, deferrable, ad
hoc queries, simple push
messages and reports.
Processing tier: the programming glue between interface and database. Contains the
decision logic that takes the input request and decides to do something with it (e.g. the
siness rules). The processing tier should support many small messages or a few big ones
(e.g. filling in an order part by part or all at once). This becomes more troublesome when
session state and recovery issues apply. The right order lines need to end up
in the correct
Especially when small messages come from different channels. Generally it is easier to
make the inward session (the processing tier interface) session
Data tier: basically the database. You have to decide whether you run it loc
ally (which is
safer) or on a separate server (which creates a lot of network overhead). Also running it
locally has the benefit that when the database schema changes you can identify all instances
accessing it, over a network this becomes more difficult.
A second decision is whether to use
a database handler (an abstraction). Today SQL is a bloadly supported query language and
these database handler are needed less, they do however have the advantage that when the
database changes your application doesn’t
Any of the interfaces of a tier could be made into a service. This is only logical if you end up with 1) a
loosely coupled interface and 2) the interface is used by many requesters.
common distributed architecture patterns in use:
Middleware bus (or ring) architectures
Middleware that unlocks core applications to other applications. It is fast, secure and flexible.
Most middleware busses are custom built and many organizations worry about maintaining
(somewhere in between)
A hub is a server routing messages. It can do a lot, like routing messages based on type,
reformat the message, broadcast the message, add information, etc. etc. A hub can
thus act as a bridge between dif
ferent networking and middleware technologies. Using hubs
makes everything more flexible, however also forms a single point of failure, another link in
the chain, and another technology to pay for.
Especially with a lot of ad hoc software you
could end up
with something very complicated and you no longer know what does what and
how they relate to each other.
Web services architectures
Use standards such as SOAP, WSDL, and UDDI. Technically they are just a collection of
ies. They form a cheap alternative to integrate software. Compared to
traditional software web services are slower because of their translation to XML messages.
Also many organizations provide interfaces to web services.
These architectures are not mutuall
y exclusive, many organizations have all of them. Many however
fall into a
category: ad hoc, or point
Coupling is the degree to which one party must make assumptions about the other party. The more
complex the assumptions, th
e tighter the coupling. Tighter coupling requires more changes than
loose couplings. Ideally you can test each component individually. Dependencies fall into several
categories: protocol dependency (use same middleware standard), configuration dependency (
does it cope with changes?), message format dependency, message semantic dependency (how to
interpret the message), session state dependency (what order do messages need to be
send/received in?), security dependency (some services not available to some
process dependencies and business object dependencies (identifying objects in different actors). All
these fall in three groups: technology dimension which can be resolved by following the same
standards which are inherently flexible (e.g
. XML). Application dimension (message format and
session dependencies) which can only be changed in the application. Wider concerns (business
process, business object and again security), may require changes in many applications.
Tightly and loosely coup
led has two dimension; apps can be loosely coupled along the technology
dimension but tightly coupled at the application dimension.
Chapter 7: Resiliency
Parallel servers greatly reduce server downtime. If one server is down 1 in 100 days, two could be
h be down 1 in 10000 days (theoretically).
The obvious way to improve resiliency is to use backup servers. Recovery consists of four steps:
Detect the failure: a heartbeat feature checks whether the primary server says “Yes I am still
running”. This tells
you next to nothing and you need to do extra tests to figure out what’s
wrong and if the production application really is running. E.g. the response times of the last
10 transactions can be useful. When you switch to backup in case of an error lots can go
wrong; plus many software and operational problems are not cured by a switch. Very
conscious sites will make the switch easily because they’ve taken care of these
problems, most organizations try to avoid switches.
Cleanup work in progress:
e are two ways to backup a server 1) copy database logs and
apply them against a copy of the database, or 2) have the disk subsystem keep a mirror copy
of the disk on the backup system. The first is more efficient, the second easier. The database
ean up any uncompleted transactions.
Activating the application:
once database and message queues are tidied up users need to be
forwarded to the backup server by either running it under the same IP, using an intelligent
router or special protocol written
between client and server (starting a new session without
telling the user). Batch recovery is also described in the book but I can’t really understand it
128, have fun). In distributed systems there are three options: 1) the client failing,
recovery as batch recovery, 2) sever failing, simply recover last transaction, and 3)
both fail at the same time. It is easiest to keep states stored in the database and keep server
and client stateless.
Reprocessing “lost” messages: database reco
ne is not enough. Two problems when
server fails: 1) The client does not know whether the transaction was completed or not, and
2) if it did complete what was the last output message? You can solve this by storing
sequence number separately and let the cli
ent interrogate the server.
Switch can cause major delays. You can have the user active at both operations and backup systems
at the same time: dual activity. Two approaches:
Clustering: the database (which is mirrored) is shared by both systems. This has
management problems, lock management, lock manager duplication and duplicate log files.
These problems are all solved but at the cost of lower performance.
database approach: each system has its own database. Each transaction is processed
once on each database. Two
phase commit ensures the systems are kept in sync. If
one goes down the system simply stops sending it
. Two problems: 1) when
failed system comes back online it must catch up all missed transactions. Problem her
they need to be processed in the same order the commits were processed (thus not the
input order). 2) The system needs to handle the situation where the client connection to
both systems is working just fine but the connection between the two systems
The network however is often very complicated, involving many clients and many servers. Also the
router is a single point of failure as well and requires backup. Also, a switch would go seamlessly
when web servers maintain two connections, one t
o the active transaction server and one to the
backup and only use the active connection.
Most downtime is caused by planned downtime. You can use online backup copies or special disk
mirroring features to copy the database onto another machine. Online bac
kup copies require after
images to bring the database back to a consistent state. Most large
scale enterprise disk subsystems
have features that make a mirror copy and then logically break the disks from one machine and
attach them to another.
You can hand
le many configuration changes in hardware and software by
intelligent use of backup systems (take backup offline, change software or hardware, brink backup
online, switch to backup, repeat process on other machine).
In many ways application software failur
e is the worst kind of error. Programmers should 1) prevent
bad data getting to the database, 2) look for corruption in the database, 3) display as much
information as possible about any errors detected.
In a worst case the database can get corrupted. If
it does you need to set back a uncorrupted backup
if you can find the uncorrupted point in time. Older
traditional mainframe transactions are generally
a lot easier to recover than object middleware.
IT people think too often backup is an IT issue where
it is actually a business concern and designers
need to set resiliency goals; adding resiliency is a costs and cost/benefit is a business analysis.
There are three parts to resiliency analysis:
Data safety analysis (distributed, or two central databases, o
r departmental databases, etc)
Uptime analysis (uptime is a nuisance
and inactive workers cost money
, but losing data is
much worse, what uptime do you really need for what costs?)
handling analysis (
look at errors caused by external factors (e.g.
no manager in
department) and those of IT infrastructure breakdown, user errors, and program errors).
Chapter 8: Performance and scalability
We have been looking for 20 years at Moore’s law but still performance issues remain. A reason for
this is the un
lippery slope: there is a ‘gap’ where processers wait a lot because they do not have the
right information in the memory. Even though processors have a cache these days the problem
remains. You can dampen this effect by introducing sophisticated hardware a
To push a program down the slope you can do: 1) reduce active memory, 2) reduce number of task
switches, 3) reduce lock contention, 4) reduce # of IO’s, 5) reduce # of network messages, 6) reduce #
of memory overlays.
Bottlenecks in transacti
A network of 10
Mbit LAN delivers considerably less than 10
Mbit depending on the number
of devices connected on it.
Disk throughput: because IO operations are not evenly distributed over all disks you need
y more disks to handle the loa
d, ending up with loads of hard disks but most of which
doing very little.
Total efficiency: you need roughly 30% idle time because queue
times deteriorate after that.
Check the formula: Total time = service time / (1
utilization). 70% utilization means
total time is a bit over 3 times the service time!
Memory can run out very quickly; 2MB for each terminal connection would surpass the limit
of 4GB of many OS with 2000 connections. A transaction monitor is a solution; the number of
parallel copies is
then the number of active transactions rather than the number of
Object interfaces: some languages (e.g. Visual Basic) encourage you to fetch attributes for each
transaction. What used to be “read and write” or “find, get, and update”, you know
have “find, do 10
messages to get 10 attributes, display, do 2 updates”. Increasing network traffic and especially
Transactional component servers: for various reasons e.g. J2EE application servers do not scale well
vertically (= putting
processors on it). Solution is to scale horizontally; put an extra
server, set it up the same and balance the work load.
phase commit: causes extra messages thus network traffic and processing power. The
transaction also takes longer and da
tabase locks are held longer (= lock contention).
Message queuing frees up network and processing load but does increase the time for a message to
Remote database access for real
time transactions are inefficient. To alleviate their workload you c
use stored procedures but you do convert a database server into a transaction server. Instead, use a
real transaction server (e.g. NET Enterprise Services or Enterprise JavaBeans) with better support for
multithreading and connection pool management.
There are three reasons for batch:
Support cyclical business processes (payroll or bank interest accrual)
Support housekeeping tasks (copying database to backup)
Optimization (for example when you have to insert 100.000 records, each with an index,
nslating into between 200.000 and 400.000 IOs would take a few hours!)
With 24/7 economies, the internet, the time to do batches in has been shrinking.
In general distribution has four problems:
It requires a great deal of extra coding; transa
ctions, reports and inquiries become more
Evenly spreading of data is hard, larger machines likely support much more traffic than
Considerable additional overhead, such as two
Much more difficult operationally
built the application as a distributed system from the ground up; it is very difficult to
Fool the client when he accesses an IP that there is only one webserver, where in fact there are a lot
more. The problem is mainly to keep the
data equal on all servers. Easiest solution is to put all the
state in the back
(avoid stateful session beans for example).
Load balancing on database servers is tricky; if there are ‘hot spots’ of data which transactions want
to change all th
e time performance will be poor. Examples are: control totals and end of indexes. You
have to design load balancing with that in mind.
Business intelligence systems
BI systems is a broad term used for
to decision support systems. Two performance
Ad hoc database queries: large queries will dominate the IO capacity (e.g. memory usage,
database buffers, they squeeze out the rest of the work). Data replication in other database
servers can solve this problem.
Data replication: introduces its
own set of problems. The basic is simple, make a copy and
keep it up
date. It increases network load (although you can batch updates together) and
the IO load is generally higher on a query database than on a transaction server. You could
let the target
machine get behind and catch up during the day or add more memory.
Backups and recovery
the bigger the organization, the faster you can get failures and the longer it takes to recover
from these, while at the same time you want less of these.
en you have to deal with large
amounts of stored data (e.g. a bank with 2 million transactions per day) and you backup all these, it
will take quite a while (more than a day even) to restore the database.
All issues with online transaction pro
cessing equally apply to web services but do have some issues of
Generally web services use SOAP built on XML which takes time to process, and runs on HTTP which
is much slower than DCOM, RMO, or raw sockets.
In addition the internet slows you d
own depending on the number of hops and distance you have to
travel. Secondly, many user queries in the web browser require information from different systems,
adding to the load.
UDDI and SWDL take up extra space as well, therefore many rather use just SO
for their network to get familiarity with the technology.
Design for scalability and performance
You should consider performance and scalability early in design because:
Performance consequences of data distribution can be assessed
etween deferrable transactions and real
time transactions should be noted. Use
message queues for deferrable transactions and distributed transaction processing for real
The application design model gives the data and transaction volumes.
At early (h
igh) level you measure the scale of the problem. At more detailed level you can look at
actual data. At the very detailed level you can investigate the interfaces, code and database usage
At the start of the chapter the TPC
C benchmark was explai
ned (measures how many transactions
can be performed by a processor) but after all the information here you can understand a lot is
missing in that test; mirror disks, restart areas and other resiliency code is left out.
Chapter 9: Systems management
ms management can be divided into 5 categories:
Administration; concerned with all aspects of managing the configuration of a system
Operation; concerned with keeping the system running
Fault tracking and resolution; info about faults not immediately resol
Performance management and security
The functional (first four) categories are interrelated, see page 171. Often these categories are being
managed by different groups in the business.
Between the s
ixties and eighties systems were often just s
ilos (see page 172) but slowly migrated to
distributed systems (page 172) and with it increasing systems management considerably. From only
internal users we now have an internet network IP & others (workstations, routers, switches,
browsers, etc), outsour
ced IP network (
managed by network provider)
and the Internet.
A number of
environmental attributes have effect on systems management:
The environment is very complicated, systems may run well into the thousands
Large number of components means there are h
uge numbers of different conditions; how to
distinguish between what is important and what not?
The IP network is outsourced which makes it difficult to fix problems if any arises (e.g. a teller
cannot reach the banking system).
Web services are outsourced
These days you have more complex administration tools allowing you to manage multiple systems
from one pc using graphical interfaces instead of plain consoles. See page 178 how the systems
management model looks like.
A rules engine consolidates
many errors to more useful information and makes systems more
manageable. How this information is communicated from several systems to one has been
standardized; most well know is the Simple Network Management Protocol (SNMP). Devices of
s can now be management by one management tool. It is most effective in
management routers and switches, but less effective at managing system software and applications.
Collecting performance information in a system requires an agent which runs in the sys
measured, and contains configurable ‘hooks’ or ‘probes’ that collect the required information. The
manager itself (with the reporting tools) is often on a different system and aggregates information
from multiple systems.
In the olden days perfor
mance measurement tools were written manually but a lot is available now
shelf. Current attention is aimed at self
Guidelines in putting it all together:
Scope: meaning the amount of the environment over which the mana
gement function apply.
You cannot control everything anymore. You can think of horizontal “slices” of the
environment. Take middleware as an example, this has to work in its entirety and Operations
staff should monitor this slice carefully.
h level of automation is essential to reduce costs and improve quality.
end service provision: vertical management. There are many components but they
never exist in isolation; collectively they deliver services to users. You should gain insight in
end service status.
Applications can contribute to systems management. Logs for example.
Enhancing systems management environment: use an evolutionary approach. Don’t throw
everything away but improve on it. You cannot be fully aware what the introd
uction of new
technology will do exactly.
Chapter 10: Security
Authentication = identifying users
Access control = authorization, giving users authority
Protection = stopping unauthorized access to resources
Security management = how to administer and repo
rt breaches in security
One tip in managing roles for users; keep it
merge roles if they have the same privileges for
Three issues in early security design:
How do you assign roles?
Often when changing roles and
people just give
them their logins or download the
data and mail it around.
How is duplicate data protected?
In any organization data is replicated, how you keep this safe is a difficult issue.
What strategy is there to guard against the enemy within?
Most frauds have ins
ide help. Let two or more people give permission for something, build
fast detection systems and assume violation is possible and build in recovery procedures.
The onion model is illustrated on page 190. Each concentric ring represents a protective screen
contains one or more resources. The rectangles represent access points and make a number of
services available to the outside ring/world. To get from the outside to inner services it must pass all
rings and thus multiple access points. The most secure
data should be in the inner circle, whereas
less secure data is in the outside circle (the ring that is easiest to breach).
Still, this doesn’t solve authentication
problems; breaking in authorization servers and the attacker
can assume any role he wants
. It is only as secure as the authorization mechanism is secure. Dividing
information on several locations helps; if someone gets administration rights for a web server but no
data can be found on it the damage can be low.
Also make sure assigning roles to
users is thoroughly protected.
Because of legacy software and different branches within the organizations the onion model is
actually a ‘boiling lava model’ (see page 193)., with several
little and bigger ‘onions’
and access points
access points require different passwords, and people will start writing
them down. The boiling lava model is much more difficult to control and enforce security policies.
wide data (e.g. product data and customer data) suffer probl
for access in
separate organizational branches.
The onion model does not work well with the Web. Web services don’t trust each other’s
authorization/authentication very much and
have their own
You can use security tokes (a pi
ece of data identifying the user)
on the Web
that act as a pass that
gives you access.
See page 197 for how it works. You can include a security token in a SOAP header
using the WS
A security token is supplied and validated by a security
service. All the services controlled by one security token service is called a security context.
Most websites use SSL (Secure Sockets Layer) to encrypt their sign
on process (see page 195 if you
want to know more about SSL).
There are two forms of
(or ‘public key’): different key to encrypt than to decrypt. They are slow but
you can publically publish your key and anyone can send you secure messages.
Symmetric (or ‘private key’): same key for both, is fast. Well known is DES
On a completely different note: s
hould requesters trust service providers? Often they have no
choice. A timeout on security tokens can help with this problem.
To develop security it is often easiest to use a network diagram and draw the access points and
security context (read rings) in the diagram.
Especially with legacy software with their own security it
becomes problematic to achieve a single authentication.
Chapter 11: Application design and IT Architecture
Everything described in this chapter is su
pposed to be done before the application project is under
way; it is about IT planning.
At first people didn’t grasp the enormity of programs and just started coding. When they did they
started to program structurally and with waterfall development; which
ofcourse includes the
implementation structure. There are several problems with requirements; 1)
end users and business don’t know their requirements, 2) it is difficult to express the design in a way
that is understandable for both pro
grammers and business sponsors, 3) a division among
requirements, design, and implementation leads to over engineering (especially in large
organizations). Waterfall models don’t work
well with changing requirements.
Alternatively, you can use agile method
s that build iteratively and request feedback from end
often to elicit more detailed requirements. The Agile manifesto (from the Agile Alliance) states:
We value individuals and interactions over processes and tools
We value working software over com
We value customer collaboration over contract negotiation
We value responding to change over following a plan
Extreme Programming (XP) does a minimum of design and the only artifact that matters is the code;
changes are not really
anticipated up front. XP instead builds a large test library up front so changes
can be made without much worry.
These two schools of thought on design are referred to as “design up front” (planned) and “design as
needed” (Agile). The authors prefer a thi
rd approach (discussed later) that takes design in three
levels: business level, task level and transaction level.
MDA (Model Driver Architecture), currently gaining popularity, aims to develop the program directly
from the design.
Business rules determine
how facts (= data) are structurally defined and processed. There are five
Constraints (e.g. “Urgent order must be accepted if order value is less than 30 dollars”)
List constraints (e.g. “User status is raised if he 1) spend more than 1000
dollars and 2) is a
member for over 12 months”)
Classification (e.g. “order is urgent if the delivery must be done in less than 3 hours”)
Computation (e.g. “ratio = price/earnings”)
Enumeration (e.g. “customer standing can be gold, silver or bronze”)
is no mathematical theorem for business rules; much of it depends on the context.
Almost any application has some element of systems integration in it. You can leave legacy systems
like they are, make minimal changes or major changes depending on the sit
organizations make these decisions based on the age of the technology on which it is built.
oriented programming aims to improve reuse.
A marketplace for reusable components never
really took off. Why?
Development must be broken down in
three roles (see page 212):
The programmer who writes reusable components
The assembler who writes scripts that call the components
The administrator who writes the deployment descriptors and configures the production
In most cases the assembler c
annot find the right components he needs (because of performance
issues, it does too much, it does too little, etc.) thus the assembler will often become programmer as
well. You have to be very lucky to find exactly what you need (the authors in the book c
serendipitous reuse, to be translated as ‘fortunate reuse’).
An alternative to serendipitous reuse is architectural reuse; a top
down approach instead of bottom
up. You enforce reuse by providing only a limited set of reusable components (one error
mechanism for example).
Many silos are considered bad, but the alternative is bad as well: one monolithic application no
We seem to be bad at looking over a large number of requirements and splitting them
into logical chunks.
enterprise architecture should focus on the problems faced by the organization and outline a
solution. Organizations generally look for 1) faster development 2) cost reduction and 3) better
security, reliability and performance. The architecture should
define how this should be
As with any large design process, many different people design different parts and often upper
designs have to be revised because something isn’t possible for appropriate cost.
The top layer of an IT applicat
ion would be the business process level. The second level is the task
level (see them as dialogues between users and the system). At the bottom level there is design and
implementation called the transaction level.
In RUP (Ration Unified Process) it starts
with Inception; defining the project boundaries. However the
authors of the book suggest you need to does business process level design first to know these
boundaries. They also suggest eXtreme Programmers to do business process level design first,
e their notion of ‘design as needed’. At task level the complex tasks should be written by the
most experienced programmers and discussed with the group. See page 222 for a table describing
the details on these levels. According to the book’s authors, this
level approach is to be
preferred over planned and agile methods.
Chapter 12: Implementing business processes
Until recently functional analysis was much more common than process analysis. A simple example
would be a car rental process. After so
me time a gold
card is introduced that speeds up the process.
Because the previous system was not
a new application is developed
for the gold
card, without reusing anything. Now there are two applications fulfilling one bus
process: car rental.
Everything a business does can be described in processes. Some processes are however more
defined than others. A process is a series of activities that delivers something. Activities themselves
can be processes; wiring a house is
an activity of the process building the house, but wiring a house is
a process in itself. The lowest level of detail that cannot be subdivided is called a task and is usually
done by one person in one place at one time and requires some skill or resource.
Many processes trigger other processes what the book calls “send and forget”.
A prescriptive approach is fast and repeatable, they are documented in diagrammatically or in a
series of steps.
An alternative is a plan and have a process for converting the p
lan into a one
off process definition.
A process delivers something, usually some goods or a service
A process follows a plan that defines the order of the activities
Activities can sometimes be processed in parallel
Activities can be conditi
onal, that is, process plans can have path choices
An activity can be a process in its own right
A process can start another process (send and forget)
A process may be ongoing, meaning loops back on itself
Two extremes of plans: very prescriptive or a plan
that defines rules that must be obeyed
In practice process execution may deviate from the plan
In practice many companies have processes going on at any one time and are likely to be
competing for resources
Information outside their process context has li
ttle meaning. Information falls into one of four
categories (see page 230):
Plan objects; information about process plans
Tracking objects; information about where an individual process has got to in its execution
Resource objects; information needed by th
Result objects; information that is a process deliverable
Often tracking objects are combined with result objects (e.g.
a half finished order is completed,
making a tracking object the result object).
There are four patterns to architecture proce
Single centralized application (see page 231 for diagram)
Several processes with one application and one database
Tracking multiple centralized application (see page 231 for diagram)
Several processes with multiple applications, but each application
would handle all
processes. Advantages of (1) and (2) are they only have to save data in one database and can
be thoroughly secured. Disadvantage it relies on the network (although often this is not a
Pass through (see page 232 for diagram)
pplication does its job and passes the data to the next application. Advantages: each
app can have its own technology and database can be restructured in any way you want.
Disadvantage is it has lots of timing dependencies.
Copy out/copy in (see page 233 f
Starts as Pass through but at the end of the app’s job the data is sent back to a central
database. Advantages are similar to Pass through + it has a centralized view of what is
happening to the processes. Disadvantage is it is more complicate
Normally you start clarification (part of design process) by drawing process implementation diagrams
(see page 234 for diagram). Guidelines in doing this:
Don’t go in more details than the task
If several tasks are closely related then group them in a b
At database level only mention the major tracking and resource objects; don’t mention
Represent a batch step as a single application
Don’t mention any technology
level design is clarified the next step is to analyze it which has
sevel areas for analysis:
Error handling: you can do three things when something goes wrong 1) fix what needs to be
fixed and try again, 2) leave the whole problem for manual reconciliation later, and 3) revert
to another process (e.
Timing constraints: when you look at process steps determine what dictates the transition.
Some process steps have a limited timeframe for their execution or data transfer.
Flexibility: process level design lays the requ
irements of what users will do, the process
design team may not have the authority to make all the changes in the process.
Process level design helps in many ways, such as:
Provides tool for improving quality of data
Provides fault lines where system
Defines message flows between processes are deferrable or real time
Provides rationale for resiliency requirements
Provides rationale for performance criteria
Provides underlying basis for the discussion about security
Provides underlying basi
s for discussion about data distribution
On page 239 there is an extra bit that explains the difference between functions and processes. Boils
down to that most organizations have taken a strong functional view and often works, but is also the
cause why IT
builds lots of silo apps. Functional approach is departmental driven; not organization
wide and only want to be responsible for as little as possible. Also functional approach is in
insufficient approach to analyze how to change the business.
Integration design is a major
(but relatively short)
element of task
which ensures the
solution hangs together as an integrated whole
. Integration design is the design of protocols
between app and app, and app and end
The integration design group needs to know the nonfunctional requirements, such as critical
performance goals, recovery time goals, and special security requirements.
When designing the database you should not just look at the task but take a bigger persp
at the business rules as well.
The output from integration design is not program design or component design but simply a
description of input and output messages and any session state needed to control the protocols.
A security session should
outlast all the task session a user performs (see page 243 for a diagram).
The sequence of tasks should be ACID (although perhaps a bit less strict). When a user is working on
an tracking object for instance, he should have exclusive rights. This could ca
use very long
ansactions which is undesirable for performance reasons.
Timing or p
seudo locks partly solve this
not creating one big lock but short smaller ones, a full rollback of the entire transaction
becomes difficult in that case
. An alt
ernative is copy the data locally (copy out/copy in pattern).
Task/message diagram = diagram
actors, messages, application processing and objects
(see page 246 for an example)
It displays the flow of data.
You should only model data if it is
portant to follow
up analysis and for clarification of the task.
The design process consist of: understanding requirements, brainstorming solutions, clarifying one or
more chosen solutions, and analysis.
Analysis using integration design can be broken down
into eight areas:
Scalability and performance
: take data volumes for each task and calculate the number of
end data integrity
: take task/message diagram and methodically trace through the
flow and step by step check if anything can go wron
: what roles are associated with each message?
: assess configuration/version control and monitoring; how to get
notified in time and how to fix issues?
Enterprise data consistency
: Check data duplication in other systems (and if
that’s bad), plus
how much do you depend on data from other apps and is this accurate and complete?
Ease of implementation
: does your organization have the experience for this?
Flexibility for change
: think of how the system might need to change in the fu
IT strategy compliance
: does the technology fit the chosen strategy?
A good integration design is loosely coupled. Tightly coupled applications are characterized by many
time messages and complex session state. Loosely coupled are large me
ssages and no
session state. From tight to
has gradations (see page 252). Loosely coupled is more
resilient than tightly coupled.
Loosely coupled systems require more work to synchronize the database and reading local copies of
Tightly coupled require more testing but probably less code.
You can also add tiers to integration design (see page 253) and the author’s argument lengthy (and
not understandably) that semi
coupled systems would be best. See page 252 why, I couldn’t follo
Chapter 14: Information access and information accuracy
Database design occurs during task
This chapter discusses four aspects of database design: 1) information access, 2) information
accuracy, 3) shared data or controlled redundancy, and
4) integration with existing databases.
See for the information access diagram page 258.
There is no simple solution when accessing information. Depending on how many objects you want
to access and performance issues you can adopt diffe
Generally there are requirements to reformat the data to be more understandable, but people want
to see the raw data as well.
Denormalization of data means codes such as “LHR” are converted into
“London Heathrow” by joining lookup tables w
ith data tables.
But again the manager might need to
look up the abbreviation used for the production department.
When building a data mart (which is smaller and cheaper than a
should build using data from the
Most reports used to be printed during the morning are now available online in business systems.
These reports are typically built using SQL queries processed directly in the database. Data marts
help in these but do not replace the
To achieve process improvement (e.g. speed at which an order is fulfilled) you should log timestamps
and such. However while designing such systems we often forget these.
When dealing with customers (e.g. they want to check the status
of their order) they require up
date information. A data mart wouldn’t suffice; these are based on older data using Copy out/Copy
Reasons why data is inaccurate:
It is out of date
Wrong conclusions are drawn from the data
mation is duplicated (and you don’t know which is the correct one)
Information was input incorrectly
Often IT applications are part of the problem, and the requirements users have of them; e.g. they
want to spell the name directly in the order form instead
of looking them up first.
In almost any organization there is data duplication somewhere. Merging them is difficult because of:
How do you know you are talking about the same object?
How do you know whether an attribute is the same?
If the data for one at
tribute of the same object is different in two databases, what then?
The last one is always unsolvable.
Shared data or controlled duplication
See page 267 for a diagram of both these solutions.
Shared data implies a common database, typically separate from
other databases. Disadvantages: 1)
poor performance, 2) some remote database access do not support two
commits, and 3)
structure of shared database is hard coded, making changing schemas difficult. To get around these
problems see page 268.
call on s
hared data component = a component interface is put
above the shared database
so external systems no longer communicate with the database directly; thus
changing the database
itself is less of a problem. This works best if data is used on many pl
similar to embedded call but the client machine now also has an extra interface layer
that communicates with the shared database’s abstraction layer. This is even more loosely coupled.
This is best used if the data is used on fewer p
Controlled duplication has the same data in two or more databases. Advantages are better resiliency
and better performance; disadvantages data is slightly out of data and the app is more complex.
Getting the data exactly the same in both databases c
an be a bit
think of java multiple
thread synchronization problems and you get the idea.
phase commits are not ideal because if
one system would go down the other would be badly affected. You can have one database do all the
updates and o
nly after that update the second one; the second database would be delayed. As with
shared data, the interfaces of the databases should be exactly the same so you can move and change
You can even make a hybrid version of controlled duplica
tion and shared databases.
Creating consistency in existing databases
Say you want a silo app share its data, there are three problems:
Technical: convert old program to use new component interface or equivalent
Often creating component interfaces is not e
asy on old technology.
Data migration problem: move data from old database to new one that may be formatted
. This could mean lots of manual labor because we don’t really trust AI for this.
Business process problem: change business process to us
e new data
. No matter the
technology, data will always remain an business responsibility.
When the database supports ODBC and OLE DB interfaces and facilities this change can be done very
An information broker pattern (when one
update to the
data integrity issues (when both systems broadcast updates at the same time on
the same data) and the killer problem is often there is no equal object identifier for all the data. So
they don’t know
how to translate these updates.
The information controller
The authors of the book introduce the title information controller for the person whose responsible
for accuracy and quality of all data in the organization. Data quality is best fixed when it is i
thus the information controller has a very visible job. He should understand what is meant with data
access and transactions.
Chapter 15: Changing and integrating applications
Little has been written how to change legacy applications into a new si
tuation. The previous chapter
discussed how to migrate the data. The book gives an example which chops the before and after
situations in several smaller projects and prioritizes them. Doing everything all at once would take
too much time before results ar
Use a high
level business process model of the system and map from that the current IT applications.
From this information identify the necessary shared data and the integration dependencies (real
time and deferrable) between the applications.
Usually most changes for before
new situations are data consolidation and adding a presentation
To build the presentation layer there are two options; 1) leave the old (green
screen) interface and
build a new one on top (also called scree
n scraping), and 2) a transaction service interface (see page
Screen scraping must do the following:
Log on to the back
Do any menu traversal required
Reformat the data into a string of text that precisely conforms to the format expected b
Take out the fields from exactly the right place in the received message
Keep track of the position in the dialogue of many end users simultaneously
On top of that you need to take care of error handling. If you have to apply scr
scraping on ten
screens it’s a lot of work. A lot of web messages might correspond to loads of smaller green
for each of those an error could have occurred; and you’ll have to rollback (two
commit). See also page 282.
transaction service interface is the preferred choice. You need to do three thing to make this
Change existing production code: most presentation and security code can be removed
because this will now be at the presentation layer. In many cases the
old code is not
considered the basis for reading and writing data but some of this logic ends up in the
presentation layer as well; e.g. a new customer is stored locally who has not been validated
is used for a new order. This ordering dependency is a rea
son to store state (which you don’t
want), as well as security and temporary data gathering. You should solve this as good as
possible by sending everything in one large message to the database, or store the state in the
Wrap old technology with
new: you create a new interface for an application by installing
some software that converts the old interface to the new interface. This can be easy or very
when a programmer tries. If it is too difficult you might still need to go
for a new application.
Most important here is all transactions need to be stateless; all the
baggage that comes from a ter
minal interface must be eliminated.
Look at the impact on the business processes
. One option is the
rocesses very flexible, putting all core processes in a transaction server, would require
all states be measured in the object (how far it is in the process). Advantages: 1) core server
itself is stateless, 2) process rules are implemented in one place, 3)
not difficult to mix
presentation modes (e.g. web or telephone call), and 4) different core activities can be
physically located on different servers. The alternative is the
: each presentation
layer refinement is implemented as a separate appl
ication instead of calling core processes
these are incorporated in the application. Advantages: 1) fast (data stored locally), 2)
processes are easier to change. But disadvantages: 1) less consistency insurance, 2)
impossible to mix presentation modes, an
d 3) hard to reuse components and modules.
To change from (old) file transfer applications to message queuing includes training, installation,
configuration, development of operating procedures, and program changes to existing applications.
Benefits of mes
sage queuing over file transfer are responsiveness and integrity.
Similarly, RPC to transactional component middleware has long
term benefits but not a quick return
Batch runs are still very common. These days many banks need to calculate re
time balances and
run their batch processes at night that do the actual debit/credits on the accounts. Because of 24
hours society the batch runs need to be shortened more and more. There are four ways to do this:
Shorten the administration (discussed i
n chapter 7 which covers resiliency)
Shorten the batch process; usually this means running processes in parallel
Run batch alongside the online processing; as long as the transactions are short this can be
. The weird thing is the batch would then go a
longside the presentation layer element
(see page 294, second diagram), and fight alongside them for the transaction server.
When for batch process the database needs to be frozen (e.g. reporting functions); replace
the batch program with online programs w
ho does things differently (see page 294
really understood this one)
In many ways running a batch is like running a giant transaction.
Chapter 16: Building an IT Architecture
An architectural model should be a guide for implementation, and modifi
ed when necessary. It is not
something to be put on a shelf. It includes function and nonfunctional requirements (resiliency,
performance, manageability, and security of the resulting system).
Chapter 16 includes several case studies which are not summari
zed in this document. Check’m out
for yourself to get some more feelings for practical applications.
Section 16.3 sums the main points of
the book, only a few pages, read it! (if you made it so far).
Also a last word of note, each chapter has its own
summary. These summaries deviate
sometimes from the actual chapter, a better title of them would be: “What we think you should
remember”. They are also a good read!