Unit IV: SOAP protocol, XML-RPC, HTTP, SOAP faults and SOAP ...

hungryhorsecabinSoftware and s/w Development

Dec 14, 2013 (3 years and 8 months ago)

271 views

SRM/MCA/HS
92


Unit IV:
SOAP protocol, XML
-
RPC, HTTP, SOAP faults and SOAP attachments, Web
services, UDDI, XML security


1.

RPC (Remote Procedure Call)


It is often necessary to design
distributed systems
, where the code to run an application is
spread across multiple comp
uters. For example, to create a large transaction processing system,
you might have separate servers for business logic objects, one for presentation logic objects, a
database server, etc., which all need to talk to each other.



In order for a model like

this to work, code on one computer needs to call code on another
computer. For example, the code in the web server might need a list of orders, for display on a
web page, in which case it would call code in the Business Objects to provide that list of ord
ers.
That code in turn might need to talk to the database. When code on one computer calls code on
another computer, this is called a
Remote Procedure Call
, usually abbreviated
RPC
.

2.


RPC protocols

There are a number of protocols that exist for performing r
emote procedure calls, but the most
common are
DCOM

and
IIOP

(both of which are extensions of other technologies:
COM

and
CORBA
, respectively), and
Java RMI
. Each of these protocols provides the functionality that
you need to perform remote procedure calls
, although each also has its drawbacks.


COM and DCOM are Microsoft Specific
.
Both IIOP and DCOM provide language independent;
Since

JAVA is specifically designed for
language independence,


Java RMI is
evolved

for
distributed computing.
It has the ability

to transfer the code with every call. But it has the
drawbacks that it ties programmer to one programming language, Java for all of the objects in
the distributed system.




SRM/MCA/HS
93


SOAP : The new RPC protocol

:


To these existing protocols, we can now add one m
ore: the
Simple Object Access Protocol
, or
SOAP
. According to the current SOAP specification it is "a lightweight protocol for exchange of
information in a decentralized, distributed environment". In other words, it is a standard way to
send information fr
om one computer to another using XML to represent the information.

The fundamental change brought about by SOAP has been the ability to move data anywhere
across the Web.
Figure 4.2

illustrates that until SOAP there were only two main options for
moving data between partners.


One was to build a wide area network spanning a broad ge
ographic region and let partners
plug into it. This was the approach taken by Electronic Data Interchange (EDI), which
defined messages and protocols for data transfer but left the network details up to the
partners. The result was a collection of networks

that pretty much locked the partners in
and made it difficult and expensive to reach out to other EDI networks and costly to bring
in new partners.


The second approach for moving data between partners was to build a distributed object
infrastructure than

ran over the Internet. This was the approach taken by Common Object
Request Broker Architecture (CORBA), Remote Method Invocation (RMI), and
Distributed Component Object Model (DCOM). The problem was that each had to decide
on a protocol that could sit on

top of TCP/IP and handle interobject communication.
CORBA chose Internet Inter
-
ORB Protocol (IIOP), DCOM chose Object Remote
Procedure Call (ORPC), and RMI chose Java Remote Method Protocol (JRMP). While this
approach reduced the need to share the same un
derlying network, the drawback was that
CORBA could talk to CORBA, RMI to RMI, and DCOM to DCOM, but they could not
talk to each other nor directly to the Web except through special sockets that required
adding extra layers to an already complex architectu
re.


In a nutshell, SOAP is a transport protocol similar to IIOP for CORBA, ORPC for DCOM, or
JRMP for RMI. SOAP differs from CORBA, RMI, or DCOM in several ways:

SRM/MCA/HS
94



IIOP, ORPC, and JRMP are binary protocols, while SOAP is a text
-
based protocol that
uses XML
. Using XML for data encoding makes SOAP easier to debug and easier to read
than a binary stream.


Because SOAP is text
-
based, it is able to move more easily across firewalls than IIOP,
ORPC, or JRMP.


SOAP is based on XML, which is standards
-
driven rather t
han vendor
-
driven. Potential
adopters are less likely to fear vendor lock
-
in with SOAP

SOAP, the third option shown in
Figure 4.2
, combines the data capabilities of XML with the
transport capability of HTTP, thereby overcoming the drawbacks of both EDI and tightly
coupled distributed object systems such as CORBA, RMI, and DCOM. It d
oes this by breaking
the dependence between data and transport and in doing so opens up a new era of loosely
coupled distributed data exchange.



SRM/MCA/HS
95


3.

SOAP


SOAP is an XML
-
based protocol for exchanging information in a decentralized, distributed
environment.

Soap enables two processes to communicate with each other regardless of
hardware and software platforms on which they are running.


It was literally made for the Web, a combination of XML and HTTP that opens up new options
for distributed data exchange and

interaction in a loosely coupled Web environment
.

The SOAP
specification defines a protocol where all information sent from computer to computer is marked
up in XML, and the information is usually transmitted via HTTP
.


As
Figure 4.1
0

shows, SOAP is a tec
hnology that allows XML to move easily over the Web.
SOAP does this by defining an XML envelope for delivering XML content and specifying a set
of rules for servers to follow when they receive a SOAP message.



SRM/MCA/HS
96


Some of the advantages of SOAP over the othe
r protocols we've discussed so far:


It's platform
-
, language
-
, and vendor
-
neutral. Because SOAP is implemented using XML
and (usually) HTTP, it is easy to process and send SOAP requests in any language, on
any platform, without having to depend on tools fr
om a particular vendor.


It's easy to implement. SOAP was designed to be less complex than the other protocols

hence the word "simple" in its name. A SOAP server could be implemented using
nothing more than a web server and an ASP page, or a CGI script.


It'
s firewall safe. Assuming that you use HTTP as your network protocol, SOAP
messages can be passed across a firewall, without having to perform extensive
configuration. However, if a firewall administrator does want to filter out SOAP
messages,

The fundame
ntal building block of
SOAP

is
XML
.
SOAP

defines a specialized, yet flexible
XML
grammar that standardizes the format and structure of messages. The benefits of
XML in
SOAP

are:



XML
is human readable, making it easier to understand and debug


X
ML

parsers a
nd related technologies are widely available


X
ML

is open standard


X
ML

includes many related technologies that can be leveraged in soap


The capabilities of
SOAP

are:



Enables interoperability between systems using standard, widely available protocols such
as xml and http


Allows systems to communicate with each other through firewalls, without having to
open additional, potentially unsafe ports


Soap fully describes each data element in the message, making it easier to understand and
troubleshoot problems tha
t may occur


SOAP

does not do the following:



Attempt to define how objects are created or destroyed


Impose any specific security mechanism or implementation


Define an authentication scheme


SOAP

messages can be exchanged over secure socket layer (ssl) whi
ch is a standard web
protocol that provides a secure, encrypted http connection between the client and server.

SOAP has gained wide acceptance across the software industry. Its impact is evident from the
following observations:


Web services frameworks use
SOAP as the transport technology for delivering data and
XML
-
RPC messages across distributed networks.


Microsoft is committed to SOAP as part of its .NET initiative.

SRM/MCA/HS
97



Sun is using SOAP in its Sun Open Net Environment (Sun ONE) Web services
framework.


IBM, w
hich has played a major role in the SOAP specification, has numerous SOAP
support tools, including a SOAP toolkit for Java programmers. IBM has donated the
toolkit to Apache Software Foundation's XML Project, which has published an Apache
-
SOAP implementati
on based on the toolkit.


CORBA Object Request Broker (ORB) vendors such as Iona are actively supporting
SOAP in the form of CORBA
-
to
-
SOAP bridges
.

SOAP specification:



The
SOAP

protocol specification is a
W3C

submitted note that is now under the umbrella
of
XML

protocols working group.
SOAP consists of three parts:


Encoding rules that control XML tags that define a SOAP message and a framework that
describes message content


Rules for exchanging application
-
defined data types, including when to accept or di
scard
data or return an exception to the sender


Conventions for representing remote procedure calls and responses

Soap message elements
:


SOAP messages define one
-
way data transmission from a sender to a receiver. However, SOAP
messages are often combined

to implement patterns such as request
-
response. When using HTTP
bindings with SOAP, SOAP response messages can use the same connection as the inbound
request
.



SOAP messages have a common format that includes a SOAP Envelope, an optional Header, and
a Bo
dy section that contains the message content. SOAP also defines a message path and set of
roles that SOAP nodes can adopt along a path.



SOAP envelope: It is a required part of SOAP message. It serves as a container for all
remaining soap messages. Typical
ly, it includes SOAP header and body elements. Also it
defines namespaces used by these elements.


SOAP header: It is an optional part of SOAP message. It defines additional information
that can be related to method request in the body element. It does not

contain specific
semantics.


SOAP body: It is required part of a SOAP message that contains the data specific to a
particular method call, such as the method name and any input/output arguments or the
return values produced by the method.


SRM/MCA/HS
98




SOAP
Coding
sample
:


<soap:envelope xmlns:xsi=
http://www.w3.org/2001/smlschema
-
instance


xmlns:xsd=
http://www.w3.org/2001/xmlschema



xmlns.soap=
http://schemas.xmlsoap.org/soap/envelope/
>



<soap:header>


<authentication xmlns=
http://tempuri.org
>


<username>srm</username>


<pas
sword>mca</password>


</authentication>


</soap:header>


<soap:body >


<ctempresponse xmlns=”http://tempuri.org /”>


<ctempresult>0</ctempresult>


</ctempresponse>


</soap:body>


</soap:envelope>



SRM/MCA/HS
99


4.

XML
-
RPC

XML
-
RPC

is a remote procedure call (RP
C) protocol which uses XML to encode its calls and
HTTP as a transport mechanism. XML
-
RPC works by sending a HTTP request to a server
implementing the protocol. The client in that case is typically software wanting to call a single
method of a remote syste
m. Multiple input parameters can be passed to the remote method, one
return value is returned. The parameter types allow nesting of parameters into maps and lists,
thus larger structures can be transported. Therefore XML
-
RPC can be used to transport object
s
or structures both as input and as output parameters.

Identification

of clients for authorization
purposes can be achieved using popular HTTP security methods. Basic access authentication is
used for identification
.

HTTPS is used when identification (via

certificates) and encrypted
messages are needed
.

XML
-
RPC is simpler to use and understand than SOAP because it


allows only one method of method serialization, whereas SOAP defines multiple
different encodings


has a simpler security model


does not suppor
t (nor require) the creation of WSDL service descriptions, although
XRDL

provides a simple subset of the functionality provided by WSDL

But SOAP tries to pick up where XML
-
RPC left off by implementing user de
fined data types,
the ability to specify the recipient, message specific processing control, and other features.


Here's an example of an XML
-
RPC request:

POST /
RPC2 HTTP/1.0

User
-
Agent: Frontier/5.1.2 (WinNT)

Host: betty.userland.com

Content
-
Type: text/xml

Content
-
length: 181


<?xml version="1.0"?>

<methodCall>


<methodName>examples.getStateName</methodName>


<params>


<param>


<value><i4>41</i4>
</value>


</param>


</params>


</methodCall>

SRM/MCA/HS
100



5.

HTTP

The
Hypertext Transfer Protocol
,
HTTP
, is a
request/response

protocol.

HTTP is an important
building block for using XML as a Web
-
based messaging protocol. Although the Internet and
various
protocols such as FTP and TELNET had been in existence since the 1970s for moving
files, sending email, and allowing individuals to connect remotely, it wasn't until 1992 that the
face of the Internet was changed through the use of a simple request
-
respons
e protocol known as
HTTP.

Both HTTP and FTP move data across the Internet. FTP delivers data directly to disk
while HTTP delivers it to a browser. When the data is in HTML or a format the browser
understands, we have the Web.


when you make an HTTP reques
t the following steps occur:


A connection to the HTTP server is opened


A request is sent to the server


Some processing is done by the server


A response from the server is sent back


The connection is closed

HTTP, a simple request
-
response Web protocol, has
been the catalyst for XML's widespread use.
The HTTP GET command requests a Web page. The HTTP POST command delivers
information and receives information back.


SRM/MCA/HS
101


The most common scenario on the Web is for the requested file to contain text and HTML tags.
W
hen text and tags are returned to a browser, the tags are interpreted according to a browser's
internal programming model. The HTTP protocol says nothing about how tags are rendered,
which is why different browsers often display the same Web page in very d
ifferent ways. The
HTTP GET command, however, allows data transfer only from server to client; to permit the
transfer of data from client to server, the POST command was added.

The POST command is a request for a server to do something with data delivered
as part of the
POST message. POST was included in the HTTP specification in order to deliver HTML form
data to a server for processing by some server program. The structure of a POST request is
similar to a GET, except that data intended for the server app
ears after the header and is referred
to as the body or payload of the request.

Figure 4.5

illustrates the structure of an HTTP request showing the difference between GET and
POST. When a POST request arrives at a server, the server looks for data following the blank
line that signals the end of header information. This data deliver
y mechanism turns out to be the
key element in moving XML across the Internet. Instead of supplying data from an HTML form,
the payload slot of an HTTP request can just as easily be packaged with XML.


SRM/MCA/HS
102


As
Figure
shows, XML's transport independence means t
hat it may be carried by any Internet
protocol, including HTTP and FTP, or even sent via mail using Simple Mail Transfer Protocol
(SMTP). This freedom to move data has opened the door to XML
-
RPC, SOAP, and the entire
Web services initiative. XML and HTTP a
re loosely coupled, with no internal dependencies on
each other. Distributed infrastructures such as CORBA, RMI, and DCOM are tightly coupled,
with dependencies between data and transport.


Why HTTP for SOAP?

Most SOAP implementations would probably use H
TTP as their transport. Why is that? Here are
a few reasons:


HTTP is already a widely implemented, and well understood, protocol
.


The request/response paradigm lends itself to RPC well
. In actual fact, the SOAP
specification says that SOAP messages are one
-
way, instead of two
-
way. This would
mean that there would have to be two separate messages sent: one from the "client" to the
"server" with the numbers to add, and one from the "server" back to the "client" with the
result of the calculation. Luckily, the

SOAP specification also says that when a
request/response protocol, such as HTTP, is used, these two messages can be combined
in the request/response of the protocol.


Most firewalls are already

configured to work with HTTP. because firewalls are
configure
d to let HTTP traffic go through, it is much easier to provide the necessary
functionality if all of the communication between the web server and the other servers
uses this protocol


HTTP makes it easy to build in security, with
Secure Sockets Layer

(
SSL
)

SRM/MCA/HS
103


SOAP request and response through HTTP:





SRM/MCA/HS
104



6.

SOAP faults

SOAP faults occur when an application cannot understand a SOAP message or when an error
occurs during the processing of a message. SOAP defines an XML
fault

element that carries
error and/or statu
s information back to the message sender.

Faults are intended to provide detail to the sender as to why the fault occurred. The information
that can be returned as part of a fault includes the following:


faultcode
:

SOAP defines a set of faultcodes for basi
c SOAP errors, although an
application may provide its own codes.


faultstring
:

This element provides a readable explanation as to why the fault occurred.


detail
:

The value of the
detail

element is that it provides information about the
problem that occurre
d while processing the
Body

element. If not present, it indicates that
the problem did not occur in the body of the SOAP message
.


A SOAP message indicating a fault might look similar to this:

<SOAP
-
ENV:Envelope xmlns:SOAP
-
ENV="http://schemas.xmlsoap.org/so
ap/envelope/"


<SOAP
-
ENV:Body>


<SOAP
-
ENV:Fault>


<faultcode>soap:Server</faultcode>


<faultstring>Database error</faultstring>


<faultactor>some URI</faultactor>


<detail>


<e:addNumbersFault xmlns:e="some URI">


<e
:dbError>1001</e:dbError>


<e:message>invalid columnname</e:message>


</e:addNumbersFault>


</detail>


</SOAP
-
ENV:Fault>


</SOAP
-
ENV:Body>

</SOAP
-
ENV:Envelope>


The <faultcode> element contains a unique identifier, which identifies
this particular type
of error. The SOAP specification defines four such identifiers:

Fault Code

Description

VersionMismatch

A SOAP message was received that specified a version of the SOAP protocol that this
server doesn't understand. (This would happen b
y specifying a different namespace for
the <
Envelope
> element than the one we've been using so far.)

MustUnderstand

The SOAP message contained a mandatory header, which the SOAP server didn't
understand.

Client

Indicates that the message was not properly

formatted. That is, the client made a
mistake when creating the SOAP message.

Server

Indicates that the server had problems processing the message, even though the contents
of the message were formatted properly. For example, perhaps a database was down.

SRM/MCA/HS
105



Keep in mind that the identifier is actually namespace
-
qualified, using the
http://schemas.xmlsoap.org/soap/envelope/namespace

.


The <faultstring> element simply contains a human
-
readabl
e string, containing a
description of the error.


The <faultactor> element contains a URI which specifies which SOAP intermediary
caused the fault.


The <detail> element can be used to contain additional, application
-
specific information
about the error. Ju
st like the <Body> element, the actual information is included in
namespace
-
qualified child elements of the <detail> element.

7.

SOAP attachments


SOAP provides a protocol to deliver XML across the Internet. However, requirements often
dictate that not just X
ML needs to be transported but also other related documents such as DTDs,
schema, Unified Modeling Language diagrams, faxes, public and private keys, and digests that
may be related to the XML.


In keeping with the spirit of the Web not to introduce new t
echnologies when existing ones are
available, SOAP relies on the existing rules for HTTP attachments to deliver auxiliary data with
a primary SOAP message, allowing a SOAP message to reference the attachments.

The SOAP with Attachments (see
Figure 4.15
) do
cument defines a binding for a SOAP message
to be carried within a Multi
-
Purpose Internet Mail Extensions (MIME) multipart/related message
in such a way that the processing rules for the SOAP message are preserved.

The MIME multipart mechanism for encapsu
lation of compound documents can be used to
bundle entities related to the SOAP message, such as attachments.




SRM/MCA/HS
106


8.

Web services


Web service is a programmable URL. It is an application component that is remotely callable
using standard Internet protocols su
ch as HTTP and XML. Web services on the Internet, does
not depend on a specific OS, object model or programming language
.


Web services are all about delivering distributed applications via programmable components that
are accessible over the web. As an ex
ample, many e
-
commerce sites need to calculate shipping
charges based on a variety of shipping options. Typically, such a site might maintain a set of
database tables that describe the shipping options and charges for each shipping company in its
web site.

By utilizing screen scraping (essentially a process of analyzing the data in a page for
certain patterns in order to extract the data for processing), a program can examine the web page
and extract the shipping information from that page. Consider the e
-
c
ommerce site
programmatically calling a web service provided by the shipping company on its web site that
automatically calculate shipping costs based on the shipping method and package weight that
you specify in your request and returns the resulting char
ge to you in real time.


The potential valuable applications of web services are:



Extend the capabilities of classic distributed applications and services to the
heterogeneous platform that is the Internet


Services that is either too difficult or too exp
ensive to implement yourself. Example
credit card validation, financial account management, stock quotes etc.,


Services that provide commonly needed functionality for other services. Example: User
authentication, usage billing, usage auditing and so on.


Se
rvices that aggregate distributed, discrete services into an orchestrated whole. A good
example of this type of service would be travel booking.


Services that integrate your business systems with your partners



Web services represents an industry
-
wide res
ponse to the need for a flexible and efficient
business collaboration environment
.


Describing:

Web services describes its functionality and attributes so that other
applications can figure out how to use it.


Exposing:

Web services register with a repositor
y that contains a white pages holding
basic service
-
provider information, a yellow pages listing services by category, and a
green pages describing how to connect and use the services.


Being invoked:

When a Web service has been located, a remote applicatio
n can invoke
the service.


Returning a response:

When a service has been invoked, results are returned to the
requesting application

The Web Services Architecture
:

Web service depends on several enabling technologies including XML, SOAP, UDDI, and
WSDL.
As

Figure 5.2

illustrates, there are three major aspects to Web services:

SRM/MCA/HS
107



A
service provider

provides an interface for software that can carry out a specified set of
tasks.


A
service requester

discovers and invokes a software service to provide a business
so
lution. The requester will commonly invoke a remote procedure call on the service
provider, passing parameter data to the provider and receiving a result in reply.


A
repository

or
broker

manages and publishes the service. Service providers publish their
se
rvices with the broker, and requests access those services by creating bindings to the
service provider.


UDDI is a protocol for describing Web services components that allows businesses to register
with an Internet directory so they can advertise their s
ervices and companies can find each other
and carry out transactions over the Web.

WSDL is the proposed standard for describing a Web service. WSDL is built around an XML
-
based service Interface Definition Language that defines both the service interface a
nd the
implementation details. WSDL details may be obtained from UDDI entries that describe the
SOAP messages needed to use a particular Web service.


SRM/MCA/HS
108


SOAP is a protocol for communicating with a UDDI service (see
Figure 5.3
). SOAP simplifies
UDDI access by allowing applications to invoke object methods or functions residing on remo
te
servers. The advantage of SOAP is that it can use universal HTTP to make a request and to
receive a response. SOAP requests and responses use XML not only to target the remote method
but to package any data that is required by the method


Exercise 40:


Web service example for temperature conversion in ASP.Net using VB
:



<%@ WebService Language="VB" Class="TempConverter" %>

Imports System

Imports System.Web.Services


public class TempConverter


<WebMethod()> Public Function FtoC(ByVal Tempr as Decimal)
as Decimal

Return ((Tempr
-
32)*5)/9

End Function


<WebMethod()> Public Function CtoF(ByVal Tempr as Decimal) as Decimal

Return ((Tempr*9)/5)+32

End Function


End Class

SRM/MCA/HS
109



Any software components or application can be exposed as Web services so that it ca
n be
discovered and used by another component or application. Web services may be as simple as a
movie review or weather forecast or as complex as a complete travel package that includes hotel
and airline bookings and restaurant reservations
.



SRM/MCA/HS
110



9.

UDDI

The
Universal Discovery, Description, and Integration

protocol (
UDDI
) is a protocol which
allows web services to be registered, so that they can be discovered by programmers

and other
web services.


Registries may serve different public and private functions.
A UDDI
-
compliant registry provides
an information framework for describing the services of a Web entity or business. The Web
services vision uses UDDI registries as the focal point for registering and locating services. It is
expected that some registries
will be public and others private. Microsoft, IBM, and HP have
agreed to provide a public UDDI registry, open for search and connection across the entire
Internet. But private registries will also be available either internally within companies or among
a
closely knit family of trusted partners and collaborators.

UDDI is the protocol for communicating with registries.
The UDDI framework consists of
several specifications that describe how a program can interact with a registry, including the
following.


The
UDDI Programmer's API Specification defines approximately 30 SOAP messages
that are used to perform inquiry and publishing functions against any UDDI
-
compliant
business registry. This specification outlines the details of each of the XML structures
associa
ted with these messages.


The UDDI Data Structure Specification defines the four major data structures used by
Programmer API. These include
businessEntity
,
businessService
,
bindingTemplate
, and
tModel
.

The Organization of UDDI

Directory

Operation

Informat
ion

White pages:

Name, address,
telephone number, and other
contact information of a
given business

Publish:

How the
provider of Web
services registers itself

Business information:

A
businessEntity

object
contains information about services, categories,
c
ontacts, URLs, and other things necessary to
interact with a given business.

Yellow pages:

Categories of
businesses based on existing
(nonelectronic) standards

Find:

How an
application finds a
particular Web service

Service information:

Describes a group
of Web
services. These are contained in a
businessService

object.

Green pages:

Technical
information about the Web
services provided by a given
business

Bind:

How an
application connects to
and interacts with Web
services after it's been
found

Binding inf
ormation:

The technical details
necessary to invoke Web services. This includes
URLs, information about method names,
argument types, and so on. The
bindingTemplate

object represents this data.





Service specification detail:

This is metadata
about the
various specifications implemented by a
given Web service. These are called
tModel
s in
the UDDI specifica



SRM/MCA/HS
111


Using UDDI to Make the Connection

explained with ZwiftBooks server connection:

The following is a scenario of interaction for connecting to our Zwi
ftBooks server using UDDI
discovery:

1.

A company is interested in writing software that connects to several book
-
service
providers and comparing price and delivery times for each. It needs a program that can
connect to the UDDI business registry via either a

Web interface or a tool that uses the
Inquiry API. After a lookup based on an appropriate yellow pages listing, the company
obtains a
businessEntity

that represents ZwiftBooks.

2.

Using the
businessEntity
, the client can either drill down for more detail or
request a
complete
businessEntity

structure. In either case, the objective is to obtain a
bindingTemplate

that provides the information about how to connect to ZwiftBooks
Web service.

3.

Based on the details of the specification provided by the
bindingTemplat
e
, the company
sets up its program to interact with the ZwiftBooks Web service. The semantics of the
service may be obtained by accessing the
tModel

contained in the
bindingTemplate

for
the service.

4.

At runtime, the program invokes the Web service based on
the connection details
provided in the
bindingTemplate
.

When a failure occurs, the cached information is refreshed based on current information from a
UDDI Web registry
.



10.

XML security

XML is a flexible data framework that allows applications to communicat
e across the Internet. In
order for XML to be used for e
-
commerce applications, there must be support for security and
trust. Requirements for XML security include confidentiality, authentication, and data integrity.
The World Wide Web Consortium (W3C) add
resses these issues through
XML Encryption

and
XML Signature
, for authenticating merchants, suppliers, and buyers, and for digitally signing and
encrypting XML documents. These initiatives make use of public and private keys but do not
address the issue of

how to trust key providers.

Figure 7.1

illustrates the three basic security
requirements for e
-
business:


SRM/MCA/HS
112



Confidentiality:

Ensuring that information is not made available or disclosed to
unauthorize
d individuals, entities, or pro
cesses. Some one eavesdro
pping on a
conversation or tapping into a data stream should not be able to understand the
communication.

Encrypting with a public key ensures confidentiality.


Authentication:

The ability to determine that a message really comes from the listed
sender. Clo
sely associated with authentication is nonrepudiation: preventing the
originator of a document or communication from denying having sent it. For a business
transaction to be valid, neither party should later be able to deny participation.

Encrypting
with a

private key ensures authentication.


Data integrity:

Ensuring that when information arrives at its destination it hasn't been
tampered with or altered in transit from its original form, either accidentally or
deliberately.

A digest or digital hash represen
ts a unique snapshot of a document.


These three dimensions of secure e
-
commerce rest on a foundation of cryptography. All
cryptography operates according to the same basic principle: some algorithm or formula
is used to scramble or encipher information so
that it is difficult to determine its meaning
without an appropriate key to unscramble or decipher the information.

HTTPS is not sufficient.

XML's use in Internet e
-
commerce applications demands the essential ingredients of all
electronic security systems:

confidentiality, authentication, and data integrity. While public
-
key
cryptography provides techniques for meeting all three requirements, there are issues peculiar to
XML and SOAP that require approaches that go beyond the basic capabilities provided by
existing and widely accepted transport
-
layer security mechanisms, such as Secure Sockets Layer
(SSL) and Transport Layer Security (TLS). The problem is that solutions such as SSL and TLS
address only part of the requirements for confidentiality, authentica
tion, and data integrity when
an XML
-
SOAP combination is in use. The following sections describe several scenarios in
which the security provided by the various transport
-
layer mechanisms may not be sufficient.

XML Document Security Issues

XML requires spe
cial treatment when encrypting or signing.

The rules of XML allow for some
special scenarios that make it difficult to simply encrypt or digitally sign XML, such as the
following:


Missing attributes declared to have default values are provided to the appli
cation as if
present with the default value.


Character references are replaced with the corresponding character.


Entity references are replaced with the corresponding declared entity.


Attribute values are normalized by replacing character and entity refere
nces.


Attribute values are also normalized, unless the attribute is declared to be an XML
CDATA type
.

When normalized, all leading and trailing spaces are stripped, and all
interior runs of spaces are replaced with a single space.


SRM/MCA/HS
113


SOAP Security Issues

SOA
P messaging raises new issues concerning digitally signing XML.

Just as XML processing
brings up special issues related to digitally signing XML documents, SOAP, the common
transport protocol for XML, also raises some issues:


SOAP security must illustrate
how data can flow through an application and network
topology to meet the requirements set by the policies of the business without exposing the
data to undue risk.


SOAP security must not mandate specific technology or infrastructure, but must provide
for p
ortability, flexibility, interoperability, and heterogeneity.

XML white space may change while XML content remains the same.

Digital signatures only
work if the calculation of a digital hash is performed on exactly the same bits as the signing
calculations
. Since noncontent white space can be added to an XML document without changing
its meaning, some way to standardize a document must be used before signing and verification.
For example, in ASCII text there are three commonly used line endings; we need to
permit a
signed text to be modified from one line
-
ending convention to another between the time of
signing and signature verification and still be treated as the same document for verification
purposes. The solution is to convert the document to some stand
ard canonical form before
signing so that surface changes in the document will not break the signature.


The XML Security Framework

The W3C is driving three XML security technolog
ies:


XML Digital Signature


XML Encryption


XML Key Management Services

Figure 7
-
4 and the following paragraphs illustrate how the building blocks interrelate to form the
XML security architecture.


SRM/MCA/HS
114


A canonical form represents the underlying content of an X
ML document.
XML
Canonicalization is the use of an algorithm to generate the canonical form of an XML document
to ensure security in cases where XML is subject to surface representation changes or to
processing that discards some information not essential
to the data represented in the XML.
Canonicalization addresses the fact that when XML is read and processed using standard XML
parsing and processing techniques, some surface representation information may be lost or
modified.

Some of the steps that take p
lace during the creation of a core canonical form include


Encoding the document in the Universal Character Set UTF
-
8


Normalizing line breaks before parsing


Normalizing attribute values as if by a validating processor


Replacing character and parsed entity r
eferences


Replacing CDATA sections with their character content


Removing the XML declaration and document type declaration (DTD)


Converting empty elements to start
-
end tag pairs


Normalizing white space outside of the document element and within start and e
nd tags

XML Encryption supports the encryption of all or part of an XML document. The specification
is flexible enough to allow the encryption of any of the following:


The entire XML document


An element and all its subelements


The content of an XML element


A reference to a resource outside the document

The XML Digital Signature specification defines both the syntax and rules for processing XML
digital signatures. Signatures provide integrity, message authentication, and signer authentication
services for da
ta either contained within an XML document or referred to by such a document.

To digitally sign an XML document using XML Signature, you must carry out the following
steps:


Create a
SignedInfo

element with
SignatureMethod
,
CanonicalizationMethod
, and
Refer
ence
(s).


Canonicalize the XML document.


Calculate the
SignatureValue

based on algorithms specified in
SignedInfo
.


Construct the
Signature

element that includes
SignedInfo
,
KeyInfo

(if required), and
SignatureValue

XKMS works with public
-
key infrastructures
.

XKMS is a W3C initiative that targets the
delegation of trust processing decisions to one or more specialized trust processors, to give
businesses an easier way to manage digital signatures and data encryption.

XKMS specifies
protocols for distributing a
nd registering public keys and is suitable for use in conjunction with
the proposed standard for XML Signature and as a companion standard for XML Encryption.
XKMS has two parts: the XML Key Information Service Specification (X
-
KISS) and the XML
Key Regist
ration Service Specification (X
-
KRSS).

SRM/MCA/HS
115


11.

Java and WebServices:

Java development environments

have evolved into rock
-
solid code tools that generate

J2EE
code, compile it, and let you test it on the J2EE

application server of your choice. Tools for
developing
Web

services have evolved as well. Today there are several excellent

J2EE code
libraries available for free that support Web

service functions such as building SOAP envelopes
and generating

J2EE client proxy classes from WSDL files.



(i)

Web service promotio
n by Apache


M
ost J2EE Web service software providers use the latest version of the

Apache software
foundation’s AXIS as the core of their offerings. AXIS stands for

“Apache eXtensible
Interaction System,” and is a reference implementation of the

latest W3
C recommendations for
SOAP and WSDL
.


The Web Services Invocation Framework (WSIF) is a Java API for invoking Web services
without directly accessing a SOAP API, such as AXIS. WSIF provides the same kind of
functionality that JAXP does for Web parsing. A W
SIF interface can be used on any WDSL
-
compatible Web service, regardless of the SOAP or WSDL version or implementation that the
service was originally created under. For example, moving from Apache SOAP classes to
Apache AXIS classes does not require any
changes to application code if the WSIF interface is
employed
.


The Web Services Inspection Language is a standardized way to find out about published WSDL
without a USDDI server implementation. WSIL also provides rules for how inspection
-
related
informati
on can be revealed by a site.


Apache has implemented Java reference implementations of the XML
-
Signature Syntax and
Processing Recommendation, and the XML Encryption Syntax and Processing.


(ii)

Sun, Jakarta Tomcat, IBM Websphere
,
&
implement XML Web services.



JAXR: Java API for XML Registry, JAVXM:
XML Messaging