Nehru Arts & Science College Department of Computer Science & IT I M.SC Electronics and Communication System E-Learning Material

materialisticrampantInternet και Εφαρμογές Web

10 Νοε 2013 (πριν από 4 χρόνια και 3 μέρες)

124 εμφανίσεις





















Nehru Arts & Science College

Department of Computer Science & IT

I M.SC Electronics and Communication System

E
-
Learning Material

WEB TECHNOLOGIES





































Syllabus



UNIT I

Internetworking concepts


Devices:
Repeaters


Bridges


Routers


Gateways


Internet
topology Internal Architecture of an ISP


IP Address


Basics of TCP


Features of

TCP


UDP.


UNIT II

DNS


Email


FTP


HTTP


TELNET
-

Electronic commerce and Web technology


Aspects


Types


E
-
proc
urement models


Solutions


Supply chain management


Customer Relationship
Management


Features Required for enabling e
-
commerce

Tiers


Concepts of a Tier


UNIT III

Web page


Static Web pages


Dynamic Web pages


DHTML


CGI


Basics of ASP

technolo
gy


Active Web pages
-

User Sessions: Sessions and session Management


Maintaining state information
-

Transaction Management: Transaction Processing monitors


object Request Brokers


Component transaction


monitor


Enterprise Java Beans.


UNIT IV

Se
curity issues: Basic concepts


cryptography


Digital signature


Digital certificates


Security
Socket Layer (SSL)


Credit card Processing Models


Secure Electronic Transaction


3D Secure
Protocol


Electronic money. Electronic Data Interchange: Over
view of EDI


Data Exchange
Standards


EDI Architecture


EDI and the Internet


UNIT V

Extensible Markup Language (XML)


Basics of XML


XML Parsers


Need for a standard


Limitations of Mobile Devices


WAP Architecture


WAP stack


Object Technology.



TEXT BOOK

1. Achyat.S.Godbole and Atul Kahate, “Web Technologies”, Tata McGraw Hill Pub. Co,

Delhi, 2006.












Unit I


Section A

1.
Internetworking Concepts



A large
organization

will use several networking technologies



Inter
-
organizational

communi
cation is significant



Universal service
-

any two computers should be able to communicate



However, different network technologies cannot just be wired together

2.

Devices To Expand The Network




Repeaters



Bridges



Switches



Routers



Gateway

3.

Network Devices



Provide transport for the data that needs to be transferred between end
-
user devices.



Extend cable connections



Concentrate connections



Convert Data
Formats



Manage data transfer

4.

UDP Sockets



Creating UDP sockets.



Client



Server



Sending data.



Receiving data.



Connected Mode.

5. Transport Control Protocol



Outline



TCP objectives revisited



TCP basics



New algorithms for RTO calculation


6. What

Is TCP/IP




I
n simple terms is a language that enables communication between computers



A set of rules (protocol) that defines how two computers address each other and send data to each
other



Is a suite of protocols named after the two most important protocols TCP and
IP but includes other
protocols such as UDP, RTP, etc


Section B

1. IP Address Allocation



Private IP address ranges:



10/8 (10.0.0.0


10.255.255.255)



192.168/16 (192.168.0.0


192.168.255.255)



172.16/12 (172.16.0.0


172.31.255.255)



Public IP address space



Assigned by an appropriate authority such as RIPE, ARIN, AFRINIC, etc. or Local Internet
Registries (LIRs)



Public Address space for the Africa Region available from AfriNIC



Choose a small block from whatever range you have, and subnet your networks (to
avoid problems
with broadcasts)


2.
Bridges




Layer 2 (Data Link Layer) device.




Divide a network into
segments

and filter
traffic. Each segment is a
collision domain.




Limit or filter traffic keeping local traffic local yet allow connectivity to other parts
(segments)




Make decision based on the MAC address list

Connect different architectures and Forward packets



between architectures: Ethernet & Token
-
Ring.



Read network addresses at the MAC



(Media Acc
ess Control) sub
-
layer



Decide which segment address is on



Decide whether or not to forward packet



Drawback:



Broadcast

packages are passed across bridges.















3. UDP Sockets



Creating UDP sockets
.



Client



Server



Sending data.



Receiving data.



Connected Mode.


Position of UDP, TCP, and SCTP in TCP/IP suite























4.
Routers




Layer 3 (Network Layer) device



Connect networks with multiple paths between net
work segments (subnets)



Make decisions based on the network address.



Network segment


Network address



Connect different layer 2 technologies (
Ethernet
, Token Ring, FDDI, etc.)



Have become the backbone for the Internet, running the IP protocol.



Its purpos
e is to:



examine incoming messages (layer 3 data), choose the best path for them through the network,
and
Switch

them to the proper outgoing port.



They don’t allow bad data or broadcast storm to be passed on the network



They can connect networks using the
same protocol but different network architecture.




















5.

Purpose Of An IP Address





Unique Identification of



Source

Sometimes used for security or policy
-
based filtering of data



Destination

So the networks know where to send the data



Netwo
rk Independent Format



IP over anything



Identifies a machine’s connection to a network



Physically moving a machine from one network to another requires changing the IP address



TCP/IP uses unique 32
-
bit addresses


Section C









Section

C

1. Internet Archi
tecture



Organizations choose network technologies appropriate for each need and to use routers to connect
all networks



Figure (below) illustrates how three routers can be used to connect four arbitrary physical networks
into an internet





















Figure shows each router with exactly two connections



commercial routers can connect more than two networks



a single router could connect all four networks in the example



An organization seldom uses a single router to connect all of its networks



There
are reasons for multiple connections:



Load
-
balancing and speed



the processor in a given router is insufficient to handle the traffic passing among an arbitrary
number of networks



Redundancy improves internet reliability



To avoid a single point of failure



T
he protocol software continuously monitors internet connections



It instructs routers to send traffic along alternative paths when a network or router fails



An organization must choose a design that meets the organization's need for



Reliability



Capacity



Co
st



The exact details of internet topology to be chosen often depend on the following



bandwidth of the physical networks



expected traffic



organization's reliability requirements



cost



performance of available router hardware








2. Tcp Features



Connection
-
oriented



Byte
-
stream



app writes bytes



TCP sends
segments




app reads bytes



Reliable data transfer




Full duplex



Flow control: keep sender from overrunning receiver



Congestion control: keep sender from overrunning network














3. Interconnection Dev
ices


Basic Idea: Transfer data from input to output



Repeater



Amplifies the signal received on input and transmits it on output



Modem:



Accepts a serial stream of bits as input and produces a modulated carrier as output


(or vice versa)



Hub



Connect nodes/se
gments of a LAN



When a packet arrives at one port, it is copied to all the other ports



Switch:



Reads destination address of each packet and forwards appropriately to specific port



3 switches (IP switches) also perform routing functions



Bridge:



“ignores” pa
ckets for same LAN destinations



forwards ones for interconnected LANs



Router:



decides routes for packets, based on destination address and network topology



Exchanges information with other routers to learn network topology




4. Transport Control Protocol




Outline



TCP objectives revisited



TCP basics



New algorithms for RTO calculation



Overview



TCP is the most widely used Internet protocol



Web, Peer
-
to
-
peer, FTP, telnet, …



A two way, reliable, byte stream oriented end
-
to
-
end protocol



Includes flow and congest
ion control



Closely tied to the Internet Protocol (IP)



A focus of intense study for many years



Our goal is to understand the RENO version of TCP



RENO is most widely used TCP today



RFC 2001 (now expired)



RENO mainly specifies mechanisms for dealing with con
gestion



Unit I
I


Section A


1. Domain

Name System Basic Terminology



Name space



defines set of possible names



Consists of a set of name to value
bindings



Resolution mechanism



When invoked with a name returns corresponding value


2. Electronic Commerce (
E
-
Commerce)



Commerce refers to all the activities the purchase and sales of goods or services.



Marketing, sales, payment, fulfillment, customer service



Electronic commerce is doing commerce with the use of computers, networks and commerce
-
enabled software
(more than just online shopping)

3.

HTTP



Hypertext Transport Protocol



Language of the Web



protocol used for communication between web browsers and web servers

4. The TELNET Protocol



TCP connection



data and control over the same connection.



Network Virtual
Terminal



negotiated options

5.

What is eProcurement Process?



Requirement


Search


Select


(Negotiation)


Order


Reception


Payment


After service


Disposal



Reengineering using B2B Platform



Tight integration of internal and external systems unavoida
ble




Section B


1. What is SUPPLY CHAIN MANAGEMENT



Value Chain



Supply side
-

raw materials, inbound logistics and production processes



Demand side
-

outbound logistics, marketing and sales.




SUPPLY CHAIN MANAGEMENT






















Supply chain is the
system by which organizations source, make and deliver their products or
services according to market demand.



Supply chain management operations and decisions are ultimately triggered by demand signals at
the ultimate consumer level.



Supply chain as defin
ed by experienced practitioners extends from suppliers’ suppliers to
customers’ customers.




SUPPLY CHAIN MANAGEMENT IS FACILITATED BY :



PROCESSES



STRUCTURE




TECHNOLOGY



Supply chain serves two functions:



Physical



Market mediation



Supply chain objectives m
ay differ from situation to situation.



For functional products, cost efficiency is the critical factor.



For innovative products, responsiveness is the important factor.



Leanness + Agility together make up Leagility













2.

Components of eProcuremen
t



What is eProcurement Process?



Requirement


Search


Select


(Negotiation)


Order


Reception


Payment


After
service


Disposal



Reengineering using B2B Platform



Tight integration of internal and external systems unavoidable



Components of eProcuremen
t



Content management



Requisitioning



Approval routing



Order management



Decision support

e
Procurement System Architecture for Chevron

























3.

Transmission Modes in FTP




Mode is used to specify additional coding or sequencing performed on

data



independent of data type and file structure




Stream




S


stream of bytes, if record structure









EOF sent as record indication; if file






eof indicated by closing stream




Block



B


file sent as sequence of blocks









preced
ed by header info allows restart








of an interruped transfer




Compressed


C


data compressed using run length








encoding






4.

FTP
(File Transfer Protocal)



RFC 959



uses two TCP Ports



one for control



one for data transfers



command
-
response prot
ocol



control port uses telnet protocol to negotiate session



US
-
ASCII



<crlf> is end
-
of
-
line character



Active Mode FTP



Client connect from a random unprivileged port (n > 1023) to the servers command port (21)
and sends port command to tell server to connect

to n+1 then listens on the next higher
unprivileged port (n+1) for server responses. The server connects from it’s data port (20) to the
client data port (n+1)



Passive Mode FTP



Client opens two random unprivileged ports ( n > 1023 and n+1; ex 1026 and
1027) and
connects the first port (n) to server command port 21 and issues a pasv command (server sends
port to use for data); client connects to servers specified data port, server completes connection.


5. Email



SMTP
-

Simple Mail Transfer Protocol



RF
C 821



POP
-

Post Office Protocol



RFC 1939



Also:



RFC 822
Standard for the Format of ARPA Internet Text Messages



RFCs 1521, 1522
Mime

Terminology



User Agent: end
-
user mail program



Message Transfer Agent: responsible for communicating with remote hosts and
transmitting/receiving email (both a client and server).



Mail Exchanger: host that takes care of email for a domain.




SMTP



Used to exchange mail messages between mail servers (Message Transfer Agents).



SMTP Protocol.



SMTP sender is the client



SMTP receiver

is the server.



Alternating dialogue:



client sends command and server responds with command status message.



Order of the commands is important!



Status messages include ascii encoded numeric status code (like HTTP,FTP) and text
string.



SMTP Commands



HELO
-

identifies sender



MAIL FROM:

-

starts a mail transaction and identifies the mail originator



RCPT TO:

-

identifies individual recipient. There may be multiple
RCPT TO:

commands.



DATA

-

sender ready to transmit a series of lines of text, each ends with
\
r
\
n.

A line containing
only a period ‘.’ indicates the end of the data.






Section C

1.

Domain Name System Overview

What are names used for in general?



identify objects



locate objects



define membership in a group

Domain Name System Basic Terminology



Name s
pace



defines set of possible names



Consists of a set of name to value
bindings



Resolution mechanism



When invoked with a name returns corresponding value




DNS Properties



Size of Internet demands well devised naming mechanism



Specified in RFC 1034, 1035 (Mo
ckapetris ‘87)



Names versus addresses



Human readable versus router readable



Location transparent versus location
-
dependent



Flat versus hierarchical



Can names be divided into components?



Global versus local



What is the scope of naming?



DNS for other purpo
ses



Determines where

user requests are routed



The Domain Name System



The
domain name system

is usually used to translate a host name into an IP address .



Domain names comprise a hierarchy so that names are unique, yet easy to remember.



The domain name for
a host is the sequence of labels that lead from the host (leaf node in the
naming tree) to the top of the worldwide naming tree.



A domain is a sub tree of the worldwide naming tree.



Hierarchical name space for Internet objects










Names are read from r
ight to left separated by periods



Each suffix in a domain name is a domain



wail.cs.wisc.edu, cs.wisc.edu, wisc.edu, edu



edu
com
princeton


mit
cs
ee
ux01
ux04
phy sics
cisco


y ahoo
nasa


nsf
arpa


nav y
acm


ieee
gov
mil
org
net
uk
f r

2. Electronic Commerce (E
-
Commerce)



Commerce refers to all the activities the purchase and sales of goods or services.



Marketing, sal
es, payment, fulfillment, customer service



Electronic commerce is doing commerce with the use of computers, networks and commerce
-
enabled software (more than just online shopping)



E
-
commerce applications



Supply chain management



Video on demand



Remote banki
ng



Procurement and purchasing



Online marketing and advertisement



Home shopping



Auctions



Marketing, sales, payment, fulfillment, customer service






Ecommerce infrastructure



Information superhighway infrastructure



Internet, LAN, WAN, routers, etc.



Telecom, ca
ble TV, wireless, etc.



Messaging and information distribution infrastructure



HTML, XML, e
-
mail, HTTP, etc.



Common business infrastructure



Security, authentication, electronic payment, directories, catalogs, etc.



Advantages of Electronic Commerce



Increased
sales



Reach narrow market segments in geographically dispersed locations



Create virtual communities



Decreased costs



Handling of sales inquiries



Providing price quotes



Determining product availability



Being in the space



Disadvantages of Electronic Commerce



Loss of ability to inspect products from remote locations



Rapid developing pace of underlying technologies



Difficult to calculate return on investment



Cultural and legal impediments























The process of e
-
commerce



Attract customers



Advertisin
g, marketing



Interact with customers



Catalog, negotiation



Handle and manage orders



Order capture



Payment



Transaction



Fulfillment (physical good, service good, digital good)



React to customer inquiries



Customer service



Order tracking



3.

HTTP



Hypertext Tra
nsport Protocol



Language of the Web



protocol used for communication between web browsers and web servers



TCP port 80



RFC 1945




URI,URN,URL



Uniform Resource Identifier



Information about a resource



Uniform Resource Name



The name of the resource with in a nam
espace



Uniform Resource Locator



How to find the resource, a URI that says how to find the resource




HTTP


URLs



URL



Uniform Resource Locator



protocol (http, ftp, news)



host name (name.domain name)



port (usually 80 but many on 8080)



directory path to the re
source



resource name



http://xxx.myplace.com/www/index.html



http://xxx.myplace.com:80/cgi
-
bin/t.exe



HTTP


methods



Methods



GET



retrieve a URL from the server

simple page request

run a CGI program

run a
CGI

with arguments attached to the URL



POST



preferred

method for forms processing

run a CGI program

parameterized data in

sysin more secure and private



PUT



Used to transfer a file from the client to the server



HEAD



requests URLs status header only



used for conditional URL handling for performance enhanc
ement schemes



retrieve URL only if not in local cache or date is more recent than cached copy





HTTP Request Packets



Sent from client to server



Consists of HTTP header



header is hidden in browser environment



contains:



content type / mime type



content lengt
h



user agent
-

browser issuing request



content types user agent can handle



and a URL



HTTP Request Headers



Precede HTTP Method requests



headers are terminated by a blank line



Header Fields:



From



Accept



Accept
-
Encoding



Accept Language


4. TELNET



TELNET is a
protocol

that provides “a general, bi
-
directional, eight
-
bit byte oriented
communications facility”.



telnet

is a
program

that supports the TELNET protocol over TCP.



Many application protocols are built upon the TELNET protocol.



The TELNET Protocol



TCP conn
ection



data and control over the same connection.



Network Virtual Terminal



negotiated options



TELNET Control Functions



TELNET includes support for a series of control functions commonly supported by servers.



This provides a uniform mechanism for communica
tion of (the supported) control functions.



Interrupt Process (IP)



suspend/abort process.



Abort Output (AO)



process can complete, but send no more output to user’s terminal.



Are You There (AYT)



check to see if system is still running.



Erase Character (EC)



d
elete last character sent



typically used to edit keyboard input.



Erase Line (EL)



delete all input in current line.










5.
Seven

features of e
-
commerce



Ubiquity



Global Reach



Universal Standards



Richness



Interactivity



Information Density



Personalization
/Customization



Why study these feature?



To some extend, these features are unique to e
-
commerce and explain why this technology has
change the world so much more than other technologies



Do these technologies have any of 7 features?



Phone, electricity, car
s, planes



Yes, but not all 7.



Ubiquity



Available everywhere



Built into other devices



Hidden, but still there



Why is ubiquity good?



Ubiquity lowers transaction costs for the consumer/buyer.



How so?



Think about buying gifts last Christmas



Think about the fac
t that gas costs $2.40/gallon Related Concept



Cognitive Energy



mental effort needed to complete a task.



Ubiquity

reduces cognitive energy.



Humans tend to seek options that require the minimum cognitive energy.



Consider the mental effort needed to buy you
r book online vs. hunting for it at various book
stores.



Global Reach



e
-
commerce technologies enable a business to easily reach across geographic boundaries.



Its really easy to understand how this feature can benefit business and consumers.



While e
-
commerc
e can reach across geographic boundaries, can it reach across
demographic
boundaries.



Demographics



age



income



race



gender



religion



There is one demographic boundary that technology can reach. Luckily, business doesn’t have an
interest in reaching this gro
up.















6.
What is CRM?



Customer Relationship Management (CRM)


“The approach of identifying, establishing, maintaining, and enhancing lasting relationships with
customers.”


“The formation of
bonds
between a company and its customers.”



Also te
ll about why crm is needed: change in customers, change in market place, change in
technology, globalisation, deregulation, advances in IT



















The Marketing Perspective



The marketing manager...

1.

Defines objectives

2.

Identifies customers

3.

Defines c
ommunication strategies

4.

Designs/improves products/offers/services/promotions

5.

Tests the impacts of her decisions

6.

Revises her decisions for maximum effectiveness




Defines objectives
















Step 2: Identify Customers



Perform SEGMENTATION



Define the rig
ht customers



Use information of past transactions as key for making predicting future ones



Define the segments and their characteristics



Develop customized marketing strategies for the different segments






Step 3: Communication Strategies



Which message shou
ld be transmitted?



Which channel should be used?




Step 4: Design the Products, Offers, Services and Promotions



Analyze the price, time period, risks, marketing costs



Define the product / offer / service / promotion and its general structure



Identify effect
ive use of sales and communication channels



Step 5: Test the Impacts



Impacts of the decisions have to be

tested and
and assessed

on a
sample




Step 6: Revise the Decisions



Make revisions to the targeted offer / service / promotions



Finally apply the decisi
ons to the whole segment or population



7.

Three
-
Tiered Applications

The key to using Remote Data Service technology lies in understanding the three
-
tiered
client/server model. This approach separates the various components of a client/server system int
o
three "tiers":



Client Tier



Middle Tier



Data source Tier



Client tier

A local computer on which either a Web browser displays a Web page that can display and
manipulate data from a remote data source, or (in non
-
Web
-
based applications) a stand
-
alone compi
led
front
-
end application.


















Middle tier

A Server computer that hosts components which encapsulate an organization's business rules.
Middle
-
tier components can either be Active Server Page scripts executed on Internet Information
Server, or (
in non
-
Web
-
based applications) compiled executables.















Data source tier



A computer hosting a database management system (DBMS), such as a Microsoft SQL Server
database. (In a two
-
tier application, the middle
-
tier and data source tier are combine
d.)



















Advantages of Three
-
tier architecture



removes a huge processing burden from client machines.



can be used to consolidate enterprise
-
wide business rules as application servers process business
rules in a single place for use by multiple

applications. When rules change, only a change to the
application server is required.



any knowledge of the database server may be hidden from the client. database queries may be
presented to client in alternative forms.



Unit I
II


Section A


1. Dynamic
vs. Static Web Sites

Roughly speaking, there are two kinds of Web sites: those with static content and those with
dynamically generated content. These are also called static Web sites and dynamic Web sites, or Web sites
with static pages versus Web sites w
ith dynamic pages.


2.

DHTML

"Dynamic HTML" is typically used to describe the combination of HTML, style sheets and scripts
that allows documents to be animated. Dynamic HTML allows a web page to change after it's loaded into
the browser
--
there doesn't ha
ve to be any communication with the web server for an update. You can think
of it as 'animated' HTML. For example, a piece of text can change from one size or color to another, or a
graphic can move from one location to another, in response to some kind of

user action, such as clicking a
button.


3.

ASP

ASP is a pseudo
-
programming language aimed at HTML development. It allows web pages to do
more than contain just static content. By placing ASP tags in with your HTML tags, you can have a page
that interact
s with the user. The page can make decisions based on logic and user input.


4. Object request broker

In
distributed computing
, an
object request broker (ORB)

is a
piece of
middleware

software that
allows programmers to make program calls from one computer to another via a network. ORBs promote
interoperability of distributed object systems because

they enable users to build systems by piecing
together objects from different vendors, so that they communicate with each other via the ORB.


5. EJB container holds two major types of beans:



Session Beans

that can be either "Stateful", "Stateless" or "Singleton" and can be accessed via either
a
Local

(same JVM) or
Remote

(different JVM) interface or directly without an interface,

in which
case local semantics apply. All session beans support asynchronous execution for all views
(local/remote/no
-
interface).



Message Driven Beans (also known as MDBs or Message Beans). MDBs also support
asynchronous execution
, but via a messaging paradigm.


Section B

1.
What is ASP?

ASP is a pseudo
-
programming language aimed at HTML development. It allows web pages to do
more than contain just static content. By placing ASP tags in with your HTML tags, you can have a page
that

interacts with the user. The page can make decisions based on logic and user input. If you're familiar
with HTML, then you know that an HTML tag uses <> around it's tags. For example <b>this text will be
bold</b> would make
this text will be bold
. ASP is
similar in that it uses delimiter tags like HTML.
However, the tags differ slightly. An ASP delimiter tag starts with <% and ends with %>.

ASP was developed by Microsoft and it's core language is based off of Microsoft's Visual Basic.
The language ASP use
s is VBScript. However, you can change this if you are familiar with another
language that you would like to use instead. At the top of an ASP page, before the first <html> tag, you
would put <% @Language=VBScript %>. If you wanted to use another language,

you would make the
change in this tag. There is a reason for this. ASP is not like HTML in certain ways. ASP is interpreted at
the server instead of by the browser. This is what makes ASP dynamic. By taking user input, connecting to
a database or whatever
, ASP is translated on the server and outputs pure HTML. Because ASP was
designed for Microsoft's NT servers, it's default language is VBScript. However, it is good coding practice
to always include the specified language of your code at the top of the pag
e, so the server knows how to
interpret it. ASP can also run on a UNIX server, but the UNIX server must be running ChiliSoft. To
develop ASP on your own computer, you must be running either an NT server product, or Win95/98
Personal Web Server. You can usu
ally find PWS on your Windows CD. Just run a find>files or folders on
your Windows CD and search for PWS. After running the installation for PWS, you must open all ASP
pages by typing into the address bar

http://localhost/nameofmypage.asp.

You cannot sim
ply open an ASP page like you normally would open an HTML page.

Active Server Pages (ASP) is a proven, well
-
established technology for building dynamic Web
applications, which provides the power and flexibility you need to create anything from a personal,
Web
based photo gallery to a complete catalogue and shopping cart system for your next eCommerce project.
One unique feature of ASP is that it lets you choose your favourite scripting language, be it JavaScript or
VBScript; however, VBScript is by far the
most popular choice. In this article, I’ll bring you up to speed on
the basic syntax of the VBScript language, including variables, operators, and control structures.






2.

Common Gateway Interface



The
Common Gateway Interface

(
CGI
) is a standard that de
fines how
web server

software can
delegate the generation of
web pages

to a stand
-
alone application, an executable file.

Such
applications are known as
CGI scripts
; they can be written in any programming language, although
scripting languages

are often used.



Technical overview

The common g
ateway interface (CGI) is a
standard

way for a
Web server

to pass a Web user's
request to an applica
tion program and to receive data back to forward to the user. When the user
requests a Web page (for example, by clicking on a hyperlink or entering a Web site address), the
server sends back the requested page. However, when a user fills out a
form

on a Web page and
sends it in, it usually needs to be processed by an application program. The Web server typically
passes the form information to a small application program that processes th
e data and may send
back a confirmation message. This method or convention for passing data back and forth between
the server and the application is called the common gateway interface (CGI).

If you are creating a Web site and want a CGI application to get

control, you specify the name
of the application in the uniform resource locator (
URL
) that you code in an
HTML

file. This URL
can be specifi
ed as part of the
forms tags

if you are creating a form. For example, you might code:

<form method="POST" action="http://www.mybiz.com/cgi
-
bin/formprog.pl">


and the serve
r at "mybiz.com" would pass control to the CGI application called "formprog.pl" to
record the entered data and return a confirmation message. (The ".pl" indicates a program written in
Perl

but other

languages could have been used.)

The common gateway interface provides a consistent way for data to be passed from the user's
request to the application program and back to the user. This means that the person who writes the
application program can make s
ure it gets used no matter which operating system the server uses
(
Windows
,
Linux
,
Macintosh
,
UNIX
,
OS/390
, or others). It's simply a basic way for information to
be passed from the Web server about y
our request to the application program and back again.

3.

Object request broker


In
distributed computing
, an
object request broker (ORB)

is a piece of
middleware

software that
allows programmers to make program calls from one computer to another via a network. ORBs promote
interoperability of distributed object systems because they enable users to

build systems by piecing
together objects from different vendors, so that they communicate with each other via the ORB.

ORBs handle the transformation of in
-
process data structures to and from the byte sequence, which
is transmitted over the network. This

is called
marshalling

or
serialization
.

Some ORBs, such as
CORBA
-
compliant systems, use an Interface Description Language (
IDL
) to
describe the data that is to

be transmitted on remote calls.

In addition to marshalling data, ORBs often expose many more features, such as
distributed
transactions
,
directory services

or real
-
time scheduling.

In object
-
oriented languages, the ORB takes the form of an object with methods enabling
connection to the objects being served. After an object co
nnects to the ORB, the methods of that object
become accessible for remote invocations. The ORB requires some means of obtaining the network address
of the object that has now become remote. The typical ORB also has many other methods.






4.

TP

Monitor A
rchitectures (Cont.)



Process per client model
-

instead of individual login session per terminal, server process
communicates with the terminal, handles authentication, and executes actions.



Memory requirements are high



Multitasking
-

high CPU overhead for
context switching between

processes



Single process model
-

all remote terminals connect to a



single server process.



Used in client
-
server environments



Server process is multi
-
threaded; low cost for thread switching



No protection between applications



Not su
ited for parallel or distributed databases



Many
-
server single
-
router model
-

multiple application server processes access a common database;
clients communicate with the application through a single communication process that routes
requests.



Independent s
erver processes for multiple applications



Multithread server process



Run on parallel or distributed database



Many server many
-
router model
-

multiple processes communicate with clients.



Client communication processes interact with router processes that rou
te their requests to the
appropriate server.



Controller process starts up and supervises other processes.



Detailed Structure of a TP Monitor



Queue manager handles incoming messages



Some queue managers provide persistent or durable message
queuing

contents
of queue are safe
even if systems fails.



Durable
queuing

of outgoing messages is important



application server writes message to durable queue as part of a transaction



once the transaction commits, the TP monitor guarantees message is eventually
delivered
,
regardless of crashes.



ACID properties are thus provided even for messages sent outside the database



Many TP monitors provide locking, logging and recovery services, to enable application servers
to implement ACID properties by themselves.



Transaction Mana
gement



Transaction management is complicated in multi

database systems because of the assumption of
autonomy



Global 2PL
-
each local site uses a strict 2PL (locks are released at the end); locks set as a
result of a global transaction are released only when

that transaction reaches the end.



Guarantees global serializability



Due to autonomy requirements, sites cannot cooperate and execute a common concurrency
control scheme



E.g. no way to ensure that all databases follow strict 2PL



Solutions:



provide very low

level of concurrent execution, or



use weaker levels of consistency



Local transactions are executed by each local DBMS, outside of the MDBS system control.



Global transactions are executed under multi

database control.



Local autonomy
-

local DBMSs cannot c
ommunicate directly to synchronize global transaction
execution and the multi

database has no control over local transaction execution.



local concurrency control scheme needed to ensure that DBMS’s schedule is serializable



in case of locking, DBMS must be
able to guard against local deadlocks.



need additional mechanisms to ensure global serializability



Transaction Processing Monitors



TP monitors initially developed as multithreaded servers to support large numbers of terminals
from a single process.



Provide

infrastructure for building and administering complex transaction processing systems
with a large number of clients and multiple servers.



Provide services such as:



Presentation facilities to simplify creating user interfaces



Persistent queuing of client r
equests and server responses



Routing of client messages to servers



Coordination of two
-
phase commit when transactions access multiple servers.



Some commercial TP monitors: CICS from IBM, Pathway from



Tandem, Top End from NCR, and Encina from Transarc


Sect
ion C

1.

Static web page

A
static web page

(sometimes called a
flat page
) is a
web page

that is delivered to the user exactly
as stored, in contrast to
dynamic web pages

which are generated by a
web application
.

Consequently a static web page displays the same information for all use
rs, from all contexts,
subject to modern capabilities of a
web server

to nere such versions are available and the server is
configured to do so.

Static web pages are often
HTML

documents stored as files in the
file system

and made available
by the web server over
HTTP
. However, loose interpretations of the term could include web pages stored
in a
database
, and could even include pages formatted using a template and served through an applica
tion
server, as long as the page served is unchanging and presented essentially as stored.



Advantages and disadvantages



Advantages



No programming skills are required to create a static page.



Inherently publicly cacheable (ie. a cached copy can be shown to

anyone).



No particular hosting requirements are necessary.



Can be viewed directly by a web browser without needing a web server or
application
server
, for example dire
ctly from a
CD
-
ROM

or
USB Drive
.



Disadvantages



Any personalization or interactivity has to run client
-
side (ie. in the browse
r), which is
restricting.



Maintaining large numbers of static pages as files can be impractical without automated tools.



Dynamic vs. Static Web Sites

Roughly speaking, there are two kinds of Web sites: those with static content and those with
dynamically

generated content. These are also called static Web sites and dynamic Web sites, or Web sites
with static pages versus Web sites with dynamic pages.



Static Web Sites

For a static
-
content Web sit
e, all content appearing on Web pages is placed manually by
professional Web developers. This is also called "design
-
time page construction," because the pages are
fully built while the site is being developed. Static
-
content Web site is developed and then

maintained by
experienced professionals. Such Web site usually costs less when initially developed, but then all future
changes still have to be done by Web professionals. Therefore a static Web site can be more expensive to
maintain, especially when you
want to make frequent changes to your site.


2
.
Dynamic Web Sites


On the other hand, pages in a dynamic
-
content Web site are constructed "on the fly" when a
page is requested from a Web browser
. Dynamic
-
content Web site, while still developed by
professionals, can be maintained directly by you, our customer. Such Web site initially costs more to
develop, but then you don't have to pay Web professionals every time you need to change something on
your site. If you plan to make frequent changes to your site, you most likely will be better off with a
dynamic Web site.



Dynamic web page

A
dynamic web page

is a kind of
web page

that has b
een prepared with fresh information
(
content

and/or
layout
), for each individual viewing. It is not
static

because it changes with the time (ex.
a
news

content), the user (ex. preferences in a
login session
), the
user interaction

(ex.
web page game
),
the context (pa
rametric customization), or any combination of the foregoing.



Properties associated with dynamic web pages

Classical
hypertext

navigation occurs among "static" documents, and, for
web user
s
, this
experience is reproduced using
static web pages
, meaning that a page retrieved by different users at
different times is always the same, in the same form.

However, a we
b page can also provide a
live

user experience. Content (text, images, form fields,
etc.) on a
web page

can change in response to different contexts or conditions. In dynamic sites, page
con
tent and page layout are created separately. The content is retrieved from a database and is placed
on a web page only when needed or asked. This allows for quicker page loading, and it allows just
about anyone with limited web design experience to update
their own website via an administrative
tool. This set
-
up is ideal for those who wish to make frequent changes to their websites including text
and image updates, e.g. e
-
commerce.



Two types of dynamic web sites



Client
-
side scripting and content creation

Us
ing
client
-
side scripting

to change interface behaviors
within

a specific
web page
, in
response to

mouse or keyboard actions or at specified timing events. In this case the dynamic
behavior occurs within the
presentation
.

Such web pages use presentation technology called
rich interfaced pages
.
Client
-
side

scripting languages

like
JavaScript

or
ActionScript
,
used for
Dynamic HTML

(DHTML) and
Flash

technologies respectively, are frequently used to orchestrate media ty
pes (sound, animations,
changing text, etc.) of the presentation. The scripting also allows use of
remote scripting
, a
technique by which the DHTML page requests additional i
nformation from a server, using a
hidden
Frame
,
XMLHttpRequests
, or a
Web service
.

The
Client
-
side

content is generated on the user's computer. The web browser retrieves a
page from the server, then processes the code embedded i
n the page (often written in
JavaScript
)
and displays the retrieved page's content to the user.

The innerHTML property (or write command) can illustrate the client
-
side dynamic page
gene
ration: two distinct pages, A and B, can be regenerated as document.innerHTML = A and
document.innerHTML = B; or "on load dynamic" by document.write(A) and document.write(B).

There are also some utilities and frameworks for converting HTML files into JavaS
cript
files. For exampl

uses innerHTML property for rendering pages from converted HTML on client
-
side.



Server
-
side scripting and content creation

A program running on the web se
rver (
server
-
side scripting
) is used to change the
web
content

on various
web pages
, or to adjust the sequence of or reload of the web pages. Server
responses may be determined by such conditions as data in a posted
HTML form
, parameters in the
URL
, the type of browser being used, the passage of time, or a database or server
state
.

Such web pages are often created with the help of
server
-
side

languages such as
PHP
,
Perl
,
ASP
,
ASP.NET
,
JSP
,
ColdFusion

and other languages. These server
-
side languages typically use
the
Common Gateway Interface

(CGI) to produce
dynamic web pages
. These kinds of pages can
also use, on the client
-
side, the first kind (DHTML, etc.).

Server
-
side dynamic content is more compli
cated: (1) The client sends the server the
request. The server receives the request and processes the server
-
side script such as [PHP] based on
the
query string
, HTTP POST data, cook
ies, etc.

The dynamic page generation was made possible by the
Common Gateway Interface
, stable
in 1993. Then
Server Side Includes

pointed a more direct way to deal with server
-
side scripts, at the
web servers
.



Combining client and server side

Ajax

is a web development technique for dynamically interchanging content with the server
-
side, without reloading the web page.
Google Maps

is an example of a
web application

that uses
Ajax techniques and database.



Disadvantages

Search engines

work by creating indexes of published
HTML

web pages that were, initially,
"static". With the advent of dynamic web pages, often created from a private database,
the content
is less visible
. Unless this content is duplicated in some way (for example, as a series of extra static
pages on the same site), a search may not find the information

it is looking for. It is unreasonable to
expect generalized web search engines to be able to access complex database structures, some of
which in any case may be secure.

3.

Session Management



Desktop session management

Desktop session manager is a progra
m that can save and restore desktop sessions. A desktop
session is all the windows currently running and their current content. Session manager on
Linux
-
based
systems is provided by
X session manager

. On
Microsoft Windows

systems, no session manager is
included in the system. Session management is provided by third
-
party applications like

twinsplay
.

A full description of Session Management under X Window
-
based systems is on the
X session
manager

page.



Browser session management

Session management is particularly useful in a
web browser

where a user can save all open
pages and settings and restore them at

a later date. To help recover from a system or application crash,
pages and settings can also be restored on next run.
OmniWeb

and
Opera

are examples of web
browsers that support session management. Other modern browsers such as
Mozilla Firefox

support
session management through third
-
party
plugins or extensions. Session management is often managed
through the application of
cookies
.



Web server session management

Hypertext Transfer Protocol

(HTTP) is stateless: a client computer running a web browser must
establish a new
Transmission Control Protocol

(TCP) network connection to the

web server with each
new HTTP GET or POST request. The web server, therefore, cannot rely on an established TCP
network connection for longer than a single HTTP GET or POST operation. Session management is the
technique used by the web developer to make t
he stateless HTTP protocol support session state. For
example, once a user has authenticated oneself to the web server, his/her next HTTP request (GET or
POST) should not cause the web server to ask him/her for him/her account and password again. For a
dis
cussion of the methods used to accomplish this please see
HTTP cookie
.

The session information is stored on the web server using the session identifier (session ID)
generated as a result of the first (
sometimes the first authenticated) request from the end user running a
web browser. The "storage" of session IDs and the associated session data (user name, account number,
etc.
) on the web server is accomplished using a variety of techniques including, bu
t not limited to: local
memory, flat files, and databases.

In situations where multiple web servers must share knowledge of session state (as is typical in
a cluster environment

see
computer cluste
r
) session information must be shared between the cluster
nodes that are running web server software. Methods for sharing session state between nodes in a
cluster include: multicasting session information to member nodes (see
JGroups

for one example of this
technique), sharing session information with a partner node using
distributed shared memory

or
memory virtualization
, sharing session information between nodes using network sockets, storing
session information on a shared file system such as the
network file sy
stem

or the
global file system
, or
storing the session information outside the cluster in a
database
.

If session information is considered tr
ansient, volatile data that is not required for
non
-
repudiation

of transactions and doesn't contain data that is subject to compliance auditing (in the
U.S.

for example, see the
Health Insurance Portability and Accountability Act

and the
Sarbanes
-
Oxley Act

for examples of two laws that necessitate compliance auditing) then any method of storing session
information can be used. However, if session information is subject to audit compliance, consideration
should be gi
ven to the method used for session storage, replication, and clustering.

In a
service oriented architecture

Simple Object Access Protocol or
SOAP

messages constructed
with Extensible Markup Language (
XML
) messages can be used by consumer applications to cause
web servers to create sessions.

4
.

Dynamic (DHTML
)

Dynamic HTML
, or
DHTML
, is an
umbrella term

for a collection of technologies used together to
create interactive and animated
web si
tes

by using a combination of a static
markup language

(such as
HTML
), a
client
-
side

scripting

language (such as
JavaScript
), a presentation definition language (such as
CSS
), and the
Document Object Model
.

DHTML allows scripting languages to change
variables

in a web page's definition l
anguage, which
in turn affects the look and function of otherwise "static" HTML page content,
after

the page has been fully
loaded and during the viewing process. Thus the dynamic characteristic of DHTML is the way it functions
while a page is viewed, not
in its ability to generate a unique page with each page load.

By contrast, a
dynamic web page

is a broader concept


any web page generated differently for
each user, load occurrence, or specific
variable values. This includes pages created by client
-
side scripting,
and ones created by
server
-
side scripting

(such as
PHP
,
Perl
,
JSP

or
ASP.NET
) where the web server
generates content before sending it to the client
.




Uses

DHTML allows authors to add effects to their pages that are otherwise difficult to achieve. For
example, DHTML allows the page author to:



Animate text and images in their document, independently moving each element from any
starting point to any en
ding point, following a predetermined path or one chosen by the user.



Embed a ticker that automatically refreshes its content with the latest news, stock quotes, or
other data.



Use a form to capture user input, and then process and respond to that data w
ithout having to
send data back to the server.



Include rollover buttons or drop
-
down menus.

A less common use is to create browser
-
based action games. During the late 1990s and early 2000s,
a number of games were created using DHTML, but differences betw
een browsers made this difficult:
many techniques had to be implemented in code to enable the games to work on multiple platforms.
Recently browsers have been converging towards the
web standards
, wh
ich has made the design of
DHTML games more viable. Those games can be played on all major browsers and they can also be ported
to Widgets for
Mac OS X

and Gadgets for
Windows Vista
, which are based on DHTML code.

The term "DHTML" has fallen out of use in recent years as it was associated with practices and
conventions that tended to not work well between various web browsers. DHTML may now be referr
ed to
as
unobtrusive JavaScript

coding (
DOM Scripting
), in an effort to place an emphasis on agreed
-
upon best
prac
tices while
allowing similar effects in an accessible, standards
-
compliant way
.

Basic DHTML support was introduced with
Internet Explorer 4.0
, although there was a basic
dynamic system with
Netscape Navigator 4.0
. When it originally becam
e widespread DHTML style
techniques were difficult to develop and
debug

due to varying degrees of support among web browsers of
the technologies involved. Development became easier when
Internet Explorer 5.0+
,
Mozilla Firefox

2.0+,
and
Opera

7.0+ adopted a shared Document Object Model.

More recently
JavaScript libraries

such as
jQu
ery

have abstracted away much of the day
-
to
-
day
difficulties in cross
-
browser DOM manipulation.



DHTML

"Dynamic HTML" is typically used to describe the combination of HTML, style sheets and scripts
that allows documents to be animated. Dynamic HTML allows
a web page to change after it's loaded into
the browser
--
there doesn't have to be any communication with the web server for an update. You can think
of it as 'animated' HTML. For example, a piece of text can change from one size or color to another, or a
graphic can move from one location to another, in response to some kind of user action, such as clicking a
button.



Technology Components



The major components of Dynamic HTML technology are:
-




Style

Sheets

(NS)

(MS)

let you specify the stylistic attributes of the typographic
element
s of your web page. They let you change the color, size, or style of the text on a page
without

waiting for the screen to refresh.



Content Positioning

(NS)

(MS)

lets a web developer animate any element on a web
page, moving pictures, text, and objects at will. It lets you ensure

that pieces of content are
displayed on the page exactly where you want them to appear, and you can modify their
appearance and location after the page has been displayed.



Dynamic Content

(MS)

actually changes the words, pictures, or multimedia on a
page
without

another trip to the web Server.



Data Binding

(MS)

lets you get all the information you need to ask questions,
change elements, and get results
without

going back to the web server.



Downloadable Fonts

(N
S)

let you use the fonts of your choice to enhance the
appearance of your text. Then you can package the fonts with the page so that the text is
always displayed with your chosen fonts.

5. Enterprise

Java Bean



Enterprise JavaBeans

(
EJB
) is a managed, ser
ver
-
side
component

architecture for modular
construction of
enterprise appl
ications
.

The EJB specification is one of several
Java

APIs

in the
Java EE

specification. EJB is a
server
-
side

model that
encapsulates

the
business logic

of an application. The EJB specification was originally
developed in 1997 by
IBM

and later adopted by
Sun Microsystems

(EJB 1.0 and 1.1) in 1999

and enhanced
under the
Java Community Process

as
J
SR 19

(EJB 2.0),
JSR 153

(EJB 2.1),
JSR 220

(EJB 3.0) and
JSR
318

(EJB 3.1).

The EJB specificatio
n intends to provide a standard way to implement the back
-
end 'business' code
typically found in enterprise applications (as opposed to 'front
-
end' interface code). Such code was
frequently found to address the same types of problems, and it was found that

solutions to these problems
are often repeatedly re
-
implemented by programmers. Enterprise JavaBeans were intended to handle such
common concerns as persistence, transactional integrity, and security in a standard way, leaving
programmers free to concentr
ate on the particular problem at hand.

.














Types

An EJB container holds two major types of beans:



Session Beans

that can be either "Stateful", "Stateless" or "Singleton" an
d can be accessed
via either a
Local

(same JVM) or
Remote

(different JVM) interface or directly without an

interface,

in which case local semantics apply. All session beans s
upport asynchronous
execution for all views (local/remote/no
-
interface).



Message Driven Beans (also known as MDBs or Message Beans). MDBs also support
asynchronous execution, but via a messaging paradigm.



Session beans



Stateful Session Beans

Stateful Sessi
on Beans are business objects having
state
: that is, they keep track of which
calling client they are dealing with throughout a session and thus access to

the bean instance is
strictly limited to only one client at a time. In case concurrent access to a single bean is attempted
anyway the container serializes those requests, but via the @AccessTimeout annotation the
container can throw an exception instead.

Stateful session beans' state may be persisted (passivated)
automatically by the container to free up memory after the client hasn't accessed the bean for some
time. The JPA

extended persistence context is explicitly supported by Stateful Session Beans.



Examples



Checking out in a web store might be handled by a stateful session bean that would u
se
its state to keep track of where the customer is in the checkout process, possibly holding
locks on the items the customer is purchasing (from a system architecture's point of
view, it would be less ideal to have the client manage those locks).



Stateles
s Session Beans

Stateless Session Beans are business objects that do not have state associated with them.
However, access to a single bean instance is still limited to only one client at a time and thus
concurrent

access to the bean is prohibited. In case concurrent access to a single bean is attempted
anyway the container simply routes each request to a different instance.

This makes a stateless
session bean automatically thread
-
safe. Instance variables can be used during a single method call
to the bean, but the contents of those instance variables are not guaranteed to be preserved acro
ss
method

calls. Instances of Stateless Session beans are typically pooled. If a second client accesses a
specific bean right after a method call on it
made by a first client has finished, it might get the same
instance. The lack of overhead to maintain a conversation with the calling client makes them less
resource
-
intensive than stateful beans.



Examples



Sending an e
-
mail to customer support might be han
dled by a stateless bean, since this is
a one
-
off operation and not part of a multi
-
step process.



A user of a website clicking on a "keep me informed of future updates" box may trigger
a call to an asynchronous method of the session bean to add the user to

a list in the
company's database (this call is asynchronous because the user does not need to wait to
be informed of its success or failure).



Fetching multiple independent pieces of data for a website, like a list of products and the
history of the curren
t user might be handled by asynchronous methods of a session bean
as well (these calls are asynchronous because they can execute in
parallel

that way,
which potentially i
ncreases performance). In this case, the asynchronous method will
return a
Future

instance.



Singleton Session Beans

Singleton Session Beans are business objects having a g
lobal shared state. Concurrent
access to the one and only bean instance can be controlled by the container (Container
-
managed
concurrency, CMC) or by the bean itself (Bean
-
managed concurrency, BMC). CMC can be tuned
using the @Lock annotation, that designa
tes whether a read lock or a write lock will be used for a
method call. Additionally, Singleton Session Beans can explicitly request to be instantiated when
the EJB container starts up, using the @Startup annotation.



Examples



Loading a global daily price l
ist that will be the same for every user might be done with
a singleton session bean, since this will prevent the application having to do the same
query to a database over and over again.



Message driven beans

Message Driven Beans

are business objects whos
e execution is triggered by messages
instead of by method calls. The Message Driven Bean is used among others to provide a high level
ease
-
of
-
use abstraction for the lower level JMS (
Java Message Service
) specification. It may
subscribe to JMS message queues or message topics, which are typically injected into the bean.
They were added in EJB to allow event
-
driven processing. Unlike session beans, an MDB does not
have
a client view (Local/Remote/No
-
interface), i.

e. clients can not look
-
up an MDB instance. It
just listens for any incoming message on, for example, a JMS queue or topic and processes them
automatically. Only JMS support is required by the Java EE spec, but

Message Driven Beans can
support other messaging protocols. Such protocols may be asynchronous but can also be
synchronous. Since session beans can also be synchronous or asynchronous, the prime difference
between session
-

and message driven beans is not
the synchronicity, but the difference between
(object oriented)
method

calling

and
messaging
.


Unit I
V


Section A


1. Cryptography


Cryptography is where security engineering meets mathematics. It provides us with the tools that
underlie mos
t modern security protocols. It is probably the key enabling technology for protecting
distributed systems, yet it is surprisingly hard to do right.

Computer security people often ask for non
-
mathematical definitions of cryptographic terms. The
basic term
inology is that
cryptography
refers to the science and art of designing ciphers;
cryptanalysis
to
the science and art of breaking them; while
cryptology
, often shortened to just
crypto
, is the study of both.
The input to an encryption process is commonly c
alled the
plaintext
, and the output the
ciphertext.
Thereafter, things get somewhat more complicated. There are a number of
cryptographic primitives


basic building blocks, such as
block ciphers, stream ciphers
, and
hash functions.
Block ciphers may either

have one key for both encryption and decryption, in which

case they’re called
shared key
(also
secret key
or
symmetric
), or have separate keys for encryption
and decryption, in which case they’re called
public key
or
asymmetric.
A
digital signature scheme

is a
special type of asymmetric crypto primitive.


2. SSL

Secure Socket Layer (SSL), and its newer revision, Transport Layer Security (TLS), are the de
-
facto standard used to end
-
to
-
end encrypt and verify any website traffic deemed worthy of encryption. T
his
includes specifically credit card purchases and bank sites but it may also be used on any site requesting a
password or dealing with personal information. SSL and TLS use public key encryption





3.

Secure Electronic Transaction (Set)

The Secure Elec
tronic Transaction (SET) is an open encryption and security specification that is
designed for protecting credit card transactions on the Internet.

4.

Security Mechanisms in Electronic Money

The security mechanisms in these procedures arc similar to the me
chanisms described earlier. Let
us study the process of the customer obtaining the money in the form of tiles from the hank. The same
principles would apply to other transactions (e.g. a customer buying something from a merchant and then
vending these file
s to him)



Section B


1.

Digital Signature


Suppose that the base
p
and the generator
g
(which may or may not be a primitive root) are public
values chosen in some suitable way, and that each user who wishes to sign messages has a private signing
key
X
an
d a public signature verification key
Y = gX
. An
ElGamal signature scheme
works as follows:
choose a message key
k
at random, and form
r = gk
(modulo
p
). Now form the signature
s
using a linear
equation in
k
,
r
, the message
M
, and the private key
X
. There
are a number of equations that will do; the
particular one that happens to be used in ElGamal signatures is:


rX + sk = M
modulo
p


1


So
s
is computed as
s =
(
M


rX
)
/k
p
). When both sides are passed through
our one
-
way homomorphism
f
(
x
)
= gx
modulo
p
we get:


grX gsk
gM
modulo
p


Or

Yrrs
gM
modulo
p


An ElGamal signature on the message
M
consists of the values
r
and
s
, and the recipient can v
erify
it using the above equation. A few details need to be sorted out to get a functional digital signature scheme.
For example, bad choices of
p
or
g
can weaken the algorithm; and we will want to hash the message
M
using a hash function so that we can si
gn messages of arbitrary length, and so that an opponent can’t use
the algorithm’s algebraic structure to forge signatures on messages that were never signed. Having attended
to these details and applied one or two optimizations, we get the
Digital Signatu
re Algorithm
(DSA) which
is a U.S. standard and widely used in government applications.


DSA (also known as DSS, for Digital Signature Standard) assumes a prime
p
of typically 1024 bits, a
prime
q
of 160 bits dividing (
p


1), an element
g
of order
q
in th
e integers modulo
p
, a secret signing key
x
, and a public verification key
y = gx
. The signature on a message
M
,
Sigx
(
M
), is (
r, s
), where:


r
gk
(modulo
p
)) modulo
q

s
h
(
M
)


xr
)/
k
modulo
q


The hash function used here is SHA1.


DSA is the classic example of a randomized digital signature scheme without message recovery.


2. The

SET Process

Let us now take a simplistic took at the SET proc
ess before we describe its technical details.





The cardholder opens an account: The cardholder opens a credit card account (such as
MasterCard or Visa) with a bank
(issuer)
that supports electronic payment mechanisms and the SET
protocol.




The cardholder

receives a certificate: After the cardholder's identity is verified (with the help of
details such us passport business documents, etc.) the cardholder receives a digital certificate
from a CA. The certificate also contains details such as die cardholder'
s public key and its
expiration date.



The merchant receives
m
certificate: A merchant who wants to accept credit cards must possess
a digital certificate. The merchant must also obtain the payment gateway
's
digital certificate






The cardholder places on o
rder: This is a typical
shipping can
process, wherein the cardholder
browses the list of items available, searches for specific items, .selects one or more of them and
places the order. The merchant, in turn, sends details such as the list of items selecte
d, their
quantities, prices, total bill, etc. back to the cardholder for his record*,with the help of an order
form.



2
.

EDI Architecture



Business application layer
:

The business application layer receives the business document to be
transmitted to the bu
siness partner of the organization. This document can be a purchase order, a
funds transfer request, etc. Note that the business documents at this layer arc usually not
in the
EDI
format
. They arc, instead
,

in the organization's internal format.
The Busine
ss

application laye
r
is
simply an interface that receives documents created in such internal formats.



Internal format conversion layer: This layer takes the business document prepared by th
e
busi
ness
application layer and

converts each individual field in
t
o EDI
equivalent field.



EDI
Translator

layer: Whereas the internal format conversion layer ma
ps individual data fields on
to E
DI fields, the
translator

maps the entire document on to its EDI equivalent document. That
is.

it
make
s sure

that the in
ter li
nk

in the
document

(e.g. in a payroll form, there c
oul
d be reference in an
employee's department within the document), if any, conform to the EDI standard and that the
document as a whole is now in the format
as
specified by the EDI standards.



ED1 envelope l
ayer:

Also known as KDI
communications layer, this piece of

EDI software dials
t
he phone number of the VAN
,

if it is a VAN
-
operated

network
. If it uses leased lines, t
he
appropriate communication with the leased line software is the responsibility of this
layer. At the
sender's
end(A),
t
his

software layer sends Live business document to the VAN. Toe VA
N stores
in
its mailbox
f
or the
appropriate destination (
B). The envelope layer at the destination (B) then
receives this business document from its mailbox w
hen it dials the VAN the next time.


3
. What

is EDI?



An acronym for Electronic Data Interchange.



Electronic data interchange (EDI) is the computer
-
to
-
computer exchange of data in standardized,
electronic formats between companies. Computer to computer mean
s “original application

program to processing application program.



EDI is also a business strategy utilizing technology to achieve business objectives and enhance
business relationships.



EDI transactions, that can be exchanged between two companies (common
ly referred to as Trading
Partners) anywhere in the world within hours or minutes.



The standardized formats make it possible for a variety of organizations to exchange information
easily and without confusion.



EDI Standards

The sender and receiver must use

the same standards so that everyone is speaking the same
language. The American National Standards Institute (ANSI) Accredited Standards Committee (ASC)
X12 defines EDI Standards. This is a
cross industry

standards body with representation from many

indus
tries interested in EDI. The X12 standards serve as a common business language allowing all EDI
trading partners to communicate electronically with one another. At Entergy we also follow guidelines
developed by the Utility Industry Group (UIG), a sub
-
group

of ANSI. It has been a key group in
establishing standards for transaction commonly used by utilities. The invoice transaction is referred to
as an 810 transaction.





4
. Major

sets of
EDI standards
:



The
UN
-
recommended
UN/EDIFACT

is the only international standard and is predominant outside
of North America.



The
US

standard
ANSI ASC X12

(X12) is predominant in North America.



The
TRADACOMS

standard developed

by the ANA (Article Numbering Association) is
predominant in the
UK

retail industry.



The ODETTE standard used within the European automotive industry

All of these standards firs
t appeared in the early to mid 1980s. The standards prescribe the
formats, character sets, and data elements used in the exchange of business documents and forms. The
complete
X12 Document List

includes all major business documents, including purchase orders (called
"ORDERS" in UN/EDIFACT and an "850" in X12) and invoices (called "INVOIC" in UN/EDIFACT
and an "810" in X12).

The EDI standard says which pieces of information a
re mandatory for a particular document,
which pieces are optional and give the rules for the structure of the document. The standards are like
building codes. Just as two kitchens can be built "
to code
" but look completely different, two EDI
documents can follow the same standard and contain different sets of information. For example a food
company may indicate a product's expiration date while a clothing manufacturer would choose t
o send
color and size information.


Section C


1.

SECURE ELECTRONIC TRANSACTION (SET)


The Secure Electronic Transaction (SET) is an open encryption and security specification that is
designed for protecting credit card transactions on the Internet. The pi
oneering work in this area was done
in 1996 by MasterCard and Visa jointly. They were joined by IBM, Microsoft, Netscape. RSA, Tcrisa and
VeriSign. Starting from that time, there have been many tests of the concept, and by 1998 the first
generation of SET
-
compliant products appeared in the market.

The need for SET came from the fact that MasterCard and Visa had realized
than

for e
-
commerce

payment processing, software vendors were coming up with new and conflicting standards. These were
mainly driven by Mic
rosoft on one hand and by IBM on the other. To avoid all sorts of future
incompatibilities, MasterCard and Visa decided to come up with a standard, ignoring all their competi
tion
issues, and in the process, involving all the major software manufacturers.

SET is not a payment system. Instead, it is a set of security protocols and formats that enable the
users to employ the existing credit card payment infrastructure on the Internet in a secure manner. SET
serv
ices can be summarized as follows:

1.

It provides
a secure communication channel among all the parties involved in an c
-
commcrcc
transaction.

2.

It provides authentication by the use of digital certificates.

3.

It ensures confidentiality, because the information is only available to the panics involved in a
tra
nsaction, and that too only when and where necessary*



SET Participants

Before we discuss SET, let us have an overview of the participants in the SET system. We have dis
cussed
them earlier, but let us recap the main points for the ease of understanding and

clarity.



C
ardholder Using the Internet, consumers and corporate purchasers interact with the
merchants (discussed subsequently) for buying goods and services. A cardholder is an
authorized holder of a payment card such as MasterCard or Visa that has been
issued by an
Issuer
(d
iscussed subsequently). This term

is used inter

changeably

with
customer.



M
erchant: A merchant is a person or an organization
than

wants to sell goods or services to
cardholders. A merchant must have a relationship with an
Acquirer
(d
iscussed subsequently)
for




Issuer
: The issuer i
s

a financial institution (such a
s

a bank) that provides a payment card to a
cardholder. The

most critical point is

that the

inner is ultimately responsible for the payment of the
cardholder
’s

debt.




Acquirer
:

This is a fi
nancial institution t
hat ha
s

a relationship with merchants for processing
payment card
authorizations

and payment. The reason for having acquirers
is that merchants

accept
credit cards of more than one brand, but
are

not interested in dealing

with so many bankcard
organizations or issuers, Instead, an acquirer provides the merchant an assurance that a particul
ar
cardholder account is active,

tha
t

the purchase amount does not exceed the credit limits, etc. The
acquirer also provide
s

electronic
funds
transfer

to the merchant account
.

Later, the issuer
reimburses the acquirer using some payment network.





Payment Gateway: This
1%

a task thai can be taken up by the acquirer or by an organization as a
dedicated function. The payment gateway processe
s the payment message* on behalf of the
merchant. Specifically in SET. the payment gateway acts as an interface between SET and the
existing card payment networks for payment authorization** The merchant exchanges SET
messages with the payment gateway over

the Interne). The payment gateway, in turn, connect* to
the acquirer's systems using a dedicated network line in most cases.




Certification Authority (
CA): This is an authority that is trusted to provide public
key

c
ertifica
tes

to
cardholders, merchants a
nd p
ayment
gateways
.

In facts, CAs are very crucial to the success of SET.


2. SSL

Secure Socket Layer (SSL), and its newer revision, Transport Layer Security (TLS), are the de
-
facto standard used to end
-
to
-
end encrypt and verify any website traffic deemed

worthy of encryption. This
includes specifically credit card purchases and bank sites but it may also be used on any site requesting a
password or dealing with personal information. SSL and TLS use public key encryption


The most recent draft of the SSL
3.0 specification was published in November of 1996 by
Netscape. The intent was to be a “security protocol that provides communications privacy over the Internet.
The protocol allows client/server applications to communicate in a way that is designed to p
revent
eavesdropping, tampering, or message forgery.” The goals included cryptographic security,
interoperability, extensibility, and relative efficiency.


Interoperability was a goal so that applications could be written to the standard and expected to
wo
rk with any other applications written to the standard. Interoperability, it was noted, does not imply that
two programs will always be able to connect. One might not have the correct algorithm support or
credentials necessary for the connection to the oth
er.


Extensibility was described as providing “a framework into which new public key and bulk
encryption methods can be incorporated as necessary.” It was noted that this should prevent the need to
implement a new security protocol entirely should a weakn
ess be found in one of the current encryption
methods.

Cryptography, obviously, causes a higher CPU load than sending the data unencrypted. Still, they
made some effort to minimize the network traffic and allow for session caching.


SSL 3.0 was the basis
for the TLS 1.0 (RFC 2246) specification published by the Internet
Engineering Task force (IETF) in 1999. The TLS 1.0 specification described itself as being similar to but
not backwards compatible with the SSL 3.0 specification. It did include a fallback
mechanism for SSL 3.0
if TLS was not available.



Protocols

SSL/TLS has 4 underlying protocols: Handshake, Record, Change Cipher Spec, and Alert.


All other SSL/TLS protocols reside inside of the Record protocol. This is laid out as:



The type allows for a
ny of the other 3 protocols as well as application data. In decimal, the types
are as follows:

20 ChangevCiphervSpec

21 Alert

22 Handshake

23 Application (data)


The version would be 3 then 0 for SSL 3.0. Because TLS is a “minor modification to the SSL 3.0

protocol,” TLS is defined as major version 3, minor version 1. TLS 1.1 is 3 then 2, and the upcoming TLS
1.2 will be major version 3 then minor version 3.


The record length is written in terms of bytes and can not exceed 2^14 (16,384). Compression
allows

for the length to be extended by up to 1024 bytes, to a new max of 17,408 bytes in the TLS
Compressed. Length field.


TLS connections begin with a 6
-
way handshake. The handshake protocol structure is:



3.

Interpreting data

EDI translation software

provi
des the interface between internal systems and the EDI format
sent/received. For an "inbound" document the EDI solution will receive the file (either via a Value Added
Network or directly using protocols such as FTP or AS2), take the received EDI file (com
monly referred to
as a "mailbag"), validate that the trading partner who is sending the file is a valid trading partner, that the
structure of the file meets the EDI standards and that the individual fields of information conforms to the
agreed upon standa
rds. Typically the translator will either create a file of either fixed length, variable
length or XML tagged format or "print" the received EDI document (for non
-
integrated EDI
environments). The next step is to convert/transform the file that the transla
tor creates into a format that
can be imported into a company's back
-
end business systems or ERP. This can be accomplished by using a
custom program, an integrated proprietary "mapper" or to use an integrated standards based graphical
"mapper" using a stan
dard data transformation language such as
XSLT
. The final step is to import the
transformed file (or database) into the company's back
-
end
enterprise resource planning

(ERP) system.

For an "outbound" document the process for integrated EDI is to export a file (or read a database)
from a company's back
-
end ERP, transform the file to the appropriat
e format for the translator. The
translation software will then "validate" the EDI file sent to ensure that it meets the standard agreed upon
by the trading partners, convert the file into "EDI" format (adding in the appropriate identifiers and control
str
uctures) and send the file to the trading partner (using the appropriate communications protocol).

Another critical component of any EDI translation software is a complete "audit" of all the steps to
move business documents between trading partners. The au
dit ensures that any transaction (which in reality
is a business document) can be tracked to ensure that they are not lost. In case of a retailer sending a
Purchase Order to a supplier, if the Purchase Order is "lost" anywhere in the business process, the
effect is
devastating to both businesses. To the supplier, they do not fulfill the order as they have not received it
thereby losing business and damaging the business relationship with their retail client. For the retailer, they
have a stock outage and th
e effect is lost sales, reduced customer service and ultimately lower profits.

In EDI terminology "inbound" and "outbound" refer to the direction of transmission of an EDI
document in relation to a particular system, not the direction of merchandise, money

or other things
represented by the document. For example, an EDI document that tells a warehouse to perform an
outbound shipment is an inbound document in relation to the warehouse computer system. It is an
outbound document in relation to the manufacture
r or dealer that transmitted the document.



Unit V


Section A


1.

Extensible Markup Language



Extensible Markup Language

(
XML
) is a set of rules for encoding documents in
mach
ine
-
readable

form. It is defined in the XML 1.0 Specification

produced by the
W3C
, and several other related
specifications, all
gratis

open standards
.


2.XML Schema




A newer
schema

l
anguage, described by the W3C as the successor of DTDs, is
XML Schema
, often
referred to by the
initialism

for XML Schema instances, XSD (XML Schema Definition). XSDs are
far more powerful than DTDs in describing XML languages. They use a rich
datatyping

system and
allow for more detailed constraints on an XML document's logical structure. XSDs also use an
XML
-
based format, which makes it possible to use ordinary XML tools to help process them.

3.

Data binding



Another form of XML processing API is
XML data binding
, where XML data is made available as
a hierarchy of custom, strongly typed classes, in contrast to the generic objects created by a
Document Object Model

parser. This approach simplifies code development, and in many cases
allows problems to be identified at compile time rather than run
-
time. Example data binding
systems include the
Java Architecture for XML Binding

(JAXB), XML Serialization in .NET, and
CodeSynthesis XSD

for
C++
.

4.Object
-
oriented programming

(
OOP
)



It is a
programming paradigm

u
sing "
objects
"


data structures

consisting of
data fields

and
methods

together with their interactions


to design applicat
ions and computer programs.
Programming techniques may include features such as
data abstraction
,
encapsulation
,
messaging
,
modularity
,
polymorphism
, and
inheritance
. Many modern
programming languages

now support
OOP.



5.

Standards organizations

Some of the
standards organizations

of relevance for communications protocols are the
International Organization for Standardization

(ISO), the
International Telecommunications Union

(ITU), t
he
Institute of Electrical and Electronics Engineers

(IEEE), and the
Internet Engineering
Task Force

(IETF). The IETF maintains the protocols in use on the Internet. The IEEE controls
many software and hardware protocols in the electronics industry for

commercial and consumer
devices. The ITU is an umbrella organization of telecommunications engineers designing the
public
switched telephone

network

(PSTN), as well as many
radio

communication systems. For
marine
electronics

the
NMEA

standards are used. The
World Wide Web Consortium

(W3C) produces
protocols and standards for Web technologies.

In
ternational standards organizations are supposed to be more impartial than local
organizations with a national or commercial self
-
interest to consider. Standards organizations also
do research and development for standards of the future. In practice, the s
tandards organizations
mentioned, cooperate closely with each other





Section B


1.
WAP Stack



Application Layer:

The application layer is also called Wireless Application Environment IWAE)
This layer provides an environment for wireless application develo
pment, similar to the application
layer of the TCP/IP stack.




Session Layer:

Also called Wireless Session Protocol (WSP). the session layer provides meth
ods
for allowing a client
-
server interaction in the
form
of sessions between a mobile device and the
W
AP gateway. This is conceptually similar to the session layer of the OSI model.




Transaction Layer

The transaction layer is also called Wireless Transaction Protocol (WTP) in the
WAP terminology. It provides methods for performing transactions with the des
ired degree of
reliability. Such a layer is missing from the TCP/IP and OSI models*




Security Layer:

The security layer in the WAP stack is also called Wireless Transport Layer
Security (WTLS) protocol. It is an optional layer, which when present, provides

features such as
authentication, privacy and secure connections, as required by many modem e
-
commerce and m
commerce applications.




Transport Layer

The transport layer of the WAP stack is also called Wireless Datagram Protocol
(WDP) and it deals with the
issues of transporting data packets between the mobile device and the
WAP gateway, similar to the way TCP and UDP work.















2.

Pull parsing in XML

Pull parsing treats the document as a series of items which are read in sequence using the
Iterator

design pattern. This allows for writing of
recursive
-
descent parsers

in which the structure of
the code performing the parsing mirrors the structure of the X
ML being parsed, and intermediate
parsed results can be used and accessed as local variables within the methods performing the
parsing, or passed down (as method parameters) into lower
-
level methods, or returned (as method
return values) to higher
-
level me
thods. Examples of pull parsers include
StAX

in the
Java

programming language, XMLReader i
n
PHP

and System.Xml.XmlReader in the
.NET Framework
.

A pull parser creates an iterator that sequentially visits the vario
us elements, attributes, and
data in an XML document. Code which uses this iterator can test the current item (to tell, for
example, whether it is a start or end element, or text), and inspect its attributes (local name,
namespace
, values of XML attributes, value of text, etc.), and can also move the iterator to the next
item. The code can thus extract information from the document as it traverses it. The recursive
-
descent approach
tends to lend itself to keeping data as typed local variables in the code doing the
parsing, while SAX, for instance, typically requires a parser to manually maintain intermediate data
within a stack of elements which are parent elements of the element bei
ng parsed. Pull
-
parsing code
can be more straightforward to understand and maintain than SAX parsing code..


3.

The standardization process

The standardization process starts off with ISO commissioning a sub
-
committee workgroup.
The workgroup issues workin
g drafts and discussion documents to interested parties (including other
standards bodies) in order to provoke discussion and comments. This will generate a lot of questions,
much discussion and usually some disagreement on what the standard should provide

and if it can
satisfy all needs (usually not). All conflicting views should be taken into account, often by way of
compromise, to progress to a
draft proposal

of the working group.

The draft proposal is discussed by the member countries' standard bodies a
nd other organizations
within each country. Comments and suggestions are collated and national views will be formulated,
before the members of ISO vote on the proposal. If rejected, the draft proposal has to consider the
objections and counter
-
proposals to

create a new draft proposal for another vote. After a lot of
feedback, modification, and compromise the proposal reaches the status of a
draft international
standard
, and ultimately an
international standard
.

The process normally takes several years to co
mplete. The original paper draft created by the designer
will differ substantially from the standard, and will contain some of the following 'features':



Various optional modes of operation, for example to allow for setup of different packet sizes at
startu
p time, because the parties could not reach consensus on the optimum packet size.



Parameters that are left undefined or allowed to take on values of a defined set at the discretion of
the implementer. This often reflects conflicting views of some of the me
mbers.



Parameters reserved for future use, reflecting that the members agreed the facility should be
provided, but could not reach agreement on how this should be done in the available time.



Various inconsistencies and ambiguities will inevitably be found
when implementing the standard.

International standards are reissued periodically to handle the deficiencies and reflect changing
views on the subject.







Section C

1.

Related specifications

A cluster of specifications closely related to XML have been d
eveloped, starting soon after the
initial publication of XML 1.0. It is frequently the case that the term "XML" is used to refer to XML
together with one or more of these other technologies which have come to be seen as part of the XML
core.



XML Namespaces

enable the same document to contain XML elements and attributes taken from
different vocabularies, without any
naming collisions

occurring. Although XML Namespaces are
not part of the XML specification itself, virtually all XML software also supports XML
Namespaces.



XML Base

defines the

xml:base attribute, which may be used to set the base for resolution of
relative URI references within the scope of a single XML element.



The
XML Information Set

or
XM
L infoset

describes an abstract data model for XML documents in
terms of
information items
. The infoset is commonly used in the specifications of XML languages,
for convenience in describing constraints on the XML constructs those languages allow.



xml:id V
ersion 1.0 asserts that an attribute named xml:id functions as an "ID attribute" in the sense
used in a DTD.



XPath

defines a syntax named
XPath expressions

which identifies one or more of the inte
rnal
components (elements, attributes, and so on) included in an XML document. XPath is widely used
in other core
-
XML specifications and in programming libraries for accessing XML
-
encoded data.



XSLT

is a language with an XML
-
based syntax that is used to transform XML documents into
other XML documents, HTML, or other, unstructured formats such as plain text or RTF. XSLT is
very tightly coupled with XPath, which it uses to address components of the in
put XML document,
mainly elements and attributes.



XSL Formatting Objects
, or XSL
-
FO, is a markup language for XML document formatting which
is most often used to
generate
PDFs
.



XQuery

is an XML
-
oriented query language strongly rooted in XPath and XML Schema. It
provides methods to access, manipulate

and return XML.



XML Signature

defines syntax and processing rules for creating
digital signatu
res

on XML content.



XML Encryption

defines syntax and processing rules for
encrypting

XML content.

Some oth
er specifications conceived as part of the "XML Core" have failed to find wide
adoption, including
XInclude
,
XLink
, and
XPointer
.


2.

History of XML

The versatility of
SGML

for dynamic information display was understood by early digital
media publishers in the l
ate 1980s prior to the rise of the Internet. By the mid
-
1990s some
practitioners of SGML had gained experience with the then
-
new
World Wide Web
, and believed
that SGML offered so
lutions to some of the problems the Web was likely to face as it grew.
Dan
Connolly

added SGML to the list of W3C's activities when he joined the staff in 1995; work began
in mid
-
199
6 when Sun Microsystems engineer
Jon Bosak

developed a charter and recruited
collaborators. Bosak was well connected in the small community of people who had experience
both in SGML and th
e Web.


XML was compiled by a
working group

of eleven members, supported by an
(approximately) 150
-
member Interest Group. Technical debate took place on the Interest Group
mailing
list and issues were resolved by consensus or, when that failed, majority vote of the
Working Group. A record of design decisions and their rationales was compiled by
Michael
Sperberg
-
McQueen

on December 4, 1997.
James Clark

served as Technical Lead of the Working
Group, notably contributing the empty
-
element "
<empty

/>" syntax and the name "XML". Other
names that had been put forward for consideration included "MAGMA" (Minimal Architecture for
Generalized Markup Applications), "SLIM" (Structured Language for Internet Markup) and
"MGML" (Minimal Generalized Mark
up Language). The co
-
editors of the specification were
originally
Tim Bray

and
Michael Sperbe
rg
-
McQueen
. Halfway through the project Bray accepted a
consulting engagement with
Netscape
, provoking vociferous protests from Microso
ft. Bray was
temporarily asked to resign the editorship. This led to intense dispute in the Working Group,
eventually solved by the appointment of Microsoft's
Jean Paoli

as a third co
-
editor.

The XML Working Group never met face
-
to
-
face; the design was accomplished using a
combination of email and weekly teleconferences. The major design decisions were reached in
twenty weeks of int
ense work between July and November 1996, when the first Working Draft of
an XML specification was published. Further design work continued through 1997, and XML 1.0
became a
W3C

Recommendation on Feb
ruary 10, 1998.



Sources

XML

is a profile of an ISO standard
SGML
, and most of XML comes from SGML unchanged.
From SGML comes the separation of logical and physical structures (elements and entities)
, the
availability of grammar
-
based validation (DTDs), the separation of data and metadata (elements and
attributes), mixed content, the separation of processing from representation (
processing instructions
),
and the default angle
-
bracket syntax. Removed were the SGML Declaration (XML has a fixed
delimiter set and adopts
Unicode

as the document
character set
).

Other sources of technology for XML were the
Text Encoding Initi
ative

(TEI), which defined a
profile of SGML for use as a "transfer syntax"; and
HTML
, in which elements were synchronous with
their resource, document character sets were separate from resource en
coding, the xml:lang attribute
was invented, and (like
HTTP
) metadata accompanied the resource rather than being needed at the
declaration of a link. The Extended Reference Concrete Syntax (ERCS) pr
oject of the SPREAD
(Standardization Project Regarding East Asian Documents) project of the ISO
-
related
China/Japan/Korea Document Processing expert group was the basis of XML 1.0's naming rules;
SPREAD also introduced hexadecimal numeric character referen
ces and the concept of references to
make available all Unicode characters. To support ERCS, XML and HTML better, the SGML standard
IS 8879 was revised in 1996 and 1998 with WebSGML Adaptations. The XML header followed that
of ISO
HyTime
.

Ideas that developed during discussion which were novel in XML included the algorithm for
encoding detection and the encoding header, the processing instruction target, the xml:space attribute,
and the new close

delimiter for empty
-
element tags. The notion of well
-
formedness as opposed to
validity (which enables parsing without a schema) was first formalized in XML, although it had been
implemented successfully in the Electronic Book Technology "Dynatext" softwar
e; the software from
the University of Waterloo New Oxford English Dictionary Project; the RISP LISP SGML text
processor at Uniscope, Tokyo; the US Army Missile Command IADS hypertext system; Mentor
Graphics Context; Interleaf and Xerox Publishing System.



Versions

There are two current versions of XML. The first (
XML 1.0
) was initially defined in 1998. It
has undergone minor revisions since then, without being given a new version number, and is currently
in its fifth edition, as published on November 26, 20
08. It is widely implemented and still
recommended for general use.

The second (
XML 1.1
) was initially published on February 4, 2004, the same day as XML 1.0
Third Edition, and is currently in its second edition, as published on August 16, 2006. It contain
s
features (some contentious) that are intended to make XML easier to use in certain cases. The main
changes are to enable the use of line
-
ending characters used on
EBCDIC

platforms, and the use

of
scripts and characters absent from Unicode 3.2. XML 1.1 is not very widely implemented and is
recommended for use only by those who need its unique features.

Prior to its fifth edition release, XML 1.0 differed from XML 1.1 in having stricter
requirem
ents for characters available for use in element and attribute names and unique identifiers: in
the first four editions of XML 1.0 the characters were exclusively enumerated using a specific version
of the
Unicode

standard (Unicode 2.0 to Unicode 3.2.) The fifth edition substitutes the mechanism of
XML 1.1, which is more future
-
proof but reduces
redundancy
. The approach taken in the fifth edition
of XML 1.0 and in all editions of XML 1.1 is that only certain characters are forbidden in names, and
everything else is allowed, in order to accommodate the use of suitable name character
s in future
versions of Unicode. In the fifth edition, XML names may contain characters in the
Balinese
,
Cham
, or
Phoenician

scripts among many others which have been added to Unicode since Unicode 3.2. Almost
any Unicode code point can be used in the character data and a
ttribute values of an XML 1.0 or 1.1
document, even if the character corresponding to the code point is not defined in the current version of
Unicode. In character data and attribute values, XML 1.1 allows the use of more
control characters

than XML 1.0, but, for "robustness", most of the control characters introduced in XML 1.1 must be
expressed as numeric character references (and #x7F through #x9F, which had been allowed
in XML
1.0, are in XML 1.1 even required to be expressed as numeric character references). Among the
supported control characters in XML 1.1 are two line break codes that must be treated as whitespace.
Whitespace characters are the only control codes that
can be written directly.

There has been discussion of an XML 2.0, although no organization has announced plans for
work on such a project. XML
-
SW (SW for
skunkworks
), wri
tten by one of the original developers of
XML, contains some proposals for what an XML 2.0 might look like: elimination of DTDs from
syntax, integration of
namespaces
,
XML Base

and
XML Information Set

(
infoset
) into the base
standard.

The World Wide Web Consortium also has an XML B
inary Characterization Working Group
doing preliminary research into use cases and properties for a binary encoding of the XML infoset. The
working group is not chartered to produce any official standards.

3.

Fundamental concepts and features
Object Techno
logy

A survey by Deborah J. Armstrong of nearly 40 years of computing literature identified a
number of "quarks", or fundamental concepts, found in the strong majority of definitions of OOP.


Not all of these concepts are to be found in all object
-
oriented

programming languages. For
example, object
-
oriented programming that uses classes is sometimes called
class
-
based
programming
, while
prototype
-
based programming

does not typically use classes. As a result, a
significantly different yet analogous terminology is used to define the concepts of
object

and
instance
.

Benjamin C. Pierce and some other researchers view as futile any attempt to distill OOP to a
minimal set of features. He nonetheless identifies fundamental features that support the OOP
programming style in most object
-
oriented languages:


Dynamic dispatch



when a method is invoked on an object, the object itself determines
what code gets executed by looking up the method at run time in a table associated with the object.
T
his feature distinguishes an object from an
abstract data type

(or module), which has a fixed
(static) implementation of the operations for all instances. It is a program
ming methodology that
gives modular component development while at the same time being very efficient.



Encapsulation

(or
multi
-
methods
, in which case the state is kept separate)



Subtype polymorphism



Object
inheritance

(or
delegation
)



Open recursion



a special variable (syntactically it may be a keyword), usually called this or
self, that allows a method body to invoke ano
ther method body of the same object. This variable
is
late
-
bound
; it allows a method defined in one class to invoke another method that is defined
later, in some subclass thereof.

Similarly, in his 2003 book,
Concepts in programming languages
, John C. Mitc
hell
identifies four main features: dynamic dispatch,
abstraction
, subtype polymorphism, and
inheritance. Michael Lee Scott in
Programming Lan
guage Pragmatics

considers only
encapsulation, inheritance and dynamic dispatch.

Additional concepts used in object
-
oriented programming include:



Classes

of objects



Instances

of classes



Methods

whi
ch act on the attached objects.



Message passing



Abstraction



De
coupling

Decoupling refers to careful controls that separate code modules from particular use cases,
which increases code re
-
usability. A common use of decoupling in OOP is to polymorphically
decouple the encapsulation (see
Bridge pattern

and
Adapter pattern
)
-

for example, using a method
interface which an encapsulated object must satisfy, as opposed
to using the object's class.


4.

Future of standardization (OSI)

A lesson learned from
ARPANET

(the predecessor of the Internet) is that standardization of
protocols is not enough, because pro
tocols also need a framework to operate. It is therefore important to
develop a general purpose, future
-
proof framework suitable for
structured protocols

(such as layered
protocols) and their standardization. This would prevent protocol standards with over
lapping functionality
and would allow clear definition of the responsibilities of a protocol at the different levels (layers). This
gave rise to the ISO
Open Systems Interconnection reference model

(RM/OSI), which is used as a
framework for the design of s
tandard protocols and services conforming to the various layer specifications.

In the
OSI model
, communicating systems are assumed to be connected by an underlying physical
medium providi
ng a basic (and unspecified) transmission mechanism. The layers above it are numbered
(from one to seven); the n
th

layer is referred to as (n)
-
layer. Each layer provides service to the layer above it
(or at the top to the application process) using the ser
vices of the layer immediately below it. The layers
communicate with each other by means of an interface, called a
service access point
. Corresponding layers
at each system are called
peer entities
. To communicate, two peer entities at a given layer use a
(n)
-
protocol, which is implemented by using services of the (n
-
1)
-
layer. When systems are not directly
connected, intermediate peer entities (called
relays
) are used. An
address

uniquely identifies a service
access point. The address naming domains need no
t be restricted to one layer, so it is possible to use just
one naming domain for all layers. For each layer there are two types of standards: protocol standards
defining how peer entities at a given layer communicate, and service standards defining how a
given layer
communicates with the layer above it.



In the original version of RM/OSI, the layers and their functionality are (from highest to lowest
layer):



The
application layer

may provide the following services to the application processes:
identificatio
n of the intended communication partners, establishment of the necessary authority
to communicate, determination of availability and authentication of the partners, agreement on
privacy mechanisms for the communication, agreement on responsibility for erro
r recovery and
procedures for ensuring data integrity, synchronization between cooperating application
processes, identification of any constraints on syntax (e.g. character sets and data structures),
determination of cost and acceptable quality of service
, selection of the dialogue discipline,
including required logon and logoff procedures.



The
presentation layer

may provide the following services to the application layer: a request for
the establishment of a session, data transfer, negotiation of the syn
tax to be used between the
application layers, any necessary syntax transformations, formatting and special purpose
transformations (e.g. data compression and data encryption).



The

session layer

may provide the following services to the presentation layer:

establishment
and release of session connections, normal and expedited data exchange, a quarantine service
which allows the sending presentation entity to instruct the receiving session entity not to
release data to its presentation entity without permiss
ion, interaction management so
presentation entities can control whose turn it is to perform certain control functions,
resynchronization of a session connection, reporting of unrecoverable exceptions to the
presentation entity
.




The
transport layer

provid
es reliable and transparent data transfer in a cost effective way as
required by the selected quality of service. It may support the multiplexing of several transport
connections on to one network connection or split one transport connection into several n
etwork
connections.



The
network layer

does the setup, maintenance and release of network paths between transport
peer entities. When relays are needed, routing and relay functions are provided by this layer.
The quality of service is negotiated between net
work and transport entities at the time the
connection is setup. This layer is also responsible for (network) congestion control.



The
data link layer

does the setup, maintenance and release of data link connections. Errors
occurring in the physical layer a
re detected and may be corrected. Errors are reported to the
network layer. The exchange of data link units (including flow control) is defined by this layer.



The
physical layer

describes details like the electrical characteristics of the physical connecti
on,
the transmission techniques used, and the setup, maintenance and clearing of physical
connections.

In contrast to the
TCP/IP layering scheme
, which assumes a conn
ectionless network, RM/OSI
assumed a connection oriented network. Connection oriented networks are more suitable for wide area
networks and connectionless networks are more suitable for local area networks. Using connections to
communicate implies some for
m of session and (virtual) circuits, hence the (in the TCP/IP model lacking)
session layer. The constituent members of ISO were mostly concerned with wide area networks, so
development of RM/OSI concentrated on connection oriented networks and connectionle
ss networks were
only mentioned in an addendum to RM/OSI.At the time, the IETF had to cope with this and the fact that
the Internet needed protocols which sim
ple were not there. As a result the IETF developed its own
standardization process based on "rough consensus and running code". The standardization process is
described by
RFC2026
.

Nowadays, the IETF has bec
ome a standards organization for the protocols in use on the Internet. RM/OSI
has extended its model to include connectionless services and because of this, both TCP and IP could be
developed into international standards.