The Architecture of the World Wide Web

uptightexampleΔίκτυα και Επικοινωνίες

24 Οκτ 2013 (πριν από 3 χρόνια και 9 μήνες)

103 εμφανίσεις

The Architecture of the World
Wide Web

Min Song

IS

NJIT

Internet Architecture


Today’s Internet


Thousands of networks


Connected by legal agreements and commercial
contracts


Uses TCP/IP protocol


Internet service providers (ISPs)


Provide most individual users with access to the Internet


Dialup connections


Modems and conventional phone lines


x
DSL and cable modems provide broadband access

Packet Switching


Most modern Wide Area Network (WAN) protocols,
including TCP/IP, X.25, and Frame Relay


Packet switching is more efficient and robust for
data that can withstand some delays in transmission,
such as e
-
mail messages and Web pages.


Circuit
-
switching: Normal telephone service is based
on a circuit
-
switching technology


a dedicated line is allocated for transmission between
two parties.


data must be transmitted quickly and must arrive in
the same order in which it's sent.


real
-
time

data, such as live audio and
video
.

Use of Packets

Internet Protocols:TCP/IP


Communications protocol suite


Packet switched protocol


No end
-
to
-
end connection is required


Each message broken down into small pieces called packets


Packets possibly routed to destination over different
paths


Transmission Control Protocol (TCP)


Breaks messages into packets


Numbers packets in order


Reorders packets at the destination


Internet Protocol (IP)


Routes packets to the proper destination




Domain Names


Every computer connected to the Internet must have
a unique IP address


IP address format is xxx.xxx.xxx.xxx where xxx is a
number between 0 and 255


How do we know that 207.46.245.222 is Microsoft?


Domain Name Service(DNS)


A database of Internet names


DNS Servers convert Internet names to IP addresses


Top level domains



Ping:
to test whether a particular host is
reachable across an IP network.


Tcpdump:
to sniff network packets and make
some statistical analysis out of those dumps



The World Wide Web


Collection of hyperlinked computer files on the Internet


Client
-
server application


Web servers


Web browsers as clients


WWW standards


Hypertext markup language (HTML)


Current standard for writing Web pages


Implementation of
SGML

specifically for Web pages


Tags in HTML instruct the client browser how to format and
display the Web page content


Hypertext transfer protocol (HTTP)


Protocol that establishes a connection between Web server and
client


Extensible markup language (XML)


A
meta
-
markup language


Gives meaning to the data enclosed within XML tags


Static versus Dynamic Web Pages


HTML and XML only display and exchange data


No interactivity; no processing of data


Scripting languages


Provides basic interactivity


Rollovers


Crawling text


JavaScript


VBScript


Full
-
featured Web programming


Java


Client side scripting or browser side scripting


Applets


J2EE


Common Gateway Interface (CGI)


Allows passing of data between a static HTML page and a
computer program



Searching the WWW


Most data on the Internet is part of the WWW


Search engines


large databases that index WWW
content


Building the search engine database


Submit a site to the search engine administrator for
listing


Spiders


Metatags


Google


Yahoo



Hypertext Transfer Protocol


A protocol (syntax and semantics) for
transferring representations of resources


usually across the Internet using TCP


Design goals


speed

(stateless, cachable, few round
-
trips)


simplicity


extensibility


data (payload) independence


A true network
-
based API

HTTP/0.9 (pre
-
1993)


Absolute Simplicity

GET /url
-
path

<TITLE>Hello World</TITLE>

Hello World


No Extensibility


only one method (GET)


no request modifiers


no response metadata

HTTP/1.0 (1993
-
present)


Simple and (mostly) Extensible

GET /Test/hello.html HTTP/1.0

Accept: text/html

User
-
Agent: GET/5 libwww
-
perl/0.40


HTTP/1.0 200 OK

Date: Fri, 12 Jan 1996 01:02:49 GMT

Server: Apache/1.0.5

Content
-
type: text/html

Content
-
length: 38

Last
-
modified: Wed, 10 Jan 1996 01:


<TITLE>Hello</TITLE>

Hello out there!

HTTP/1.0 Deficiencies


No complete specification until end of `94


No minimum standard for compliance


Poor network behavior


one request per connection


no reliable transfer of dynamic content


no control over response caching


failed to anticipate proxies and gateways


created huge demand for vanity addresses


misuse/misunderstanding of MIME

HTTP/1.1


Culmination of two years work, RFC2068


with Henrik Frystyk, Jim Gettys, Jeff
Mogul


designed at UCI and W3C; expanded in
IETF


Improved Reliability


chunked transfer of dynamic content


recognition of proxy and gateway
requirements


explicit cachability of responses


Improved Network Behavior


persistent connections


virtual hosts (many names, one address)

HTTP/1.1 (1997
-
????)


Less Simple, More Extensible, but Compatible

GET /Test/hello.html HTTP/1.1

Host: kiwi.ics.uci.edu:8080

User
-
Agent: GET/7 libwww
-
perl/5.40


HTTP/1.1 200 OK

Date: Fri, 07 Jan 1997 15:40:09 GMT

Server: Apache/1.2b6

Content
-
type: text/html

Transfer
-
Encoding: chunked

Etag: “a797cd
-
465af”

Cache
-
control: max
-
age=3600

Vary: Accept
-
Language




HTTP/1.x Deficiencies


MIME is too verbose (overhead per message)


Control mixed with metadata


Metadata restricted to header or trailer


Fixed request/response ordering can block
progress


Incurs frequent round
-
trip delays due to
connection establishment.

HTTP/2.x


Tokenized transfer of common fields


reducing bandwidth usage, latency


removal of MIME syntax limitations


self
-
descriptive for extensions


Multiplexing control, data, metadata streams


reducing desire for multiple connections


enabling multi
-
protocol connections


per
-
stream priority or credit mechanism


Layered streams for meta
-
metadata,
encryption...

XML to the rescue?


“X” for extensible:


self
-
descriptive syntax


semantics by reference (doctype,
namespaces)


rendering by reference (style sheets)


An XML representation is an object turned

inside
-
out, with behavior
-
by
-
reference


However, network application performance
will demand standards for domain
-
specific
doctypes and style sheets

Future Work


Dynamic application architectures


Architectural analysis and performance
bounds


Impact of future network architectures
(ATM)


Balancing secure transfer with firewall
visibility


Protocol for manipulating resource mappings


HTTP
-
NG (W3C/Xerox PARC)


rHTTP (UCI)