HTML and HTTP
Based on Chapter 32 in Computer
Networks and Internets, Comer
HTML stands for
and HTTP stands for
Protocol, so that raises the question: what is
Hypertext is “a method of storing data through a
computer program that allows a user to create and
fields of information at will and to retrieve
is a region on one document (page)
that when clicked brings up for the user another
It was developed by Ted Nelson in the 1960s.
The “resources” (data or program files) are
located on many computers through an
internet or the Internet, hence this is a
The location of a resource is given by its
URL (Uniform Resource Locator)
Hypertext is generally viewed in a web browser,
an application used to locate (linked or otherwise)
web pages and display them.
Some browsers such as Lynx only link text
But when most people think of browsers they
think of Netscape Navigator and/or Microsoft
Internet Explorer, which support more than just
Modern browsers link information in non
format (graphics, sound, video, etc.) and so are
“multimedia” or “hypermedia” programs.
The browser may need a plug
in to support some
formats. A plug
in adds a particular feature or
service to a larger system.
ins are based on MIME file types.
The first widely used multimedia browser was
Marc Andreessen is credited with initiating the
development of Mosaic.
Mosaic moved the Internet out of the realm of
academics and computer hobbyists by making it
accessible to a much more general audience.
It helped the Internet maintain its exponential
growth in number of users.
Fig. 2.1: Computers connected to
the Internet vs. Year
Andreessen started Mosaic while working for the
National Center for Supercomputing Applications
(NCSA) at the University of Illinois.
Andreessen helped found Netscape
Communications, which was originally called
Mosaic is distinct from Netscape. In fact, Mosaic
is also licsensed for commercial use and is
provided to users by some Internet access
Browsers interpret web documents,
especially HTML documents
HyperText Markup Language is an
“authoring” scheme for creating documents
for the World Wide Web.
The World Wide Web (WWW) is the
collection of resources available through
HTTP to users on the Internet.
The M in HT
L stands for “Markup”
Markup refers to the sequence of characters
(or symbols) inserted in a document to
indicate how the file should look when it is
printed or displayed and/or to describe the
document's logical structure.
The markup indicators are often called
These formatting instructions must be
distinguishable from the text they are in.
In HTML, angle brackets < and > are used as
delimiters to indicate the beginning and end of a
This gives <b>bold</b> type.
As with the byte stuffing we saw in Ethernet
frames (where soh an eot were special characters),
angle brackets must be replaced in a HTML
document with < and >
The formatting or structure the tag indicates often
refers to an entire region, so many HTML tags
occur in pairs (heading and trailing). The trailing
tag includes a slash.
An HTML document begins an <HTML> tag and
ends with an </HTML> tag.
An HTML document is broken into two pieces:
the head and the body
The head is the part between the head tags <head> and
The body is the part between the body tags <body> and
Page from my site
There are hundreds of other tags used to format
and layout the information in a Web page.
For instance, <P> is used to make paragraphs and
<I> … </I>is used to italicize fonts.
Tags are also used to specify hypertext links.
<a href=“http://www.lasalle.edu”>La Salle</a>
HTML is not the only Markup Language.
HTML has similarities to SGML, Standard
Generalized Markup Language
a generic system
for organizing and tagging elements of a
GML was started by IBM and became SGML
when it was taken over by the International
Organization for Standards (ISO).
SGML is not about formatting, it’s more general.
SGML provides rules for tagging elements.
Those tags might be interpreted as formatting as is
done in HTML but can be interpreted in other
ways as well.
Compact HTML is a reduced set of HTML
used for hand
held and other devices with
limited CPU, memory, storage and so on.
The display has limited color, no jpg files
and the user moves and selects with
“buttons” instead of a mouse.
XHTML has been taking its place.
Extensible Markup Language
“Extensible” means capable of being extended,
and markup language involves tags, so XML is a
scheme in which the user can define his or her
For example, a company may elect to designate a
social security number by placing it in tags
defined for that purpose
This data can be transported from application to
application and system to system and is carrying
around a self
identifying tag with it.
Unlike HTML tags, XML tags are not necessarily
about formatting and presentation.
However, a presentation application can be
instructed to represent a certain type of data (as
identified by its XML tags) in a particular way.
On the other hand, a database interface program
can be instructed to place the information into the
Extensible Hypertext Markup Language is a
mixture of HTML and XML designed for
network display devices.
XHTML is written in XML; therefore, it is
an XML application.
anguage is a proprietary
mark up language developed by Allaire for use
CFML is a tag
based scripting language
supporting dynamic Web page creation and
ColdFusion tags are embedded in HTML files.
The HTML tags determine the page's layout while
the CFML tags import content based on user input
or the results of a database query.
Files created with CFML have the file extension
The Document Object Model is a set of
specifications concerning how web
(such as text, images, textboxes, buttons) look and
The DOM defines the attributes and events
associated with each object, and so forth.
Dynamic HTML (DHTML) uses DOM to
dynamically alter the appearance of Web pages
after they have been downloaded (client
Alas Netscape Navigator and Microsoft
Internet Explorer use different DOMs.
This is why their implementations of
DHTML are so different.
Both companies have submitted their
DOMs to the World Wide Web Consortium
(W3C) for standardization.
HTML and other web documents are
transported across the network using HTTP
Hypertext Transport Protocol, originally
developed by Dr. Tim Berners
HTTP defines rules for how messages are
formatted and transmitted, what actions are
allowed by Web servers, what actions are
allowed by clients, etc.
A Web server has an HTTP daemon that waits for
HTTP requests and handles them when they
A Web browser is an HTTP client, sending
requests to server machines.
For example, entering a URL in the location field
of a browser (client) sends an HTTP request to the
appropriate Web server, which responds with the
HTTP is a
protocol because each
command is executed independently, without any
knowledge of the commands that came before it.
This is good for keeping transmission lines available,
since there are no ongoing sessions tying up resources.
This is bad for having a web site respond in an
intelligent way to a user.
This shortcoming of HTTP is addressed in a
variety of ways, including ActiveX, Java,
Most modern browsers support HTTP 1.1
Instead of opening and closing a connection for
each application request, HTTP 1.1 provides a
that allows multiple
requests to be batched or
to an output
The underlying TCP layer can put multiple
requests (and responses to requests) into one TCP
Fewer segments, less overhead.
HTTP 1.1 (Cont.)
Compression: If a browser (client) indicates
that it can decompress HTML files, then a
server compresses them for transport across
Standard image files are already in a
compressed format, so this improvement
applies only to HTML and other non
Secure HTTP is an extension to the HTTP
protocol for sending data securely over the Web.
Not all browsers and servers support S
Another technology for secure communications
over the Web is Secure Sockets Layer (SSL).
SSL and S
HTTP have different designs and
goals. SSL is designed to establish a secure
connection between two computers, S
designed to send individual messages securely.
To increase speed, browsers cache web
page documents locally.
There are also cache servers, machines on
the local network that cache web page
First, the page is looked for on the local
machine, then on the local network (cache
server) and then at the remote location.
Refresh if you don’t want the