Introduction to XML

viraginitysplashInternet και Εφαρμογές Web

10 Νοε 2013 (πριν από 3 χρόνια και 9 μήνες)

80 εμφανίσεις


Introduction to XML

Marek Podgorny and Lukasz Beca

EECS SU and CollabWorx, Inc.

Syracuse University

Fall 2002


Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

2

Markup Languages


Marking up text is a methodology for encoding data with
information about itself


Yellow highlighter

is a valid markup methodology


You decide which part of the document are important


It is portable


others can benefit from your markup


Two critical properties on a valid markup:


A standard must be in place to define what a valid markup is


Above, markup is defined as a
bit of yellow ink

atop text


In HTML a markup is a <font color=yellow>
tag
</font>


A standard must be in place to define what markup means


Yellow highlight

means the highlighted text represents an important
point


In HTML each tag carries a well
-
defined formatting instruction

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

3

What is XML?


Like HTML, XML (Extensible Markup Language) is a
markup language which relies on the concept of rule
-
specifying tags and the use of a tag
-
processing
application that knows how to deal with the tags


For HTML, the application is a browser


This is because HTML is a presentation markup


For XML, the application can by
anything


XML may be processed by browsers, but its application
domain is huge and not even completely understood
today

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

4

eXtensibility of XML


The most important technical difference between
XML and HTML is that while HTML is a closed set of
tags, XML is a meta
-
language for defining other
markup languages


XML specifies the standards with which you can define
your own markup languages with their own sets of tags


This very statement makes people nervous…


We will discuss methodology to define a new language
but in practice very few people will ever write a DTD

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

5

Made
-
up Markup Language (MuML)

<CONTACT>


<NAME>Kim Smith</NAME>


<ID>027</ID>


<COMPANY>WebtopSystems Inc.</COMPANY>


<EMAIL>kim@webtopsystems.com</EMAIL>


<PHONE>315 443
-
4868</PHONE>


<STREET>111 College Pl</STREET>


<CITY>Syracuse</CITY>


<STATE>New York</STATE>


<ZIP>13244</ZIP>

</CONTACT>

This is a chunk of valid XML. How is it useful?

Netscape browser surely doesn’t know what to do with it….

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

6

How to make MuML useful?


There must be a set of rules allowing us/computer to
understand syntax of the language


In XML, this information is provided to processing application by
Document Type Definition (
DTD
)


The DTD specifies what it means to be a valid tag
-

the syntax for
marking up


There must be a set of rules defining the meaning
(semantics) of the markup


To specify what valid tags mean, XML documents are also
associated with
style sheets

which provide
GUI instructions

for a
processing application like a web browser.


Note that other application domains of XML might do w/o a style
sheet


e.g., application using XML a object serialization technique

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

7

Style Sheet Pseudo
-
Code


Anytime you see a
<CONTACT>, display it using a
<UL> tag. </CONTACT> tags
should be converted to </UL>


All <NAME> tags can be
substituted for <LI> tags and
</NAME> tags should
substituted for </LI>


All <EMAIL> tags can be
substituted for <LI> tags and
</EMAIL> tags should be
ignored


Style sheet utilizes the
functionality of HTML to define
the formatting of MuML.


For non
-
browser apps, the
HTML translation is irrelevant


Processing application
combines the logic of the
style
sheet
, the

DTD
, and the
data of
the MuML

document, and
displays it according to the rules
and the data.


So instead of a simple HTML
we got three different chunks.
Why the pain?


Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

8

Complex XML World


We need a processing agent which will put together
the DTD, the style sheet, and the data


Note Web browsers barely up to the task yet


Formal definition:


"A software module called an XML processor is used to
read XML documents and provide access to their content
and structure. It is assumed that an XML processor is
doing its work on behalf of another module, called the
application."


And this is not yet all….



Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

9

Build your own ColdFusion?


XML allows each specific industry to develop its own tag
sets to meet its unique needs


Doesn’t force everyone's browser to incorporate zillions of tag sets,
or developers to settle for a tag set that is too generic to be useful


Compelling? Well…


The real power of XML:


Not only can you define your own set of tags, but the rules
specified by those tags are not limited to formatting rules


XML allows you to define all sorts of tags with all sorts of rules


tags representing business rules or tags representing data description
or data relationships.



As these tags are reflected in DOM,
you can do computation on
documents!


Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

10

Why are HTML days counted?


The GUI is embedded in the data.


What happens if you decide that you like a table
-
based
presentation better than a list
-
based presentation?


Searching for information in the data is tough


The data is tied to the logic and language of HTML
and hence to browsers


What if I want to use my data in a Java applet?

HTML: <LI>State: Ohio


<LI>State: Oregon

XML: <state>Ohio</state>


<state>Oregon</state>


How do I find all records for
Ohio


What is relationship of
Ohio and Oregon?

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

11

HTML Search in Action

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

12

Long Live XML!


With XML, the GUI and data are divorced



Thus, changes to display do not require messing with the data
-

a
separate style sheet will specify a table display or a list display



Searching the data is easy and efficient



Search engines can parse description
-
bearing tags rather than
muddling in the data. Tags provide them with the intelligence they
otherwise lack



Complex relationships (trees, inheritances, classes) can be
communicated


The code is much more legible to a lay person
-



It is obvious that
<ID>911</ID>

represents an ID whereas
<LI>911

might not. XML is self
-
describing

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

13

Why isn’t it there if it is so good?


No XML applications…


IE 5.0 provides some support for XSL and XML if output
is HTML


Netscape 5.0 (Mozilla) also implements support for XML
but not for XSL


A quote:


“XML isn't about display
--

it's about
structure
. This has implications
that make the
browser question secondary
. So the whole issue of
what is to be displayed and by what means is intentionally left to
other applications. You can target the same XML (with different XSL)
for different devices (standard web browser, palm pilot, printer, etc.).
You should not get the impression that XML is useless until browsers
support it. This is definitely not true
--

we are using it at NASA in ways
where no browser plays any role."
-

Ken Sall, NASA IT Manager

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

14

XML Design Goals


Enable better search algorithms (metadata)


Enable presentation of various views for same data


Integrate data from different sources


Provide easy use over the Internet


Create documents readable even by humans


Support data interchange


Enable easy development of document processing
applications


Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

15

XML
-

Summary


Extensible Markup Language
-

Subset of Standard
Generalized Markup Language (SGML)


Universal format for describing structured data on
the Web


Specification developed by World Wide Web
Consortium (W3C) supervised by XML Working
Group

Applications of XML

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

17

Applications of XML


XML languages


XML protocols


Support for XML


Client side


Server side


XML and databases


Data interchange




Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

18

XML Deployment


XML is a basis for development of industry language
and protocol standards


Corporations and academic organizations form
special organizations (consortiums or forums) in
order to develop standards for whole branches of
industry. Example: World Wide Web Consortium or
WAPForum

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

19

Extensible HyperText Markup Language
(XHTML)


XML based syntax


Extensibility through XHTML modules allow the
combination of existing and new feature sets when
developing content and when designing new user agents
(web browsers, portable devices, etc.)


Examples of modules:


required modules: structure, basic text, hypertext, lists


optional modules: presentation, forms, tables, images,
stylesheets, applets, frames, etc.


XHTML is designed with general user agent interoperability
in mind, XHTML documents should be displayed on any
type of XHTML
-
compliant devices


Current version
-

XHTML™ 1.0, DTD specification
available at http://www.w3.org site


Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

20

Synchronized Multimedia Integration
Language (SMIL)


SMIL allows developers to mix media presentation to be
presented and synchronized with each other


For example, the SMIL document can specify:


the positioning where the visual content appears in
player


when audio or video (or other type of stream) starts and
stops playing


Users need a special player to view the SMIL documents


Products supporting SMIL: Real Networks
-

Realplayer,
Apple
-

QuickTime


See:
http://www.empirenet.com/~joseram/smil_intro/smil_intro.ht
ml for tutorial about SMIL written in SMIL


Current version
-

SMIL 1.0, Specification available at
http://www.w3.org site

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

21

Wireless Application Protocol (WAP) and
Wireless Markup Language (1)


Forecasted users of wireless services by 2001
-

530 million


Currently used and available in the future devices have
multimedia capabilities: receiving/sending e
-
mail, accessing
Internet


Wireless Application Protocol
-

standard for the presentation
and delivery of wireless information and telephony on mobile
phones and other wireless terminals


handset manufacturers that represent 90 percent of world market
support this standard


Wireless Markup Language (WML)
-

part of the standard,
designed to describe information to be presented on small
displays


WML documents can be accessed over the Internet using
standard HTTP protocol


traditional servers can be used for hosting WML documents


Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

22

Simple Object Access Protocol (SOAP)



Support for Remote Procedure Call and messaging
mechanisms over various protocols (for example, HTTP).
implemented in XML


Describes conventions for definition of:


method calls


method parameters


results of method calls


serialization mechanisms for encoding application
-
defined data
types


Since SOAP messages can be transported over HTTP
protocol, currently deployed Web infrastructure becomes
one distributed computing platform (distributed objects can
be placed on HTTP servers)


Current version
-

SOAP 1.1 (status: note), Specification
available at http://www.w3.org site

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

23

Support for XML in Web Browsers


Internet Explorer 5.0+


Extensible Markup Language


Extensible Stylesheet Language


Cascading Stylesheets


Document Object Model


Data Islands


Mozilla 5.0


Extensible Markup Language


Cascading Stylesheets


Document Object Model


Graphical User Interface built using XUL (Extensible User
Interface Language)
-

users can provide their own user interface
documents to customize layout of the browser


Microbrowsers for portable devices


Wireless Markup Language

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

24

Support for XML on Server Side


Web servers can host XML documents


XML documents can be dynamically generated by
servlets, JSP pages, and ASP pages


XML adapters allow translation from application
specific formats to XML


XML documents can be stored in databases for fast
retrieval


Enterprise applications with XML processing
functionality can be easily built using available XML
parser components and XSL processors


Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

25

XML Document and Database (1)

Part Name

Part ID

Price


InStock

window

001


40$


yes

muffler

002


150$


yes

door


003


30$


no


Information stored in database

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

26

XML Document and Database (2)

<store>

<part id=“p001”>

<part
-
name>window</part
-
name>

<price>40</price>

<instock>yes</instock>

</part>

<part id=“p002”>

<part
-
name>muffler</part
-
name>

<price>150</price>

<instock>yes</instock>

</part>


</store>


The same information represented as an XML document

Introduction to XML

CPS606, Fall 2002, EECS SU & CollabWorx

27

Data Interchange


One of the most costly aspect of Enterprise
Application Integration
-

conversion of proprietary
data formats to other data formats


XML
-

new data interchange standard


Information handled by different applications and
data sources can be converted into XML to provide
uniform data format


Using XML


applications can exchange data easily


application specific data can be used on the Internet