XML

tacitmarigoldInternet et le développement Web

25 janv. 2014 (il y a 3 années et 4 mois)

161 vue(s)

XML

1/25/2014

1

XML

Download as
Power Point file
for saving or
printing
.

XML

1/25/2014

2

XML Introduction


eXtensible Markup Language


A language for defining language
syntax
and
language
use
.


Element
is a pair of start and end tags.


Well
-
formed


when start and end tags match.


Hierarchical,
elements are nested within others.


Human readable

and
writeable.


XML

1/25/2014

3

XML Applications


XML
(eXtensible Markup Language) is a language for
defining and representing languages.


One key use is to define a data structure and data within
the structure, useful for exchanging data between different
applications and computer systems.


Since XML is text readable and writeable by humans, also
computer architecture neutral.


Many database systems such as Oracle can generate
entries from a database in XML form for use by other
applications that understand XML without regard to the
computer system word size, etc.


XML use is likely to grow in areas where multiple
computer systems must interact and exchange data.



XML

1/25/2014

4

Hierarchical Structure


XML documents must have a strictly hierarchical tag
structure.


That is, start tags must have corresponding end tags.


In XML vocabulary, a pair of start and end tags is
called an
element
.


Any element must be properly nested within another.


The following snippet is
well
-
formed
because if there
is a <To> start tag there must be an </To> end tag.





<To>
47150
</To>




XML

1/25/2014

5

Example
-

Hierarchical Structure

<package>




<To>
47150
</To>




<From>
47165
</From>




<Weight>
17.0
</Weight>




<Rate>
27.50
</Rate>




</package>

XML

1/25/2014

6

<USPS>


<package>



<To>47150</To>



<From>90210</From>



<Weight>25.0</Weight>



<Rate>43.50</Rate>


</package>


<package>



<To>47150</To>



<From>47165</From>



<Weight>17.0</Weight>



<Rate>27.50</Rate>


</package>

</USPS>


Example


Nested
Structure

XML

1/25/2014

7

Exercise 1

1.
Give XML that defines a
Person

with your
age

and
name

as
elements
.

2.
Diagram the hierarchical structure (parse
tree) defined by the XML.

3.
Give XML that defines a
Family

of
Person
.

4.
Diagram the hierarchical structure (parse
tree) defined by the XML.


XML

1/25/2014

8

XML Parsing


Parsing XML:


determines if
well
-
formed

or
valid
XML


allows programmer to access and manipulate elements.


XML parser defined for IE4 browser, Java, C++,...


Within IE browser, JavaScript parses XML using a XML
parser object.


Two general types of XML parsers:


SAX parsers:


generate events as each element is parsed that programmer must handle.


do not build a parse tree in memory,


useful for large structures that cannot be maintained in memory.


DOM parsers:


build
complete

parse tree that programmer can manipulate,


simple to use,


can require large resources to hold complete tree.

XML

1/25/2014

9

XML Parsers


Many languages and applications such as Web browsers
can
parse

XML to extract the data from the structure.


The following brief example illustrates how XML,
HTML, and JavaScript would be used to define and
access the data structure containing shipping
information.


The results can only be viewed in an XML enabled
browser such as IE4 and above that support the
MicroSoft XML objects.



XML

1/25/2014

10

SAX and DOM Parsers


SAX

parsers are event driven and do not build a parse
tree from the XML so the structure and data related to
an event are only available when the parse event occurs.
The user of the parser must decide what part of the parse
to store when a parse event occurs for later
manipulation.


DOM

parsers are not event driven but build a complete
parse tree of the XML that can be accessed by the parser
user with parser methods similar to that already seen in
the Javascript and XML example (e.g.
xmlDoc.documentElement.childNodes.item(0).text
).
DOM parsers are also generally simpler to use, their
main drawback is due to the resource requirements for
constructing and storing the full parse tree when the
XML document is large.


XML

1/25/2014

11

<SCRIPT LANGUAGE="JavaScript">



var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");



xmlDoc.loadXML(



'


<package>' +


'


<To>47150</To>' +


'


<From>47165</From>' +


'


<Weight>17.0</Weight>' +


'


<Rate>27.50</Rate>' +


'


</package>


' );



document.write(




"Rate: $" + xmlDoc.documentElement.childNodes.item(3).text+"<br>");

</Script>

Parse of XML in JavaScript

Browser

Result


Rate: $27.50

XML

1/25/2014

12

JavaScript Parse Explanation


JavaScript is embedded within an HTML file interpreted by a
browser. The syntax below has the following meaning:


<SCRIPT LANGUAGE="JavaScript">

-

Whatever follows to
the </SCRIPT> is JavaScript.



var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");

-

Defines an XML DOM parser object.


document.write( “Rate: $" +
xmlDoc.documentElement.childNodes.item(3).text+"<br>");

-

Write to the browser display as HTML.


xmlDoc.documentElement.childNodes.item(3).text


xmlDoc
-

the complete XML document attribute.


documentElement
-

XML document top level,
<package>
.


childNodes
-

nodes
<To>, <From>, <Weight>, <Rate>
below the top level
<package>
.


item(3)
-

The
<Rate>

child node.


item(3).text


27.50

is text of child node
<Rate>
.




XML

1/25/2014

13

XML Grammar

1.
Hierarchical element structure
-

start tags must have corresponding
end tags
.

2.
Case sensitivity

3.
Extensible
-

Extend XML by creating
new tags in EBNF defining a
Document Type Declaration

(
DTD
).

4.
Empty


Defines tags with no content
.

5.
Quoted attribute values

XML

1/25/2014

14

Hierarchical Element Structure


XML documents must have a strictly hierarchical tag
structure;
start

must have corresponding
end

tag.


In XML vocabulary, start & end tag pair called an
element
. An element can be nested within another.


The snippet below is
not well
-
formed,
an <Option>
start tag requires an </Option> end tag.



<Order> 2 Tacos



The snippet below is
well
-
formed
:


The <Order><Option1> start tags imply that Option1 is nested
within Order, so Option1 should end before Order, as in
</Option1></Order>.


<Order>2 Tacos




<Option1>Cheese</Option1>

</Order>

XML

1/25/2014

15

Empty Tags


Empty tags are also allowed as
elements

in
XML documents.


An empty tag is essentially a start and end
tag in one, and is identified by a
trailing slash

after the tag name.


For example, this is
well
-
formed

XML:


<Order>2 Tacos




<Option1>Cheese</Option1>


<Option3/>


</Order>



XML

1/25/2014

16

Attribute Values


All
attribute values

must be within single or
double quotes.


The following is
not well
-
formed
. Note the missing
quotes around
attribute
Nickname
value
Ray
.


<Name Nickname=Ray>Raymond Wisman</Name>



But these are
well
-
formed
:


<Name Nickname="Ray">Raymond Wisman</Name>


<a href=
http://www.whitehouse.gov
> George Bush</a>

XML

1/25/2014

17

Case Sensitive and Extensible


Case sensitivity
-

XML tags are case
-
sensitive.


Extensible
-

Extend XML by creating new
tags.


To create new tags, you must define, or
constrain, them by writing
grammar rules
,
which the tags must obey.


One means of defining new tags is by a
grammar Document Type Declaration (DTD)
using EBNF.



XML

1/25/2014

18

Document Type Declaration
(
DTD)


A DTD is a
grammar

that describes what tags
and attributes are valid in an XML document,
and in what context they are valid.


EBNF

used to define the allowable XML
constructs. EBNF uses production rules
where the left side represents a construct,
and the right side defines what that construct
can contain.


Without a DTD a XML document can be
checked for
well
-
formedness
, but not for
validity
.

XML

1/25/2014

19

DTD Grammar Definition

Element definition

What it means

A?

Matches A or nothing; optional A.

A+

Matches one or more occurrences of A.

A*

Matches zero or more occurrences of A.

A | B

Matches A or B but not both.

A , B

Matches A followed by B, in that order.

(A, B)+

Parenthesized expression treated as unit. Matches
one or more occurrences (A followed by B).


EMPTY

EMPTY Element Content Model.

Type

Element Declaration

Element Content Model

<

!

ELEMENT


AorBB

(A | B+)

>

XML

1/25/2014

20

DTD Example

<!ELEMENT AorBB (A | B+)>

defines a rule where:

<AorBB> <A/> </AorBB>

<AorBB> <B/> </AorBB>

<AorBB> <B/> <B/> <B/> <B/> </AorBB>

are
valid

but:

<AorBB> <A/> <B/> </AorBB>

<AorBB> <A/> <A/> <A/> </AorBB>

are
invalid.


XML

1/25/2014

21

DTD Example

<!ELEMENT AandBB (A , B+)>

defines a rule where:

<AandBB> <A/> <B/> </AandBB>

<AandBB> <A/> <B/> <B/> <B/ </AandBB>

are
valid

but:

<AandBB> <A/> </AandBB>

<AandBB> <B/> </AandBB>

<AandBB> <B/> <A/> </AandBB>

<AandBB> <B/> <B/> </AandBB>

are
invalid.


XML

1/25/2014

22

Parse Tree Examples

<Bs><Bs><Bs><b/></Bs><b/></Bs><b/></Bs>






<Bs>




<Bs> <b>




<Bs> <b>



<b>

<!ELEMENT AorB (a | b)*>

<!ELEMENT Bs ((Bs, b) | b)>

<!ELEMENT a EMPTY>

<!ELEMENT b EMPTY>

<AorB><a/><b/><a/></AorB>



<AorB>

<a> <b> <a>

Bs

| Bs

| | Bs

| | | b

| | b

| b

AorB

| a

| b

| a

XML

1/25/2014

23

Parse Tree Examples

<Bs><b/><Bs><b/><Bs><b/></Bs></Bs></Bs>






<Bs>





<b> <Bs>




<b> <Bs>



<b>

<!ELEMENT AandB (a, b)*>

<!ELEMENT Bs ((b,Bs) | b)>

<!ELEMENT a EMPTY>

<!ELEMENT b EMPTY>

<AorB><a/><b/><a/><b/></AorB>



<AorB>

<a> <b> <a> <b>

Bs

| b

| Bs

| | b

| | Bs

| | | b

AandB

| a

| b

| a

| b

XML

1/25/2014

24

Exercise 2

1.
State in English and BNF the following rules
:

a)
<!ELEMENT rule1 (a,b)>

b)
<!ELEMENT rule2 ((a | b), b)>

c)
<!ELEMENT rule3 (a*,b+)>

d)
<!ELEMENT rule4 ((rule4, b) | b)>

e)
<!ELEMENT a EMPTY>

f)
<!ELEMENT b EMPTY>

2.
Which of the following are
valid

under the rules
?

a)
<rule1> <a/> </rule1>

b)
<rule2> <b/> <a/> </rule2>

c)
<rule3> <a/> <b/> <b/> </rule3>

d)
<rule4> <b/> <b/> <b/> </rule4>

e)
<rule4> <rule4>
<rule4><b/></rule4>

<b/></rule4> <b/> </rule4>

3.
Give the parse trees.

a)
<rule1> <a/> </b> </rule1>

b)
<rule3> <b/> <b/> </rule3>

c)
<rule4> <rule4>
<rule4><b/></rule4>

<b/></rule4> <b/></rule4>

XML

1/25/2014

25

DTD Example
-

<exp> Language Definition

<!ELEMENT exp ( (exp, plus, exp) | (exp, times, exp) | (lparen, exp, rparen) | a | b | c )>

<!ELEMENT a EMPTY>

<!ELEMENT b EMPTY>

<!ELEMENT c EMPTY>

<!ELEMENT plus EMPTY>

<!ELEMENT times EMPTY>

<!ELEMENT lparen EMPTY>

<!ELEMENT rparen EMPTY>

<exp> ::= <exp> + <exp> | <exp> * <exp> | ( <exp> ) | a | b | c

The following are part of the <exp> language.


<exp> <a/> </exp>


a

<exp>
<exp><a/></exp>

<plus/>
<exp><b/></exp>

</exp>


a + b

XML

1/25/2014

26

1. Examples of <exp> Language

a

<exp>



<a/>

</exp>

a + b



<exp>



<exp><a/></exp>



<plus/>



<exp><b/></exp>

</exp>

<!ELEMENT exp ( (exp, plus, exp) | (exp, times, exp) | (lparen, exp, rparen) | a | b | c )>

XML

1/25/2014

27

a+b*c

<exp>



<exp>



<exp><a/></exp>



<plus/>



<exp><b/></exp>



</exp>



<times/>



<exp><c/></exp>

</exp>


( a )



<exp>



<lparen/>



<exp><a/></exp>



<rparen/>

</exp>


2. Examples of <exp> Language

<!ELEMENT exp ( (exp, plus, exp) | (exp, times, exp) | (lparen, exp, rparen) | a | b | c )>

XML

1/25/2014

28

( a+b )

<exp>



<lparen/>


<exp>



<exp><a/></exp>



<plus/>



<exp><b/></exp>



</exp>



<rparen/>

</exp>

3. Examples of <exp> Language

<!ELEMENT exp ( (exp, plus, exp) | (exp, times, exp) | (lparen, exp, rparen) | a | b | c )>

XML

1/25/2014

29

Exercise 3


Give the parse tree of each as an <exp>:

1.
a + b

2.
a * b + c

3.
(a + b) * c


Copy and save the following three slides as:

1.
exp.dtd


<exp> grammar definition.

2.
exp.xml


strings to be parsed as <exp> grammar.

3.
exp.htm


parses strings in exp.xml file.


Open
exp.htm

in IE to validate the
exp.xml

file.

<!ELEMENT exp ( (exp, plus, exp) | (exp, times, exp) | (lparen, exp, rparen) | a | b | c )>

XML

1/25/2014

30

1. Exercise 3
-

Exp.dtd

<!ELEMENT exp ((exp, plus, exp) | (exp, times, exp) | ( lparen, exp, rparen) | a | b | c )>

<!ELEMENT a EMPTY>

<!ELEMENT b EMPTY>

<!ELEMENT c EMPTY>

<!ELEMENT plus EMPTY>

<!ELEMENT times EMPTY>

<!ELEMENT lparen EMPTY>

<!ELEMENT rparen EMPTY>

XML

1/25/2014

31

2. Exercise 3
-

Exp.xml

<!DOCTYPE exp SYSTEM "exp.dtd">


<exp>



<lparen/>


<exp>



<exp><a/></exp>



<plus/>



<exp><b/></exp>


</exp>



<rparen/>

</exp>

XML

1/25/2014

32

<SCRIPT LANGUAGE="JavaScript">

var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");

try {
xmlDoc.load("exp.xml");




document.write("<pre>");



traverse(xmlDoc.documentElement,"");



document.write("</pre>");

}

catch(e) {


document.write(



"URL "+xmlDoc.parseError.url+" Line "+xmlDoc.parseError.line+



" position "+xmlDoc.parseError.linepos+" "+



xmlDoc.parseError.srcText + " " + xmlDoc.parseError.reason);

}


function

traverse
(node,indent) {



var i, children, type = node.nodeTypeString;


if (type == "element") {





document.write("<br>" + indent + node.nodeName);



children = node.childNodes;



if (children != null)







for (i=0; i<children.length; i++)


traverse

(children.item(i), indent + "|


");




}


}


</SCRIPT>


3. Exercise 3
-

JavaScript XML Parsing


Exp.htm

XML

1/25/2014

33

Exercise 3 Continued


Edit
exp.xml
,

to define XML for an <exp>.


Give the XML for each as an <exp> and parse to
validate:

1.
a + b

2.
a * b + c

3.
(a + b) * c

<!ELEMENT exp ( (exp, plus, exp) | (exp, times, exp) | (lparen, exp, rparen) | a | b | c )>

XML

1/25/2014

34

Example
-

XML Parse


(a+b)

Input


<exp>



<lparen/>


<exp>



<exp>



<a/>


</exp>



<plus/>



<exp>



<b/>


</exp>


</exp>



<rparen/>

</exp>

Output


exp

|


lparen

|


exp

|


|


exp

|


|


|


a

|


|


plus

|


|


exp

|


|


|


b

|


rparen


Parse Tree

<!ELEMENT exp ( (exp, plus, exp) | (exp, times, exp) | (lparen, exp, rparen) | a | b | c )>