Introduction to XPath

honorableclunkΛογισμικό & κατασκευή λογ/κού

30 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

66 εμφανίσεις

Introduction to XPath

Bun Yue

Professor, CS/CIS

UHCL

Resources


XPath 1.0:
http://www.w3.org/TR/xpath


XPath 2.0:
http://www.w3.org/TR/xpath20/


EditiX (free edition):
http://free.editix.com/


XPath 1.0 testbed by whitebeam:
http://www.whitebeam.org/library/gu
ide/TechNotes/xpathtestbed.rhtm




Introduction to XPath 1.0


XPath is used to address parts of an XML
document.


XPath is a W3C recommendation.


The newest version is 2.0, which is largely
backward compatible.


XPath is used by XPointer, XSLT and
XQuery.


XPath is designed to access elements, but
not creating new elements.


Designed to be embedded in a host
language, such as XSLT or XQuery.

Location Path


XPath uses path expressions to
address parts of the documents,
called
location path
.


A location path is composed of a
sequence of
location steps
, separated
by a '/'.

Location Path


A location path can be absolute or
relative.


an
absolute location

path starts with '/',
the document root.


a
relative location

path does not start
with '/'. Its path is relative to a
context
node
.


XPath 1.0 Results


The result of an XPath 1.0 may be
one of the following four types:


Number


String


Boolean


node
-
set: a set of node


As a set, there is no duplicate node.


Not the same as a document fragment.


To be replaced by sequence in XPath 2.0.


Example

/stocks/stock



matches all element nodes stock that
are children of the root element
stocks.


Editix


In Editix, use “>View > Windows >
XPath View” to execute XPath
expressions.


May select XPath 1.0 or 2.0.

Location Step


A location step is composed of three
parts:


a
node axis

(required): to describe
direction for navigation.


a
node test

(required): to specify the
node type, and


a set of node predicate

(optional): to
specify additional inclusion test.


Example

//stocks/child::stock[@symbol=“IBM"]/l
astprice


Consider the location step:

child::stock[@symbol=“IBM"]


axis: child

node test: stock

predicate: [@symbol=“IBM"]

Axis


An axis is the first part of the location
step and is followed by :: before the
node test and predicates.


There are 13 axes in XPath 1.0.


The default axis is the child axis.


The symbol @ can be used for the
attribute axis.


Axes in XPath 1.0


child: the children of the context node. (not including
attribute nodes).


descendant: contains the descendants of the context
node.


parent: contains the parent of the context node, if
there is one.


ancestor: the ancestors of the context node; including
the root node if the context node is not the root node.


following
-
sibling: all the following siblings of the
context node.


preceding
-
sibling: all the preceding siblings of the
context node.


Axes in Path 1.0


following: all nodes in the same document as the
context node that are after the context node in
document order, excluding any descendants and
excluding attribute nodes and namespace nodes


preceding: all nodes in the same document as the
context node that are before the context node in
document order, excluding any ancestors and
excluding attribute nodes and namespace nodes


attribute: contains the attributes of the context node;
the axis will be empty unless the context node is an
element

Axes in XPath 1.0


namespace: the namespace nodes of the
context node; the axis will be empty unless
the context node is an element


self: contains just the context node itself


descendant
-
or
-
self: the context node and
the descendants of the context node


ancestor
-
or
-
self: the context node and the
ancestors of the context node; thus, the
ancestor axis will always include the root
node.

Shorthand


. is the shorthand for self::node()


.. is the shorthand for parent::node().


// is the shorthand for /descendant
-
or
-
self::node()/

Node tests in XPath 1.0


The second part of a location step. It is
required.


There are three kind of node tests:


NameTest: the name of the node.


NodeType test:


node(): all nodes, including comments and PI,
excluding attributes and the document root.


text()


comment()


processing
-
instruction('pi
-
name')


* is a wildcard character matching any
name. It is a name test.


Predicate tests


Predicate tests are the last part of a
location steps.


They are enclosed by [] and are optional.


There may be more than one predicate
test.


XPath built
-
in functions can be used to
construct predicate (boolean) expression
as the added condition for inclusion.


Boolean operators: and, or.


Example

//text()

matches all text nodes.


//@p[.='1']

select all attributes with the name p
with value 1.

//person[first][last]

XPath Functions


There are many XPath 1.0 functions
for testing and other purposes.


Many of them are obvious. The non
-
obvious ones are explained below.


XPath 1.0 Functions


boolean(): convert to boolean data type.


false(): returns false always.


lang(arg): returns True iff the xml:lang
attribute of the context node is the same as
a sublanguage of the language specified by
the argument string arg.


not(arg): negation of arg.


true()


count(arg): number of nodes in the nodeset
argument arg.

XPath Functions


id(arg): select elements with their id argument arg.


last(): returns the context size of the expression
evaluation context


local
-
name(arg): returns the local name of the first
node in the node
-
set argument arg; returns the local
name of the context node if arg is missing.


name()


namespace
-
uri()


position(): returns the promixity position (starting
from one) of the context node within the axis.


XPath 1.0 Functions


ceiling(arg): ceiling of the number
argument arg.


floor(arg)


number(arg): convert arg to number.


round(arg):


sum(arg): sum of values of the node set
argument arg.


concat(): string concatenation of
arguments.


contains(arg1. arg2): true iff arg1 contains
arg2.


XPath 1.0 Functions


normalize
-
space(arg): returns the string
argument arg with white space stripped.


starts
-
with(arg1, arg2): whether arg1
starts with arg2.


string(): convert to string.


string
-
length(arg): the number of
characters of the string arg.


substring(arg1, arg2, arg3): returns the
substring of arg1 that starts with the index
arg2 for a length of arg3.


XPath 1.0 Functions


substring
-
after(arg1, arg2): the
substring of arg1 after arg2.


substring
-
before(): the substring of
arg1 after arg2.


translate(arg1, arg2, arg3): returns
arg1 with each character in arg2
translated to the corresponding
characters in arg3.

XPath 1.0 Classwork


To be handed in the class.


Use Familytree.xml

XPath 2.0


W3C related specifications:


XQuery 1.0 and XPath 2.0 Data Model


XQuery 1.0 and XPath 2.0 Functions and
Operators


XQuery 1.0 and XPath 2.0 Formal Semantics


XML Path Language (XPath) 2.0


XSL Transformations (XSLT) Version 2.0


XSLT 2.0 and XQuery 1.0 Serialization


XQuery 1.0: An XML Query Language

Major Changes in XPath 2.0


Sequences to replace node
-
sets as
the main data model.


XML Schema data types


Variable binding


A rich set of functions


Richer expressions


New comment styles




Sequences and items


A
sequence

is an
ordered

heterogeneous

collection of items.


An
item

can be


A
node


An
atomic

value

Sequences

Example:


(1, 5 to 8, "Bun Yue", 2.1)

(1+2, 5)

(1 to 50)[. mod 3 = 1]

/* | //person

(1, 2, (3, (4, 5))) is (1,2,3,4,5)


Sequences


Items within a sequence


Can be in any arbitrary order.


Can be heterogeneous.


Can be repeating.


Sequences are not nested.


XPath 2.0 results are sequences.
Atomic values are considered to be
sequences with a single item.

For expression & variable binding


for $varname in (expression) return
(expression)


Example:


for $person in //person


return count($person/email)

for $person in //person


return fn:count($person/email)


If statement

Example:


if (//person[first/text()='Boris']) then
'found Boris' else 'no Boris'

XPath 2.0 Functions


Many new functions:
http://www.w3schools.com/XPath/xp
ath_functions.asp


Some categories:


Sequences


Aggregate functions


Nodes


Numeric


String, with regular expressions


Quantified Expressions


Applied to a sequence:


some


every


Format:


some $v in sequence satisfies condition


every $v in sequence satisfies condition


Example

if (every $person in //person satisfies
$person/email) then


"everyone has email address"

else


"oh oh"

Classwork


To be handed in the class.


Use Familytree.xml


Questions