Introduction to XML Path Language (XPath20)

77
Introduction to XPath Transparency No. 1 Introduction to XML Path Language (XPath20) Cheng-Chia Chen

description

Introduction to XML Path Language (XPath20). Cheng-Chia Chen. What is XPath ?. Latest version: 2.0 : http://www.w3.org/TR/xpath20 XQuery/XPath Data Model (XDM) XQuery/XPath Formal Semantics XQuery 1.0 and XPath 2.0 Functions and Operators 1. 0 : http://www.w3.org/TR/xpath - PowerPoint PPT Presentation

Transcript of Introduction to XML Path Language (XPath20)

Page 1: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 1

Introduction to XML Path Language (XPath20)

Cheng-Chia Chen

Page 2: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 2

What is XPath ?

Latest version: 2.0 : http://www.w3.org/TR/xpath20 XQuery/XPath Data Model (XDM) XQuery/XPath Formal Semantics XQuery 1.0 and XPath 2.0 Functions and Operators

1.0 : http://www.w3.org/TR/xpath

a language for addressing parts of an XML document,

designed to be used by XSLT , XQuery, XML Schema and XPointer.

References: W3Schools

Page 3: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 3

TOC

1 Introduction

2 Data Model

3 Location Paths

4 Expressions

5 Core Function Library

Page 4: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 4

1. Introduction

What is XPath? A language used to to address parts of an XML [XML]

document, provides basic facilities for manipulation of strings,

numbers and booleans, operate on the abstract, logical structure of an XML

document, rather than its surface syntax.

Page 5: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 5

XPath(2.0) data model

provides a tree representation of XML documents as well as atomic values such as number, strings, and booleans,

and flat sequences that may contain both references to nodes

in an XML document and atomic values.

The result of evaluating an XPath expression is a sequence of items, each of which is either a node from the input document, or an atomic value.

Page 6: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 6

Type systems of XPath

XPath Expression: the primary syntactic construct in XPath. would be evaluated to yield a value, which is a possibly

empty sequence of items.

An item is either a node or an atomic value.

Page 7: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 7

Expression evaluation

occurs with respect to a context. XSLT, XQuery and XPointer specify how the context

is determined. A context consists of:

1. a node (the context node) 2. a pair of non-zero positive integers the context

position and the context size) 3. a set of variable bindings 4. a function library 5. the set of namespace declarations in scope for the

expression Notes:

3,4,5 does not change when evaluating subexpressions. 2 can only be changed by predicates Some expression may change 1.

Page 8: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 8

Location path

The most important kind of expressions used to selects a set of nodes relative to a context

node.

Page 9: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 9

2. Data Model

details in XQuery/XPath data Model XPath operates on an XML document as a tree of

nodes. All xpath expressions are evaluated to produce a

sequence of items. Item

atomic value (atoms) node (of an XML document tree)

Page 10: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 10

Kinds of Atoms

Kinds of atoms number1.0 (a double floating-point number) boolean1.0 (true or false) string1.0 (a sequence of unicode characters) or

generalized to including all simple datatypes defined by xml schema2.0

number2.0 is classified further into integer, decimal, float and double.

Page 11: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 11

Atomization

A sequence of items can be atomized to produce a sequence of atoms by replacing every node item with its string value as follows: text node the contents of the text node root node or element node the concatenation in document order of the string

values of all descendent text nodes attribute node, comment node, processing-instructin

node string value of the node

The string value of a node can be queried by invoking fn:string(node).

Page 12: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 12

Types of nodes in an XML tree

Same as in XPath 1.0 The tree contains nodes. Types of nodes and their possible children:

root nodes : element ( = 1), comment, PI element nodes: element, text, PI, comment,

[attribute, namespace] text nodes: leaves attribute nodes : leaves namespace nodes: leaves processing instruction nodes : leaves comment nodes : leaves

Page 13: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 13

Basic concepts

See Concepts from XDM

Node Identities Document Order Sequence Types

Page 14: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 14

Node Identity

Every node has a unique identity. (like objects in Java) identical to itself, not identical to any other node. I.e., node1 = node2 iff node1 and node 2 correspond to

the same node occurrence.

Notes: 1.node identity ≠ ID attribute.

2.An element has an identity even if it has no ID attributes.

3.Non-element Nodes also have unique identity.

Atomic values do not have identity; every occurrence of “5” as an integer is identical to every

other occurrence of “5” as an integer.

Page 15: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 15

Example

<courses>

<course name =“dismath”>

<student idref=“Wang” />

<student idref=“chen” /> …

</course>

<course name=“compiler”>

<student idref=“Wang” />

<student idref=“Chang”/> … </course> </courses>

Ex: xpath: ( /courses/course[name=‘dismath’]/student[1]

is //student[3] ) returns false. xapth: (//students[1]/@idref is //students[3]/@idref )

returns false. (why?)

Page 16: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 16

Document order and reverse document order

Same as in XPath 1.0

Page 17: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 17

Example [to be added]

<?xml version=“1.0” ?>

<a xmlns:ns1 = “uri1” at1 = “…” at2=“…” >

<a1> data1 </a1>

<a2> data2 </a2>

<a3><b3/><!-- comment 1 --> </a3>

<?pi pidata ?>

</a> Ddoc order: root < a < ns1 < { at1,at2}

< a1 < ns14a1 < data1 …

< a3 < ns14a3 < b3 < ns14b3 < comment

< pi

Page 18: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 18

Sequences

Sequence of items is the unique output type of all XPath expressions. A sequence may contain nodes, atomic values, or any mixture of

nodes and atomic values. no distinction between an item and a singleton sequence

containing that item. (‘123’ ) = ‘123’ ; node2 = ( node2 ).

A node does not loose its identity when it is added to a sequence. [i.e., only references to the node are added] A node may occur in multiple places of one or more sequences.

Sequences are flat and never contain other sequences. Appending (d e) to (a b c) will not produce (a b c (d e)) but would flat

it to (a b c d e ) automatically. Notes:

Sequences replace node-sets from XPath 1.0. In XPath 1.0, node-sets do not contain duplicates.

Page 19: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 19

Types in XDM

accept all types defined by XML Schema supports XSLT and XQuery whose type system are based

on XML Schema. includes 19 built-in primitive types, 5 additional types

defined by XDM and user/implementor defined types.

type system defined in XQuery&XPath formal semantics

Every item in the data model has both a value and a type. Examples: nodes node type, 5 xsd:integer ; ‘5’ xsd:string; “Hello World.” xs:string.

Page 20: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 20

XDM Type Hierarchy

from XDM Type Hierarchy.

Page 21: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 21

Representation of Types

Use expanded-QName (EQName) to represent a type.

Definition: An expanded-QName is a set of three values consisting of {prefix} a possibly empty prefix, {namespace name} a possibly empty namespace URI and {local name} a local name. Note: Only URI and local name is used for identity.

Lexical representation of an expanded QName: [pre1:] localName URI determined by context.

A type [with target namespace = n1 and local name = loc1] is represented by a EQName[ whose URI = n1 and local Name = loc1].

Page 22: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 22

General constraints on nodes

All nodes must satisfy the following general constraints: 1. Every node must have a unique identity, distinct from

all other nodes. [unique identity] 2. The children property of a node must not contain two

consecutive Text Nodes. [no adjacent texts ] 3. The children property of a node must not contain any

empty Text Nodes. [no empty text ] 4. The children and attributes properties of a node must

not contain two nodes with the same identity. [no sharing of nodes ]I.e., no sharing of contained nodes (hence a tree but not a dag ).

Page 23: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 23

Predefined Types

xs:untyped denotes the dynamic type of an element node that has not been

validated, or has been validated in skip mode.

xs:untypedAtomic denotes untyped atomic data, such as text that has not been

assigned a more specific type or attribute value that is validated in skip mode

xs:anyAtomicType derived from xs:anySimpleType the root of all atomic types (not including list or union type) the base type of all 23 primitive types.

xs:dayTimeDuration, xs:yearMonthDuration derived from xs:duration form: PddDTddHddMdd:ddd form: PddddYmmM

Page 24: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 24

atomic (Typed) value constructions

signature (format): see XPath constructor functions prefix:TYPE($arg as xs:anyAtomicType?) as prefix:TYPE?

Notes: ? means the input and output is a sequence of zero or

one atomic value. if $arg is empty then the output is also the empty

sequence. possible prefix:TYPE

xs:integer, xs:int, xs:datetime, xs:boolean,… can also be user defined atomic types : bk:ISBN, np:IP

QName of target type

InputType OutputType

Page 25: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 25

List of constructors for built-in types

xs:string($arg as xs:anyAtomicType?) as xs:string? xs:string(“abc”) string “abc”; xs:string(123) “123”

xs:boolean($arg as xs:anyAtomicType?) as xs:boolean? xs:boolean(“abc”) error; xs:boolan(“”) false; xs:boolean(10)

true; xs:boolean() error; xs:booolean(()) ()

xs:decimal($arg as xs:anyAtomicType?) as xs:decimal? xs:decimal(“123.456789” ) 123.456789

xs:float($arg as xs:anyAtomicType?) as xs:float? xs:double($arg as xs:anyAtomicType?) as xs:double? Note:

xs:int(“1234567891234”) error xs:integer(“1234567891234) 1234567891234

Page 26: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 26

All others are similar. xs:duration, xs:dateTime, xs:time,xs:date,xs:gYearMonth, xs:gYear,xs:gMonthDay,xs:gDay,xs:gMonth xs:hexBinary,xs:base64Binary xs:anyURI,xs:QName xs:normalizedString, xs:token, xs:language, xs:NMTOKEN, xs:Name, xs:NCName, xs:ID, xs:IDREF, xs:ENTITY, xs:integer, xs:long, xs:int, xs:short, xs:byte xs:nonPositiveInteger,xs:negativeInteger xs:nonNegativeInteger, xs:unsignedLong,xs:unsignedInt,xs:unsignedShort,

xs:unsignedByte, xs:positiveInteger,xs:yearMonthDuration, xs:dayTimeDuration, xs:untypedAtomic,

Page 27: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 27

More Examples

xs:string(“abc”), xs:int(“123”) xs:float(“123.3e10”) xs:date(“2006-11-12”)

xs:gMonthYear(“--11-12:) xs:gMonth(“--11”) xs:gDay(“---12”)

xs:dateTime(“2006-11-12T12:00:00"). fn:dateTime( xs:date("1999-12-31"), xs:time("12:00:00"))

xs:dateTime("1999-12-31T12:00:00"). fn:dateTime( xs:date("1999-12-31"), xs:time("24:00:00"))

returns xs:dateTime("1999-12-31T00:00:00") because "24:00:00" is an alternate lexical form for "00:00:00".

note: 24:00:00 = 00:00:00

Page 28: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 28

construction of typed value w/o namespace

How to construct a value the type of which belongs to a namespace without a namespace URI?

1. use cast operation: ex: weight is a subtype of xs:int w/o belonging to any

namespace. Then we can use : 40 cast as weight to get an instance of weight.

2. undeclare default namespace : declare default function namespace “” ; … weight(40) …

Page 29: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 29

String values

Every atomic value has a string representation.

The value can be obtained by the casting operation: Ex: ( xs:int(“123”) + 45 ) cast as xs:string return “168”

Page 30: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 30

Properties of nodes

string-value Every node has a string-value, which is part of the node

or computed from the string-value of descendant nodes.

expanded-name1.0 ( in 2.0 it is replaced with EQName) expanded-name = namespce URI + local part The namespace URI is either null or a URI string

[RFC2396]. Two expanded-names are equal if they have the same

local part, and the same namespace URIs

Page 31: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 31

Node relationship

Same as in xpath 1.0

Page 32: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 32

properties/relationship of nodes m(e) is the URI bound to prefix e

node type expanded

name

string-value child parent

1.root; document

---

( no value)

descendent texts

2,5,6 {}

2.element

( e:local)

m(e) + local

null + local

descendant texts

2,3,5,6. 1,2

3.text --- text content {} 2

4.attribute

( e: attr=“…”)

m(e)+attr or

null+ attr

attr value

(normalized)

{} 2

5.comment --- text of content {} 1,2

6.PI null+PITarget PIData {} 1,2

7.namespace

(xmlns:p=“uri”)

null+p

null+””

uri {} 2

Page 33: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 33

3 Location Paths (renamed PathExpr in 2.0)

Same as in xpath 1.0 (except some mirror change) LocationPath

a special kind of expressions, used to locate a sequence of nodes in the document. sorted in document order no duplicates

Page 34: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 34

4. General Expressions

Every expression evaluates to a sequence of items atomic values nodes

Atomic values may be double1.0 or numeric2.0

booleans Unicode strings or other datatypes defined by XML Schema

Page 35: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 35

Atomization

A sequence may be atomized: atomic members not affected; node items become strings This results in a sequence of atomic values

conversion rules: document or element nodes the concatenation of all

descendant text nodes (a string) other kind of nodes the obvious string. attribute node atribute vlaues cast as xs:string text text content comment commnet text pI PI data (PI target dropped) namesapce node text of namespace URI

Page 36: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 36

Kinds of Expressions

3.1 Primary Expressions : string + numeric literls

3.2 Path Expressions

3.3 Sequence Expressions: , to [ … ], |, intersect, -

3.4 Arithmetic Expressions : +, - , *, div, idiv, mod

3.5 Comparison Expressions: is, <, >, =, le, ge, eq, ne…

3.6 Logical Expressions : and, or, not,

3.7 For Expressions : for

3.8 Conditional Expressions : if

3.9 Quantified Expressions : every, some

3.10 Expressions on SequenceTypes

Page 37: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 37

Primary Expressions

Literals string: “abc”, ‘abc’, “He said “”OK”””, ‘He said “ok” ’. numerical: 123 xs:integer, 123.4 xs:decimal 124.4e5 xs:double non-literals: xs:int(“125”) = xs:int(125) = 125 cast as xs:int boolean : fn:true(), fn:false()

Variable References : $pre:name, $var-1 Parenthesized Expressions : ( ), ( expr ) Context Item Expression : .

(1 to 100) [. mod 5 eq 0] //book[ fn:count(./author) > 1 ]

Function Calls : pre:fName( arg1, …, argn ) fn:concate(“abc”, “def”)

Page 38: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 38

Literal Expressions

423.14156.022E23’XPath is a lot of fun’

”XPath is a lot of fun”

’The cat said ”Meow!”’

”The cat said ””Meow!”””

”XPath is just so much fun”

Page 39: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 39

Variable References

$foo$bar:foo

$foo-17 refers to the variable ”foo-17” Possible fixes:

($foo)-17, $foo -17, $foo+-17

Page 40: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 40

XPath operators and their precedences

see reference XPath 2.0 grammar

Page 41: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 41

Path Expressions

Locations paths are expressions They may be applied to arbitrary sequences

evaluation rule discussed before.

Page 42: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 42

Sequence Expressions

Constructing Sequences : , , to (1,2,3) ,(), (3) (1,2,3,3) 2 to 4 (2,3,4) (10, (1 to 3)) (10,1,2,3) (1,(2,3,4),((5))) (1,2,3,4,5) -- flatten

Filter Expressions : PrimaryExpr [ … ]* (1 to 30) [ . mod 3 = 0 ] [ . mod 5 = 0 ] (15, 30) (10 to 20) [ 5] (14)

Combining Node Sequences (for Node only): assume doc order : A < B < C < D < E union: (A,B,A) | (B,C) | (A,C) = (A,B) union (B,C) (A,B,C) intersect, except : (A,B,C,D )intersect (B,D,A,E) except (B) (A, D).

Page 43: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 43

Filter Expressions

Predicates generalized to arbitrary sequences The expression ’.’ is the context item The expression:

(10 to 40)[. mod 5 = 0 and position)>20]

has the result:

30, 35, 40

Page 44: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 44

Arithmetic Expressions

+, -, *, div, idiv, mod, +, - (unary) -3 div 2 -1.5 (decimal) -3 idiv 2 -1 (integer) -3.4 mod 2 (or -2) -1.4 rule: x = y * ( x idiv y) + (x mod y)

precedence : {+,-} < {*, mod, div,idiv} < {unary +,-}

Operators are generalized to sequences if any argument is empty, the result is empty () + 3 () All argument are singleton sequences of numbers: ( 3) + ( 4) + 5 12 otherwise, a runtime error occurs (1,3) + (2,4) error

Page 45: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 45

Comparison Expressions boolean

Value Comparisons comparison operators : eq, ne, lt, le, gt, ge. used for comparing single values.

General Comparisons (**) operators: =, !=, <, <=, >, >=. are existentially quantified comparisons that may be

applied to operand sequences of any length. The result is true or false if it does not raise an error.

Node Comparisons operators: is, >>, << A is B true if A anb B are the same node A << B = B >> A true if if A preceds B in doc order.

Page 46: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 46

Value Comparison

Comparison operators: eq(=), ne(≠), lt(<), le(<=), gt(>), ge(>=)

Used on atomic values When applied to arbitrary values ( sequence ):

atomize if either argument is empty, the result is empty if either has length >1, the result is false if incomparable, a runtime error ; ex:8 < “abc” otherwise, compare the two atomic values 8 eq 4+4(//rcp:ingredient)[1]/@name eq ”beef cube steak”

Page 47: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 47

Node Comparison

Operators: is, <<, >> Used to compare nodes on identity and order is is for node identity; >>, << for node ordering

When applied to arbitrary values: if either argument is empty, the result is empty if both are singleton nodes, the nodes are compared otherwise, a runtime error. Ex: //book[1] is “abc”

Ex: (//student)[2] is //student[@id = ”s9527”] /rcp:collection << (//rcp:recipe)[4] (//rcp:recipe)[4] >> (//rcp:recipe)[3]

Page 48: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 48

General Comparison (use with care!!) Operators: =, !=, <, <=, >, >= Used on general sequences:

atomize if there exists two values, one from each argument, whose value

comparison holds, the result is true –Note: It may raise an error during the value comparison

otherwise, the result is false ;

8 = 4+4 (1,2) = (2,4)//rcp:ingredient/@name = ”salt”

() = () false!! (2) != (“2”) runtime error(1,2) = (1, “2”) true(1,2) = (“2”, 1) runtime error

I.e., seq1 gop seq2 means ∃x1∈seq1∃x2∈seq2 (x1 vop x2).

Page 49: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 49

Be Careful About Comparisons

((//rcp:ingredient)[40]/@name,(//rcp:ingredient)[40]/@amount) eq((//rcp:ingredient)[53]/@name, (//rcp:ingredient)[53]/@amount)

false, only singletons and compatible values can be compared

((//rcp:ingredient)[40]/@name, (//rcp:ingredient)[40]/@amount) =((//rcp:ingredient)[53]/@name, (//rcp:ingredient)[53]/@amount

true, since the two names are found to be equal

((//rcp:ingredient)[40]/@name, (//rcp:ingredient)[40]/@amount) is((//rcp:ingredient)[53]/@name, (//rcp:ingredient)[53]/@amount)

runtime error, since only single-node sequences can be compared

Page 50: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 50

Algebraic Axioms for Comparisons

xx

xyyx

zxzyyxzxzyyx

•Reflexivity:

•Symmetry:

•Transitivity:

•Anti-symmetry:

•Negation:

yxxyyx

yxyx

Page 51: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 51

Genral comparisons violates most axioms

Reflexivity?

()=() yields false

Transitivity? (1,2)=(2,3), (2,3)=(3,4), not (1,2)=(3,4)

Anti-symmetry?

(1,4)<=(2,3), (2,3)<=(1,4), not (1,2)=(3,4)

Negation?

(1)!=() yields false, (1)=() yields false

Page 52: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 52

Logical Expressions

Operators: and, or Constants use functions :

true() and false()

Negation uses fucntion: not(…)

prcedence: or < and < not(.) Arguments are coerced, false if the value is:

the boolean : false() the empty sequence : () the empty string : ”” the number zero : 0 e.g: 0 or ”0” true; not(”0”) false ; 0 or () false

Page 53: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 53

Functions

XPath has an extensive function library Default namespace for functions:http://wwww.w3.org/2005/xpath-functions

http://www.w3.org/2006/xpath-functions 106 functions are required

More functions with the namespace:

http://www.w3.org/2001/XMLSchema for constructors

Page 54: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 54

Function Invocation

Calling a function with 4 arguments:

fn:avg(1,2,3,4) -- fail

Calling a function with 1 argument:

fn:avg((1,2,3,4))

Page 55: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 55

Numeric operators and functions

Arithmetic operators:+, -, *, div, idiv, mod

ex: 2 + 3, + 3, 5.0 – 4, -+4.0, 30.2 div 4.2, 30 idiv 4, 20 mod 3

value comparisons: eq(=), ne(!=), le(<=), lt(<), ge(>), gt(>=) 2.3 > 5; 4 != 3; 4 ge 6

functions:fn:abs(-23.4) = 23.4fn:ceiling(23.4) = 24fn:floor(23.4) = 23fn:round(23.4) = 23 ; fn:round(23.5) = 24fn:round-half-to-even(-22.5) = 22

Page 56: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 56

Boolean Functions

Note: no constants for true/false. use functions true() and false() instead.

Boolean operators: and, or a and b or c means (a and b) or c

functions: not(-), true(), false() fn:not(0) = fn:true() fn:not(fn:true()) = fn:false() fn:not("") = fn:true() fn:not((1)) = fn:false()

Notes: 0,“” have effect boolean value false. (1) has effect boolean value true.

Page 57: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 57

Effect boolean values

The following values are interpreted as true: boolean true non-empty string non-zero number a sequence whose first item is a node

The following values are interpreted as false: boolean false empty string 0 () // empty sequence

Examples: (2,3) or (4,5) runtime error 2 and “” false ; (2) and (3) true (why?)

Page 58: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 58

String Functions

fn:concat("X","ML") = "XML"fn:concat("X","ML"," ","book") = "XML book"fn:string-join(("XML","book")," ") = "XML book"fn:string-join(("1","2","3"),"+") = "1+2+3"fn:substring("XML book",5) = "book"fn:substring("XML book",2,4) = "ML b"fn:string-length("XML book") = 8fn:upper-case("XML book") = "XML BOOK"fn:lower-case("XML book") = "xml book”

fn:translate("bar","abc","ABC") = "BAr"fn:translate("--aaa--","abc-","ABC") = "AAA".fn:translate("abcdabc", "abc", "AB") = "ABdAB".

Page 59: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 59

Regexp Functions

fn:contains("XML book","XML") = fn:true()fn:matches("XML book","XM..[a-z]*") = fn:true()fn:matches("XML book",".*Z.*") = fn:false()fn:replace("XML book","XML","Web") = "Web book"fn:replace("XML book","[a-z]","8") = "XML 8888"

Page 60: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 60

Cardinality Functions on sequence

fn:exists(()) = fn:false()fn:exists((1,2,3,4)) = fn:true()fn:empty(()) = fn:true()fn:empty((1,2,3,4)) = fn:false()fn:count((1,2,3,4)) = 4fn:count(//rcp:recipe) = 5

Page 61: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 61

Sequence Functions

fn:distinct-values((1, 2, 3, 4, 3, 2)) = (1, 2, 3, 4)

fn:insert-before((2, 4, 6, 8), 2, (3, 5)) = (2, 3, 5, 4, 6, 8) (: 2 is the position:)fn:remove((2, 4, 6, 8), 3) = (2, 4, 8)fn:reverse((2, 4, 6, 8)) = (8, 6, 4, 2)fn:subsequence((2, 4, 6, 8, 10), 2) = (4, 6, 8, 10)

fn:subsequence((2, 4, 6, 8, 10), 2, 3) = (4, 6, 8)

Page 62: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 62

Aggregate Functions

fn:avg((2, 3, 4, 5, 6, 7)) = 4.5

fn:max((2, 3, 4, 5, 6, 7)) = 7

fn:min((2, 3, 4, 5, 6, 7)) = 2

fn:sum((2, 3, 4, 5, 6, 7)) = 27

fn:count((2, 3, 4, 5, 6, 7)) = 6

Page 63: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 63

Node Functions

fn:doc("http://www.brics.dk/ixwt/recipes/recipes.xml")

fn:position()

fn:last()

Page 64: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 64

Coercion Functions

xs:integer("5") = 5xs:integer(7.0) = 7xs:decimal(5) = 5.0xs:decimal("4.3") = 4.3xs:decimal("4") = 4.0xs:double(2) = 2.0E0xs:double(14.3) = 1.43E1xs:boolean(0) = fn:false()xs:boolean("true") = fn:true()xs:string(17) = "17"xs:string(1.43E1) = "14.3"xs:string(fn:true()) = "true"

Page 65: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 65

For Expressions

The expressionfor $r in //rcp:recipe

return fn:count($r//rcp:ingredient[fn:not(rcp:ingredient)])

returns the value11, 12, 15, 8, 30

The expressionfor $i in (1 to 5) for $j in (1 to $i) return $j

returns the value1, 1, 2, 1, 2, 3, 1, 2, 3, 4, 1, 2, 3, 4, 5

Page 66: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 66

Conditional Expressions (IfThenElse)

fn:avg( for $r in //rcp:ingredient return if ( $r/@unit = "cup" ) then xs:double($r/@amount) * 237 else if ( $r/@unit = "teaspoon" ) then xs:double($r/@amount) * 5 else if ( $r/@unit = "tablespoon" ) then xs:double($r/@amount) * 15 else ())

Page 67: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 67

Quantified Expressions

form: ( some | every ) $var1 in Expr1 ,…,$varn in Exprn … satisfies Expr

a boolean exprEx: some $r in //rcp:ingredient satisfies $r/@name eq "sugar"

fn:exists( for $r in //rcp:ingredient return if ($r/@name eq "sugar") then fn:true() else ()

)

Page 68: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 68

Expressions on sequence types

Expressions on SequenceTypes

1. Instance Of2. Cast3. Castable4. Constructor Functions5. Treat

sequence types A sequence type is a type that can be expressed using

the SequenceType syntax. Sequence types are used to refer to a type in an XPath

expression, whose value is always a sequence

Page 69: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 69

sequence type syntax

sequence type empty-sequence() item-type (? | + | * ) ?

item-type atomic-type item() kind-test

atomic-type any QName // xs:int, my:type kind-test

Page 70: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 70

kind-test

DocumentTest document-node(), document-node(element(book, booktype)) ElementTest element(), element(*, xs:int),

element(p:e1) AttributeTest attribute( ), attribute(*, my:type),

attribute(my:attr1) SchemaElementTest schema-element(Ele-Name) Els-Name is declared in an in-scope schema element

declaration SchemaAttributeTest schema-attribute(Att-Name) PITest processing-instruction([ NCName | string ]) CommentTest comment() TextTest text() AnyKindTest node()

Page 71: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 71

Sequenctype expressions InstanceofExpr ::= TreatExpr instanceof sequencType

5 instanceof xs:integer, 5 instanceof xs:decimal (6,5) instanceof xs:integer+ . instance of element()

CastExpr    ::=    UnaryExpr [ cast as [ atomicType] ] (2,3) cast as xs:double+ 2 cast as xs:float

CastableExpr  ::=    CastExpr [ castable as [ atomicType] ] (2,3) castable as xs:double+ true (2,3) castable as xs:double? false

TreatExpr    ::=    CastableExpr [ treat as SequenceType ] ex: @addr treat as attribute(*, USAddress ) change the declared(static) type of @addr to USAddress. During evaluatin, if the actual (dynamic) type is not error

Page 72: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 72

XPath 1.0 Restrictions

Many implementations only support XPath 1.0 Smaller function library Implicit casts of values Some expressions change semantics:

”4” < ”4.0”

is false in XPath 1.0 but true in XPath 2.0

Page 73: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 73

XPointer

A fragment identifier mechanism based on XPath Different ways of pointer to the fourth recipe:

...#xpointer(//recipe[4])

...#xpointer(//rcp:recipe[./rcp:title ='Zuppa Inglese'])

...#element(/1/5)

...#r102

Page 74: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 74

Expression Hierarchy (1.0)

PrimaryExpr (Expr), funCall, number, literal, varReference (Expr), f(a,b,c), 2.3, “abc”, $pre

FilterExpr PrimaryExpr pred* $ns[@name=‘abc’]

PathExpr FilterExpr / LP FilterExpr // LP LP $ns[@name=‘abc’] //author[2]

UnionExpr PathExpr | PathExpr UnaryExpr - UnionExpr MultiplicativeExpr *, div, mod, AdditiveExpr +, - RelationalExpr <, <=, >, >= EqualityExpr =, != AndExpr and OrExpr or Expr OrExpr

Page 75: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 75

Expression Hierarchy (2.0)

PrimaryExpr (Expr?), funCall, numberOrStringLiteral, varRef, cxtItemExpr (Expr), (), f(a,b,c), 2.3, “abc”, $xyz, .

StepExpr ::= (PrimaryExpr | AxisStep) Pred* $x [@name eq ‘abc’], pre:e1[@name][2]

RelativePathExpr ::= StepExpr ((‘/’ | ‘//’ ) StepExpr )* $ns[@name=‘abc’] //author[2] /@name

PathExpr ::=(“/”?|‘//’)RelativePathExpr|RelativePathExpr ValueExpr ::= PathExpr UnaryExpr ::=(‘+’ |’ –’ )* ValueExpr CastExpr ::= UnaryExpr (‘cast’ ‘as’ AtomicType ‘?’)?

/bk:books[2]/@name cast as xs:string () cast as xs:int?

Page 76: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 76

CastableExpr ::= CastExpr (‘castable’ ‘as‘ AtomicType ‘?’ )? if ($x castable as my:type) then $x cast as my:type else $x cast as xs:string

TreatExpr ::= CatableExpr (‘treat’ ‘as’ sequenceType )? $add treat as element(*, USAddress) static type of $addr may be element(*, Address), but require it to be

element(*, USAddress) at runtime. o/w dynamic error

instanceOfExpr ::= TreatExpr (‘instacne’ ‘of’ sequencType )? IntersectExpr ::= instanceOfExpr ( (‘insersect’ | ‘except’ )

instacneOfExpr)* unionExpr ::= intersectExpr ( (‘union’ | ‘|’ ) intersectExpr)*

Page 77: Introduction to XML Path Language (XPath20)

Introduction to XPath

Transparency No. 77

MultiplicativeExpr *, div, idiv, mod, 5 div 2 * 3

AdditiveExpr +, - 2 + 3 - 4

RangeExpr ::= AdditiveExpr (to AdditiveExpr)? 3 to 100

ComparisonExpr ::= RangeExpr ( (NodeCmp | ValueCmp | GeneralCmp ) RangeExpr )?

AndExpr and OrExpr or ExprSingle ::= OrExpr | IfExpr | ForExpr | QuantifiedExpr Expr ExprSingle (‘,’ ExprSingle)* XPath ::= Expr