1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery...

78
1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. For more information on how you may use them, please see http://www.openlineconsult.com/db

Transcript of 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery...

Page 1: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

1

Advanced Database Topics

Copyright © Ellis Cohen 2002-2005

Querying XML withXPath and XQuery

These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.

For more information on how you may use them, please see http://www.openlineconsult.com/db

Page 2: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 2

Topics

XPath 1.0

Predicates

XPath Nodes & Axes

XPath 2.0

XQuery

Element Construction with XQuery

Page 3: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 3

XPath 1.0

Page 4: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 4

XPath: The XML Path Language

Used to select nodes of the document matching given criteria

Identifies parts of an XML documents used in XML Schema, XPointer, XSLT and XQuery

Data Types: strings, numbers, booleans and node-sets (with support for basic operations & functions)

Compact, non-XML syntax similar to OS paths (uses / to walk hierarchy)

Page 5: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 5

XML Element Nodes<CourseBooks>

<Course>CS779</Course><Book>

<Title>Database Design, Implementation & Management, 5th Edition</Title>

<Author>Rob & Coronel</Author><Publisher>Course Technology</Publisher>

</Book><Book>

<Title>Professional XML Databases</Title><Author>Williams</Author><Publisher>Wrox Press</Publisher>

</Book></CourseBooks>

root

CourseBooks

Course Book Book

Title Author Publisher…

This XML schema will be used for examples, with

other elements & attributes added as needed

Page 6: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 6

Hierarchy Navigation

XPath uses / to navigate down the tree

Selects a nodeset – a set of nodes, not just a single node

Absolute NavigationStart XPath expression with /Starts at root of document

Relative NavigationDoesn't start with /Starts at the context (i.e. current node)

Page 7: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 7

Simple Navigation

/CourseBooksThe coursebooks element(s) of the root

/CourseBooks/BookAll the book element children of the coursebooks

child of the root

/CourseBooks/Book/AuthorAll the authors of all the books

AuthorAll the author children of the context node

./AuthorSame, since . means the context node

Page 8: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 8

Navigation to Descendants

//AuthorAll author descendents of the root

.//AuthorAll author descendents of the context

node

//Author/NameThe names (i.e. child name elements) of

all the authors

/CourseBooks//AuthorAll author descendents of coursebooks

Page 9: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 9

XPath 1.0 Result

root

CourseBooks

Course Book Book

Author"CS779"

"Rob & Coronel"

Author

"Williams"

Query: //authorResult is a sequence of

nodes in document order

Page 10: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 10

Text Fragments & Node Sets

Author

"Rob & Coronel"

Author

"Williams"

You can imagine that the result of the query is a set of XML text fragments:

<Author>Rob & Coronel</Author>,<Author>Williams</Author>

but that's just a convenient way of visualizing the result. The result is the actual set of nodes

Page 11: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 11

Selecting All Elements

/CourseBooks/*All elements which are childen of

coursebooks (e.g. the course and the 2 book elements)

//Author/*All child elements of all the authors

//Book//*All descendent elements of all the books

Page 12: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 12

Predicates

Page 13: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 13

Positional Predicates

//Book/Author[1]1st author of each book

//Book/Author[3]/Address[1]First address of the 3rd author of each

book (ignores books that don't have 3 authors)

(//Book/Author)[1]The 1st of all the book authors

Page 14: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 14

Descendent Positional Predicates

//Author[1]1st author of each book (assuming

authors are only children of books; more generally, the first author in each node that contains an author)

(//Author)[1]The 1st of all the authors

(//Author)[3]/Address[1]The 1st address of the 3rd author

Page 15: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 15

Comparative Predicates

//Author[Lastname="Cohen"]//Author[./Lastname="Cohen"]

Authors whose lastname is CohenNote: Lastname is evaluated in the context of

each authorNote: Lastname is a node; by comparing it to a

string, automatically use its string-value (for simple elements, its contents)

//Book[Author/Lastname="Cohen"]Books with an author whose lastname is Cohen

//Book[.//Lastname="Cohen"]Same, if lastnames are only children of authors

//Book/Author[starts-with(Lastname,"C")]Book authors whose lastname starts with C

Page 16: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 16

Comparison Problem

What's the difference between

//Book/Author[Lastname = "Cohen"]

and

//Book[Author/Lastname = "Cohen"]/Author

Page 17: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 17

Comparison Answer

//Book/Author[Lastname = "Cohen"]-- Book authors whose last name is Cohen

//Book[Author/Lastname = "Cohen"]-- Books who have an author

whose last name is Cohen

//Book[Author/Lastname = "Cohen"]/Author-- Authors of Books who have an author

whose last name is Cohen

If a book has two authors, Kelly and Cohen, then the first expression will just include the Author node for Cohen, while the second expression will include the author nodes for both Kelly and Cohen

Page 18: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 18

Parent Traversal & Duplicates

//Author[Lastname="Cohen"]– authors whose lastname is Cohen

//Author[Lastname="Cohen"]/..– books with an author whose lastname

is Cohen (assuming that authors only appear as children of books)

Note: Suppose a book has two authors whose lastnames are both Cohen. The parent book node will only be included once. XPath 1.0 expressions automatically eliminate duplicate nodes.

Page 19: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 19

Function-Based Predicates

//Book/Author[2]//Book/Author[position()=2]

The second author of each book

//Book/Author[last()]//Book/Author[position()=last()]

The last author of each book

//Book/Author[position()<3]The first 2 authors of each book(//book/author[1..2] or //book/author[(1,2)] are not legal!)

//Book[count(Author)>2]Books with 3 or more authors

//Book[count(Author)>2]/AuthorThe authors of those books

Page 20: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 20

Aggregate Functions

count(//Book[.//Lastname="Cohen"])# of books authored by Cohen

//Book[Price > avg(//Book/Price)]Books whose price is greater than the average

book price

//Book[Price = min(//Book/Price)]Books whose price is equal to the minimum book

price

What is //Book[Price > avg(//Book/Price)]/Price

Page 21: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 21

Average Problem Solution

What is //Book[Price > avg(//Book/Price)]/Price

The prices of all the books whose price is above average

What query would find all books that have more than one author named Cohen?

Hint: The answer is of the form: //Book[ count( something ) > 1 ]

Page 22: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 22

Count Problem Solutions

What query would find all books that have more than one author named Cohen?

//Book[ count(Author[Lastname = "Cohen"]) > 1 ]

Note that the following does not work

//Book[ count(Author/Lastname = "Cohen") > 1 ]

since Author/Lastname = "Cohen" is a boolean expression, and count operates on a nodeset.

Page 23: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 23

Unions and Boolean Operators

//Book[(count(Author)>2) or (Price>50)]

//Book[count(Author)>2] |//Book[Price>50]

Books that have more than 2 authors or that cost more than $50

//Book[Price>50]/Author |//Book/Author[count(Address)>1]

Authors of books whose price is more than $50 or who have more than one address

Page 24: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 24

Intersection & Difference//Book[(count(Author)>2) and (Price>50)]

//Book[count(Author)>2] intersect//Book[Price>50]

Books that have more than 2 authors and that cost more than $50

//Book[(count(Author)>2) and not (Price>50)]

//Book[count(Author)>2] except//Book[Price>50]

Books that have more than 2 authors and that do not cost more than $50

intersect & except added in XPath 2.0

Page 25: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 25

Multiple Predicates

//Book[(count(Author)>2) and (Price>50)]

//Book[count(Author)>2][Price>50]

Books that have more than 2 authors and that cost more than $50

//Book[Price > 50][position() < 11](//Book[Price > 50])[position() < 11]

The first 10 books that cost more than $50

Page 26: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 26

General Comparison of Sets and Values

//Book[Author/Lastname="Cohen"]Books with some author

whose lastname is Cohen

//Book[Author/Lastname!="Cohen"]Books with some author

whose lastname is not Cohen

//Book[not(Author/Lastname="Cohen")]Books that do not have some author

whose lastname is Cohen

What is //Book[not(Author/Lastname!="Cohen")]

When comparing a set of values (or nodes with values) to a single value, the comparison only needs to be

satisfied for a single element of the set

Page 27: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 27

Every Author

//Book[not(Author/Lastname!="Cohen")]

Books that do not have some authorwhose lastname is not Cohen

That is:

Books whose authors' lastnamesare all Cohen

Page 28: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 28

Comparing Sets and Sets

//Book[Author/Lastname=("Cohen","Jones")]– book who have some author whose lastname is either

Cohen or Jones

//Book[Author/Lastname!=("Cohen","Jones")]– book who have some author whose lastname is neither

Cohen nor Jones

What do you think the meaning of this is:

//Book[Publisher!="WroxPress"][Author/Lastname= //Book[Publisher="WroxPress"]/Author/Lastname]

Page 29: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 29

Comparing Sets Problem

//Book[Publisher="WroxPress"]– books published by Wrox Press

//Book[Publisher="WroxPress"]/Author/Lastname– lastnames of authors of books published by Wrox Press

//Book[Publisher!="WroxPress"][Author/Lastname=("Cohen","Jones")]– books not published by Wrox Press

who have an author whose lastnameis either Cohen or Jones

//Book[Publisher!="WroxPress"][Author/Lastname=//Book[Publisher="WroxPress"]/Author/Lastname]– books not published by Wrox Press

who have an author whose lastnameis the same as the lastname of some author of a bookpublished by Wrox Press

Page 30: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 30

XPath 1.0 Problem

Suppose that Author is a subelement of Book, and that Author has a Name element, as well as an Authid element which uniquely identifies the author.

Can you write an XPath 1.0 expression to return the names of authors who have authored more than one book

Page 31: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 31

Limitations of XPath 1.0

Can you write an XPath 1.0 expression to return the authors who have authored more than one book

You'd like to write something like

//Author[count( //Book[Author/Authid=$./Authid] ) > 1]/Name

That is, you'd like $. to mean the Author nodes we are currently examining, and which we want to include in the result if the # of books by that author is > 1.

But, alas, there is no $. syntax, and . would refer to the inner book being examined, not the author, so this can't be written in XPath 1.0; we'll see how to do this using XPath 2.0

Page 32: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 32

XPath Nodes & Axes

Page 33: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 33

Attributes

//Book[@isbn]All books with an isbn attribute

//Book[Author/@status="deceased"]All books with a deceased author

//Book[@*]All books that have some attribute

//Book[not(@*)]All books that have no attributes

//Book/@idThe id attributes of all books. WHOA!

Those aren't element nodes!

Page 34: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 34

XPath Node Types

1. Root node2. Element node3. Attribute node4. Text node5. Processing instruction node6. Comment node7. Namespace node

Every element node has child nodes for ALL active namespaces

Page 35: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 35

Attribute Nodes

//Book/@isbnThe isbn attribute nodes of all the books

//Book/@*All attributes of all books

//id()All ID-type attribute nodes

//id("here")The ID-type attribute node named "here"

//Book[id("curbook")]The book with an ID of "curbook"

What is the result of//Book[starts-with(.//@authid, "fr_")]

Page 36: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 36

Attribute Solution

What is the result of//Book[starts-with(.//@authid, "fr_")]

Well, assuming that authors have an authid attribute, and that authid attributes start with a prefix which indicates the author's nationality ("fr_" meaning French),

This identifies all books with a French author.

Page 37: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 37

Text Nodes

Description

em"It's "

"way"TextNode

" cool"

Title

"XML Stuff"

Book

Page 38: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 38

Text Node Queries(//Book)[1]/Title

Might correspond to the element node: <Title>XML Stuff</Title>

(//Book)[1]/Title/text()Would correspond to the text node containng: "XML Stuff"

(//Book)[1]/DescriptionMight correspond to the element node

<Description> It's <em>way</em> cool</Description>

(//Book)[1]/Description/node()This returns a set of the 3 child nodes:

a text node, an element node, and another text node

(//Book)[1]/Description/*Would correspond to the one child element node: <em>way</em>

(//Book)[1]/Description/text()Would correspond to the two text nodes for "It's " and " cool"

What about (//Book)[1]/Description//text()

Page 39: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 39

Text Descendents vs ChildrenDescription

em"It's "

"way"

" cool"

(//Book)[1]/Description/text()Would correspond to the two child text nodes for "It's " and " cool"

(//Book)[1]/Description//text()Would correspond to the three descendent text nodes for "It's ", "way" and " cool"

Page 40: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 40

XPath Navigation Axes

ancestor

descendant

followingpreceding

child

attribute

namespace

self

following-sibling

preceding-sibling

from Arnaud Sahuguet

Page 41: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 41

XPath Navigation Axes

child (default)self (abbreviate using .)parent (abbreviate using ..)attribute (abbreviate using @)descendent-or-self (abbreviate using //)descendentancestor-or-selfancestorprecedingpreceding-siblingfollowingfollowing-siblingnamespace

Page 42: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 42

Using XPath Navigation Axes

//book/author/@status

/descendent-or-self::book/child::author/attribute::status

//description/text()

/descendent-or-self::description/child::text()

Page 43: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 43

Uses of XPath

To specify queriesTo specify which set of elements

need to have unique values, be keys, or contain keyrefs for XML Schema

To identify sets of nodes to be formatted or transformed by XSLT

To identify the parts of documents to be hyperlinked using XPointer

Page 44: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 44

XML Key Reference Example

BookDB

Booklist

Book

title

Author

publisher

……

name address dob Authref

Authlist

……

root

authid

Page 45: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 45

XSchema Key and Keyref Example

<xs:key name="authkeys"><xs:selector xpath="//Author"/><xs:field xpath="@authid"/>

</xs:key>

Every author's authid is unique and non-nil

Each book's Authref refers to a legal authid

<xs:keyref name="authrefs" refer="authkeys"><xs:selector xpath="//Book"/><xs:field xpath="Authref"/>

</xs:keyref>

The contents of a book's authref attribute must correspond to some author's authid attribute

Page 46: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 46

Cross Referencing

In BookDB,Find the title of books whose author

is Williams

//Book[@authref = //Author[@name="Williams"]/@authid]/@title

Find the names of authors of books published by Wrox Press

Page 47: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 47

Cross Referencing Solution

Find the names of authors of books published by Wrox Press

//Author[@authid = //Book[@publisher="Wrox Press"]/@authref]/@name

Page 48: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 48

Early OPathFilters

Employee[empno = 3417]• The employee whose empno is 3417

Employee[job = 'ANALYST']• The employees who are analysts

"Collecting" NavigationEmployee[dept.dname = 'RESEARCH']Dept[dname = 'RESEARCH'].empls

• Employees in the research departmentEmployee[job = 'CLERK'].deptDept[empls.job = 'CLERK']

• Departments that have clerks

Initial versions of OPath (Microsoft's language for querying in object models) used the same syntax as

XPath. The next iteration replaced the "/" (standard in the web-based world) with "." (standard in the OO world).

OPath has since evolved farther from XPath.

Page 49: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 49

XPath 2.0

Page 50: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 50

XPath 2.0 Queries

Extend XPath 1.0 queries byAdding "for … return …" syntaxAdditional conditional and

quantified expressionsAdding more functions and

operators

XPath 2.0 is designed to be a common subset of two full-fledged XML query languages: XQuery & XSLT

Page 51: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 51

XPath 2.0 Type Model

XPath 1.0Boolean, Number, StringNodeset

no duplicatestraversed in forward or backward document order

XPath 2.0All primitive XML Schema DatatypesSequences of nodes and/or primitive values

Ordered (not always document order), allow duplicates (use distinct-values/distinct-nodes fns). XPath 1.0 expressions always in document order with duplicates removed

Flattened (no sequences of sequences, though a node can represent an arbitrary hierarchy)

No difference between a single node/value and a singleton sequence

Page 52: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 52

Iteration

for $x in //Book return $x/Pricereturns a sequence of the price nodesequivalent to //Book/Price

 ( <Price>32.95</Price>,

<Price>18.25</Price>, … )

XPath 1.0 Expressions

Page 53: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 53

Obtaining Values

for $x in //Book return $x/Price/text()returns a sequence of the prices as textequivalent to //Book/Price/text()

 

for $x in //Book return number($x/Price) returns a sequence of the actual price

values

( 32.95, 18.25, … )

Page 54: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 54

Conditional Expressions

for $x in //Bookreturn

if (count($x/Author) > 2)then $x/Price * .5else $x/Price

Note the resulting sequence has both price nodes and numbers

Page 55: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 55

XPath 1.0 vs XPath 2.0

Suppose that Author is a subelement of Book, and that Author has a Name element, as well as an authid attribute which uniquely identifies the author.

Can you write an XPath 2.0 expression to return the names of authors who have authored more than one book

Hint: Try for $a in //Author return

if (count( something ) > 1) then $a/Name else ()

Page 56: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 56

XPath 2.0 Solution

Return the names of authors who have authored more than one book

for $a in //Author returnif (count( //Book[Author/@authid=$a/@authid]) > 1) then $a/name else ()

Note: () denotes the empty sequence

The final result is the concatenation of the sequences returned for each iteration of the for loop.

Concatenating the empty sequence has no effect

Page 57: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 57

Quantified Expressions

//Book[some $a in Author satisfies starts-with($a/Lastname,"C")]Books with some author

whose lastname starts with C

//Book[some $a in Author satisfies $a/Lastname="Cohen"]Books with some author

whose lastname is Cohen

//Book[every $a in Author satisfies $a/Lastname="Cohen"]Books all of whose authors'

lastnames are CohenGet the names of authors, all of whose

books are published by Wrox Press

Page 58: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 58

Quantified Problem Solution

for $a in //Author returnif (every $b in //Book[Author/Name = $a/Name] satisfies $/Publisher = "WroxPress") then $a/Name else ()

Note: This will include authors who are not authors of any books.

How could this be fixed?

Page 59: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 59

Duplicate Nodes in XPath 2.0

//Author[Lastname="Cohen"]/..– books with an author whose lastname is Cohen

(assuming that authors only appear as children of books). A book that has two authors whose lastnames are both Cohen will appear once since XPath 1.0 expressions automatically eliminate duplicate nodes.

for $a in //Author[Lastname="Cohen"]return $a/..– books with an author whose lastname is Cohen

(assuming that authors only appear as children of books). A book that has two authors whose lastnames are both Cohen will appear twice. Duplicate nodes are only eliminated automatically in XPath 1.0 expressions.

distinct-nodes( for $a in //Author[Lastname="Cohen"] return $a/..)– Explicitly eliminate duplicate nodes

Page 60: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 60

XQuery

Page 61: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 61

FLWOR Expressions

for …

let …

where …

order by …

result …

Any number of these(at least one)in any order

Optional

Required

Page 62: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 62

Variable Binding

for $x in //Booklet $p := $x/Pricereturn number($p)

( 32.95, 18.25, … )

Let is a binding operator.There is no assignment operator.

Page 63: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 63

Where Clause

for $x in //Booklet $p := $x/Pricewhere $p > 5.00return $p

equivalent to //Book[Price > 5.00]/Price

 

Page 64: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 64

XQuery Problem

Suppose that Author is a subelement of Book, and that Author has a Name element, as well as an authid attribute which uniquely identifies the author.

What's the clearest XQuery expression which returns the string values of the names of authors who have authored more than one book

Page 65: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 65

XQuery Solution

What's the clearest XQuery expression which returns the string values of the names of authors who have authored more than one book

for $a in //Author let $abooks =

//Book[Author/@authid=$a/@authid] where count($abook) > 1 return string($a/Name)

Page 66: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 66

Ordering

for $b in //Book[Price > 100] order by $b/Author[1]/Name,

$b/Price descendingreturn string($b/Title)

Return a set of the string values of the titles of all the books whose price is greater than $100

Order themFirst, by the name of the first authorSecondly, by price, highest price first

Page 67: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 67

User-Defined Functions

declare function depth($e as node) as xs:integer

{if (empty($e/*))

then 1else max( for $c in $e/* return depth($c)) + 1

}

Page 68: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 68

XQueryX

The XQuery syntax is not XML-Based

XQueryX has the same semantics as XQuery, but it is XML-Based.

Too large & ugly to include

See http://w3.org/TR/xqueryx

Page 69: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 69

Element Construction with XQuery

Page 70: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 70

Element Construction

<somenum>20</somenum>

<somenum>20</somenum>

This doesn't generate textIt generates a somenum element

Page 71: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 71

Expression Substitution

let $s := 20return <Somenum>$s</Somenum>

<Somenum>$s</Somenum> let $s := 20return <Somenum>{ $s }</Somenum>

<Somenum>20</Somenum>

Page 72: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 72

Dynamic Element Creation

let $s := 20return element Somenum {$s }

<Somenum>20</Somenum> let $tag := "Somenum",

$s := 20Return element { $tag } { $s }

Page 73: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 73

Elements with Nodes

<Naming>{ (//Author/Name)[1]}</Naming>

<Naming> <Name>John Doe</Name>

</Naming>

Page 74: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 74

Concatenating Nested Sequences

<Names>{ //Author/Name[starts-with(.,"John")]}</Names>

<Names> <Name>John Doe</Name> <Name>John Bigboutay</Name> <Name>John Doe</Name> <Name>John YaYa</Name></Names>

Note that XPath 1.0 expressions eliminate duplicated nodes, but not nodes with duplicated values

Page 75: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 75

Nested Substitution

<Result>{

for $j in (3, 1, 2)return <Val>{ $j }</Val>

}</Result>

<Result> <Val>3</Val> <Val>1</Val> <Val>2</Val></Result>

Page 76: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 76

Eliminating Duplicate Values<Names>{ for $nm in distinct-values(

//Author/Name[starts-with(.,"John")] )return <Name>{ $nm } </Name>

}</Names>

<Names> <Name>John Doe</Name> <Name>John Bigboutay</Name> <Name>John YaYa</Name></Names>

Page 77: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 77

Calculated Selection

<Names>{ let $nms := distinct-values(

//Author/Name[starts-with(.,"John")] ) for $j in (3, 1, 3)

return <Name>{ $nms[$j] }</Name>}</Names>

<Names> <Name>John YaYa</Name> <Name>John Doe</Name> <Name>John YaYa</Name></Names>

Page 78: 1 Advanced Database Topics Copyright © Ellis Cohen 2002-2005 Querying XML with XPath and XQuery These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike.

CS 779 Spring 2005 © Ellis Cohen, 2002-2005 78

Joins & Child Transfer <Books-with-Reviews>{ let $bks = doc(www.mybooks.com/books.xml), $revs = doc(www.bookreview.com/reviews.xml) for $b in $bks//Book let $isbn = $b/@isbn where some $r in $revs//BookReview

satisfies $r/@isbn = $isbn return <Book> { $b/@* } { $b/* } { for $r in $revs//Bookreview[@isbn=$isbn] return <Review>{ $r/node() }</Review> } </Book>}</Books-with-Reviews>