Post on 11-Jan-2016
1
XQuery
Slides From
Dr. Suciu
2
FLWR (“Flower”) Expressions
FOR ...
LET...
WHERE...
RETURN...
FOR ...
LET...
WHERE...
RETURN...
3
FOR-WHERE-RETURN
Find all book titles published after 1995:
for $x in doc("bib.xml")/bib/bookwhere data($x/@year) > 1999return $x/title
for $x in doc("bib.xml")/bib/bookwhere data($x/@year) > 1999return $x/title
Result: <title> abc </title> <title> def </title> <title> ghi </title>
doc(“bib.xml”) in freexml editor by default looks for bib.xml in the default directory where it was installed
4
FOR-WHERE-RETURN
Equivalently (perhaps more geekish)
FOR $x IN doc("bib.xml")/bib/book[@year > 1995] /title
RETURN $x
FOR $x IN doc("bib.xml")/bib/book[@year > 1995] /title
RETURN $x
And even shorter:
doc("bib.xml")/bib/book[@year > 1995] /title doc("bib.xml")/bib/book[@year > 1995] /title
5
FOR-WHERE-RETURN
• Find all book titles and the year when they were published:
for $x in doc("bib.xml")/bib/bookreturn
<answer><what> {data($x/title)}</what><when> {data($x/@year)}</when>
</answer>
for $x in doc("bib.xml")/bib/bookreturn
<answer><what> {data($x/title)}</what><when> {data($x/@year)}</when>
</answer>
We can construct whatever XML results we want !
6
Answer
<answer> <what>TCP/IP Illustrated</what> <when>1994</when></answer><answer> <what>Advanced Programming in the UNIX Environment</what> <when>1992</when></answer><answer> <what>Data on the Web</what> <when>2000</when></answer><answer> <what>The Economics of Technology and Content for Digital TV</what> <when>1999</when></answer>
<answer> <what>TCP/IP Illustrated</what> <when>1994</when></answer><answer> <what>Advanced Programming in the UNIX Environment</what> <when>1992</when></answer><answer> <what>Data on the Web</what> <when>2000</when></answer><answer> <what>The Economics of Technology and Content for Digital TV</what> <when>1999</when></answer>
7
FOR-WHERE-RETURN
• Notice the use of “{“ and “}”
• What is the result without them ?
FOR $x IN doc("bib.xml")/bib/bookRETURN <answer> <title> $x/title/text() </title> <year> $x/year/text() </year> </answer>
FOR $x IN doc("bib.xml")/bib/bookRETURN <answer> <title> $x/title/text() </title> <year> $x/year/text() </year> </answer>
8
Aggregates
Find all books with more than 3 authors:
count = a function that countsavg = computes the averagesum = computes the sumdistinct-values = eliminates duplicates
FOR $x IN doc("bib.xml")/bib/bookWHERE count($x/author)>3 RETURN $x
FOR $x IN doc("bib.xml")/bib/bookWHERE count($x/author)>3 RETURN $x
for $l in distinct-values(doc("bib.xml")//author/last)
return <last>{ $l }</last>
for $l in distinct-values(doc("bib.xml")//author/last)
return <last>{ $l }</last>
9
Aggregates
<answer>{for $b in doc(“bib.xml")//booklet $c := $b/authorreturn <book>{ $b/title, <count>{ count($c) }</count>}</book>}</answer>
10
XQuery
Find books whose price is larger than average:<answer>
{for $b in doc("bib.xml")/biblet $a := avg($b/book/price/text())for $x in $b/bookwhere ($x/price/text() > $a)return $x}</answer>
<answer>{for $b in doc("bib.xml")/biblet $a := avg($b/book/price/text())for $x in $b/bookwhere ($x/price/text() > $a)return $x}</answer>
LET binds a variable to one value;FOR iterates a variable over a list of values
11
FOR v.s. LET
FOR $x IN /bib/bookRETURN <result> { $x } </result>
FOR $x IN /bib/bookRETURN <result> { $x } </result>
Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ...
LET $x := /bib/bookRETURN <result> { $x } </result>
LET $x := /bib/bookRETURN <result> { $x } </result>
Returns: <result> <book>...</book> <book>...</book> <book>...</book> ... </result>
12
For vs Let (cont’d)for $i in (1, 2, 3)let $j := "a"return <tuple><i>{ $i }</i><j>{ $j }</j></tuple>
for $i in (1, 2, 3), $j in (4, 5, 6)return <tuple>
<i>{ $i }</i><j>{ $j }</j>
</tuple> for $i in (1, 2, 3)let $j := ( 4 to 6)return <tuple>
<i> { $i }</i><j>{ $j }</j>
</tuple>
13
XQuery: NestingFor each author of a book by Morgan
Kaufmann, list all books she published:
FOR $b IN doc(“bib.xml”)/bib, $a IN $b/book[publisher /text()=“Morgan Kaufmann”]/authorRETURN <result> { $a, FOR $t IN $b/book[author/text()=$a/text()]/title RETURN $t } </result>
FOR $b IN doc(“bib.xml”)/bib, $a IN $b/book[publisher /text()=“Morgan Kaufmann”]/authorRETURN <result> { $a, FOR $t IN $b/book[author/text()=$a/text()]/title RETURN $t } </result>
14
XQuery
<result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result>
<result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result>
Result:
15
Inverting Hierarchy• <listings>• {• for $p in distinct-values(doc("bib.xml")//publisher)• order by $p• return• <result>• { $p }• {• for $b in doc("bib.xml")/bib/book• where $b/publisher = $p• order by $b/title• return $b/title• }• </result>• }• </listings>
List of titles published by each publisher
16
SQL and XQuery Side-by-sideProduct(pid, name, maker, price) Find all product names, prices,
sort by price
SELECT x.name, x.priceFROM Product xORDER BY x.price
SELECT x.name, x.priceFROM Product xORDER BY x.price
SQL
FOR $x in doc(“db.xml”)/db/Product/rowORDER BY $x/price/text()RETURN <answer> { $x/name, $x/price } </answer>
FOR $x in doc(“db.xml”)/db/Product/rowORDER BY $x/price/text()RETURN <answer> { $x/name, $x/price } </answer>
XQuery
17
<answer> <name> abc </name> <price> 7 </price></answer> <answer> <name> def </name> <price> 23 </price></answer> . . . .
Answers
Notice: this is NOT awell-formed document !(WHY ???)
Name Price
abc 7
def 23
. . . . . .
18
Producing a Well-Formed Answer
<myQuery> { FOR $x in doc(“db.xml”)/db/Product/row ORDER BY $x/price/text() RETURN <answer> { $x/name, $x/price } </answer> }</myQuery>
<myQuery> { FOR $x in doc(“db.xml”)/db/Product/row ORDER BY $x/price/text() RETURN <answer> { $x/name, $x/price } </answer> }</myQuery>
19
<myQuery> <answer> <name> abc </name> <price> 7 </price> </answer> <answer> <name> def </name> <price> 23 </price> </answer> . . . .</myQuery>
Xquery’s Answer
Now it is well-formed !
20
SQL and XQuery Side-by-sideProduct(pid, name, maker, price)Company(cid, name, city, revenues) Find all products made in Seattle
SELECT x.nameFROM Product x, Company yWHERE x.maker=y.cid and y.city=“Seattle”
SELECT x.nameFROM Product x, Company yWHERE x.maker=y.cid and y.city=“Seattle”
SQL
FOR $r in doc(“db.xml”)/db, $x in $r/Product/row, $y in $r/Company/rowWHERE $x/maker/text()=$y/cid/text() and $y/city/text() = “Seattle”RETURN { $x/name }
FOR $r in doc(“db.xml”)/db, $x in $r/Product/row, $y in $r/Company/rowWHERE $x/maker/text()=$y/cid/text() and $y/city/text() = “Seattle”RETURN { $x/name }
XQuery
FOR $y in /db/Company/row[city/text()=“Seattle”], $x in /db/Product/row[maker/text()=$y/cid/text()]RETURN { $x/name }
FOR $y in /db/Company/row[city/text()=“Seattle”], $x in /db/Product/row[maker/text()=$y/cid/text()]RETURN { $x/name }
CoolXQuery
21
<product> <row> <pid> 123 </pid> <name> abc </name> <maker> efg </maker> </row> <row> …. </row> …</product><product> . . .</product>. . . .
22
SQL and XQuery Side-by-sideFor each company with revenues < 1M count the products over $100
SELECT y.name, count(*)FROM Product x, Company yWHERE x.price > 100 and x.maker=y.cid and y.revenue < 1000000GROUP BY y.cid, y.name
SELECT y.name, count(*)FROM Product x, Company yWHERE x.price > 100 and x.maker=y.cid and y.revenue < 1000000GROUP BY y.cid, y.name
FOR $r in doc(“db.xml”)/db, $y in $r/Company/row[revenue/text()<1000000]RETURN <proudCompany> <companyName> { $y/name/text() } </companyName> <numberOfExpensiveProducts> { count($r/Product/row[maker/text()=$y/cid/text()][price/text()>100]) } </numberOfExpensiveProducts> </proudCompany>
FOR $r in doc(“db.xml”)/db, $y in $r/Company/row[revenue/text()<1000000]RETURN <proudCompany> <companyName> { $y/name/text() } </companyName> <numberOfExpensiveProducts> { count($r/Product/row[maker/text()=$y/cid/text()][price/text()>100]) } </numberOfExpensiveProducts> </proudCompany>
23
SQL and XQuery Side-by-sideFind companies with at least 30 products, and their average price
SELECT y.name, avg(x.price)FROM Product x, Company yWHERE x.maker=y.cidGROUP BY y.cid, y.nameHAVING count(*) > 30
SELECT y.name, avg(x.price)FROM Product x, Company yWHERE x.maker=y.cidGROUP BY y.cid, y.nameHAVING count(*) > 30
FOR $r in doc(“db.xml”)/db, $y in $r/Company/rowLET $p := $r/Product/row[maker/text()=$y/cid/text()]WHERE count($p) > 30RETURN <theCompany> <companyName> { $y/name/text() } </companyName> <avgPrice> avg($p/price/text()) </avgPrice> </theCompany>
FOR $r in doc(“db.xml”)/db, $y in $r/Company/rowLET $p := $r/Product/row[maker/text()=$y/cid/text()]WHERE count($p) > 30RETURN <theCompany> <companyName> { $y/name/text() } </companyName> <avgPrice> avg($p/price/text()) </avgPrice> </theCompany>
A collection
An element