Native XML support in DB2 9 for z/OS

48
Native XML Support in DB2 9 for z/OS Phil Grainger CA

description

 

Transcript of Native XML support in DB2 9 for z/OS

Page 1: Native XML support in DB2 9 for z/OS

Native XML Support inDB2 9 for z/OS

Phil GraingerCA

Page 2: Native XML support in DB2 9 for z/OS

2

Agenda

> Introduction

> What exactly IS XML?

> DB2 9 XML storage

> DB2 9 XML processing

> Further thoughts on XML and DB2

> Bibliography

Page 3: Native XML support in DB2 9 for z/OS

3

What IS XML?

> eXtensible Markup Language

> Self describing data storage/transport

> Vendor and platform independent Eg RSS feeds

Podcasts

> Can contain structured, unstructured or a mix of data

Page 4: Native XML support in DB2 9 for z/OS

4

An example of XML

> XML consists of a series of nodes which form a hierarchy

> Neither the names nor the contents of the nodes are predefined

This is why it’s termed “extensible”

> A node is enclosed between <nodename> and </nodename> tags

Windows Word has a neat way of showing this

> I’ll use some XML borrowed from the Sky television news feed

RSS feeds are a great example of XML usage

> Please note that the screenshots are only the FIRST PART of the XML

So some </end> tags are missing

Page 5: Native XML support in DB2 9 for z/OS

5

An example of “raw” XML

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel>

<title>Sky News | Strange News | First For Breaking News</title> <link>http://news.sky.com/skynews/strangebuttrue</link>

<image>http://static.sky.com/images/skynews/rss/rss.gif<title>Sky News</title><url>http://static.sky.com/images/skynews/rss/rss.gif</url><link>http://news.sky.com/</link>

</image> <description>Sky News Strange But True</description> <language>en-us</language> <copyright>Copyright 2007, BSKYB. All Rights Reserved.</copyright> <lastBuildDate>Thu, 02 Aug 2007 11:22:30 GMT</lastBuildDate> <category>Sky News</category> <ttl>60</ttl>

<channelLinks> <link name="Uk News" url="http://news.sky.com/skynews/uknews"/> <link name="World" url="http://news.sky.com/skynews/worldnews"/> <link name="Money" url="http://news.sky.com/skynews/money"/> <link name="Business" url="http://news.sky.com/skynews/business"/> </channelLinks>

<item><title><![CDATA[World's Cheekiest Burglar Hunted On Facebook]]></title><link>http://news.sky.com/skynews/article/0,,30100-1278207,00.html?f=rss</link><description><![CDATA[A disgruntled homeowner has fallen victim to possibly the

cheekiest burglar in the world - and has now turned to social networking website Facebook to track him down.]]></description>

<enclosure url="" length="123456" type="image/gif" height="45" width="95"><![CDATA[]]></enclosure>

</item>

Page 6: Native XML support in DB2 9 for z/OS

6

Or as Microsoft Word shows it

Page 7: Native XML support in DB2 9 for z/OS

7

Or an alternative Word view

> Showing how all the elements and nodes are related

> Also this makes the hierarchical nature of XML even more obvious

Page 8: Native XML support in DB2 9 for z/OS

8

So we could draw it as a hierarchy!

rss

channel

ttile link image

title url link

description language copyright lastbuilddate category ttl channellinks

link

item

title link description enclosure

*

So, whodefines the

format? In ourexample

this isrepeating

HeyDoes this look

like IMS to you?

Page 9: Native XML support in DB2 9 for z/OS

9

XML Schemas

> We’ve already seen that XML is infinitely extensible

> Does this mean anarchy?

> It can

> But there are also things called “XML Schemas” A schema defines what can appear in an XML document

A well formed XML document can still violate an XML schema

Page 10: Native XML support in DB2 9 for z/OS

10

DB2 9 for z/OS support for XML

> DB2 9 provides a new XML datatype

> An entire XML document can be stored in a single XML column

So one column of one row in a table has one complete document

Page 11: Native XML support in DB2 9 for z/OS

11

DB2 9 for z/OS support for XML

There are some limitations, for example:

> Only well-formed documents are allowed All tags must have end tags

Elements must be nested correctly

Attributes must have values (enclosed by “ or ‘)

Tags are case sensitive </A> doesn’t end <a>

> An XML schema can optionally be applied with the DSN_XMLValidate() function

Providing you have defined a schema to DB2

In database DNSXRS (part of the catalog)

Page 12: Native XML support in DB2 9 for z/OS

12

DB2 9 for z/OS support for XML

> Documents are not stored as strings So not comparable with any string data type

> But are manipulated by various XML expressions and functions

Including an XML predicate “function”

Page 13: Native XML support in DB2 9 for z/OS

13

DB2 9 for z/OS support for XML

> Storage of XML data is a little like LOB storage

> When you create a table with an XML column, you get some other things as well

A hidden column called DB2_GENERATED_DOC_ID_FOR_XML

A unique index on this column

An table space to store the XML data

A table in the above table space

An XML index for the above table

> Luckily DB2 creates all of these things for us!

Page 14: Native XML support in DB2 9 for z/OS

14

DB2 9 for z/OS support for XML

CREATE TABLE GRAPH02.XML_TABLE2

( KEY_COULMN INTEGER NOT NULL,

XML_COLUMN XML NOT NULL);

> Wasn’t THAT easy!

> Note that there is no length specification Maximum XML size is the same as the max LOB

size

Currently 2GB

Page 15: Native XML support in DB2 9 for z/OS

15

DB2 9 for z/OS support for XML

> There are some things you can’t do with an XML column Sort

Group

Most predicates

Primary, foreign or unique key

> Also, no host languages have XML data type manipulation support

Yet …….

So XML data has to be manipulated as string data

Page 16: Native XML support in DB2 9 for z/OS

16

Processing XML data

> Inserting data into an XML column is simply a matter of issuing an INSERT statement

Or a LOAD

> The XML statement MUST conform to DB2 standards And must be “well formed”

Look out for SQLCODE -20398 which says you have an error somewhere

– A byte offset IS given, but this is after DB2 has converted the XML to UTF-8

– So may not exactly match where the error is

Page 17: Native XML support in DB2 9 for z/OS

17

Processing XML data

> And you can optionally apply a schema too remember Bear in mind that there WILL be an overhead to applying a

schema

Page 18: Native XML support in DB2 9 for z/OS

18

XML functions

> There are a number of functions for manipulating XML data

> Be careful though, not all the functions starting “XML” are for manipulating XML data

Many are for CREATING XML data from relational data

Page 19: Native XML support in DB2 9 for z/OS

19

XMLDOCUMENT()

> For creating XML documents from relational data or from parts of other XML documents

> At it’s simplest, it returns the same as a basic SELECT from the table

But can produce XML documents with all the necessary headers

Page 20: Native XML support in DB2 9 for z/OS

20

XMLSERIALIZE()

> Converts XML data into textual data Can include/exclude XML declarations

Converts to LOB, BLOB, CLOB or DBCLOB

Page 21: Native XML support in DB2 9 for z/OS

21

XPath

> Before we can talk about working with these XML data types, we need to talk about XPath

> XPATH notation allows you to navigate the XML document

> You can use XPATH to return subsets of your documents

Page 22: Native XML support in DB2 9 for z/OS

22

XPath

> There is not time here for an in-depth XPATH discussion But, for example ….

DB2 needs to know, when we refer to a node name, which specific one we mean

/rss/channel/item/titlewould allow us to work with the /title/ nodes in our XML data

– In our case the <item> node is also a repeating node

Page 23: Native XML support in DB2 9 for z/OS

23

XPath

> XPath can be used in SELECT lists Using XMLQUERY functions, for example

> In predicates Using the new XMLEXISTS predicate

Returns TRUE or FALSE depending on XPath expression

Page 24: Native XML support in DB2 9 for z/OS

24

Let’s start simple - XMLQUERY()

> Returns a portion of an XML document matching a query

> Also returns all the subsidiary nodes

SELECT KEY_COLUMN,

XMLSERIALIZE(XMLQUERY('/rss/channel/item

[title="Worlds Cheekiest Burglar Hunted On Facebook"]'

PASSING XML_COLUMN)

AS CLOB(2K))

FROM GRAPH02.XML_TABLE

> Could return

Page 25: Native XML support in DB2 9 for z/OS

25

XMLQUERY()

> KEY_COLUMN followed by textual XML

1 <item><title>Worlds Cheekiest Burglar Hunted On Facebook</title><link>http://news.sky.com/skynews/article/0,,30100-1278207,00.html?f=rss</link><description>A disgruntled homeowner has fallen victim to possibly the cheekiest burglar in the world - and has now turned to social networking website Facebook to track him down.</description><enclosure url="" length="123456" type="image/gif" height="45" width="95"/></item>

> This is ONE of a repeating set of nodes from one document

Page 26: Native XML support in DB2 9 for z/OS

26

XMLQUERY()

> HOWEVER, if the XMLQUERY() returns <null> As it will if it can’t find the text in your document

> A row will still be returned for each row in the table KEY_COLUMN value and <null>

> We also need a way to specify predicates on the XML data

Page 27: Native XML support in DB2 9 for z/OS

27

A bit more complex - XMLEXISTS()

> XMLEXISTS() returns TRUE or FALSE depending on whether an XPath expression finds a result

> So we expand our query into:SELECT KEY_COLUMN,

XMLSERIALIZE(XMLQUERY('/rss/channel/item

[title="Worlds Cheekiest Burglar Hunted On Facebook"]'

PASSING XML_COLUMN)

AS CLOB(2K))

FROM GRAPH02.XML_TABLE

WHERE XMLEXISTS('/rss/channel/item

[title="Worlds Cheekiest Burglar Hunted On Facebook"]'

PASSING XML_COLUMN)

> Now, rows will only be returned where the XPath in XMLEXISTS() finds data

Page 28: Native XML support in DB2 9 for z/OS

28

Searching

> You can see though that the arguments to XMLQUERY and XMLEXISTS have to be an EXACT match for the content we are searching for

> What if we want to do a wildcarded sort of search

> XPath has no concept of “%” or “_”, but it does have a series of functions that may help

> One useful one is contains Like this:

Page 29: Native XML support in DB2 9 for z/OS

29

Searching

SELECT KEY_COLUMN,

XMLSERIALIZE(XMLQUERY('/rss/channel

[contains(title,“Facebook")]'

PASSING XML_COLUMN)

AS CLOB(8K))

FROM GRAPH02.XML_TABLE

WHERE XMLEXISTS('/rss/channel

[contains(title,“Facebook")]'

PASSING XML_COLUMN)

Page 30: Native XML support in DB2 9 for z/OS

30

Searching

>This should be clear, but we are Wanting data returned that has

“Facebook” in the /rss/channel/title node ONLY for rows that contain “Facebook” in

an /rss/channel/title node

Page 31: Native XML support in DB2 9 for z/OS

31

Not just SELECT

> Of course, we could also say something like

DELETE

FROM GRAPH02.XML_TABLE

WHERE XMLEXISTS('/rss/channel

[contains(title,“Favebook")]'

PASSING XML_COLUMN)

> Delete all the rows that contain “Facebook” in a title node

Page 32: Native XML support in DB2 9 for z/OS

32

XML indexes

> Using XPath notation, you can create indexes on your XML column

CREATE UNIQUE INDEX XML_INDEX

ON GRAPH02.XML_TABLE(XML_COLUMN)

GENERATE KEY USING XMLPATTERN

'/rss/channel/item/title'

AS SQL VARCHAR(128)

> Yes, this IS a unique index

And it DOES constrain the content of the node specified to VARCHAR(128)

Page 33: Native XML support in DB2 9 for z/OS

33

XML indexes

> So you can also uses indexes to constrain the CONTENT of nodes

> In the previous example, we said

GENERATE KEY USING XMLPATTERN

'/rss/channel/item/title'

AS SQL VARCHAR(128)

> Any attempt to insert a document with a /title/ longer than 128 characters will fail

Page 34: Native XML support in DB2 9 for z/OS

34

XML indexes

> Here’s something unusual

> I have two rows in my table SELECT COUNT(*) does indeed return 2

> So why does REBUILD INDEX sayDSNUCRUL - UNLOAD PHASE STATISTICS - NUMBER OF RECORDS PROCESSED=34 ?

Because each row has MULTIPLE index keys! Look again at the CREATE INDEX XPath statement We’re indexing INTO an XML document Each document (in this case) has 17 occurrences of

/rss/channel/item/title

Page 35: Native XML support in DB2 9 for z/OS

35

More on XPath

> XPath arguments are case sensitive

> Be VERY careful about how you code them!

> Also, errors in XPath specifications can be hard to debug Syntax errors are easy DB2 tells you

XML errors aren’t so simple

– Spelling errors

– Capitalization errors

– Etc.

Page 36: Native XML support in DB2 9 for z/OS

36

More on XPath

>Why does this not return any data?SELECT KEY_COLUMN,

XMLSERIALIZE(XMLQUERY('/rss/chanel/item

[title="ITV Profits Take A Dive In First Half"]'

PASSING XML_COLUMN)

AS CLOB(2K))

FROM GRAPH02.XML_TABLE

WHERE XMLEXISTS('/rss/chanel/item

[title="ITV Profits Take A Dive In First Half"]'

PASSING XML_COLUMN)

Page 37: Native XML support in DB2 9 for z/OS

37

More on Xpath

>It’s because we spelled “channel” with one “n”

NO error is returned even though the node “chanel” does not exist in the XML

>Just because no node of that name exists TODAY, that does not mean one will not be there tomorrow

Page 38: Native XML support in DB2 9 for z/OS

38

Further thoughts on XML

> Firstly, remember that the XML data is effectively free form

> What is in your XML column could be ANY valid XML data Each row in the table does not have to contain similar data

for the XML column

Page 39: Native XML support in DB2 9 for z/OS

39

Further thoughts on XML

> My examples just happen to contain two almost identical rows

But I could add a third, very different, document

From a business perspective, this would not be sensible

But DB2 would allow it

Page 40: Native XML support in DB2 9 for z/OS

40

New Features by APAR

> PK51571, 51572 and 51573XMLTABLE() and XMLCAST()

Page 41: Native XML support in DB2 9 for z/OS

41

XMLTABLE()

> Turns a “repeating group” in an XML document into rows in a “table”

SELECT X.*

FROM GRAPH02.XML_TABLE G,

XMLTABLE('/rss/channel/item' PASSING G.XML_COLUMN

COLUMNS "SEQ" FOR ORDINALITY,

"TITLE" CHAR(64) PATH 'title')

AS X;

Page 42: Native XML support in DB2 9 for z/OS

42

XMLTABLE()

> Returns

1 Worlds Cheekiest Burglar Hunted On Facebook

2 Deadly Petrol Roller Skates Seized

3 Why Panda Poo Will Play A Part In Olympics

4 Surgeons Operate By Mobile Phone Light

5 Great White Shark Seen In Cornwall

6 Lightning Strike No Flash In The Pan For Survivor

7 Spooky Scamp Has Skill For Sniffing Death

Page 43: Native XML support in DB2 9 for z/OS

43

XMLTABLE() in a View

CREATE VIEW XML_VIEW AS

SELECT X.*

FROM GRAPH02.XML_TABLE G,

XMLTABLE('/rss/channel/item' PASSING G.XML_COLUMN

COLUMNS "SEQ" FOR ORDINALITY,

"TITLE" CHAR(64) PATH 'title')

AS X

Page 44: Native XML support in DB2 9 for z/OS

44

XMLTABLE() in a View

SELECT * FROM

XML_VIEW

WHERE TITLE LIKE '%Facebook%‘

> Now we can use wildcards and column names to access our XML data!!

> Do be careful of performance though This will require materialisation of the data

BEFORE the predicate can be applied

Page 45: Native XML support in DB2 9 for z/OS

45

New Features by APAR

> PK55585 and PK55831 (still open)13 new XPATH functionse.g. fn.lower-case, fn.upper-case, fn.matches, fn.position, fn.replace & fn.tokenize

> PK47594 and PK58766XML Load performance improvement

Page 46: Native XML support in DB2 9 for z/OS

Questions??

Page 47: Native XML support in DB2 9 for z/OS

Bibliography

Page 48: Native XML support in DB2 9 for z/OS

48

Bibliography

> Look out for GC18-9856

“DB2 Version 9.1 for z/OS – What’s New” SG24-7330

“DB2 9 for z/OS Technical Overview” SG24-7239

“Enhancing SAP by Using DB2 9 for z/OS ” SC18-9858

“DB2 Version 9.1 for z/OS – XML Guide”

> SG24-7315 “DB2 9 pureXML Guide” is for DB2 LUW NOT for z/OS