Relational data as_xml

15
Efficiently Publishing Relational Data as XML Documents Authors:- J Shanmugasundaram, Michael Carey etc (IBM Almaden Research Center) Presented By Harshavardhan Achrekar (University of Massachusetts-Lowell)

description

Database Presentation

Transcript of Relational data as_xml

Page 1: Relational data as_xml

Efficiently Publishing Relational Data as XML Documents

Authors:- J Shanmugasundaram, Michael Carey etc

(IBM Almaden Research Center)Presented By Harshavardhan Achrekar(University of Massachusetts-Lowell)

Page 2: Relational data as_xml

What drove them?

XML emerging as standard for business data exchange on World Wide Web.

Need a mechanism to publish currently stored relational data as XML Documents.

Page 3: Relational data as_xml

Primary Issues

Language Specifications - structure and tag data from tables as hierarchical XML Documents.

Best Implementation Technique – study characteristics and performances of various alternatives for constructing XML documents. When to add tags & structure How much of processing is done

within relational engine?

Page 4: Relational data as_xml

RoadMap

Language specification based on SQL

Implementation Early tagging, structuring Late tagging, structuring Early structure, late tagging

Performance Evaluation

Page 5: Relational data as_xml

Sample XML Document for Customer<customer id=”C1”>

<name> John Doe </name><accounts>

<account id=”A1”> 1894654 </account><account id=”A2”> 3849342 </account>

</accounts><porders>

<porder id=”PO1” acct=”A1”> // first purchase order<date>1 Jan 2000</date><items>

<item id=”I1”> Shoes </item><item id=”I2”> Bungee Ropes </item>

</items><payments>

<payment id=”P1”> due Jan 15 </payment>

<payment id=”P2”> due Jan 20 </payment>

<payment id=”P3”> due feb 15 </payment>

</payments></porder><porder id=”PO2” acct=”A2”> // second purchase order

…</porder>

</porders></customer>

Note the•Elements•Names/Tags•ID Refs•Attribute•Nested sub-element

Page 6: Relational data as_xml

Underlying tables

Customer(id int, name varchar)

Account(id varchar, custID int, acctnum int)

Item(id int, poID int, desc varchar)

PurchOrder(id int, custID int, acctID varchar, date varchar)

Payment(id int, poID int, desc varchar)

Page 7: Relational data as_xml

SQL-based language specifications

Sqlfunctions: Define XMLConstruct CUST(Custid: integer, CustName: varchar) AS {

<Customer id=$Custid>$CustName </Customer>}

Sqlaggregates: Select XMLAGG ( ITEM (item.id, item.desc) )

From Item item // returns an XML aggregation of items

Page 8: Relational data as_xml

Customer Definition of XML Constructor

Define XML Constructor CUST (custId: integer,custName: varchar(20),acctList: xml,porderList: xml) AS {

<customer id=$custId><name> $custName </name><accounts> $acctList

</accounts><porders> $porderList </porders>

</customer>}

Input

Output

Output - A Customer XML ElementAggregate function XMLAGG – Concatenates XML Fragments produced by XML Constructor

Page 9: Relational data as_xml

Sample SQL query constructs XML from relational tables

1. Select cust.name, CUST(cust.id, cust.name,2. (Select XMLAGG(ACCT(acct.id, acct.acctnum))3. From Account acct 4. Where acct.custId=cust.id),5. (Select XMLAGG(PORDER(porder.id, porder.acct,

porder.date,6. (Select XMLAGG(ITEM(item.id, item.desc))7. From Item item 8. Where item.poid=porder.id)9. (Select XMLAGG(PAYMENT(pay.id,pay.desc))10. From Payment pay,11. Where pay.poid=porder.id)))12. From PurchOrder porder 13. Where porder.custID=cust.id))14. From Customer cust

Correlated sub-queryfor customer’s Accounts

Correlated sub-queryFor purchase orders

Top Level query returns each customer from customer table

Correlated sub-query returns XML fragment

LINES 1-14 produces Scalar function returning Customer XML

Page 10: Relational data as_xml

Implementation Alternatives

Two main differences: Nesting (structuring) Tagging

Space of alternatives:

Late TaggingEarly Tagging

Late Structuring

Early StructuringInside Engine Inside Engine

Inside Engine

Outside Engine Outside Engine

Outside Engine

Stored Procedures

CLOB

Page 11: Relational data as_xml

Early tagging and structuring

Stored Procedure - Outside the engine Approach Explicitly issue nested queries Algorithm:- First query & retrieve root elements (customers id, name) Using Customer id ,issue a query to retrieve account info. Next, for same customer id, issue a query to retrieve customers purchase

order For each purchase order retrieved, query to get item and payment info.

Once done Processing of one customer is over. Repeat same for next customer till entire XML Document is ready. Fixed order Nested Loop Join outside the ENGINE Tag/Structure as soon as structure is ready Many SQL queries issued/tuple for tables with nested

structure.

Page 12: Relational data as_xml

Early tagging and structuring

Correlated CLOB - Inside the engine Approach Push queries into the engine Plug in XMLAGG, XMLCONSTRUCT support

into engine Character Large Objects- CLOBS XML

Fragments Performance Issues -handle huge CLOBS in

engine Fixed join order – implies nested loop join

strategy

Page 13: Relational data as_xml

Efficiently Publishing Relational Data as XML Documents

Page 14: Relational data as_xml

Early tagging and structuring

De-Correlated CLOB - Inside the engine Approach\ Decorrelate and use Outer Joins – no longer

fixed order Compute Account lists associated with all

customers Compute Purchase order lists associated with

all customers Join results above on customer id. Still carry around CLOBs (due to early

tagging!)

Page 15: Relational data as_xml

Efficiently Publishing Relational Data as XML Documents