Relational data as_xml
-
Upload
harshavardhan-achrekar -
Category
Education
-
view
277 -
download
0
description
Transcript of Relational data as_xml
Efficiently Publishing Relational Data as XML Documents
Authors:- J Shanmugasundaram, Michael Carey etc
(IBM Almaden Research Center)Presented By Harshavardhan Achrekar(University of Massachusetts-Lowell)
What drove them?
XML emerging as standard for business data exchange on World Wide Web.
Need a mechanism to publish currently stored relational data as XML Documents.
Primary Issues
Language Specifications - structure and tag data from tables as hierarchical XML Documents.
Best Implementation Technique – study characteristics and performances of various alternatives for constructing XML documents. When to add tags & structure How much of processing is done
within relational engine?
RoadMap
Language specification based on SQL
Implementation Early tagging, structuring Late tagging, structuring Early structure, late tagging
Performance Evaluation
Sample XML Document for Customer<customer id=”C1”>
<name> John Doe </name><accounts>
<account id=”A1”> 1894654 </account><account id=”A2”> 3849342 </account>
</accounts><porders>
<porder id=”PO1” acct=”A1”> // first purchase order<date>1 Jan 2000</date><items>
<item id=”I1”> Shoes </item><item id=”I2”> Bungee Ropes </item>
</items><payments>
<payment id=”P1”> due Jan 15 </payment>
<payment id=”P2”> due Jan 20 </payment>
<payment id=”P3”> due feb 15 </payment>
</payments></porder><porder id=”PO2” acct=”A2”> // second purchase order
…</porder>
</porders></customer>
Note the•Elements•Names/Tags•ID Refs•Attribute•Nested sub-element
Underlying tables
Customer(id int, name varchar)
Account(id varchar, custID int, acctnum int)
Item(id int, poID int, desc varchar)
PurchOrder(id int, custID int, acctID varchar, date varchar)
Payment(id int, poID int, desc varchar)
SQL-based language specifications
Sqlfunctions: Define XMLConstruct CUST(Custid: integer, CustName: varchar) AS {
<Customer id=$Custid>$CustName </Customer>}
Sqlaggregates: Select XMLAGG ( ITEM (item.id, item.desc) )
From Item item // returns an XML aggregation of items
Customer Definition of XML Constructor
Define XML Constructor CUST (custId: integer,custName: varchar(20),acctList: xml,porderList: xml) AS {
<customer id=$custId><name> $custName </name><accounts> $acctList
</accounts><porders> $porderList </porders>
</customer>}
Input
Output
Output - A Customer XML ElementAggregate function XMLAGG – Concatenates XML Fragments produced by XML Constructor
Sample SQL query constructs XML from relational tables
1. Select cust.name, CUST(cust.id, cust.name,2. (Select XMLAGG(ACCT(acct.id, acct.acctnum))3. From Account acct 4. Where acct.custId=cust.id),5. (Select XMLAGG(PORDER(porder.id, porder.acct,
porder.date,6. (Select XMLAGG(ITEM(item.id, item.desc))7. From Item item 8. Where item.poid=porder.id)9. (Select XMLAGG(PAYMENT(pay.id,pay.desc))10. From Payment pay,11. Where pay.poid=porder.id)))12. From PurchOrder porder 13. Where porder.custID=cust.id))14. From Customer cust
Correlated sub-queryfor customer’s Accounts
Correlated sub-queryFor purchase orders
Top Level query returns each customer from customer table
Correlated sub-query returns XML fragment
LINES 1-14 produces Scalar function returning Customer XML
Implementation Alternatives
Two main differences: Nesting (structuring) Tagging
Space of alternatives:
Late TaggingEarly Tagging
Late Structuring
Early StructuringInside Engine Inside Engine
Inside Engine
Outside Engine Outside Engine
Outside Engine
Stored Procedures
CLOB
Early tagging and structuring
Stored Procedure - Outside the engine Approach Explicitly issue nested queries Algorithm:- First query & retrieve root elements (customers id, name) Using Customer id ,issue a query to retrieve account info. Next, for same customer id, issue a query to retrieve customers purchase
order For each purchase order retrieved, query to get item and payment info.
Once done Processing of one customer is over. Repeat same for next customer till entire XML Document is ready. Fixed order Nested Loop Join outside the ENGINE Tag/Structure as soon as structure is ready Many SQL queries issued/tuple for tables with nested
structure.
Early tagging and structuring
Correlated CLOB - Inside the engine Approach Push queries into the engine Plug in XMLAGG, XMLCONSTRUCT support
into engine Character Large Objects- CLOBS XML
Fragments Performance Issues -handle huge CLOBS in
engine Fixed join order – implies nested loop join
strategy
Efficiently Publishing Relational Data as XML Documents
Early tagging and structuring
De-Correlated CLOB - Inside the engine Approach\ Decorrelate and use Outer Joins – no longer
fixed order Compute Account lists associated with all
customers Compute Purchase order lists associated with
all customers Join results above on customer id. Still carry around CLOBs (due to early
tagging!)
Efficiently Publishing Relational Data as XML Documents