XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

55
XML To Relational Model
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    218
  • download

    0

Transcript of XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Page 1: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

XML To Relational Model

Page 2: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 3: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 4: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 5: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 6: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Key Index – Forward Traversal

Backward Traversal

Page 7: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 8: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 9: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 10: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 11: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 12: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Binary Approach

Bname(source, ordinal, flag, target) Create many tables as different

subelement and attribute names occur in XML document

Partition Edge Table by name

Universal table – Take outer join of all binary tables

Page 13: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 14: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 15: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 16: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 17: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 18: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 19: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Universal Table with Overflow

Page 20: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 21: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 22: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 23: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Converting Ordered XML to Relations

Page 24: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Skynet Hitech. Company

<Company><Name>

Skynet Hitech</Name><Department>

<Name>Research

</Name><Manager>

John Smith</Manager><Employee>

Tom Jackson</Employee>

</Department>

<Department><Name>

Sales</Name><Manager>

Linda White</Manager><Employee>

Kevin Lee </Employee></Department>

</Company>

Page 25: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Ordered XML model for Skynet Hitech. Company

Company

Name Department

Skynet Hitech Name Manager Employee

Research John Smith Tom Jackson

Department

Name Manager Employee

Sales Linda White Kevin Lee

1

1 2 3

1 2 3 1 2 3

Page 26: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Schema of the storing table

Attributes IDID: the unique index for each tuple DID: the document ID Path: the path from the root to the leaf node,

this is to find a particular node Surrogate Pattern: number representation of

nodes Value: Text value associated with each node

Page 27: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Numbering nodes

Company

Name Department

Skynet Hitech Name Manager Employee

Research John Smith Tom Jackson

Department

Name Manager Employee

Sales Linda White Kevin Lee

1[1]

2[2]

2[1]

Page 28: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Tuple that stores “Linda White”

ID: 00334 DID: 501 Path: Company/Department/Manager Surrogate Pattern: 1[1]2[2]2[1] Value: Linda White

Page 29: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Old Skynet file stored in the RDBMS

OLD  

Path Surrogate Patten Value

Company/Name 1[1]1[1] Skynet Hitech

Company/Department/Name 1[1]2[1]1[1] Research

Company/Department/Manager 1[1]2[1]2[1] John Smith

Company/Department/Employee 1[1]2[1]3[1] Tom Jackson

Company/Department/Name 1[1]2[2]1[1] Sales

Company/Department/Manager 1[1]2[2]2[1] Linda White

Company/Department/Employee 1[1]2[2]3[1] Kevin Lee

Page 30: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 31: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 32: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 33: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 34: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 35: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 36: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 37: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 38: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

book

booktitle

author

monograph

title

contactauthor

authorID

editor

*

nameaddress

?

firstname lastname

?

authorid

article

*

name

Page 39: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 40: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

<!ELEMENT book (booktitle, author)

<!ELEMENT booktitle (#PCDATA)>

<!ELEMENT author (name, address)><!ATTLIST author id ID #REQUIRED>

<!ELEMENT name (firstname?, lastname)>

<!ELEMENT firstname (#PCDATA)>

<!ELEMENT lastname (#PCDATA)>

<!ELEMENT address ANY>

<!ELEMENT article (title, author*, contactauthor)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT contactauthor EMPTY><!ATTLIST contactauthor authorID IDREF IMPLIED>

<!ELEMENT monograph (title, author, editor)>

<!ELEMENT editor (monograph*)><!ATTLIST editor name CDATA #REQUIRED>

Page 41: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 42: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Basic Inline Algorithm

A relation is created for root of element of graph

All element’s descendents are inlined into that relation except Children below a “*” node are made into

separate relations – this corresponds to creating a new relation for a set-valued child

Each node having a backpointer edge pointing to it is made into a separate relation

Page 43: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Drawbacks

Grossly inefficient for many queries “List all authors having first name Jack” will have to

be executed as the union of 5 separate queries Large number of relations it creates

Page 44: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

To determine the set of relations to be created for an element, we construct an element graph by… Do a DFS traversal of DTD graph, starting at element

node for which we are constructing relations Each node is marked as “visited” the first time it is

reached and is unmarked once all its children have been traversed

If an unmarked node in DTD graph is reach during DFS, a new node bearing the same name is created in the element graph

A regular edge is created from the most recently created node in the element graph with the same names as the DFS parent of the current DTD node to newly created node

If an attempt is made to traverse an already marked DTD, then a backpointer edge is added from the most recently created node in the element graph to the most recently created node in the element graph of the same name as the marked DTD node

Page 45: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 46: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Fragmentation: Example

Results in 5 relations Just retrieving first and last names of an

author requires three joins!

<!ELEMENT author (name, address)><!ATTLIST author id ID #REQUIRED>

<!ELEMENT name (firstname?, lastname)>

<!ELEMENT firstname (#PCDATA)>

<!ELEMENT lastname (#PCDATA)>

<!ELEMENT address ANY>

author (authorID: integer, id: string)

name (nameID: integer, authorID: integer)

firstname (firstnameID: integer, nameID: integer, value: string)

lastname (lastnameID: integer, nameID: integer, value: string)

address (addressID: integer, authorID: integer, value: string)

Page 47: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 48: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Shared Inlining Method

Relations are created for… All elements in the DTD graph whose nodes have an

in-degree greater than one. Nodes with in-degree of one are inlined

Elements have an in-degree of zero Elements below a “*” node Of mutually recursive elements all having in-degree

one, one of them is made a separate relation Each element node X that is a separate relation inlines

all nodes Y that are reachable from it such that the path from X to Y does not contain a node that is to be made a separate relation

Page 49: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Issues with Sharing Elements

Parent of elements not fixed at schema level

Need to store type and ids of parents parentCODE field (type of parent) parentID field (id of parent) No foreign key relationship

Page 50: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Hybrid

Same as Shared except that it inlines some elements not inlined in Shared Inlines elements with in-degreee greater than

one that are not recursive or reached through a “*” node.

Set sub-elements and recursive elements are treated as in Shared

Page 51: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.
Page 52: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

book (bookID: integer, book.booktitle.isroot: boolean, book.booktitle : string)

article (articleID: integer, article.contactauthor.isroot: boolean, article.contactauthor.authorid: string)

monograph (monographID: integer, monograph.parentID: integer, monograph.parentCODE: integer, monograph.editor.isroot: boolean, monograph.editor.name: string)

title (titleID: integer, title.parentID: integer, title.parentCODE: integer, title: string)

author (authorID: integer, author.parentID: integer, author.parentCODE: integer, author.name.isroot: boolean, author.name.firstname.isroot: :boolean, author.name.firstname: string, author.name.lastname.isroot: boolean, author.name.lastname: string, author.address.isroot: boolean, author.address: string, author.authorid: string)

Page 53: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Shared Inline

Page 54: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Hybrid

Page 55: XML To Relational Model. Key Index – Forward Traversal Backward Traversal.