Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University
-
Upload
irma-sharpe -
Category
Documents
-
view
24 -
download
0
description
Transcript of Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University
Enumerating XML Data for Dynamic UpdatingL.Kit and V.Ng, Hong Kong Polytechnique University
Sang-Ho NahLilyDanielYun Hee Lee
Introduction
n-Inode: new model-mapping approach Multidimensional node ID for indexing
and node-to-node relationship calculation
Supports dynamic updating of XML data flexibly
n-INode
XML document represented as nc-ary complete tree, where nc=maximum number of child node per node
Multidimensional node ID:k-dimensional ID:(id1, io1, id2, io2, …, idk, iok)
idx: Node identifier assigned by numbering scheme
iox: Insertion order, sequential number starting from 0
Presence of iox allows more than nc child nodes to be inserted. No re-calculation of existing nodes’ id required
n-INode cont
Insertion Rules If newly inserted node’s id1 exists in the tree, its io1 is
incremented from maximum io1 among existing nodes with the same id1
If new node is inserted to the “right most position”, and maximum io1 (of all the nodes with the same id1) is less than nc, then io1 = nc+1
A new dimension is introduced to all descendants of a node that has io1 > 0. Parent’s first dimension is assigned to the child’s first dimension.
n-INode cont
Parent-Child relationship: Pair of nodes with the same number of
dimensions Pair of nodes with dimensional difference of one Parent and Child MUST share the identical first
dimension
Ancestor-Descendant relationship: Above 2 situations Pair of nodes with dimensional difference of
more than one
Implementation & Experiment
Required storage space is not the smallest of all the models tested
Other test results show that this is a reasonable trade-off
Query time is reasonable and consistent – shows it does not depend heavily on the type of query
Possible flaws in n-INode
Node relationship calculation/verification rule excludes a case where both nodes in the pair have 1-dimensional ID (first dimensions cannot be the same)
Path sequence of each node changes by allowing more than nc child nodes to be inserted – therefore path sequence should not be used in node identifier calculation
Conclusion
Identifying the insertion order removes restriction on the number of child nodes to be inserted
Re-calculation of existing nodes’ ID is not required
This allows for more effective and efficient node locating operation, supporting dynamic updates of XML data.
However, some aspects were overlooked and this makes the proof of correctness presented in the paper somewhat deficient.