Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University

Enumerating XML Data for Dynamic UpdatingL.Kit and V.Ng, Hong Kong Polytechnique University

Sang-Ho NahLilyDanielYun Hee Lee

Introduction

n-Inode: new model-mapping approach Multidimensional node ID for indexing

and node-to-node relationship calculation

Supports dynamic updating of XML data flexibly

n-INode

XML document represented as nc-ary complete tree, where nc=maximum number of child node per node

Multidimensional node ID:k-dimensional ID:(id1, io1, id2, io2, …, idk, iok)

idx: Node identifier assigned by numbering scheme

iox: Insertion order, sequential number starting from 0

Presence of iox allows more than nc child nodes to be inserted. No re-calculation of existing nodes’ id required

n-INode cont

Insertion Rules If newly inserted node’s id1 exists in the tree, its io1 is

incremented from maximum io1 among existing nodes with the same id1

If new node is inserted to the “right most position”, and maximum io1 (of all the nodes with the same id1) is less than nc, then io1 = nc+1

A new dimension is introduced to all descendants of a node that has io1 > 0. Parent’s first dimension is assigned to the child’s first dimension.

n-INode cont

Parent-Child relationship: Pair of nodes with the same number of

dimensions Pair of nodes with dimensional difference of one Parent and Child MUST share the identical first

dimension

Ancestor-Descendant relationship: Above 2 situations Pair of nodes with dimensional difference of

more than one

Implementation & Experiment

Required storage space is not the smallest of all the models tested

Other test results show that this is a reasonable trade-off

Query time is reasonable and consistent – shows it does not depend heavily on the type of query

Possible flaws in n-INode

Node relationship calculation/verification rule excludes a case where both nodes in the pair have 1-dimensional ID (first dimensions cannot be the same)

Path sequence of each node changes by allowing more than nc child nodes to be inserted – therefore path sequence should not be used in node identifier calculation

Conclusion

Identifying the insertion order removes restriction on the number of child nodes to be inserted

Re-calculation of existing nodes’ ID is not required

This allows for more effective and efficient node locating operation, supporting dynamic updates of XML data.

However, some aspects were overlooked and this makes the proof of correctness presented in the paper somewhat deficient.

Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University

Documents

Transcript of Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University