Covering Indexes for XML Queries by Prakash Ramanan

31
Covering Indexes for XML Queries by Prakash Ramanan presented by Dilek Demirel

description

Covering Indexes for XML Queries by Prakash Ramanan. presented by Dilek Demirel. Contents. XML query languages Some definitions and concepts Bisimulation and simulation relations Results. The paper is about. - PowerPoint PPT Presentation

Transcript of Covering Indexes for XML Queries by Prakash Ramanan

Page 1: Covering  Indexes for XML Queries by Prakash Ramanan

Covering Indexes for XML Queriesby Prakash Ramanan

presented by

Dilek Demirel

Page 2: Covering  Indexes for XML Queries by Prakash Ramanan

Contents

XML query languages

Some definitions and concepts

Bisimulation and simulation relations

Results

Page 3: Covering  Indexes for XML Queries by Prakash Ramanan

The paper is about

Minimizing the search tree, trying to build similar but smaller graphs equivalent the original XML document graph.

Page 4: Covering  Indexes for XML Queries by Prakash Ramanan

An XML document can be represented as a graph D=(N, E, Eref), where N is the set of nodes, E is the set of edges and Eref is a set of idref edges.

Idref edges denotes an element–subelement relationship.

The subgraph T=(N, E) is a tree.

Page 5: Covering  Indexes for XML Queries by Prakash Ramanan

Example

Page 6: Covering  Indexes for XML Queries by Prakash Ramanan

XML query languages

Some XML query languagesXPathXQuery

They allow navigation in an XML document along different axes, to locate the desired element.

Page 7: Covering  Indexes for XML Queries by Prakash Ramanan

Axes

XPath provides 13 different axes Self Child Descendant/Descendant or self Parent Ancestor/ Ancestor or self Preceding/Preceding sibling Following/Following sibling Attribute Namespace

Page 8: Covering  Indexes for XML Queries by Prakash Ramanan

Subset languages of XPath

Core Xpath (CXPath)Branching Path Queries (BPQ)Tree Pattern Queries (TPQ)

TPQ = TPQ+ subsetof BPQ+ subsetof CXPath+ subsetof XpathWhere C+ denotes query language C without

the operator NOT

Page 9: Covering  Indexes for XML Queries by Prakash Ramanan

Core XPath

Does not contain arithmetic and string operations

Has the full navigational power of XPathConsists all queries involving the thirteen axes and three boolean operators and, or and not

Page 10: Covering  Indexes for XML Queries by Prakash Ramanan

Branching Path Queries

A subset of CXPath

CXPath queries that ignore the order of sibling elements

Allows nine axes, excluding the order respecting axes

Page 11: Covering  Indexes for XML Queries by Prakash Ramanan

Tree Pattern Queries

Involve four axesSelfChildDescendantDescendant or self

The only operator and

Do not involve idref edges

Page 12: Covering  Indexes for XML Queries by Prakash Ramanan

Definitions and concepts

An index for an XML documentObtained by merging “equivalent” nodes

into a single node. “equivalent” according to what, coming

soon…

Page 13: Covering  Indexes for XML Queries by Prakash Ramanan

Index of an XML doc.

Page 14: Covering  Indexes for XML Queries by Prakash Ramanan

Definitions cont’d

A query Q distinguishes between two nodes in an XML document D, if exactly one of the two nodes is in the result of evaluating query Q on D.

Page 15: Covering  Indexes for XML Queries by Prakash Ramanan

Definitions cont’d

An index DI is a covering index for a class C of queries, if the following holds: No query in C can distinguish between two

nodes of D that are in the same extend in DI.

The important point about the covering index is:A covering index DI can be used to

evaluate the queries in C, without using D.

Page 16: Covering  Indexes for XML Queries by Prakash Ramanan

Focus of the paper

The paper have studied the evaluation of CXPath queries and covering indexes for the above mentioned subclasses of CXPath.

Page 17: Covering  Indexes for XML Queries by Prakash Ramanan

Definitions cont’d

CXPath+ is complete, in the sense that,For any node n in an XML document D,

one can always construct a query, which starts from the root , Q in CXPath+, that distinguishes n from all the other nodes.

The paper presented a method to build this query.

Page 18: Covering  Indexes for XML Queries by Prakash Ramanan

We, till now,Described some classes of XML queriesGive some definitions and concepts

Will describe the equivalence relations that are mentioned in the beginning:Define the simulation relation on vertices of

an ordinary graphDefine simulation and bisimulation

relations on an XML document

Page 19: Covering  Indexes for XML Queries by Prakash Ramanan

Simulation and bisimulation

Page 20: Covering  Indexes for XML Queries by Prakash Ramanan

Question

Why do people deal with these simulation quotients?

Because, for an XML document, if its simulation quotient is small, then a set of queries can be evaluated faster by using this index instead of the bigger XML document graph.

Page 21: Covering  Indexes for XML Queries by Prakash Ramanan

Simulation for Ordinary Graphs

Directed graphs G1=(V1, A1), G2=(V2,A2), each vertex v has a type t(v)

Simulation is a binary relation between the vertex sets V1 and V2 of two graphs. It provides a possible notion of dominance/equivalence between the vertices of the two graphs.

Page 22: Covering  Indexes for XML Queries by Prakash Ramanan

Forward simulation

Fsimulation of G1 by G2 is the largest binary relation subset of V1 * V2, such thatPreserves vertex types t(v1)=t(v2)Preserve outgoing arcs: for each v1’

elementOf post(v1), there exists v2’ elementOf post(v2) such that v1’ is Fsimulated by v2’

Fsimilarity is an equivalence relation

Page 23: Covering  Indexes for XML Queries by Prakash Ramanan

Backward Simulation

Analogous to Fsimulation

Deal with the incoming arcs at a vertex, as opposed to forward simulation which deals with outgoing arcs.

Page 24: Covering  Indexes for XML Queries by Prakash Ramanan

Forward and Backward Simulation

FbsimulationPreserves vertex typesPreserves outgoing arcsPreserves incoming arcs

Page 25: Covering  Indexes for XML Queries by Prakash Ramanan

Simulation for an XML Document

Fsimulation of D is the largest binary relation on N (node set of D), such that Preserves node types

If n1=root(D) then n2=root(D) Else t(n2)=t(n1)

Preserve outgoing tree edges For each tree edge (n1,n1’), there exists a tree edge (n2,

n2’) such that n1’ is fsimulated by n2’.

Preserve outgoing idref edges For each idref edge (n1,n1’), there exists an idref edge

(n2, n2’) such that n1’ is fsimulated by n2’.

Page 26: Covering  Indexes for XML Queries by Prakash Ramanan

FBsimulation of D

Deals with both incoming and outgoing arcsPreserves node typesPreserve outgoing tree edgesPreserve outgoing idref edgesPreserve incoming tree edgesPreserve incoming idref edges

Page 27: Covering  Indexes for XML Queries by Prakash Ramanan

Bisimulation RelationForward bisimulation of D is the largest binary relation on N (node set of D), such that Preserves node types

If n1=root(D) then n2=root(D) and vice versa Else t(n2)=t(n1)

Preserve outgoing tree edges For each tree edge (n1,n1’), there exists a tree edge (n2,

n2’) such that n1’ is fsimulated by n2’ and vice versa.

Preserve outgoing idref edges For each idref edge (n1,n1’), there exists an idref edge (n2,

n2’) such that n1’ is fsimulated by n2’ and vice versa.

Page 28: Covering  Indexes for XML Queries by Prakash Ramanan

The Quotients

An equivalence relation on N partitions N into equivalence classes. Any two nodes in the same class are related, any two nodes in different classes are not.

The quotient graph D~ is obtained from D by merging the nodes of each equivalence class into a single node.

Page 29: Covering  Indexes for XML Queries by Prakash Ramanan

Example

Page 30: Covering  Indexes for XML Queries by Prakash Ramanan

Results

A CXPath+ query Q can be evaluated on an XML document D by computing the simulation of Q by D.

For an XML document, its simulation quotient is the smallest covering index for BPQ+.

For an XML document, its simulation quotient, with idref edges ignored throughout, is the smallest covering index for TPQ.

Page 31: Covering  Indexes for XML Queries by Prakash Ramanan

Questions?