ICDT'2001, London, UK1 Minimizing View Sets without Losing Query-Answering Power Chen Li Stanford...
-
Upload
lauryn-welles -
Category
Documents
-
view
226 -
download
3
Transcript of ICDT'2001, London, UK1 Minimizing View Sets without Losing Query-Answering Power Chen Li Stanford...
ICDT'2001, London, UK 1
Minimizing View Sets without Losing Query-Answering Power
Chen Li
Stanford Universityjoint work with Mayank Bawa and Jeff Ullman
2
source query answer
Client
cache
user query
A web-caching scenario
Server
3
Client
Source relation:Book(Title, Author, Pub, Price)
Cached query results:Q1(T,A,Pr) :- book(T,A,Pub,Pr)Q2(T,A,Pr) :- book(T,A,prenhall,Pr)Q3(A1,A2) :- book(T,A1,prenhall,Pr1),
book(T,A2,prenhall,Pr2)
4
Book(Title, Author, Pub, Price)
• Q2 Q1• Remove Q2? Cannot answer query: Q(T,Pr) :- book(T,smith,prenhall,Pr)
What query results to remove?
Cached query results:Q1(T,A,Pr) :- book(T,A,Pub,Pr)Q2(T,A,Pr) :- book(T,A,prenhall,Pr)Q3(A1,A2) :- book(T,A1,prenhall,Pr1),
book(T,A2,prenhall,Pr2)
5
Compute Q3 using Q2:Q3(A1,A2) :- Q2(T,A1,Pr1),Q2(T,A2,Pr2)
We are not losing any query-answering power!
How about removing Q3?Book(Title, Author, Pub, Price)
Cached query results:Q1(T,A,Pr) :- book(T,A,Pub,Pr)Q2(T,A,Pr) :- book(T,A,prenhall,Pr)Q3(A1,A2) :- book(T,A1,prenhall,Pr1),
book(T,A2,prenhall,Pr2)
6
Observations:– Traditional query-containment does not help
[Chandra and Merlin, 1977] .– We should consider query-answering power.
• General questions: – How to describe “query-answering power”?– How to minimize a view set without losing its
query-answering power?
7
Rest of the talk
• Answering queries using views
• Query-answering power– p-containment– Relationship with traditional query containment– Minimizing a view set
• p-containment relative to a set of queries
• Conclusion and open problems
8
Answering queries using views
• Conjunctive queries and views:
h(X) :- g1(X1),…,gn(Xn)
• Example:V1(T,A,Pr) :- book(T,A,Pub,Pr)V2(T,A,Pr) :- book(T,A,prenhall,Pr)V3(A1,A2) :- book(T,A1,prenhall,Pr1),
book(T,A2,prenhall,Pr2)
9
Query answerability
• A query Q is answerable by a view set V if we can rewrite Q using views in V [LMSS95].
• Example:V2(T,A,Pr) :- book(T,A,prenhall,Pr)V3(A1,A2) :- book(T,A1,prenhall,Pr1),
book(T,A2,prenhall,Pr2)
V3 is answerable by V2:V3(A1,A2) :- V2(T,A1,Pr1),V2(T,A2,Pr2)
10
Algorithms
• Bucket algorithm [LRO96]• Inverse-rule algorithm [DG97,Qia96]• MiniCon algorithm [PL00]• SVB algorithm [Mit99]• CoreCover Algorithm [ALU00]
Testing whether a query is answerable by a set of views is NP-complete.
11
Views are expensive to maintain
• Require storage space.
• Need to be kept up-to-date.
We want to minimize a given view set while keeping its query-answering power.
12
p-containment
• A view set V is p-contained in another view set W if W can answer all the queries that are answerable by V. – “p” stands for “power.”– Denoted: V p W
• Two view sets are equipotent, if V p W and W p V. – They have the same power to answer queries.
13
Example:V1(T,A,Pr) :- book(T,A,Pub,Pr)V2(T,A,Pr) :- book(T,A,prenhall,Pr)V3(A1,A2) :- book(T,A1,prenhall,Pr1),
book(T,A2,prenhall,Pr2)
{v1,v2,v3}p {v1,v2}{v1,v2} p {v1,v2,v3}
Therefore:{v1,v2,v3} and {v1,v2} are equipotent.
14
• Lemma: V p W iff each view in V can be
answered by W.– Implies an algorithm for testing p-containment.– Assuming view sets are finite.
• Theorem: Testing V p W is NP-complete.
15
p-containment and query containment
V1(T,A,Pr) :- book(T,A,Pub,Pr)V2(T,A,Pr) :- book(T,A,prenhall,Pr)V3(A1,A2) :- book(T,A1,prenhall,Pr1),
book(T,A2,prenhall,Pr2)
• Query containment does not imply p-containment{v1} and {v2}
• p-containment does not imply query containment {v2} and {v3}
16
Minimizing a view set
• Keep removing views from the view set while retaining the equipotence.
• Might have multiple equipotent minimalsV1(A) :- r(A,B)V2(B) :- r(A,B)V3(A,B) :- r(A,X),r(Y,B)
{V1,V2,V3} has two equipotent minimals: {V1,V2}, {V3}
17
p-containment relative to queries
Queries: Q={Q1,Q2,…}
V = {V1,V2,…,Vm} W = {W1,W2,…,Wn}
V is p-contained in W w.r.t. Q if the queries in Q that are answerable by V are also answerable by W.
18
Example of relative p-containment
Relations: car(Make,Dealer) loc(Dealer,City)
Queries:Q1(D,C) :- car(toyota,D),loc(D,C)Q2(D,C) :- car(honda,D), loc(D,C)
Views: V = {V1,V2}, V1 = Q1, V2 = Q2 W = {W1}
W1(M,D,C) :- car(M,D),loc(D,C)
19
Testing relative p-containment
• Q is finite: test by the definition.
• Q is infinite?
20
Parameterized queries
• Motivation: web search forms.• A PQ is a conjunctive query with placeholders.• Example:
q(D) :- car($M,D),loc(D,$C)– Placeholders $M,$C, replaced by constants– Instances:
q(D) :- car(toyota,D),loc(D,sf)q(D) :- car(honda,D),loc(D,pa)
– The domain of each placeholder is infinite.– Thus, represent infinite number of queries.
21
Q: q(D) :- car($M,D),loc(D,$C)• v1(M,D,C) :- car(M,D),loc(D,C)
– Answer all instances of Q.• v2(M,D) :- car(M,D),loc(D,sf)
– Answer some instances of Q.– Answerable instances of Q are instances of:
q(D) :- car($M,D),loc(D,sf)• v3(M) :- car(M,D),loc(D,sf)
– Answer no instances of Q.
22
• Assume queries are generated by one PQ;• Results easily extendable to the case with
finite set of PQs.
• Complete answerability of a PQ using views– V can answer all instances of a PQ Q.– Example:q(D) :- car($M,D),loc(D,$C)v1(M,D,C) :- car(M,D),loc(D,C)
23
An algorithm for testing complete answerability
• Replace each placeholder with a new distinct constant, get a canonical instance I;
• Test if I is answerable by V.Example:
PQ: q(D) :- car($M,D),loc(D,$C)View: v1(M,D,C) :- car(M,D),loc(D,C)
Canonical instance: q(D) :- car(m0,D),loc(D,c0)
Rewriting: q(D) :- v1(m0,D,c0)
24
Partial answerability
• Some instances of Q are answerable by Vq(D) :- car($M,D),loc(D,$C)v2(M,D) :- car(M,D),loc(D,sf)
• Theorem: All the answerable instances of a PQ using V are instances of a finite set of PQs, s.t. each of them is completely answerable by V.
q(D) :- car($M,D),loc(D,sf)
25
a parameterizedquery Q
All instances of Q
answerable instancesPQ1
PQ2
PQk
…
V={V1,…,Vn}
An algorithm for finding the finite set of PQs.
26
Testing p-containment w.r.t. PQ
• Find the PQs whose instances are all the instances of Q that are answerable by V.
• For each of the PQs, test if it is completely answerable by V.
• Details are in the paper.
27
Conclusion
• Introduced p-containment, which is different from query containment.
• Showed how to minimize a view set without losing query-answering power.
• Developed an algorithm for testing relative p-containment w.r.t. instances of PQs.
• Extended to MCR-containment.
28
Open problems
• Find a view subset with lowest “cost.”
• If views are not given, find the best views to materialize.