Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data...
Transcript of Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data...
![Page 1: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/1.jpg)
Data Exchange: Computing Coresin Polynomial Time
Georg GottlobOxford University
Joint work with Alan Nash, UCSD
![Page 2: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/2.jpg)
G....: Computing Cores for Data Exchange: New Algorithmsand Practical Solutions PODS 2005
G.... & Nash: Data Exchange: Computing Cores in PolynomialTime. Submitted to PODS 2006.
This talk is based on two recent papers:
Detailed joint extended version of both papers:
G.... & Nash: Efficient Core Computation in Data Exchange.Available from the authors (Draft).
![Page 3: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/3.jpg)
Talk Structure
Introduction & basics
Computing Cores• for weakly acyclic TGDs as target dependencies• for EGDs and weakly acyclic TGDs as target dependencies
Further results (time permitting)
![Page 4: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/4.jpg)
Cores
Instance:
{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
Logical meaning
p(X,Y) & p(X,b) & p(a,b) & p(U,c) & p(U,V) & q(a,c,d)∃ X, Y, U, V:
![Page 5: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/5.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
endomorphism h: {Y b}
![Page 6: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/6.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
endomorphism h: {Y b}
REDUNDANT!∃X,Y p(X,Y) & p(X,b)
⇑⇓∃X p(X,b)
![Page 7: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/7.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
endomorphism h: {Y b}
REDUNDANT!∃X,Y p(X,Y)
⇑∃X p(X,b)
![Page 8: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/8.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
endomorphism h: {Y b}
h(I) =
⇔
![Page 9: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/9.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
endomorphism h: {Y b}
h(I) =
⇔
h(I) can be further reduced by endomorphism g: {X a, V c}
X a V c
![Page 10: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/10.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
endomorphism h: {Y b}
h(I) =
⇔
h(I) can be further reduced by endomorphism g: {X a, V c}
{ p(a,b), p(U,c), q(a,c,d) }
![Page 11: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/11.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }h(I) =
⇔
endomorphism f: {X a, Y b, V c}
{ p(a,b), p(U,c), q(a,c,d) }f(I)= g(h(I)) = g○h(I)=⇔
![Page 12: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/12.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }h(I) =
⇔
endomorphism f: {X a, Y b, V c}
{ p(a,b), p(U,c), q(a,c,d) }f(I)= g(h(I)) = g○h(I)=⇔
no refinement by endomorphisms possible !
![Page 13: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/13.jpg)
Cores
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
{ p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }h(I) =
⇔
endomorphism f: {X a, Y b, V c}
{ p(a,b), p(U,c), q(a,c,d) }f(I)= g(h(I)) = g○h(I)=⇔
Core(I)unique up to variable-renaming!
![Page 14: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/14.jpg)
Blocks
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
Blocks: Connected components in the variable-graph
Atom-Blocks: corresponding sets of atoms
![Page 15: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/15.jpg)
X Y
Blocks
I = { p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
Blocks: Connected components in the variable-graph
{X,Y} {U,V} blocksize(I)=2
blocksize(I) = size of largest block of I
U V variable-graph
![Page 16: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/16.jpg)
[G. PODS’05] - Computing core(I) tractable for bounded
treewidth or hypertree-width of variable-graph=> new bound: O(nb/2+3)
based on hypertree decompositions. ( end of talk, time permitting)
- Computing core(I) is NP-hard in general.
- It is tractable for bounded blocksize b:
core(I) can be computed in time n * O(|dom(I)|b+2) = O(nb+3)
[Fagin, Kolaitis, Popa PODS’03]:
![Page 17: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/17.jpg)
DependenciesTuple generating dependencies TGDs:
∀X ∀Y ∀Z p(X,Y) & q(Y,Z) ∃U ∃ V r(X,U) & p(Z,V)
We usually omit universal quantifiers…
TGDs can be cyclic in which case the Chase may not terminate
Cyclic TGD:
p(X,Y) & q(Y,Z) ∃ U,V r(X,U) & p(Z,V)
Equality generating dependencies EGDs:
∀X ∀Y ∀Z p(X,Y) & p(X,Z) Y=Z
![Page 18: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/18.jpg)
We restrict ourselves to setting of weakly acyclic sets of TGDs + arbitrary EGDs
( [Fagin, Kolaitis, Popa 03], [Deutsch,Tannen 03] )
This covers the overwhelming part of relevant constraints:
• Functional dependencies• w.a. inclusion dependencies• referential integrity • foreign key constraints…• …
![Page 19: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/19.jpg)
S TSource-Target TGDs TGDs
I
Data Exchange
EGDsΣt
J´
Japply Σt
Core(J´)
compute core
Σst
apply Σst
![Page 20: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/20.jpg)
S TSource-Target TGDs TGDs
I
Data Exchange
EGDsΣt
J´
Japply Σt
Core(J´)
compute coreOpen Problem [F.K.P.03]:Can Core(J’) be computed in polytimeif Σt consists of w.a. TGDs and EGDs?
Σst
apply Σst
![Page 21: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/21.jpg)
S TSource-Target TGDs TGDs
I
Data Exchange
EGDsΣt
J´
Japply Σt
Core(J´)
compute core
Σst
apply Σst
Nice: bounded blockwidth,treewidth, hypertree-width
![Page 22: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/22.jpg)
S TSource-Target TGDs TGDs
I
Data Exchange
EGDsΣt
J´
Japply Σt
Core(J´)
compute core
Σst
apply Σst
Nice: bounded blockwidth,treewidth, hypertree-width
The problem: unbounded blockwidth,treewidth, hypertree-width
Lumps blocksTogether!
![Page 23: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/23.jpg)
TGDs (even full TGDs) destroy blockwidth
{X,Y} {U,V} blocksize(I)=2
{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
![Page 24: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/24.jpg)
TGDs (even full TGDs) destroy blockwidth
{X,Y} {U,V} blocksize=2
{ p(X,Y), p(X,b), p(a,b), p(U,c), p(U,V), q(a,c,d) }
TGD: p(R,S) & p(R’,S’) p(R,R’)
{X,Y,U,V} blocksize=4
p(X,U)
![Page 25: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/25.jpg)
Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]
- Target dependencies are empty or contain only EGDs(blocks method and rigidity)
• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise
(they preserve hypertree-width)
• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)
• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME
![Page 26: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/26.jpg)
Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]
- Target dependencies are empty or contain only EGDs(blocks method and rigidity)
• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise
(they preserve hypertree-width)
• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)
• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME
![Page 27: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/27.jpg)
Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]
- Target dependencies are empty or contain only EGDs(blocks method and rigidity)
• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise
(they preserve hypertree-width)
• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)
• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME
![Page 28: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/28.jpg)
Efficient Core Computation• Fagin, Kolaitis, and Popa [PODS 2003]
- Target dependencies are empty or contain only EGDs(blocks method and rigidity)
• G….. [PODS 2005]- Target dependencies without existential quantification (= full)- Target dependencies with a single atom in the premise
(they preserve hypertree-width)
• G…., Nash [PODS 2006]- General target dependencies for which the chase is known to terminate (weakly acyclic or new conditions)
• In summary: Whenever we know we can compute universal solutions in PTIME, we know we can compute their cores in PTIME
![Page 29: Data Exchange: Computing Cores in Polynomial Timelenzerin/INFINT2007/material/Pichler1.pdf · Data Exchange: Computing Cores in Polynomial Time Georg Gottlob Oxford University Joint](https://reader030.fdocuments.in/reader030/viewer/2022040804/5e4181a7c1682537dc3e84ff/html5/thumbnails/29.jpg)
S TSource-Target TGDs TGDs
I
Data Exchange
EGDsΣt
J´
Japply Σt
Core(J´)
compute coreTheorem: Core(J’) can be computed in polynomial time.
Σst
apply Σst