Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel...
Transcript of Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel...
![Page 1: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/1.jpg)
Provenance for Natural Language Queries
Daniel Deutch Nave Frost Amir Gilad
Tel Aviv University
August 2017
Presented by Amir Gilad
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 1 / 23
![Page 2: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/2.jpg)
Outline
1 Introduction
2 Mappings and Answer Tree - Single Assignment
3 Factorization
4 Summarization
5 Experiments
6 Related Work and Conclusions
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 2 / 23
![Page 3: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/3.jpg)
Motivation
NL QueryReturn the organization of authors who published papers in database conferences after
2005
Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),
pub(wid, cid, ptitle, pyear), author(aid, aname, oid),
domainConf(cid, did), domain(did, dname),
writes(aid, wid), dname = ’Databases’, pyear > 2005
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23
![Page 4: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/4.jpg)
Motivation
NL QueryReturn the organization of authors who published papers in database conferences after
2005
Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),
pub(wid, cid, ptitle, pyear), author(aid, aname, oid),
domainConf(cid, did), domain(did, dname),
writes(aid, wid), dname = ’Databases’, pyear > 2005
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23
![Page 5: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/5.jpg)
Motivation
NL QueryReturn the organization of authors who published papers in database conferences after
2005
Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),
pub(wid, cid, ptitle, pyear), author(aid, aname, oid),
domainConf(cid, did), domain(did, dname),
writes(aid, wid), dname = ’Databases’, pyear > 2005
ResultTel Aviv University (TAU)
(why?)
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23
![Page 6: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/6.jpg)
Motivation
NL QueryReturn the organization of authors who published papers in database conferences after
2005
Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),
pub(wid, cid, ptitle, pyear), author(aid, aname, oid),
domainConf(cid, did), domain(did, dname),
writes(aid, wid), dname = ’Databases’, pyear > 2005
ResultTel Aviv University (TAU) (why?)
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23
![Page 7: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/7.jpg)
Motivation
NL QueryReturn the organization of authors who published papers in database conferences after
2005
Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),
pub(wid, cid, ptitle, pyear), author(aid, aname, oid),
domainConf(cid, did), domain(did, dname),
writes(aid, wid), dname = ’Databases’, pyear > 2005
What We Have - Provenance(oname,TAU)·(aname,Tova M.)·(ptitle,OASSIS...)·(cname,SIGMOD)·(pyear,14’)+(oname,TAU)·(aname,Tova M.)·(ptitle,Querying...)·(cname,VLDB)·(pyear,06’)+(oname,TAU)·(aname,Tova M.)· (ptitle,Monitoring...)·(cname,VLDB)·(pyear,07’)+(oname,TAU)·(aname,Slava N.)·(ptitle,OASSIS...)·(cname,SIGMOD)·(pyear,14’)+(oname,TAU)·(aname,Tova M.)·(ptitle,A sample...)·(cname,SIGMOD)·(pyear,14’)+...
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23
![Page 8: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/8.jpg)
Motivation
NL QueryReturn the organization of authors who published papers in database conferences after
2005
Formal Queryquery(oname) :- org(oid, oname), conf(cid, cname),
pub(wid, cid, ptitle, pyear), author(aid, aname, oid),
domainConf(cid, did), domain(did, dname),
writes(aid, wid), dname = ’Databases’, pyear > 2005
What We Want - ExplanationsTAU is the organization of 43 authors who published 170 papers
in 31 conferences in 2006 - 2015
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 3 / 23
![Page 9: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/9.jpg)
Solution Overview
Solution
Use the input question to formulate a detailed NL answer by replacingwords with values
I This is a general idea: showing provenance in a way that correspondsto the standard user interaction
When a long answer is needed, compact it using algebraicfactorization and summarization
I Here, again, we leverage the structure of the user question
Current Limitations
Only conjunctive queries are supported
Some aspects of the solution are limited to a specific NLIDBI But the general idea is not
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 4 / 23
![Page 10: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/10.jpg)
Framework
Fact. +Sentence
Fact. +Sentence
Parser(Augmented) NaLIR(Augmented) NaLIR
BuilderQuery Builder
NL Query
NL Query
DBDBSelP
Factorization GenerationSentence
GenerationFact. +MappingFact. +
Mapping
Results + Provenance + MappingResults + Provenance + Mapping
Query + MappingQuery + MappingDep.
TreeDep. Tree
SummarizationSentenceSentenceSentenceSentence Summarized SentenceSummarized Sentence
Augment NaLIR [Fei Li, Jagadish, 15’]
Use a provenance-aware engine - SelP [Deutch et al., 15’]
Store the provenance and mappings
Translate results and provenance to NL using factorization andsummarization
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 5 / 23
![Page 11: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/11.jpg)
Outline
1 Introduction
2 Mappings and Answer Tree - Single Assignment
3 Factorization
4 Summarization
5 Experiments
6 Related Work and Conclusions
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 6 / 23
![Page 12: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/12.jpg)
Mappings
(oname, TAU)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23
Return the organization of authors who published papers in database conferences after 2005
query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,
aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =
’Databases’, pyear > 2005
![Page 13: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/13.jpg)
Mappings
(oname, TAU)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23
Return the organization of authors who published papers in database conferences after 2005
query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,
aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =
’Databases’, pyear > 2005
![Page 14: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/14.jpg)
Mappings
(oname, TAU)
(aname, Tova M.)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23
Return the organization of authors who published papers in database conferences after 2005
query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,
aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =
’Databases’, pyear > 2005
![Page 15: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/15.jpg)
Mappings
(oname, TAU)
(aname, Tova M.)
(ptitle, ‘OASSIS...’)
(cname, SIGMOD)
(pyear, 2014)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 7 / 23
Return the organization of authors who published papers in database conferences after 2005
query(oname) :- org(oid, oname), conf(cid, cname), pub(wid, cid, ptitle, pyear), author(aid,
aname, oid), domainConf(cid, did), domain(did, dname), writes(aid, wid), dname =
’Databases’, pyear > 2005
![Page 16: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/16.jpg)
From Mappings to an Answer
(oname, TAU)
(aname, Tova M.)
(ptitle, ‘OASSIS...’)
(cname, SIGMOD)
(pyear, 2014)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
organization
of
Tova M.
published
in
SIGMOD
in
2014
’OASSIS...’who
TAU (is the)
AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23
![Page 17: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/17.jpg)
From Mappings to an Answer
(oname, TAU)
(aname, Tova M.)
(ptitle, ‘OASSIS...’)
(cname, SIGMOD)
(pyear, 2014)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
organization
of
Tova M.
published
in
SIGMOD
in
2014
’OASSIS...’who
TAU (is the)
AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23
![Page 18: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/18.jpg)
From Mappings to an Answer
(oname, TAU)
(aname, Tova M.)
(ptitle, ‘OASSIS...’)
(cname, SIGMOD)
(pyear, 2014)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
organization
of
Tova M.
published
in
SIGMOD
in
2014
’OASSIS...’who
TAU (is the)
AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23
![Page 19: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/19.jpg)
From Mappings to an Answer
(oname, TAU)
(aname, Tova M.)
(ptitle, ‘OASSIS...’)
(cname, SIGMOD)
(pyear, 2014)
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
organization
of
Tova M.
published
in
SIGMOD
in
2014
’OASSIS...’who
TAU (is the)
AnswerTAU is the organization of Tova M. who published ’OASSIS...’ in SIGMOD in 2014
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 8 / 23
![Page 20: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/20.jpg)
Outline
1 Introduction
2 Mappings and Answer Tree - Single Assignment
3 Factorization
4 Summarization
5 Experiments
6 Related Work and Conclusions
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 9 / 23
![Page 21: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/21.jpg)
Provenance Factorization
Idea
Use algebraic factorization of the provenance to take-out commonvalues that appear in multiple assignments
Provenance[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 10 / 23
![Page 22: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/22.jpg)
Provenance Factorization
Idea
Use algebraic factorization of the provenance to take-out commonvalues that appear in multiple assignments
Provenance[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]
Two Different Factorizations[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))
+ [Tova M.] · [A Sample...])
+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])
[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))
+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))
+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 10 / 23
![Page 23: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/23.jpg)
Provenance Factorization
Idea
Use algebraic factorization of the provenance to take-out commonvalues that appear in multiple assignments
Provenance[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]
Two Different Factorizations[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))
+ [Tova M.] · [A Sample...])
+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])
[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))
+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))
+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])
Shortermeansbetter?
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 10 / 23
![Page 24: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/24.jpg)
T -CompatibilityNL Query
Return the organization of authors who published papers in database conferences after 2005
Shortest Factorization
[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))
+ [Tova M.] · [A Sample...])
+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])
As a Sentence
TAU is the organization of authors who published inSIGMOD 2014
’OASSIS...’ which was published byTova M. and Slava N.
and Tova M. published ’A sample...’
and Tova M. published in VLDB
’Querying...’ in 2014
and ’Monitoring...’ in 2007.
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 11 / 23
![Page 25: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/25.jpg)
T -Compatibility
Shortest Factorization[TAU] ·([SIGMOD] · [2014] ·([OASSIS...] ·([Tova M.] + [Slava N.]))
+ [Tova M.] · [A Sample...])
+ [VLDB] · [Tova M.] ·([2006] · [Querying...]+ [2007] · [Monitoring...])
Return
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
conferences ≤T authors but conferences 6≤fbad authors
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 11 / 23
![Page 26: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/26.jpg)
T -CompatibilityNL Query
Return the organization of authors who published papers in database conferences after 2005
Longer, T -Compatible Factorization
[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))
+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))
+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])
As a Sentence
TAU is the organization of
Tova M. who published
in VLDB
’Querying...’ in 2006 and
’Monitoring...’ in 2007
and in SIGMOD in 2014
’OASSIS...’ and ’A sample...’
and Slava N. who published
’OASSIS...’ in SIGMOD in 2014.
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 11 / 23
![Page 27: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/27.jpg)
Factorization Algorithm
Proposition
Obtaining a minimal T -compatible factorization is coNP-hard
Algorithm
Factorize greedily: traverse the dependency tree level-by-level
For every level with mapped words, factorize their correspondingvalues in the provenance
Prioritize which values to take-out in each level by frequency
Complexity
O(n2 · log n): recursively traverse the dependency tree and sort thevariables at each layer by their frequency in O(n · log n)
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 12 / 23
![Page 28: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/28.jpg)
Factorization Algorithm
Proposition
Obtaining a minimal T -compatible factorization is coNP-hard
Algorithm
Factorize greedily: traverse the dependency tree level-by-level
For every level with mapped words, factorize their correspondingvalues in the provenance
Prioritize which values to take-out in each level by frequency
Complexity
O(n2 · log n): recursively traverse the dependency tree and sort thevariables at each layer by their frequency in O(n · log n)
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 12 / 23
![Page 29: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/29.jpg)
Factorization Algorithm
Proposition
Obtaining a minimal T -compatible factorization is coNP-hard
Algorithm
Factorize greedily: traverse the dependency tree level-by-level
For every level with mapped words, factorize their correspondingvalues in the provenance
Prioritize which values to take-out in each level by frequency
Complexity
O(n2 · log n): recursively traverse the dependency tree and sort thevariables at each layer by their frequency in O(n · log n)
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 12 / 23
![Page 30: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/30.jpg)
Factorization Example
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
[TAU]·[Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[Querying...]·[VLDB]·[2006]+[TAU]·[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[TAU]·[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[TAU]·[Tova M.]·[A sample...]·[SIGMOD]·[2014]
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23
![Page 31: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/31.jpg)
Factorization Example
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
[TAU] ·([Tova M.]·[OASSIS...]·[SIGMOD]·[2014]+[Tova M.]·[Querying...]·[VLDB]·[2006]+[Tova M.]· [Monitoring..]·[VLDB]·[2007]+[Slava N.]·[OASSIS...]·[SIGMOD]·[2014]+[Tova M.]·[A sample...]·[SIGMOD]·[2014])
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23
![Page 32: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/32.jpg)
Factorization Example
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
[TAU] ·([Tova M.] ·([OASSIS...]·[SIGMOD]·[2014]+[Querying...]·[VLDB]·[2006]+[Monitoring..]·[VLDB]·[2007]+[A sample...]·[SIGMOD]·[2014])+[Slava N.] · [OASSIS...] · [SIGMOD] · [2014])
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23
![Page 33: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/33.jpg)
Factorization Example
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
[TAU] ·([Tova M.] ·([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))
+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))
+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23
![Page 34: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/34.jpg)
Factorization Example
organizationPOS=NN, REL=dobj
ofPOS=IN, REL=prep
authorsPOS=NNS, REL=pobj
publishedPOS=VBD, REL=rcmod
in
conferencesPOS=NNS, REL=pobj
databasePOS=NN, REL=nn
afterPOS=IN, REL=prep
2005POS=CD, REL=pobj
paperswho
the
TAU is the organization of
Tova M. who published
in VLDB
’Querying...’ in 2006 and
’Monitoring...’ in 2007
and in SIGMOD in 2014
’OASSIS...’ and ’A sample...’
and Slava N. who published
’OASSIS...’ in SIGMOD in 2014.
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 13 / 23
![Page 35: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/35.jpg)
Outline
1 Introduction
2 Mappings and Answer Tree - Single Assignment
3 Factorization
4 Summarization
5 Experiments
6 Related Work and Conclusions
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 14 / 23
![Page 36: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/36.jpg)
Summarization
Two Levels of Summarization[TAU] ·
A
([Tova M.] ·
B
([VLDB] ·([2006] · [Querying...]+ [2007] · [Monitoring...]))
+ [SIGMOD] · [2014] ·([OASSIS...] + [A Sample...]))
B
+ [Slava N.] · [OASSIS...] · [SIGMOD] · [2014])
A
Shorter Summarized Answer Based on A
TAU is the organization of 2 authors who published
4 papers in 2 conferences in 2006 - 2014
More Detailed Summarized Answer Based on B
TAU is the organization of Tova M. who published
4 papers in 2 conferences in 2006 - 2014 and Slava N.
who published ’OASSIS...’ in SIGMOD in 2014.
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 15 / 23
![Page 37: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/37.jpg)
Outline
1 Introduction
2 Mappings and Answer Tree - Single Assignment
3 Factorization
4 Summarization
5 Experiments
6 Related Work and Conclusions
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 16 / 23
![Page 38: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/38.jpg)
Sample Use-Cases
Q: Return the authors who published papers in VLDB before 2016 andafter 2007
A: Tova M. published 16 papers in VLDB in 2008 - 2015
Q: Return the authors who published papers in database conferences
A: Tova M. published 134 papers in 18 conferences
Q: Return the organization of authors who published papers in databaseconferences after 2005
A: TAU is the organization of 43 authors who published 170 papers in31 conferences in 2006 - 2015
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 17 / 23
![Page 39: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/39.jpg)
Sample Use-Cases
Q: Return the authors who published papers in VLDB before 2016 andafter 2007
A: Tova M. published 16 papers in VLDB in 2008 - 2015
Q: Return the authors who published papers in database conferences
A: Tova M. published 134 papers in 18 conferences
Q: Return the organization of authors who published papers in databaseconferences after 2005
A: TAU is the organization of 43 authors who published 170 papers in31 conferences in 2006 - 2015
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 17 / 23
![Page 40: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/40.jpg)
Sample Scalability ResultsComputation time as a function of the number of assignments.Overhead of only 16% w.r.t evaluation time.
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0 1000 2000 3000 4000 5000
Tim
e (
sec)
Number of Assignments
Query 4
Query 5
Query 6
Query 7
Query 8
Query 9
Query 10
Query 11
Query 12
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 18 / 23
![Page 41: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/41.jpg)
Breakdown of Computation Time
0
0.1
0.2
0.3
0.4
0.5
0.6
0 1000 2000 3000 4000 5000
Tim
e (
sec)
Domain of Unique Values Per Answer
Query 4 Query 5 Query 6 Qurey 7 Query 8 Query 9 Query 10 Query 11 Query 12
(a) Factorization time
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0 1000 2000 3000 4000 5000
Tim
e (
sec)
Domain of Unique Values Per Answer
Query 4 Query 5 Query 6 Qurey 7 Query 8 Query 9 Query 10 Query 11 Query 12
(b) Sentence gen. time
Summarization time was negligible.
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 19 / 23
![Page 42: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/42.jpg)
Outline
1 Introduction
2 Mappings and Answer Tree - Single Assignment
3 Factorization
4 Summarization
5 Experiments
6 Related Work and Conclusions
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 20 / 23
![Page 43: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/43.jpg)
Related Work
NL Interfaces:
Formulate the NL query and present the answers, e.g., [Fei Li et al.,15’], [Song et al., 15’]
Present the answers in NL based on the schema [Franconi et al., 14’]
Explain the query in NL [Koutrika et al., 10’]
Provenance:
Showing the provenance in graph form, e.g., [Ailamaki et al., 98’],[Davidson et al., 08’]
Allowing user control over granularity [Cohen-Boulakia et al., 08’]
Provenance factorization and Summarization, e.g., [Chapman et al.,08’], [Olteanu et al., 12’]
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 21 / 23
![Page 44: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/44.jpg)
Summary
Main Contributions:
First to formulate the provenance of output tuples in NL
Employing both factorization and summarization to make provenancemore understandable
Devising a criterion for provenance factorization that accounts for itspresentation in NL
Future Work:
Extend the solution to UCQs, aggregation, nested queries, ...
Support more provenance models
Generalize the requirements from NL interfaces
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 22 / 23
![Page 45: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/45.jpg)
Summary
Main Contributions:
First to formulate the provenance of output tuples in NL
Employing both factorization and summarization to make provenancemore understandable
Devising a criterion for provenance factorization that accounts for itspresentation in NL
Future Work:
Extend the solution to UCQs, aggregation, nested queries, ...
Support more provenance models
Generalize the requirements from NL interfaces
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 22 / 23
![Page 46: Daniel Deutch Nave Frost Amir Giladamirgilad/papers/VLDB17Presentation.pdf · 2017-08-27 · Daniel Deutch Nave Frost Amir Gilad Tel Aviv University August 2017 Presented by Amir](https://reader033.fdocuments.in/reader033/viewer/2022042711/5f6f19e2da97ed1786667920/html5/thumbnails/46.jpg)
Thank YouQuestions?
Daniel Deutch, Nave Frost, Amir Gilad Provenance for Natural Language Queries August 2017 23 / 23