Mazda Trio Meeting
-
date post
13-Sep-2014 -
Category
Automotive
-
view
651 -
download
3
description
Transcript of Mazda Trio Meeting
![Page 1: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/1.jpg)
Trio: A System for Data, Uncertainty, and Lineage
Search “stanford trio”http://i.stanford.edu/trio
![Page 2: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/2.jpg)
2
People
Current• Jennifer Widom (faculty)• Omar Benjelloun (post-doc)• Parag Agrawal, Anish Das Sarma, Shubha Nabar (PhD)• Michi Mutsuzaki (MS)• Tomoe Sugihara (visitor)
Incoming• Martin Theobald (post-doc)• Raghu Murthy (MS)• Ander de Keijzer (visitor)
Alums• Alon Halevy, Ashok Chandra (visitors)• Chris Hayworth (MS)
![Page 3: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/3.jpg)
3
Why Uncertainty + Lineage?
Many applications seem to need bothFrom a technical standpoint, it turns out
that lineage...
1. Enables simple and consistent representation of uncertain data
2. Correlates uncertainty in query results with uncertainty in the input data
3. Can make computation over uncertain data more efficient
![Page 4: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/4.jpg)
4
Trio Components
1. Data Model ULDBs (Uncertainty-Lineage Databases): Simple extension to relational model
2. Query Language TriQL: Simple extension to SQL, well-defined
semantics and intuitive behavior
3. System Version 1: Complete system and GUI built
on top of conventional DBMS
![Page 5: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/5.jpg)
5
Running Example: Crime-Solving
Saw(witness,car) // may be uncertainDrives(person,car) // may be uncertain
Suspects(person) = πperson(Saw ⋈ Drives)
![Page 6: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/6.jpg)
6
Our Model for Uncertainty
1. Alternatives2. ‘?’ (Maybe) Annotations3. Confidences
![Page 7: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/7.jpg)
7
Our Model for Uncertainty
1. Alternatives: uncertainty about value2. ‘?’ (Maybe) Annotations3. Confidences
Saw (witness,car)(Amy, Honda) ∥ (Amy, Toyota) ∥ (Amy,
Mazda)
witness carAmy { Honda, Toyota,
Mazda }=
Three possibleinstances
![Page 8: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/8.jpg)
8
Six possibleinstances
Our Model for Uncertainty
1. Alternatives2. ‘?’ (Maybe): uncertainty about presence3. Confidences
Saw (witness,car)(Amy, Honda) ∥ (Amy, Toyota) ∥ (Amy,
Mazda)(Betty, Acura)
?
![Page 9: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/9.jpg)
9
Our Model for Uncertainty
1. Alternatives2. ‘?’ (Maybe) Annotations3. Confidences: weighted uncertainty
Saw (witness,car)(Amy, Honda): 0.5 ∥ (Amy,Toyota): 0.3 ∥ (Amy,
Mazda): 0.2(Betty, Acura): 0.6
?
Six possible instances, each with a probability
![Page 10: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/10.jpg)
10
Models for Uncertainty
• Our model (so far) is not especially new• We spent some time exploring the space of
models for uncertainty [ICDE 06, journal]
• Tension between understandability and expressiveness– Our model is understandable– But it is not complete, or even closed under
common operations
![Page 11: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/11.jpg)
11
Our Model is Not Closed
Saw (witness,car)(Cathy, Honda) ∥ (Cathy,
Mazda)
Drives (person,car)(Jimmy, Toyota) ∥ (Jimmy,
Mazda)(Billy, Honda) ∥ (Frank, Honda)
(Hank, Honda)
SuspectsJimmy
Billy ∥ FrankHank
Suspects = πperson(Saw ⋈ Drives)
???
Does not correctlycapture possibleinstances in theresult
CANNOT
![Page 12: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/12.jpg)
12
Lineage to the Rescue
Lineage• Captures “where data came from”• In Trio: A function λ from alternatives to other
alternatives (or external sources)
![Page 13: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/13.jpg)
13
Example with Lineage
ID Saw (witness,car)11
(Cathy, Honda) ∥ (Cathy, Mazda)
ID Drives (person,car)21
(Jimmy, Toyota) ∥ (Jimmy, Mazda)
22
(Billy, Honda) ∥ (Frank, Honda)
23
(Hank, Honda)
ID Suspects31
Jimmy
32
Billy ∥ Frank
33
Hank
???
Suspects = πperson(Saw ⋈ Drives) λ(31) = (11,2),(21,2)
λ(32,1) = (11,1),(22,1); λ(32,2) = (11,1),(22,2)λ(33) = (11,1), 23
Correctly captures possible instances inthe result
![Page 14: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/14.jpg)
14
Uncertainty-Lineage Databases (ULDBs)
1. Alternatives2. ‘?’ (Maybe) Annotations3. Confidences4. Lineage
ULDBs are closed and complete[VLDB 06]
![Page 15: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/15.jpg)
15
ULDBs: Lineage
• Conjunctive lineage sufficient for most operations
• Duplicate-elimination: Disjunctive lineage • Difference: Negative lineage• General case after multiple
operations/queries: Boolean formula
![Page 16: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/16.jpg)
16
ULDBs: Interesting Questions
• Data-minimality: extraneous alternatives, extraneous “?”
• Lineage-minimality: harder• Membership: tuple and table, some-
instance and all-instances
• Coexistence: multiple tuples• Extraction: remove tables, retain
possible-instances
![Page 17: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/17.jpg)
17
Example: Extraneous Data
(Diane, Mazda) ∥ (Diane, Acura)
Dianeextraneous
(Diane, Mazda)
(Diane, Acura)
?
??
![Page 18: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/18.jpg)
18
Example: Coexistence
MazdaAcura
(Diane, Mazda) ∥ (Diane, Acura)
(Diane, Mazda)
(Diane, Acura)
?
??
?Can’t coexist
![Page 19: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/19.jpg)
19
Querying ULDBs: Semantics
Query Q on ULDB D
D
D1, D2, …, Dn
possibleinstances
Q on eachinstance
representationof instances
Q(D1), Q(D2), …, Q(Dn)
D’implementation of Q
operational semanticsD + Result
![Page 20: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/20.jpg)
20
Querying ULDBs: TriQL
Basic TriQL: SQL with new semantics• Obeys commutative diagram for uncertain data• Tracks lineage• Query results: new table or on-the-fly
Implemented TriQL: also built-in predicates conf(), lineage(), lineage*()
![Page 21: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/21.jpg)
21
Additional TriQL Constructs[Language manual on web site]
• “Horizontal subqueries”Refer to tuple alternatives as a relation
• Unmerged (horizontal duplicates)• Flatten, GroupAlts
• NoLineage, NoConf, NoMaybe• Query-specified confidences [done]• Data modification statements
![Page 22: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/22.jpg)
22
Confidence Computation
• Confidences computed on-demand based on lineage—Confidence of alternative A is function of
confidences in λ*(A)—Permits any query plan for data computation
• Default probabilistic interpretation, but queries can override
SELECT person, min(conf(Saw),conf(Drives)) as confFROM Saw, DrivesWHERE Saw.car = Drives.car
![Page 23: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/23.jpg)
23
Trio System: Version 1
Standard relational DBMS
Trio API and translator(Python)
Command-lineclient
TrioMetadat
a
TrioExplorer(GUI client)
Trio Stored
Procedures
EncodedData
TablesLineageTables
Standard SQL• “Verticalize”• Shared IDs for alternatives• Columns for confidence,“?”• One per result table• Uses unique IDs
• Table types• Schema-level lineage structure• conf()• lineage() “==>”• lineage*() “==>>”
• DDL commands• TriQL queries• Schema browsing• Table browsing• Explore lineage• On-demand confidence computation
![Page 24: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/24.jpg)
24
Current & Future Topics
Algorithms: confidence computation, coexistence
extraneous data• Minimize lineage traversal• Memoization• Batch operations
System• Full query language• More internal processing ?
– Storage and indexing– Statistics and query optimization
![Page 25: Mazda Trio Meeting](https://reader033.fdocuments.in/reader033/viewer/2022061104/541369bf8d7f728a698b45c9/html5/thumbnails/25.jpg)
25
Current & Future Topics
• Top-K by confidence • Extend basic uncertainty model
—Incomplete relations—Continuous uncertainty—Correlated uncertainty ?
• External lineage, update lineage, versioning