1 Scalable Distributed Data Structures Part 2 Witold Litwin [email protected].
1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université...
-
Upload
clarissa-claudia-prigmore -
Category
Documents
-
view
215 -
download
0
Transcript of 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université...
1
Explicit and Implicit LIST Aggregate Function for
Relational Databases
Witold LitwinUniversité Paris 9 Dauphine
mailto:[email protected]
2
Summary
• New Aggregate Function
• Transforms a set of values into single one– Char type
• A basic long time need
• Should be highly useful
3
Plan
• Motivating Examples
• Explicit LIST
• Implicit LIST
• Conlusion
• Further Work
4
Motivating Example 1
• The Supplier-Part (SP) table of the best-known S-P database
S# P# Qty
s1 p1 300
s1 p2 200
s1 p3 400
s1 p4 200
s1 p5 100
s1 p6 100
s2 p1 300
s2 p2 400
s3 p2 200
s4 p2 200
s4 p4 300
s4 p5 400
5
Motivating Example 1• The classical query :
select SP.[S#], Sum(SP.Qty) AS [Total Qty]from SPgroup By SP.[S#];
S# Total QtyS1 1300S2 700S3 200S4 900
• How to get also the individual quantities ?
6
Motivating Example 2
• A database of persons having:– Multiple Hobbies– Multiple preferred Restaurants– Many Friends
• Best design:– four 4-NF tables
• P (SS #, Name), H (SS#, Hobby), R (SS#, Rest), F (SS#, Friend)
7
Database
8
Fragment
9
Query
select P.[SS#], P.Name, F.Friend, R.Rest, H.Hobby
from ((P INNER JOIN F ON P.[SS#] = F.[SS#])
INNER JOIN H ON P.[SS#] = H.[SS#])
INNER JOIN R ON P.[SS#] = R.[SS#]
where P.[SS#] ="ss1" ;
Select Name, Friends, Restaurants, Hobbies, of Person ‘SS1’
SQL :
10
Result
Usable ???
11
General Problem
• Current RDBs manage tables in 1NF– All attributes are single-valued (atomic values)
• Example 1 ; We wished– Single-valued attribute :
• SUM(QTY)
– Multi-valued attribute• Individual quantities
• The result would not be 1NF
12
General Problem
• RDB manages tables in 1NF– All attributes are single-valued
• Example 2 ; We wished :– Single-valued attributes :
• S#, Name
– Multi-valued attributes (multi-sets):• Hobby, Rest, Friend
• The result is normalized to 1NF– {(ss1,Witold, x, y, z) : x Hobby, y Rest, z Friend }
– The table is not in 4NF– Subject to well-known anomalies
13
Solutions
• Design RDBS for 0NF tables– A revolution
– 0NF RDBS will not be here for years
• Aggregate set or multi-set values into atomic values– An evolution
– All RDBS already do it using:• SUM, AVG, COUNT…
• perhaps with GROUP BY
– We need a new aggregate leaving the entire set visible• E.g: (multi)-set of values X => (single) list of values X
14
Local Culinary Example
• The set-valued attribute:– (Schwarz, Wälder; Kirchen, Chocoladen, Torte)
• The aggregated attribute:– Schwarzwälderkirchenchocoladentorte
• Local specialty, try it !
15
Explicit LIST function
Select S#, sum (Qty) AS [Total Qty], LIST (Qty) AS Histogram from SP group by S#;
S# Total Qty Histogram
s1 1300 300, 200, 400,200;100, 100
s2 700 300, 400
s3 200 200
s4 900 200, 300, 400
16
Explicit LIST functionselect P.SS#, Name, LIST (DISTINCT (Friend)), LIST (DISTINCT (Rest)), LIST (DISTINCT (Hobby)) from P, F, R, H where P.SS# = F.SS# and F.SS# = R.SS# and R.SS# = H.SS# and P.SS# ="ss1" group by P.SS#, Name ;
P Name Friend Rest Hobby
SS1 Witold Alexis, Christopher, Ron, Jim, Donna, Elisabeth, Dave, Peter, Per-Ake, Thomas
Bengal, Cantine Paris 9, Chef Wu, Ferme de Condé, Miyake, Louis XIII, Mela, North Beach Pizza, Pizza Napoli, Sushi Etoile
Bike, Classical Music, Good food, Hike, Movie, Science Fiction, Ski, Swim, Tennis, Wine
17
Explicit LIST function• Simulated actual output using MsAccess forms with list boxes
• Form with three subforms• No SQL query used
18
Explicit LIST functionselect P#, SUM (Qty) as [Total Qty], LIST (S#, Qty) as [Per supplier]
from SP
group by P#;
P# Total Qty Per supplier
p1 600 s1 300s2 300
p2 1000 s1 200s2 400s3 200s4 200
p3 400 s1 400
p4 500 s1 200 s4 300
p5 500 s1 100s4 400
p6 100 s1 100
19
Implicit LIST function
• For any single-valued A : – A = LIST (A)
• Any non-aggregated attribute in an SQL query has to be in the GROUP BY clause
• Now, any non-aggregated perhaps composite attribute A from a single table and not in GROUP BY clause is implicitly under – LIST (DISTINCT (A))
• Queries may become less procedural
20
Implicit LIST function
select P#, SUM (Qty) as [Total Qty], S#, Qty
from SP
group by P#
having ‘S# QTY’ like ‘*s4*’;
• Implicit LIST is LIST (S#, QTY)
21
Implicit LIST function
• QuerySelect S.*, P#, Qty From S, SP Where S.S# = SP.S#
Repeats all the data of the supplier S in every resulting tuple– 6 times for S1: its Name, City, Status
• Query Select S.*, P#, Qty From S, SP Where S.S# = SP.S#
Group By S.S#
Does it only once per supplier • Less redundancy
22
Implicit Joins and From
• Equijoins following the referential semantic links or integrity may be implicit
• MsAccess, SQL Server…
• FROM clause content can be inferred from the attribute names
• Even less procedural formulation may result:select P.SS#, Name, Friend, Rest, Hobbygroup by P.SS#, Name ;
23
Implementation Issues
• Should be easy for the RDBS owner– Any RDB already processes the aggregates
S# Total Qty Histogram
s1 1300 300, 200, 400,200;100, 100
s2 700 300, 400
s3 200 200
s4 900 200, 300, 400
Already done hiding the list Should also
be shown
24
Implementation Issues
• For explicit LIST, foreign function interface may suffice– Oracle, DB2, Yukon…– See related work in the paper for current
(limited) proposals • Oracle & iAnywhere (core code)
• Not for the implicit LIST– Access to core code is necessary
25
Conclusion
• LIST is a new aggregate function• Aggregates a multi-valued attribute into a single
value• Responds to a long-standing fundamental RDBS
user need- 30 years ?
• Should be rather easy to implement• Future work should start with the implementation
– Using foreign functions for explicit LIST
26
Research Support
• European Commission ICONS Project – no. IST-2001-32429.
• Microsoft Research