1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université...

27
1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:[email protected]

Transcript of 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université...

Page 1: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

1

Explicit and Implicit LIST Aggregate Function for

Relational Databases

Witold LitwinUniversité Paris 9 Dauphine

mailto:[email protected]

Page 2: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

2

Summary

• New Aggregate Function

• Transforms a set of values into single one– Char type

• A basic long time need

• Should be highly useful

Page 3: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

3

Plan

• Motivating Examples

• Explicit LIST

• Implicit LIST

• Conlusion

• Further Work

Page 4: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

4

Motivating Example 1

• The Supplier-Part (SP) table of the best-known S-P database

S# P# Qty

s1 p1 300

s1 p2 200

s1 p3 400

s1 p4 200

s1 p5 100

s1 p6 100

s2 p1 300

s2 p2 400

s3 p2 200

s4 p2 200

s4 p4 300

s4 p5 400

Page 5: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

5

Motivating Example 1• The classical query :

select SP.[S#], Sum(SP.Qty) AS [Total Qty]from SPgroup By SP.[S#];

S# Total QtyS1 1300S2 700S3 200S4 900

• How to get also the individual quantities ?

Page 6: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

6

Motivating Example 2

• A database of persons having:– Multiple Hobbies– Multiple preferred Restaurants– Many Friends

• Best design:– four 4-NF tables

• P (SS #, Name), H (SS#, Hobby), R (SS#, Rest), F (SS#, Friend)

Page 7: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

7

Database

Page 8: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

8

Fragment

Page 9: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

9

Query

select P.[SS#], P.Name, F.Friend, R.Rest, H.Hobby

from ((P INNER JOIN F ON P.[SS#] = F.[SS#])

INNER JOIN H ON P.[SS#] = H.[SS#])

INNER JOIN R ON P.[SS#] = R.[SS#]

where P.[SS#] ="ss1" ;

Select Name, Friends, Restaurants, Hobbies, of Person ‘SS1’

SQL :

Page 10: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

10

Result

Usable ???

Page 11: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

11

General Problem

• Current RDBs manage tables in 1NF– All attributes are single-valued (atomic values)

• Example 1 ; We wished– Single-valued attribute :

• SUM(QTY)

– Multi-valued attribute• Individual quantities

• The result would not be 1NF

Page 12: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

12

General Problem

• RDB manages tables in 1NF– All attributes are single-valued

• Example 2 ; We wished :– Single-valued attributes :

• S#, Name

– Multi-valued attributes (multi-sets):• Hobby, Rest, Friend

• The result is normalized to 1NF– {(ss1,Witold, x, y, z) : x Hobby, y Rest, z Friend }

– The table is not in 4NF– Subject to well-known anomalies

Page 13: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

13

Solutions

• Design RDBS for 0NF tables– A revolution

– 0NF RDBS will not be here for years

• Aggregate set or multi-set values into atomic values– An evolution

– All RDBS already do it using:• SUM, AVG, COUNT…

• perhaps with GROUP BY

– We need a new aggregate leaving the entire set visible• E.g: (multi)-set of values X => (single) list of values X

Page 14: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

14

Local Culinary Example

• The set-valued attribute:– (Schwarz, Wälder; Kirchen, Chocoladen, Torte)

• The aggregated attribute:– Schwarzwälderkirchenchocoladentorte

• Local specialty, try it !

Page 15: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

15

Explicit LIST function

Select S#, sum (Qty) AS [Total Qty], LIST (Qty) AS Histogram from SP group by S#;

S# Total Qty Histogram

s1 1300 300, 200, 400,200;100, 100

s2 700 300, 400

s3 200 200

s4 900 200, 300, 400

Page 16: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

16

Explicit LIST functionselect P.SS#, Name, LIST (DISTINCT (Friend)), LIST (DISTINCT (Rest)), LIST (DISTINCT (Hobby)) from P, F, R, H where P.SS# = F.SS# and F.SS# = R.SS# and R.SS# = H.SS# and P.SS# ="ss1" group by P.SS#, Name ;

P Name Friend Rest Hobby

SS1 Witold Alexis, Christopher, Ron, Jim, Donna, Elisabeth, Dave, Peter, Per-Ake, Thomas

Bengal, Cantine Paris 9, Chef Wu, Ferme de Condé, Miyake, Louis XIII,  Mela, North Beach Pizza, Pizza Napoli, Sushi Etoile

Bike, Classical Music, Good food, Hike, Movie, Science Fiction, Ski, Swim, Tennis, Wine

Page 17: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

17

Explicit LIST function• Simulated actual output using MsAccess forms with list boxes

• Form with three subforms• No SQL query used

Page 18: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

18

Explicit LIST functionselect P#, SUM (Qty) as [Total Qty], LIST (S#, Qty) as [Per supplier]

from SP

group by P#;

P# Total Qty Per supplier

p1 600 s1 300s2 300

p2 1000 s1 200s2 400s3 200s4 200

p3 400 s1 400

p4 500 s1 200 s4 300

p5 500 s1 100s4 400

p6 100 s1 100

Page 19: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

19

Implicit LIST function

• For any single-valued A : – A = LIST (A)

• Any non-aggregated attribute in an SQL query has to be in the GROUP BY clause

• Now, any non-aggregated perhaps composite attribute A from a single table and not in GROUP BY clause is implicitly under – LIST (DISTINCT (A))

• Queries may become less procedural

Page 20: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

20

Implicit LIST function

select P#, SUM (Qty) as [Total Qty], S#, Qty

from SP

group by P#

having ‘S# QTY’ like ‘*s4*’;

• Implicit LIST is LIST (S#, QTY)

Page 21: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

21

Implicit LIST function

• QuerySelect S.*, P#, Qty From S, SP Where S.S# = SP.S#

Repeats all the data of the supplier S in every resulting tuple– 6 times for S1: its Name, City, Status

• Query Select S.*, P#, Qty From S, SP Where S.S# = SP.S#

Group By S.S#

Does it only once per supplier • Less redundancy

Page 22: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

22

Implicit Joins and From

• Equijoins following the referential semantic links or integrity may be implicit

• MsAccess, SQL Server…

• FROM clause content can be inferred from the attribute names

• Even less procedural formulation may result:select P.SS#, Name, Friend, Rest, Hobbygroup by P.SS#, Name ;

Page 23: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

23

Implementation Issues

• Should be easy for the RDBS owner– Any RDB already processes the aggregates

S# Total Qty Histogram

s1 1300 300, 200, 400,200;100, 100

s2 700 300, 400

s3 200 200

s4 900 200, 300, 400

Already done hiding the list Should also

be shown

Page 24: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

24

Implementation Issues

• For explicit LIST, foreign function interface may suffice– Oracle, DB2, Yukon…– See related work in the paper for current

(limited) proposals • Oracle & iAnywhere (core code)

• Not for the implicit LIST– Access to core code is necessary

Page 25: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

25

Conclusion

• LIST is a new aggregate function• Aggregates a multi-valued attribute into a single

value• Responds to a long-standing fundamental RDBS

user need- 30 years ?

• Should be rather easy to implement• Future work should start with the implementation

– Using foreign functions for explicit LIST

Page 26: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

26

Research Support

• European Commission ICONS Project – no. IST-2001-32429.

• Microsoft Research

Page 27: 1 Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto:Witold.litwin@dauphine.fr.

27

Thank You for

Your Attention

Witold LitwinUniversité Paris 9 Dauphinemailto:[email protected]