Extended Operators in SQL and Relational...

60
Bags Versus Sets Extended Operators Joins Extended Operators in SQL and Relational Algebra T. M. Murali September 16, 2009 T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Transcript of Extended Operators in SQL and Relational...

Page 1: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Extended Operators in SQL and Relational Algebra

T. M. Murali

September 16, 2009

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 2: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bags or Sets?

I So far, we have said that relational algebra and SQL operate onrelations that are sets of tuples.

I Real RDBMSs treat relations as bags of tuples.I A tuple can appear multiple times in a relation.

I Performance is one of the main reasons; duplicate elimination isexpensive since it requires sorting.

I If we use bag semantics, we may have to redefine the meaning of eachrelation algebra operator.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 3: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bags or Sets?

I So far, we have said that relational algebra and SQL operate onrelations that are sets of tuples.

I Real RDBMSs treat relations as bags of tuples.I A tuple can appear multiple times in a relation.I Performance is one of the main reasons;

duplicate elimination isexpensive since it requires sorting.

I If we use bag semantics, we may have to redefine the meaning of eachrelation algebra operator.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 4: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bags or Sets?

I So far, we have said that relational algebra and SQL operate onrelations that are sets of tuples.

I Real RDBMSs treat relations as bags of tuples.I A tuple can appear multiple times in a relation.I Performance is one of the main reasons; duplicate elimination is

expensive since it requires sorting.

I If we use bag semantics, we may have to redefine the meaning of eachrelation algebra operator.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 5: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Projection and Selection

I Projection (π()): process each tuple independently; a tuple mayappear in the resulting relation multiple times.

I Selection (σ()): process each tuple independently; a tuple may appearin the resulting relation multiple times.

R

A B C

1 2 3

1 2 4

2 3 4

2 3 4

πA,B(R)

A B

1 2

1 2

2 3

2 3

σC≥3(R)

A B C

1 2 3

1 2 4

2 3 4

2 3 4

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 6: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Projection and Selection

I Projection (π()): process each tuple independently; a tuple mayappear in the resulting relation multiple times.

I Selection (σ()): process each tuple independently; a tuple may appearin the resulting relation multiple times.

R

A B C

1 2 3

1 2 4

2 3 4

2 3 4

πA,B(R)

A B

1 2

1 2

2 3

2 3

σC≥3(R)

A B C

1 2 3

1 2 4

2 3 4

2 3 4

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 7: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Projection and Selection

I Projection (π()): process each tuple independently; a tuple mayappear in the resulting relation multiple times.

I Selection (σ()): process each tuple independently; a tuple may appearin the resulting relation multiple times.

R

A B C

1 2 3

1 2 4

2 3 4

2 3 4

πA,B(R)

A B

1 2

1 2

2 3

2 3

σC≥3(R)

A B C

1 2 3

1 2 4

2 3 4

2 3 4

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 8: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Projection and Selection

I Projection (π()): process each tuple independently; a tuple mayappear in the resulting relation multiple times.

I Selection (σ()): process each tuple independently; a tuple may appearin the resulting relation multiple times.

R

A B C

1 2 3

1 2 4

2 3 4

2 3 4

πA,B(R)

A B

1 2

1 2

2 3

2 3

σC≥3(R)

A B C

1 2 3

1 2 4

2 3 4

2 3 4

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 9: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Projection and Selection

I Projection (π()): process each tuple independently; a tuple mayappear in the resulting relation multiple times.

I Selection (σ()): process each tuple independently; a tuple may appearin the resulting relation multiple times.

R

A B C

1 2 3

1 2 4

2 3 4

2 3 4

πA,B(R)

A B

1 2

1 2

2 3

2 3

σC≥3(R)

A B C

1 2 3

1 2 4

2 3 4

2 3 4

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 10: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Projection and Selection

I Projection (π()): process each tuple independently; a tuple mayappear in the resulting relation multiple times.

I Selection (σ()): process each tuple independently; a tuple may appearin the resulting relation multiple times.

R

A B C

1 2 3

1 2 4

2 3 4

2 3 4

πA,B(R)

A B

1 2

1 2

2 3

2 3

σC≥3(R)

A B C

1 2 3

1 2 4

2 3 4

2 3 4

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 11: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Union, Intersection, and Difference

I R ∪ S : if tuple t appears ktimes in R and l times in S , tappears in R ∪ S k + l times.

I R ∩ S : if tuple t appears k timesin R and l times in S , t appearsin R ∩ S min{k, l} times.

I R−S : if tuple t appears k timesin R and l times in S , t appearsin R − S max{0, k − l} times.

R

A B

1 21 22 32 3

S

A B

1 21 21 22 32 4

R ∪ S

A B

1 21 21 21 21 22 32 32 32 4

R ∩ S

A B

1 21 22 3

R − S

A B

2 3

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 12: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Union, Intersection, and Difference

I R ∪ S : if tuple t appears ktimes in R and l times in S , tappears in R ∪ S k + l times.

I R ∩ S : if tuple t appears k timesin R and l times in S , t appearsin R ∩ S min{k, l} times.

I R−S : if tuple t appears k timesin R and l times in S , t appearsin R − S max{0, k − l} times.

R

A B

1 21 22 32 3

S

A B

1 21 21 22 32 4

R ∪ S

A B

1 21 21 21 21 22 32 32 32 4

R ∩ S

A B

1 21 22 3

R − S

A B

2 3

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 13: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Union, Intersection, and Difference

I R ∪ S : if tuple t appears ktimes in R and l times in S , tappears in R ∪ S k + l times.

I R ∩ S : if tuple t appears k timesin R and l times in S , t appearsin R ∩ S min{k, l} times.

I R−S : if tuple t appears k timesin R and l times in S , t appearsin R − S max{0, k − l} times.

R

A B

1 21 22 32 3

S

A B

1 21 21 22 32 4

R ∪ S

A B

1 21 21 21 21 22 32 32 32 4

R ∩ S

A B

1 21 22 3

R − S

A B

2 3

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 14: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Union, Intersection, and Difference

I R ∪ S : if tuple t appears ktimes in R and l times in S , tappears in R ∪ S k + l times.

I R ∩ S : if tuple t appears k timesin R and l times in S , t appearsin R ∩ S min{k, l} times.

I R−S : if tuple t appears k timesin R and l times in S , t appearsin R − S max{0, k − l} times.

R

A B

1 21 22 32 3

S

A B

1 21 21 22 32 4

R ∪ S

A B

1 21 21 21 21 22 32 32 32 4

R ∩ S

A B

1 21 22 3

R − S

A B

2 3

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 15: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Union, Intersection, and Difference

I R ∪ S : if tuple t appears ktimes in R and l times in S , tappears in R ∪ S k + l times.

I R ∩ S : if tuple t appears k timesin R and l times in S , t appearsin R ∩ S min{k, l} times.

I R−S : if tuple t appears k timesin R and l times in S , t appearsin R − S max{0, k − l} times.

R

A B

1 21 22 32 3

S

A B

1 21 21 22 32 4

R ∪ S

A B

1 21 21 21 21 22 32 32 32 4

R ∩ S

A B

1 21 22 3

R − S

A B

2 3

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 16: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Union, Intersection, and Difference

I R ∪ S : if tuple t appears ktimes in R and l times in S , tappears in R ∪ S k + l times.

I R ∩ S : if tuple t appears k timesin R and l times in S , t appearsin R ∩ S min{k, l} times.

I R−S : if tuple t appears k timesin R and l times in S , t appearsin R − S max{0, k − l} times.

R

A B

1 21 22 32 3

S

A B

1 21 21 22 32 4

R ∪ S

A B

1 21 21 21 21 22 32 32 32 4

R ∩ S

A B

1 21 22 3

R − S

A B

2 3

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 17: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Union, Intersection, and Difference

I R ∪ S : if tuple t appears ktimes in R and l times in S , tappears in R ∪ S k + l times.

I R ∩ S : if tuple t appears k timesin R and l times in S , t appearsin R ∩ S min{k, l} times.

I R−S : if tuple t appears k timesin R and l times in S , t appearsin R − S max{0, k − l} times.

R

A B

1 21 22 32 3

S

A B

1 21 21 22 32 4

R ∪ S

A B

1 21 21 21 21 22 32 32 32 4

R ∩ S

A B

1 21 22 3

R − S

A B

2 3

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 18: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Bag Semantics: Products and Joins

I Product (×): If a tuple r appears k times in a relation R and tuple sappears l times in a relation S , then the tuple rs appears kl times inR × S .

I Theta-join and Natural join (./): Since both can be expressed asapplying a selection followed by a projection to a product, use thesemantics of selection, projection, and the product.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 19: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Extended Operators

I Powerful operators based on basic relational operators and bagsemantics.

I Sorting: convert a relation into a list of tuples.

I Duplicate elimination: turn a bag into a set by eliminating duplicatetuples.

I Grouping: partition the tuples of a relation into groups, based on theirvalues among specified attributes.

I Aggregation: used by the grouping operator and tomanipulate/combine attributes.

I Extended projections: projection on steroids.

I Outerjoin: extension of joins that make sure every tuple is in theoutput.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 20: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Extended Operators

I Powerful operators based on basic relational operators and bagsemantics.

I Sorting: convert a relation into a list of tuples.

I Duplicate elimination: turn a bag into a set by eliminating duplicatetuples.

I Grouping: partition the tuples of a relation into groups, based on theirvalues among specified attributes.

I Aggregation: used by the grouping operator and tomanipulate/combine attributes.

I Extended projections: projection on steroids.

I Outerjoin: extension of joins that make sure every tuple is in theoutput.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 21: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Extended Operators

I Powerful operators based on basic relational operators and bagsemantics.

I Sorting: convert a relation into a list of tuples.

I Duplicate elimination: turn a bag into a set by eliminating duplicatetuples.

I Grouping: partition the tuples of a relation into groups, based on theirvalues among specified attributes.

I Aggregation: used by the grouping operator and tomanipulate/combine attributes.

I Extended projections: projection on steroids.

I Outerjoin: extension of joins that make sure every tuple is in theoutput.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 22: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Extended Operators

I Powerful operators based on basic relational operators and bagsemantics.

I Sorting: convert a relation into a list of tuples.

I Duplicate elimination: turn a bag into a set by eliminating duplicatetuples.

I Grouping: partition the tuples of a relation into groups, based on theirvalues among specified attributes.

I Aggregation: used by the grouping operator and tomanipulate/combine attributes.

I Extended projections: projection on steroids.

I Outerjoin: extension of joins that make sure every tuple is in theoutput.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 23: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Extended Operators

I Powerful operators based on basic relational operators and bagsemantics.

I Sorting: convert a relation into a list of tuples.

I Duplicate elimination: turn a bag into a set by eliminating duplicatetuples.

I Grouping: partition the tuples of a relation into groups, based on theirvalues among specified attributes.

I Aggregation: used by the grouping operator and tomanipulate/combine attributes.

I Extended projections: projection on steroids.

I Outerjoin: extension of joins that make sure every tuple is in theoutput.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 24: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Sorting

RA τA1,A2,...(R).

SQL SELECT ... FROM . . . WHERE ... ORDER BY A1, A2, . . ..

I The result is a list of tuples in R but with the tuples sorted by theirvalues in attributes A1, A2, . . .

I In SQL, use DESC after an attribute to specify sorting in descendingorder; ASC is the default.

I If you use the result in another query, sorted order is lost.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 25: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Sorting

RA τA1,A2,...(R).

SQL SELECT ... FROM . . . WHERE ... ORDER BY A1, A2, . . ..

I The result is a list of tuples in R but with the tuples sorted by theirvalues in attributes A1, A2, . . .

I In SQL, use DESC after an attribute to specify sorting in descendingorder; ASC is the default.

I If you use the result in another query, sorted order is lost.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 26: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Duplicate Elimination

RA δ(R) is the relation containing exactly one copy of each tuplein R.

SQL SELECT DISTINCT ...

I Duplicate elimination is expensive, since tuples must be sorted orpartitioned.

I Set operations in SQL (UNION, INTERSECT, and EXCEPT) operate onsets of tuples, i.e., they first eliminate duplicates.

I To make these operators treat relations as bags, follow the operationwith the keyword ALL.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 27: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Duplicate Elimination

RA δ(R) is the relation containing exactly one copy of each tuplein R.

SQL SELECT DISTINCT ...

I Duplicate elimination is expensive, since tuples must be sorted orpartitioned.

I Set operations in SQL (UNION, INTERSECT, and EXCEPT) operate onsets of tuples, i.e., they first eliminate duplicates.

I To make these operators treat relations as bags, follow the operationwith the keyword ALL.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 28: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Duplicate Elimination

RA δ(R) is the relation containing exactly one copy of each tuplein R.

SQL SELECT DISTINCT ...

I Duplicate elimination is expensive, since tuples must be sorted orpartitioned.

I Set operations in SQL (UNION, INTERSECT, and EXCEPT) operate onsets of tuples, i.e., they first eliminate duplicates.

I To make these operators treat relations as bags, follow the operationwith the keyword ALL.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 29: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Aggregation

I Operators that summarise or aggregate the values in a single attributeof a relation.

I Operators are the same in relational algebra and SQL.

I All operators treat a relation as a bag of tuples.

I SUM: computes the sum of a column with numerical values.

I AVG: computes the average of a column with numerical values.I MIN and MAX:

I for a column with numerical values, computes the smallest or largestvalue, respectively.

I for a column with string or character values, computes thelexicographically smallest or largest values, respectively.

I COUNT: computes the number of non-NULL tuples in a column.I In SQL, can use COUNT (*) to count the number of tuples in a relation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 30: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Aggregation

I Operators that summarise or aggregate the values in a single attributeof a relation.

I Operators are the same in relational algebra and SQL.

I All operators treat a relation as a bag of tuples.

I SUM: computes the sum of a column with numerical values.

I AVG: computes the average of a column with numerical values.

I MIN and MAX:I for a column with numerical values, computes the smallest or largest

value, respectively.I for a column with string or character values, computes the

lexicographically smallest or largest values, respectively.

I COUNT: computes the number of non-NULL tuples in a column.I In SQL, can use COUNT (*) to count the number of tuples in a relation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 31: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Aggregation

I Operators that summarise or aggregate the values in a single attributeof a relation.

I Operators are the same in relational algebra and SQL.

I All operators treat a relation as a bag of tuples.

I SUM: computes the sum of a column with numerical values.

I AVG: computes the average of a column with numerical values.I MIN and MAX:

I for a column with numerical values, computes the smallest or largestvalue, respectively.

I for a column with string or character values, computes thelexicographically smallest or largest values, respectively.

I COUNT: computes the number of non-NULL tuples in a column.I In SQL, can use COUNT (*) to count the number of tuples in a relation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 32: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Aggregation

I Operators that summarise or aggregate the values in a single attributeof a relation.

I Operators are the same in relational algebra and SQL.

I All operators treat a relation as a bag of tuples.

I SUM: computes the sum of a column with numerical values.

I AVG: computes the average of a column with numerical values.I MIN and MAX:

I for a column with numerical values, computes the smallest or largestvalue, respectively.

I for a column with string or character values, computes thelexicographically smallest or largest values, respectively.

I COUNT: computes the number of non-NULL tuples in a column.I In SQL, can use COUNT (*) to count the number of tuples in a relation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 33: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping

I How do we answer the query “Count the number of classes and thetotal enrollment of the classes each department teaches”?

I Can we answer the query using the operators discussed so far?

I We need to group the tuples of Teach by DeptName and thenaggregate within each group.

I Use the grouping operator.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 34: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping

I How do we answer the query “Count the number of classes and thetotal enrollment of the classes each department teaches”?

I Can we answer the query using the operators discussed so far?

I We need to group the tuples of Teach by DeptName and thenaggregate within each group.

I Use the grouping operator.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 35: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping

I How do we answer the query “Count the number of classes and thetotal enrollment of the classes each department teaches”?

I Can we answer the query using the operators discussed so far?

I We need to group the tuples of Teach by DeptName and thenaggregate within each group.

I Use the grouping operator.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 36: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping

I How do we answer the query “Count the number of classes and thetotal enrollment of the classes each department teaches”?

I Can we answer the query using the operators discussed so far?

I We need to group the tuples of Teach by DeptName and thenaggregate within each group.

I Use the grouping operator.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 37: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping in Relational Algebra

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

1. Group Courses by DeptName.2. For each group, create a new attribute that stores the number of

classes taught by the department.3. For each group, create a new attribute that stores the total enrollment

of the classes taught by the department.

I γL(Courses), where L is a list containing three elements:

1. DeptName: the grouping attribute,2. COUNT(Number) → NumCourses: an aggregated attribute computing

the count of the Number attribute in each group and naming the newattribute NumCourses, and

3. SUM(Enrollment) → TotalEnrollment: an aggregated attributecomputing the total of the Enrollment attribute and naming the newattribute TotalEnrollment.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 38: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping in Relational Algebra

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

1. Group Courses by DeptName.

2. For each group, create a new attribute that stores the number ofclasses taught by the department.

3. For each group, create a new attribute that stores the total enrollmentof the classes taught by the department.

I γL(Courses), where L is a list containing three elements:

1. DeptName: the grouping attribute,2. COUNT(Number) → NumCourses: an aggregated attribute computing

the count of the Number attribute in each group and naming the newattribute NumCourses, and

3. SUM(Enrollment) → TotalEnrollment: an aggregated attributecomputing the total of the Enrollment attribute and naming the newattribute TotalEnrollment.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 39: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping in Relational Algebra

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

1. Group Courses by DeptName.2. For each group, create a new attribute that stores the number of

classes taught by the department.

3. For each group, create a new attribute that stores the total enrollmentof the classes taught by the department.

I γL(Courses), where L is a list containing three elements:

1. DeptName: the grouping attribute,2. COUNT(Number) → NumCourses: an aggregated attribute computing

the count of the Number attribute in each group and naming the newattribute NumCourses, and

3. SUM(Enrollment) → TotalEnrollment: an aggregated attributecomputing the total of the Enrollment attribute and naming the newattribute TotalEnrollment.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 40: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping in Relational Algebra

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

1. Group Courses by DeptName.2. For each group, create a new attribute that stores the number of

classes taught by the department.3. For each group, create a new attribute that stores the total enrollment

of the classes taught by the department.

I γL(Courses), where L is a list containing three elements:

1. DeptName: the grouping attribute,2. COUNT(Number) → NumCourses: an aggregated attribute computing

the count of the Number attribute in each group and naming the newattribute NumCourses, and

3. SUM(Enrollment) → TotalEnrollment: an aggregated attributecomputing the total of the Enrollment attribute and naming the newattribute TotalEnrollment.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 41: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping in Relational Algebra

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

1. Group Courses by DeptName.2. For each group, create a new attribute that stores the number of

classes taught by the department.3. For each group, create a new attribute that stores the total enrollment

of the classes taught by the department.

I γL(Courses), where L is a list containing three elements:

1. DeptName: the grouping attribute

,2. COUNT(Number) → NumCourses: an aggregated attribute computing

the count of the Number attribute in each group and naming the newattribute NumCourses, and

3. SUM(Enrollment) → TotalEnrollment: an aggregated attributecomputing the total of the Enrollment attribute and naming the newattribute TotalEnrollment.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 42: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping in Relational Algebra

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

1. Group Courses by DeptName.2. For each group, create a new attribute that stores the number of

classes taught by the department.3. For each group, create a new attribute that stores the total enrollment

of the classes taught by the department.

I γL(Courses), where L is a list containing three elements:

1. DeptName: the grouping attribute,2. COUNT(Number) → NumCourses: an aggregated attribute computing

the count of the Number attribute in each group and naming the newattribute NumCourses, and

3. SUM(Enrollment) → TotalEnrollment: an aggregated attributecomputing the total of the Enrollment attribute and naming the newattribute TotalEnrollment.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 43: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping in Relational Algebra

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

1. Group Courses by DeptName.2. For each group, create a new attribute that stores the number of

classes taught by the department.3. For each group, create a new attribute that stores the total enrollment

of the classes taught by the department.

I γL(Courses), where L is a list containing three elements:

1. DeptName: the grouping attribute,2. COUNT(Number) → NumCourses: an aggregated attribute computing

the count of the Number attribute in each group and naming the newattribute NumCourses, and

3. SUM(Enrollment) → TotalEnrollment: an aggregated attributecomputing the total of the Enrollment attribute and naming the newattribute TotalEnrollment.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 44: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping Continued

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

I The complete operator isγDeptName,COUNT(Number)→NumCourses,SUM(Enrollment)→TotalEnrollment(Courses)

I The schema of the new relation is(DeptName, NumCourses, TotalEnrollment).

I We can group by multiple attributes.

I We can create as many new attributes as necessary.

I We can apply other operators to the result, since the groupingoperator produces a relation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 45: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping Continued

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

I The complete operator isγDeptName,COUNT(Number)→NumCourses,SUM(Enrollment)→TotalEnrollment(Courses)

I The schema of the new relation is(DeptName, NumCourses, TotalEnrollment).

I We can group by multiple attributes.

I We can create as many new attributes as necessary.

I We can apply other operators to the result, since the groupingoperator produces a relation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 46: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Example of Grouping Continued

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

I The complete operator isγDeptName,COUNT(Number)→NumCourses,SUM(Enrollment)→TotalEnrollment(Courses)

I The schema of the new relation is(DeptName, NumCourses, TotalEnrollment).

I We can group by multiple attributes.

I We can create as many new attributes as necessary.

I We can apply other operators to the result, since the groupingoperator produces a relation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 47: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping in SQL

I Syntax is much simpler than relational algebra.

I Use the GROUP BY clause after the WHERE clause or after the FROM, ifthere is no WHERE clause.

I List grouping attributes after GROUP BY.

I Use SELECT clause to aggregate attributes.

I How do we answer the query “Count the number of classes and totalenrollment of the classes each department teaches”?

SELECT DeptName, COUNT(Number) AS NumCourses,SUM(Enrollment) AS TotalEnrollment

FROM COURSESGROUP BY DeptName;

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 48: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping in SQL

I Syntax is much simpler than relational algebra.

I Use the GROUP BY clause after the WHERE clause or after the FROM, ifthere is no WHERE clause.

I List grouping attributes after GROUP BY.

I Use SELECT clause to aggregate attributes.I How do we answer the query “Count the number of classes and total

enrollment of the classes each department teaches”?

SELECT DeptName, COUNT(Number) AS NumCourses,SUM(Enrollment) AS TotalEnrollment

FROM COURSESGROUP BY DeptName;

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 49: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping in SQL

I Syntax is much simpler than relational algebra.

I Use the GROUP BY clause after the WHERE clause or after the FROM, ifthere is no WHERE clause.

I List grouping attributes after GROUP BY.

I Use SELECT clause to aggregate attributes.I How do we answer the query “Count the number of classes and total

enrollment of the classes each department teaches”?

SELECT DeptName, COUNT(Number) AS NumCourses,SUM(Enrollment) AS TotalEnrollment

FROM COURSESGROUP BY DeptName;

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 50: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Grouping in SQL

SELECT DeptName, COUNT(Number) AS NumCourses,SUM(Enrollment) AS TotalEnrollment

FROM COURSESGROUP BY DeptName;

I Aggregated attributes are evaluated on a per-group basis.

I Only attributes mentioned in the GROUP BY clause may appearunaggregated in the SELECT clause, e.g., Number must have anaggregation operator applied to it.

I There need not be any aggregated attribute in the SELECT clause.

I Read Chapter 6.4.6 of the textbook about affect of NULL values ongrouping and aggregation.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 51: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Restricting Grouping in SQL

I How do we answer the query “Count the number of classes eachdepartment teaches, restricted to departments that have totalenrollment at least 500 in their classes (the classes taught by thatdepartment)”?

I Need to introduce the HAVING clause

SELECT DeptName, COUNT(Number) AS NumCoursesFROM COURSESGROUP BY DeptNameHAVING SUM(Enrollment) >= 500;

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 52: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Restricting Grouping in SQL

I How do we answer the query “Count the number of classes eachdepartment teaches, restricted to departments that have totalenrollment at least 500 in their classes (the classes taught by thatdepartment)”?

I Need to introduce the HAVING clause

SELECT DeptName, COUNT(Number) AS NumCoursesFROM COURSESGROUP BY DeptNameHAVING SUM(Enrollment) >= 500;

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 53: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Rules for HAVING Clauses

I An aggregation in a HAVING clause applies only to the group beingtested.

I If an attribute appears unaggregated in a HAVING clause, it mustappear in the GROUP BY line.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 54: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Complete SELECT Statement

SELECT Attribute listFROM Relation listWHERE Condition or SubqueryGROUP BY Attribute listHAVING Condition or SubqueryORDER BY Attribute list;

I WHERE is evaluated before GROUP BY and HAVING.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 55: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Complete SELECT Statement

SELECT Attribute listFROM Relation listWHERE Condition or SubqueryGROUP BY Attribute listHAVING Condition or SubqueryORDER BY Attribute list;

I WHERE is evaluated before GROUP BY and HAVING.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 56: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Joins in Relational Algebra and SQL

I Cross product:

RA R × SSQL R CROSS JOIN S;

I Theta join:

RA R ./C

S

SQL R JOIN S ON C;

I Natural join:

RA R ./ SSQL R NATURAL JOIN S;

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 57: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Outer JoinsI A dangling tuple is one that fails to pair with any tuple in the other

relation in a join operation.I Outer joins allow dangling tuples to be included in the result of join

operations, by padding them with NULL values.

RA R◦./ S

SQL R NATURAL FULL OUTER JOIN S;I Contains all tuples in R ./ S .I Includes every tuple in R that is not joined with a tuple in S , after

padding a special null symbol ⊥ (NULL in case of SQL).I Same condition applied to S .

I Left outer join:

RA R◦./L S

SQL R NATURAL LEFT OUTER JOIN S;I Like R

◦./ S but ignores dangling tuples in S .

I Right outer join is analogous to left outer join.I All outerjoin operators have theta-join analogues.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 58: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Outer JoinsI A dangling tuple is one that fails to pair with any tuple in the other

relation in a join operation.I Outer joins allow dangling tuples to be included in the result of join

operations, by padding them with NULL values.

RA R◦./ S

SQL R NATURAL FULL OUTER JOIN S;I Contains all tuples in R ./ S .I Includes every tuple in R that is not joined with a tuple in S , after

padding a special null symbol ⊥ (NULL in case of SQL).I Same condition applied to S .

I Left outer join:

RA R◦./L S

SQL R NATURAL LEFT OUTER JOIN S;I Like R

◦./ S but ignores dangling tuples in S .

I Right outer join is analogous to left outer join.I All outerjoin operators have theta-join analogues.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 59: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Outer JoinsI A dangling tuple is one that fails to pair with any tuple in the other

relation in a join operation.I Outer joins allow dangling tuples to be included in the result of join

operations, by padding them with NULL values.

RA R◦./ S

SQL R NATURAL FULL OUTER JOIN S;I Contains all tuples in R ./ S .I Includes every tuple in R that is not joined with a tuple in S , after

padding a special null symbol ⊥ (NULL in case of SQL).I Same condition applied to S .

I Left outer join:

RA R◦./L S

SQL R NATURAL LEFT OUTER JOIN S;I Like R

◦./ S but ignores dangling tuples in S .

I Right outer join is analogous to left outer join.I All outerjoin operators have theta-join analogues.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra

Page 60: Extended Operators in SQL and Relational Algebracourses.cs.vt.edu/~cs4604/Fall09/lectures/lecture... · Bags Versus SetsExtended OperatorsJoins Extended Operators in SQL and Relational

Bags Versus Sets Extended Operators Joins

Outer JoinsI A dangling tuple is one that fails to pair with any tuple in the other

relation in a join operation.I Outer joins allow dangling tuples to be included in the result of join

operations, by padding them with NULL values.

RA R◦./ S

SQL R NATURAL FULL OUTER JOIN S;I Contains all tuples in R ./ S .I Includes every tuple in R that is not joined with a tuple in S , after

padding a special null symbol ⊥ (NULL in case of SQL).I Same condition applied to S .

I Left outer join:

RA R◦./L S

SQL R NATURAL LEFT OUTER JOIN S;I Like R

◦./ S but ignores dangling tuples in S .

I Right outer join is analogous to left outer join.I All outerjoin operators have theta-join analogues.

T. M. Murali September 16, 2009 Extended Operators in SQL and Relational Algebra