CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary...

34
CS 377 Database Systems 1 Relational Algebra and Calculus Li Xiong Department of Mathematics and Computer Science Emory University

Transcript of CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary...

Page 1: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

CS 377

Database Systems

1

Relational Algebra and Calculus

Li Xiong

Department of Mathematics and Computer Science

Emory University

Page 2: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

ER Diagram of Company Database

2

Page 3: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

3

Page 4: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

4

Page 5: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

5

Page 6: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Relational Algebra and Relational Calculus

� Previous lecture on relational model presented the structures and constraints for the relational model

� Relational Algebra� Formal foundation for relational model operations

� Basis for implementing and optimizing queries in RDBMS

� Basis for practical query languages such as SQL

6

� Basis for practical query languages such as SQL

� Relational Calculus� Formal declarative language for relational queries

Page 7: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Outline

� Relational Algebra� Unary Relational Operations

� Relational Algebra Operations From Set Theory

� Binary Relational Operations

� Additional Relational Operations

� Relational Calculus

7

� Relational Calculus� Tuple Relational Calculus

� Domain Relational Calculus

� Coming up� SQL

Page 8: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Relational Algebra

� Relational algebra is a mathematical language with a basic set of operations for manipulating relations.

� A relational algebra operation operates on one or more relations and results a new relation, which can be further manipulated using operations of the same algebra.

� A relational algebra expression is a sequence of relational

8

� A relational algebra expression is a sequence of relational algebra operations.

Page 9: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Relational Algebra Operations

9

Page 10: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Unary Relational Operations - Select

� SELECT Operation: select a subset of the tuples from a relation that satisfy a selection condition.

� Example: To select the EMPLOYEE tuples whose department number is four or those whose salary is greater than $30,000 the following notation is used:

σσσσDNO = 4 (EMPLOYEE)

σσσσSALARY > 30,000 (EMPLOYEE)

� Notation: σσσσ <selection condition>(R)

� Selection condition is a Boolean expression containing clauses in the form:

10

� Selection condition is a Boolean expression containing clauses in the form:

<attribute name> <comparison op> <constant value>

<attribute name> <comparison op> <attribute name>

Page 11: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

SELECT Operation Properties

� The SELECT operation σσσσ <selection condition>(R) produces a relation S that has the same schema as R

� The SELECT operation σσσσ is commutative; i.e.,

σσσσ <condition1>(σσσσ < condition2> ( R)) = σσσσ <condition2> (σσσσ < condition1> ( R))

� A cascaded SELECT operation may be applied in any order; i.e.,

σσσσ <condition1>(σσσσ < condition2> (σσσσ <condition3> ( R))

11

σσσσ <condition1>(σσσσ < condition2> (σσσσ <condition3> ( R))

= σσσσ <condition2> (σσσσ < condition3> (σσσσ < condition1> ( R)))

� A cascaded SELECT operation may be replaced by a single selection with a conjunction of all the conditions; i.e.,

σσσσ <condition1>(σσσσ < condition2> (σσσσ <condition3> ( R))

= σσσσ <condition1> AND < condition2> AND < condition3> ( R)))

Page 12: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Unary Relational Operations - Project� PROJECT Operation: selects certain columns from the table and discards the

other columns.

� Example: To list each employee’s first and last name and salary

πLNAME, FNAME,SALARY(EMPLOYEE)

� Notation: π<attribute list>(R)

� Duplicate Elimination: the project operation removes any duplicate tuples, so the result of the project operation is a set of tuples

12

Page 13: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

PROJECT Operation Properties

� The number of tuples in π <list> (R) is always less or equal to the number of tuples in R

� If the list of attributes includes a key of R, then the number of tuples is equal to the number of tuples in R

� π <list1> (π <list2> (R) ) = π <list1> (R) as long as <list2> contains the attributes in <list1>

13

Page 14: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the
Page 15: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the
Page 16: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Sequences of Operations and the

RENAME Operation� In-line expression:

� Sequence of operations:

16

� Rename attributes in intermediate results

� RENAME operation

� Examples

� ρ DEPT5_EMPS (σDNO = 5 (EMPLOYEE))

Page 17: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Relational Algebra Operations From

Set Theory � UNION Operation: denoted by R ∪∪∪∪ S, is a relation that includes all tuples

that are either in R or in S or in both R and S. � Duplicate tuples are eliminated.

� INTERSECTION operation: denoted by R ∩∩∩∩ S, is a relation that includes all tuples that are in both R and S.

� Set Difference (or MINUS) Operation: denoted by R - S, is a relation that includes all tuples that are in R but not in S.

Type Compatibility: The two operands must be “type compatible”.

17

� Type Compatibility: The two operands must be “type compatible”.

� The operands R(A1, A2, ..., An) and S(B1, B2, ..., Bn) must have the same number of attributes, and the domains of corresponding attributes must be compatible; that is, dom(Ai)=dom(Bi) for i=1, 2, ..., n.

� The resulting relation for R∪S, R ∩ S, or R-S has the same attribute names as R (by convention).

Page 18: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Relational Algebra Operations From

Set Theory - Properties� Notice that both union and intersection are commutative operations; that is

R ∪∪∪∪ S = S ∪∪∪∪ R, and R ∩∩∩∩ S = S ∩∩∩∩ R

� Both union and intersection can be treated as n-ary operations applicable to

any number of relations as both are associative operations; that is

R ∪∪∪∪ (S ∪∪∪∪ T) = (R ∪∪∪∪ S) ∪∪∪∪ T, and (R ∩∩∩∩ S) ∩∩∩∩ T = R ∩∩∩∩ (S ∩∩∩∩ T)

18

R ∪∪∪∪ (S ∪∪∪∪ T) = (R ∪∪∪∪ S) ∪∪∪∪ T, and (R ∩∩∩∩ S) ∩∩∩∩ T = R ∩∩∩∩ (S ∩∩∩∩ T)

� The minus operation is not commutative; that is, in general

R - S ≠ S – R

Page 19: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the
Page 20: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the
Page 21: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Relational Algebra Operations From Set

Theory – Cartesian Product

� CARTESIAN (or cross product) Operation: combine tuples from two relations. In general, the result of R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) is a relation Q with degree n + m attributes Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. The resulting relation Q has one tuple for each combination of tuples—one from R and one from S. � Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then

*

21

R R S

| R x S | will have nR * nS tuples.

� The two operands do NOT have to be "type compatible”

� Example:

FEMALE_EMPS ←←←← σσσσ SEX=’F’(EMPLOYEE)

EMPNAMES ←←←← ππππ FNAME, LNAME, SSN (FEMALE_EMPS)

EMP_DEPENDENTS ←←←← EMPNAMES x DEPENDENT

Page 22: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

FEMALE_EMPS ← σ SEX=’F’(EMPLOYEE)

EMPNAMES ← π FNAME, LNAME, SSN (FEMALE_EMPS)

EMP_DEPENDENTS ← EMPNAMES x DEPENDENT

Page 23: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the
Page 24: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Binary Relational Operations - Join� JOIN Operation: the sequence of cartesian product followed by

select

� Notation: R <join condition>S

where R and S can be any relations that result from general relational algebra

expressions.

24

Page 25: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

ACTUAL_DEPENDENTS ← EMPNAMES SSN=ESSN

DEPENDENTS

Page 26: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Retrieve the department and the manager’s information:

DEPT_MGR ← DEPARTMENT MGRSSN=SSN

EMPLOYEE

Page 27: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the
Page 28: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Variations of Join

� EQUIJOIN Operation� Involves join conditions with equality comparisons only.

� The result always have one or more pairs of attributes (whose names need not be identical) that have identical values in every tuple.

� NATURAL JOIN Operation *� Gets rid of the second (superfluous) attribute in an EQUIJOIN condition.

� Requires the two join attributes, or each pair of corresponding join attributes, have the same name in both relations.

28

the same name in both relations.

� If this is not the case, a renaming operation is applied first.

Page 29: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

29

Proj_Dept <- Project * Department

Dept_Locs <- Department * Dept_Locations

Page 30: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the
Page 31: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Additional Relational Operations –

Outer Join� In NATURAL JOIN tuples without a matching (or related) tuple are

eliminated from the join result. Tuples with null in the join attributes are also eliminated.

� Outer joins can be used when we want to keep all the tuples in R or S, regardless of whether or not they have matching tuples in the other relation.

� The left outer join operation keeps every tuple in the first or left relation R in R S; if no matching tuple is found in S, then the attributes of S in

31

in R S; if no matching tuple is found in S, then the attributes of S in the join result are filled or “padded” with null values.

� The right outer join, keeps every tuple in the second or right relation S in the result of R S.

� A third operation, full outer join, denoted by keeps all tuples in both the left and the right relations

Page 32: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Outer Join ExampleEMPLOYEE left outer join (SSN=Mgr_SSN) DEPARTMENT

32

Page 33: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the

Binary Relational Operations - Division

� DIVISION Operation: R(Z) ÷ S(X), where X subset Z.

� Example: retrieve the SSN of employees who work on all the

projects that ‘John Smith’ is working on

� Let Y = Z - X (and hence Z = X ∪∪∪∪ Y). The result of DIVISION

33

� Let Y = Z - X (and hence Z = X ∪∪∪∪ Y). The result of DIVISION

is a relation T(Y) that includes a tuple t if tuples tR appear in R

with tR [Y] = t, and with tR [X] = ts for every tuple ts in S.

� For a tuple t to appear in the result T of the DIVISION, the values in t

must appear in R in combination with every tuple in S.

Page 34: CS 377 Database Systems - Emory Universitylxiong/cs377_f11/share/slides/06_relationa… · Unary Relational Operations -Project PROJECT Operation: selects certain columns from the