Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select...

54
Chapter 6 The Relational Algebra

Transcript of Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select...

Page 1: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Chapter 6

The Relational Algebra

Page 2: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The SELECT Operation (1/2)

• The SELECT operation is used to select a subset of the tuples from a relation that satisfy a selection condition.

<selection condition>(R)

(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)

Page 3: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 4: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The SELECT Operation (2/2)

• The degree of the relation resulting from a SELECT operation is the same as that of R.

• The number of tuples in the resulting relation is always less than or equal to the number of tuples in R.

• The fraction of tuples selected by a selection condition is referred to as the selectivity of the condition.

• The SELECT operation is commutative.

• We can always combine a cascade of SELECT operations into a single SELECT operation with a conjunctive (AND) condition.

Page 5: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The PROJECT Operation (1/2)

• The PROJECT operation, on the other hand, selects certain columns from the table and discards the other columns.

<attribute list>(R)

SEX, SALARY(EMPLOYEE)

Page 6: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The PROJECT Operation (2/2)

• The number of tuples in a relation resulting from a PROJECT operation is always less than or equal to the number of tuples in R. – If the projection list is a superkey of R—that is, it includes some k

ey of R—the resulting relation has the same number of tuples as R.

• Commutativity does not hold on PROJECT.

Page 7: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Sequences of Operations and the RENAME Operation (1/4)

• In general, we may want to apply several relational algebra operations one after the other.

• Either we can write the operations as a single relational algebra expression by nesting the operations, or we can apply one operation at a time and create intermediate result relations.

• In the latter case, we must name the relations that hold the intermediate results.

Page 8: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Sequences of Operations and the RENAME Operation (2/4)

FNAME, LNAME, SALARY(DNO= 5(EMPLOYEE)) • DEP5_EMPS DNO=5(EMPLOYEE)

RESULT FNAME, LNAME, SALARY(DEP5_EMPS)

Page 9: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 10: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Sequences of Operations and the RENAME Operation (3/4)

• To rename the attributes in a relation, we simply list the new attribute names in parentheses.

• TEMP DNO=5(EMPLOYEE)

R(FIRSTNAME, LASTNAME, SALARY) FNAME, LNAME, SALARY (TEMP)

Page 11: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Sequences of Operations and the RENAME Operation (4/4)

• We can also define a RENAME operation—which can rename either the relation name, or the attribute names, or both.

• The general RENAME operation when applied to a relation R of degree n is denoted by S(B1, B2, ..., Bn)(R) or S(R) or (B1, B2, ..., Bn)(R)

Page 12: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Set Theoretic Operations (1/7)

• DEP5_EMPS DNO=5(EMPLOYEE)

RESULT1 SSN(DEP5_EMPS)

RESULT2(SSN) SUPERSSN(DEP5_EMPS)

RESULT RESULT1 RESULT2

Page 13: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 14: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Set Theoretic Operations (2/7)

• Two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bn) are

said to be union compatible if they have the same degree n, and if dom(Ai) = dom(Bi) for 1 i n.

• This means that the two relations have the same number of attributes and that each pair of corresponding attributes have the same domain.

Page 15: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Set Theoretic Operations (3/7)

• UNION: The result of this operation, denoted by R S, is a relation that includes all tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated.

• INTERSECTION: The result of this operation, denoted by R S, is a relation that includes all tuples that are in both R and S.

• SET DIFFERENCE: The result of this operation, denoted by R - S, is a relation that includes all tuples that are in R but not in S.

Page 16: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 17: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Set Theoretic Operations (4/7)

• Both UNION and INTERSECTION are commutative operations.

• Both union and intersection can be treated as n-ary operations applicable to any number of relations as both are associative operations.

Page 18: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Set Theoretic Operations (5/7)

• Next we discuss the CARTESIAN PRODUCT operation—also known as CROSS PRODUCT or CROSS JOIN—denoted by x, which is also a binary set operation, but the relations on which it is applied do not have to be union compatible.

• This operation is used to combine tuples from two relations in a combinatorial fashion.

Page 19: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Set Theoretic Operations (6/7)

• In general, the result of R(A1, A2, . . ., An) x S(B1, B2, . . .,

Bm) is a relation Q with n + m attributes Q(A1, A2, . . ., An,

B1, B2, . . ., Bm), in that order.

• If R has nR tuples and S has nS tuples, then R x S will have

nR * nS tuples.

• The operation applied by itself is generally meaningless.

Page 20: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Set Theoretic Operations (7/7)

• FEMALE_EMPS SEX=’F’(EMPLOYEE)

EMPNAMES FNAME, LNAME, SSN(FEMALE_EMPS)

EMP_DEPENDENTS EMPNAMES x DEPENDENTACTUAL_DEPENDENTS SSN=ESSN(EMP_DEPENDENTS)

RESULT FNAME, LNAME, DEPENDENT_NAME(ACTUAL_DEPENDENTS)

Page 21: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 22: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The JOIN Operation (1/7)

• DEPT_MGR DEPARTMENT MGRSSN=SSN EMPLOYEE

RESULT DNAME, LNAME, FNAME(DEPT_MGR)

Page 23: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 24: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The JOIN Operation (2/7)

• The result of the JOIN is a relation Q with n + m attributes Q(A1, A2, . . ., An, B1, B2, . . ., Bm) in that order; Q has one t

uple for each combination of tuples—one from R and one from S—whenever the combination satisfies the join condition.

Page 25: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The JOIN Operation (3/7)

• A general join condition is of the form: <condition> AND <condition> AND . . . AND <condition> where each condition is of the form Ai Bj, Ai is an attribut

e of R, Bj is an attribute of S, Ai and Bj have the same doma

in, and (theta) is one of the comparison operators {=, <,

, >, , }.

• A JOIN operation with such a general join condition is called a THETA JOIN.

Page 26: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The JOIN Operation (4/7)

• The most common JOIN involves join conditions with equality comparisons only.

• Such a JOIN, where the only comparison operator used is =, is called an EQUIJOIN.

Page 27: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The JOIN Operation (5/7)

• A new operation called NATURAL JOIN—denoted by *—was created to get rid of the second (superfluous) attribute in an EQUIJOIN condition.

• In general, NATURAL JOIN is performed by equating all attribute pairs that have the same name in the two relations.

Page 28: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The JOIN Operation (6/7)

• PROJ_DEPT PROJECT * (DNAME, DNUM,MGRSSN,MGRSTARTDATE) (DEPARTMENT)

• DEPT_LOCS DEPARTMENT * DEPT_LOCATIONS

Page 29: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 30: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The JOIN Operation (7/7)

• In general, if R has nR tuples and S has nS tuples, the result

of a JOIN operation R <join condition>S will have between zero

and nR * nS tuples.

• The expected size of the join result divided by the maximum size nR * nS leads to a ratio called join selectivity, which

is a property of each join condition.

Page 31: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

A Complete Set of Relational Algebra Operations

• It has been shown that the set of relational algebra operations {, , , -, x} is a complete set; that is, any of the other relational algebra operations can be expressed as a sequence of operations from this set.

Page 32: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The DIVISION Operation (1/3)

• Retrieve the names of employees who work on all the projects that ‘John Smith’ works on.

• SMITH FNAME=’John’ AND LNAME=’Smith’(EMPLOYEE)

SMITH_PNOS PNO(WORKS_ON ESSN=SSN SMITH)

SSN_PNOS ESSN,PNO (WORKS_ON)

SSNS(SSN) SSN_PNOS ÷ SMITH_PNOSRESULT FNAME, LNAME(SSNS * EMPLOYEE)

Page 33: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 34: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The DIVISION Operation (2/3)

• In general, the DIVISION operation is applied to two relations R(Z) ÷ S(X), where X Z.

• Let Y = Z - X (and hence Z = X Y); that is, let Y be the set of attributes of R that are not attributes of S.

• The result of DIVISION is a relation T(Y) that includes a tuple t if tuples tR appear in R with tR[Y] = t, and with tR[X]

= tS for every tuple tS in S.

Page 35: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

The DIVISION Operation (3/3)

• The DIVISION operator can be expressed as a sequence of , x, and - operations as follows:– T1 Y(R)

T2 Y((S x T1) - R)

T T1 - T2

Page 36: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Additional Relational Operations

Page 37: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Aggregate Functions and Grouping (1/2)

• We can define an AGGREGATE FUNCTION operation, using the symbol ℱ (pronounced "script F"), to specify these types of requests as follows: <grouping attributes> ℱ <function list> (R) where <grouping attributes> is a list of attributes of the relation specified in R, and <function list> is a list of (<function> <attribute>) pairs.

• In each such pair, <function> is one of the allowed functions—such as SUM, AVERAGE, MAXIMUM, MINIMUM, COUNT—and <attribute> is an attribute of the relation specified by R.

• The resulting relation has the grouping attributes plus one attribute for each element in the function list.

Page 38: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Aggregate Functions and Grouping (2/2)

R(DNO, NO_OF_EMPLOYEES, AVERAGE_SAL) (DNO ℱ COUNT SSN, AVERAGE SALARY (EMPLOYEE))

• DNO ℱ COUNT SSN, AVERAGE SALARY(EMPLOYEE)

• ℱ COUNT SSN, AVERAGE SALARY(EMPLOYEE)

Page 39: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 40: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Recursive Closure Operations

• Another type of operation that, in general, cannot be specified in the basic relational algebra is recursive closure.

• This operation is applied to a recursive relationship between tuples of the same type.

Page 41: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 42: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

OUTER JOIN and OUTER UNION Operations (1/5)

• A set of operations, called OUTER JOINs, can be used when we want to keep all the tuples in R, or those in S, or those in both relations in the result of the JOIN, whether or not they have matching tuples in the other relation.

Page 43: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

OUTER JOIN and OUTER UNION Operations (2/5)

• For example, suppose that we want a list of all employee names and also the name of the departments they manage if they happen to manage a department; we can apply an operation LEFT OUTER JOIN, denoted by , to retrieve the result as follows: – TEMP (EMPLOYEE SSN=MGRSSN DEPARTMENT)

RESULT FNAME, MINIT, LNAME, DNAME(TEMP)

Page 44: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.
Page 45: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

OUTER JOIN and OUTER UNION Operations (3/5)

• A similar operation, RIGHT OUTER JOIN, keeps every tuple in the second or right relation.

• A third operation, FULL OUTER JOIN, keeps all tuples in both the left and the right relations when no matching tuples are found, padding them with null values as needed.

Page 46: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

OUTER JOIN and OUTER UNION Operations (4/5)

• The OUTER UNION operation was developed to take the union of tuples from two relations if the relations are not union compatible.

• This operation will take the UNION of tuples in two relations that are partially compatible, meaning that only some of their attributes are union compatible.

Page 47: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

OUTER JOIN and OUTER UNION Operations (5/5)

• For example, an OUTER UNION can be applied to two relations whose schemas are STUDENT(Name, SSN, Department, Advisor) and FACULTY(Name, SSN, Department, Rank).

• The resulting relation schema is R(Name, SSN, Department, Advisor, Rank), and all the tuples from both relations are included in the result.

• Student tuples will have a null for the Rank attribute, whereas faculty tuples will have a null for the Advisor attribute.

• A tuple that exists in both will have values for all its attributes.

Page 48: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Examples of Queries in Relational Algebra (1/7)

• Retrieve the name and address of all employees who work for the ‘Research’ department.

• RESEARCH_DEPT DNAME=’Research’(DEPARTMENT)

RESEARCH_EMPS (RESEARCH_DEPT DNUMBER=DNOEMPLOYEE)

RESULT FNAME, LNAME, ADDRESS(RESEARCH_EMPS)

Page 49: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Examples of Queries in Relational Algebra (2/7)

• For every project located in ‘Stafford’, list the project number, the controlling department number, and the department manager’s last name, address, and birthdate.

• STAFFORD_PROJS PLOCATION=’Stafford’(PROJECT)

CONTR_DEPT (STAFFORD_PROJS DNUM=DNUMBER DEPARTMENT)

PROJ_DEPT_MGR (CONTR_DEPT MGRSSN=SSN EMPLOYEE)

RESULT PNUMBER, DNUM, LNAME, ADDRESS, BDATE(PROJ_DEPT_MGR)

Page 50: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Examples of Queries in Relational Algebra (3/7)

• Find the names of employees who work on all the projects controlled by department number 5.

• DEPT5_PROJS(PNO) PNUMBER(DNUM= 5(PROJECT))

EMP_PRJO(SSN, PNO) ESSN, PNO(WORKS_ON)

RESULT_EMP_SSNS EMP_PRJO ÷ DEPT5_PROJS RESULT LNAME, FNAME(RESULT_EMP_SSNS * EMPLOYEE)

Page 51: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Examples of Queries in Relational Algebra (4/7)

• Make a list of project numbers for projects that involve an employee whose last name is ‘Smith’, either as a worker or as a manager of the department that controls the project.

• SMITHS(ESSN) SSN(LNAME=’Smith’(EMPLOYEE))

SMITH_WORKER_PROJ PNO(WORKS_ON * SMITHS)

MGRS LNAME, DNUMBER(EMPLOYEE SSN=MGRSSN DEPARTMENT)

SMITH_MANAGED_DEPTS (DNUM) DNUMBER(LNAME= ’Smith’(MGRS))

SMITH_MGR_PROJS(PNO) PNUMBER(SMITH_MANAGED_DEPTS * PRO

JECT) RESULT (SMITH_WORKER_PROJS SMITH_MGR_PROJS)

Page 52: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Examples of Queries in Relational Algebra (5/7)

• List the names of all employees with two or more dependents.

• T1(SSN, NO_OF_DEPTS) ESSN ℱ COUNT DEPENDENT_NAME(DEPENDENT)

T2 NO_OF_DEPS 2(T1)

RESULT LNAME, FNAME(T2 * EMPLOYEE)

Page 53: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Examples of Queries in Relational Algebra (6/7)

• Retrieve the names of employees who have no dependents. • ALL_EMPS SSN(EMPLOYEE)

EMPS_WITH_DEPS(SSN) ESSN(DEPENDENT)

EMPS_WITHOUT_DEPS (ALL_EMPS - EMPS_WITH_DEPS) RESULT LNAME, FNAME(EMPS_WITHOUT_DEPS * EMPLOYEE)

Page 54: Chapter 6 The Relational Algebra. The SELECT Operation (1/2) The SELECT operation is used to select a subset of the tuples from a relation that satisfy.

Examples of Queries in Relational Algebra (7/7)

• List the names of managers who have at least one dependent.

• MGRS(SSN) MGRSSN(DEPARTMENT)

EMPS_WITH_DEPS(SSN) ESSN(DEPENDENT)

MGRS_WITH_DEPS (MGRS EMPS_WITH_DEPS) RESULT LNAME, FNAME(MGRS_WITH_DEPS * EMPLOYEE)