08 Relational Algebra

18
1 CS 338: Computer Applications in Business: Databases (Fall 2014) ©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage Learning Slides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database System Concepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. ) CS 338: Computer Applications in Business: Databases Relational Algebra ©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage Learning Slides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database System Concepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. ) Rice University Data Center Fall 2014 Chapter 6 Relational Algebra and Calculus were developed before SQL language SQL is based on concepts from both Algebra and Calculus Relational Algebra Basic set of operations for the relational model These operations enable a user to perform specific basic retrieval requests as relational algebra expressions (sequence of relational algebra operations) Relational calculus Higher-level declarative language for specifying relational queries Formal Languages for Relational Model Æ Relational Algebra 2 In previous lectures, we studied SQL (practical language for the relational model)

Transcript of 08 Relational Algebra

Page 1: 08 Relational Algebra

1

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

CS 338: Computer Applications in Business: Databases

Relational Algebra

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Rice University Data Center

Fall 2014

Chapter 6

• Relational Algebra and Calculus were developed before SQL language• SQL is based on concepts from both Algebra and Calculus

� Relational Algebra� Basic set of operations for the relational model

� These operations enable a user to perform specific basic retrieval requests as relational algebra expressions (sequence of relational algebra operations)

� Relational calculus� Higher-level declarative language for specifying relational queries

Formal Languages for Relational ModelÆ Relational Algebra

2

In previous lectures, we studied SQL (practical language for the relational model)

Page 2: 08 Relational Algebra

2

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Formal Languages for Relational ModelÆ Relational Algebra

9 select V9 project �9 union �9 set difference –9 Cartesian product x

9 rename U

• Two types of relational operations• Unary: operate on one relation

• Binary: operate on two tables (or pairs of relations) by combining related records (rows)

• The operators take one or two relations as inputs and produce a new relation as a result

Â

3

There are six basic operators in relational algebra:

SELECT Operation

• used to choose a subset of tuples from a relation that satisfies a selection condition• different from SELECT clause of SQL

• SELECT is simply a filter that keeps only those tuples that satisfy a qualifying condition

Sigma (denotes the SELECT operator)

predicate

argument relation

Notation

4

Page 3: 08 Relational Algebra

3

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

SELECT Operation

predicate or selection condition

attribute name comparison operator constant value or attribute name

• The predicate or selection condition consists of terms or Boolean expressions connected by Boolean conditions: � (and), � (or), � (not)

• Each term or Boolean expression is made up of a number of clauses of the form

• <selection condition> is applied independently to each individual tuple t in R• If condition evaluates to TRUE, tuple selected• All selected tuples appear in the result of the SELECT operation

• SELECT operation is commutative• A sequence of SELECT operations can be applied in any order.

• The fraction of tuples selected by a selection condition is referred to as the selectivity of the condition

Â

5

SELECT Operation Æ Examples

V(A=B ^ D > 5) (R)

R

A B C D

x

x

y

y

x

y

y

y

1

5

12

23

7

7

3

10

A B C D

x

y

x

y

1

23

7

10

6

Example 1

Example 2

corresponds to the following SQL querySELECT*FROM EMPLOYEEWHERE (Dno = 4 AND Salary>25000) OR (Dno = 5 AND Salary>30000)

Page 4: 08 Relational Algebra

4

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

PROJECT Operation

� SELECT operation chooses certain rows and discards other rows

� PROJECT operation on the other hand chooses certain columns from the table and discards the other columns� If we are interested in only selecting certain attributes of a relation, then

we use PROJECT� No duplicates: result of PROJECT operation is a set of distinct tuples

pi (denotes the PROJECT operator)

degree = number of attributes in attribute list

argument relation

Notation

7

PROJECT OperationÆ Example

A B C

x

x

y

y

10

20

30

40

1

1

1

2

A C

x

y

y

1

1

2

A,C (R)

RDuplicates Removed

8

Example 1

Â

Nathan Wilson
Nathan Wilson
Note: Duplicates are removed in relational algebra but not in SQL statements
Nathan Wilson
SQL:SELECT DISTINCT(A,C) FROM R
Page 5: 08 Relational Algebra

5

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

RENAME Operation

• In general, several relational algebra operations are applied for most queries• We can write these queries in two ways:

• Write operations as a single relational algebra expression by nesting the operations

• Apply one operation at a time and create intermediate result relations (we need to give names to relations that hold intermediate results)

Give a name to each intermediate relation

9

RENAME Operation

� Rename attributes in intermediate resultsRENAME operation

rho (denotes the RENAME operator)

argument relation

Notation:

new relation name

new attribute names

renames relation only

renames attributes only

renames both relation and attributes

Â

10

Nathan Wilson
SQL:SELECT DISTINCT Fname, Lname, Salary FROM EMPLOYEE WHERE Dno=5
Nathan Wilson
Nathan Wilson
Page 6: 08 Relational Algebra

6

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

RENAME OperationÆ Example

S(D, E, F )U (R)

SU (R)

S A B

xxx

aaa

C

xzz

R

A B

xxx

aaa

C

xzz

D E

xxx

aaa

F

xzz

S

(D, E, F )U (S)

Â

Example 1

11

SummaryÆ SELECT, PROJECT, and RENAME

SELECT

PROJECT

RENAME

SELECT *FROM EMPLOYEEWHERE Dno=4 AND Salary > 25000

SELECT DISTINCT Fname, SalaryFROM EMPLOYEE

Fname, Salary(EMPLOYEE)

SELECT E.Fname As First_name, E.Salary As ESalaryFROM EMPLOYEE As EWHERE E.Dno = 5

(V(E.Dno=5)(ȡE(First_name, ESalary)(EMPLOYEE)))(Fname,Salary)

12

Nathan Wilson
Nathan Wilson
just relation
Nathan Wilson
just attributes
Nathan Wilson
relation and attributes
Nathan Wilson
SELECT * FROM R as S
Nathan Wilson
Nathan Wilson
SELECT A as D, B as E, C as F, FROM S
Nathan Wilson
Page 7: 08 Relational Algebra

7

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

UNION Operation

� Denoted by: R S

� Purpose: Result of this operation includes all tuples that are either in R or in S or in both R and S� Duplicate tuples are eliminated

To find all course IDs for courses taught in the Fall 2009 semester, or in the Spring 2010 semester, or in bothcourse_id (V semester=“Fall” ƭ year=2009 (section)) course_id (V semester=“Spring” ƭ year=2010 (section))

Example 1

13

UNION Operation

R � S

A B

x

x

y

1

2

1

A B

x

y

2

3

RS

A B

x

x

y

y

1

2

1

3

No Duplicates

Example 2

Â

14

Nathan Wilson
Page 8: 08 Relational Algebra

8

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

INTERSECTION and SET DIFFERENCE Operations

� INTERSECTION� Denoted by R Ŋ S

� Purpose: Result of this operation includes all tuples that are in bothR and S

• Intersection is: • Commutative: R � S = S � R• Associative: R � (S � W) = (R � S) � W

� SET DIFFERENCE (or MINUS)� Denoted by R – S

� Purpose: Result of this operation includes all tuples that are in R but not in S

intersection: R � S = R - (R - S)

15

INTERSECTION OperationÆ Example

A B

xxy

121

A B

xy

23

R S

A B

x 2R � S

Â

Example

16

Page 9: 08 Relational Algebra

9

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

SET DIFFERENCE OperationÆ Example

R – SA B

x

x

y

1

2

1

A B

x

y

2

3

RS

A B

x

y

1

1

Example

17

DIVISION Operation

• Denoted by: R y S

• Purpose: Suited for queries that include the phrase “for all” queries

Let TABLE1 have 2 columns (A and B); TABLE2 have column B:

• TABLE1 y TABLE2 contains all A tuples such that for every B tuple in TABLE2, there is an AB tuple in TABLE1

Example 1

18

Page 10: 08 Relational Algebra

10

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

DIVISION OperationÆ Example 1

R y SAB

x

y

1

2

A B

xxxyzmmmnny

12311134612

R

S

Example

19

Â

DIVISION OperationÆ Example 2

A B

xxxyyzzz

aaaaaaaa

C D

xzzzzzzy

aabababb

E

11113111

R y SD

ab

E

11

A B

xz

aa

C

zz

R

S

Â

Example

20

Page 11: 08 Relational Algebra

11

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Assignment Operation

• Denoted by m

• Purpose: Provides a convenient way to express complex queries • Write query as a sequential program consisting of

• a series of assignments • followed by an expression whose value is displayed as a result of the

query.

• Assignment must always be made to a temporary relation variable.

• The result to the right of the m is assigned to the relation variable on the left of the m

Â

21

Assignment OperationÆ Example 1

Retrieve a list of all female employees in the EMPLOYEE table

FEMALE_EMPS m VSex=‘F’ (EMPLOYEE)EMPNAMES m Fname, Lname, Ssn (FEMALE_EMPS)

• The result to the right of the m is assigned to the relation variable on the left of the m.

Â

22

Example 1

Page 12: 08 Relational Algebra

12

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Assignment Operation Æ Example 2

A B C

x

x

y

y

10

20

30

40

1

1

1

2

Result1 Å A,C (R)

R

A C

x

y

y

1

1

2

Result1

Example 1

Â

23

Cartesian Product (Cross Product) Operation

� Denoted by R × S

• Purpose: Combines every member (tuple) from one relation (set) with every member (tuple) from the other relation (set)

• Also known as Cross Product or Cross Join

• This is also a binary set operation but the relations on which it is applied do not have to be union compatible

24

Page 13: 08 Relational Algebra

13

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Cartesian Product OperationÆ Examples

R x S

A B

x

y

1

2

A B

xxxxyyyy

11112222

C D

xy yzxyyz

1010201010102010

E

aabbaabb

C D

xyyz

10102010

E

aabbR

S

Â

25

Example 1

Coursedept cnum instructor termCS 338 Jones SpringCS 330 Smith Winter

STATS 330 Wong Winter

TAname major

Ashley CSLee STATS

Course × TAdept cnum instructor term name majorCS 338 Jones Spring Ashley CSCS 330 Smith Winter Ashley CS

STATS 330 Wong Winter Ashley CSCS 338 Jones Spring Lee STATSCS 330 Smith Winter Lee STATS

STATS 330 Wong Winter Lee STATS

Example 2

Characteristics of Cartesian Product

• Degree• degree(R x S) = degree(R) + degree(S)

• Cardinality• cardinality(R x S) = cardinality(R) ൈ cardinality(S)

• The result of Cartesian Product returns all possible combinations• It would not be useful unless it is followed by a SELECT operation• This is also called JOIN

Â

26

Page 14: 08 Relational Algebra

14

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

JOIN Operation

• Denoted by R <join condition> S• Purpose: Like cross product, a JOIN combines tuples from

two relations into single “longer” tuples, but only those thatsatisfy matching condition

• Formally, a combination of cross product and select

Â

27

Example 1

What are the names and salaries of all department managers?

Example 2

Coursedept cnum instructor termCS 338 Jones SpringCS 330 Smith Winter

STATS 330 Wong Winter

TAname major

Ashley CSLee STATS

Course TAdept cnum instructor term name majorCS 338 Jones Spring Ashley CSCS 330 Smith Winter Ashley CS

STATS 330 Wong Winter Lee STATS

Who can TA courses offered by their own department?

Types of JOINÆ Theta JOIN

• Denoted by: R <r.A lj s.B> S• where

• A is an attribute of R, B is an attribute of S• A and B have the same domain• lj is one of the comparison operators {=, �, < , >, �, �} Æ lj א { =, �, >, �, <, � }

• Purpose: Serves as a general join condition• Combines two tuples into a single combined tuple satisfying a

given condition

• Also known as the inner join

Â

28

Page 15: 08 Relational Algebra

15

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Types of JOINÆ Theta JOIN Example

EmployeeID Name DeptID Position Salary DeptID DeptName Location

111 Sue 1 Manager 65000 1 Sales Room 245

222 John 2 Accountant 42000 2 Purchase Room 430

333 Mary 1 Clerk 28000 1 Sales Room 245

DeptID DeptName Location1 Sales Room 245

2 Purchase Room 430

3 Marketing Room 212

Employee

Department

Employee DeptID = DeptID DepartmentExample

EmployeeID Name DeptID Position Salary111 Sue 1 Manager 65000

222 John 2 Accountant 42000

333 Mary 1 Clerk 28000

444 Victor NULL Manager 52000

Â

29

Types of JOINÆ EQUIJOIN

• EQUIJOIN is the most common join condition that involves an equality comparisons where lj is {=}• This special type of Theta JOIN is called EQUIJOIN

• Denoted by: R <r.A = s.B> S

EmployeeID Name DeptID Position Salary DeptID DeptName Location

111 Sue 1 Manager 65000 1 Sales Room 245

222 John 2 Accountant 42000 2 Purchase Room 430

333 Mary 1 Clerk 28000 1 Sales Room 245

Employee DeptID = DeptID DepartmentExample

Disadvantage: redundancy

Â

30

Page 16: 08 Relational Algebra

16

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Types of JOINÆ Natural JOIN

• Denoted by: R * S

• Purpose: Removes the extra attribute in an EQUIJOIN

• Both join attributes must have the same name in both relations

• If this is not the case, a renaming operation is applied firstEmployeeID Name DeptID Position Salary DeptName Location

111 Sue 1 Manager 65000 Sales Room 245

222 John 2 Accountant 42000 Purchase Room 430

333 Mary 1 Clerk 28000 Sales Room 245

Â

31

Types of JOINÆ OUTER JOIN Operation

• Natural JOIN eliminates tuples that have no relation or do not match• This includes tuples with NULL values

� Outer JOIN is an extension of the join operation that avoids loss of information� That is, keep all tuples in R, or all those in S, or all those in both relations

regardless of whether or not they have matching tuples in the other relation

� Three main Outer JOIN types:� LEFT OUTER JOIN� RIGHT OUTER JOIN� FULL OUTER JOIN

Â

32

Page 17: 08 Relational Algebra

17

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Types of JOINÆ OUTER JOIN Operation: LEFT OUTER JOIN

• Denoted by: R S

• Purpose: Keeps every tuple in the first (or left) relation R• If no matching tuple is found in S, then the attributes of S in the join

result are filled with NULL values

EmployeeID Name DeptID Position Salary DeptName Location

111 Sue 1 Manager 65000 Sales Room 245

222 John 2 Accountant 42000 Purchase Room 430

333 Mary 1 Clerk 28000 Sales Room 245

444 Victor NULL Manager 52000 NULL NULL

Employee DepartmentExample

Â

33

Types of JOINÆ OUTER JOIN Operation: RIGHT OUTER JOIN

• Denoted by: R S

• Purpose: Keeps every tuple in the second (or right) relation S• If no matching tuple is found in R, then the attributes of R in the join

result are filled with NULL values

EmployeeID Name DeptID Position Salary DeptName Location

111 Sue 1 Manager 65000 Sales Room 245

222 John 2 Accountant 42000 Purchase Room 430

333 Mary 1 Clerk 28000 Sales Room 245

NULL NULL 3 NULL NULL Marketing Room 212

Employee DepartmentExample

Â

34

Page 18: 08 Relational Algebra

18

CS 338: Computer Applications in Business: Databases (Fall 2014)

©1992-2014 by Addison Wesley & Pearson Education, Inc., McGraw Hill, Cengage LearningSlides adapted and modified from Fundamentals of Database Systems (5/6) (Elmasri et al.), Database SystemConcepts (5/6) (Silberschatz et al.), Database Systems (Coronel et al.), Database Systems (4/5) (Connolly et al. )

Types of JOINÆ OUTER JOIN Operation: FULL OUTER JOIN

• Denoted by: R S

• Purpose: Keeps all tuples in both the left and right relations when no matching tuples are found • Add NULL values as required

EmployeeID Name DeptID Position Salary DeptName Location

111 Sue 1 Manager 65000 Sales Room 245

222 John 2 Accountant 42000 Purchase Room 430

333 Mary 1 Clerk 28000 Sales Room 245

444 Victor NULL Manager 52000 NULL NULL

NULL NULL 3 NULL NULL Marketing Room 212

Employee DepartmentExample

Â

35

Summary

• Select V

• Project �

• Rename U

• Union �

• Difference –

• Intersection �

• Division y

• Assignment m

• Cartesian Product X

` Join` Natural Join *` Left Outer Join` Right Outer Join` Full Outer Join

36