RELATIONAL ALGEBRA

69
RELATIONAL ALGEBRA Prof. Sin-Min LEE Department of Computer Science

description

CS157A. Lecture 8. RELATIONAL ALGEBRA. Prof. Sin-Min LEE Department of Computer Science. Relation schema Named relation defined by a set of attribute and domain name pairs. Relational database schema Set of relation schemas, each with a distinct name. - PowerPoint PPT Presentation

Transcript of RELATIONAL ALGEBRA

Page 1: RELATIONAL ALGEBRA

RELATIONAL ALGEBRA

Prof. Sin-Min LEE

Department of Computer Science

Page 2: RELATIONAL ALGEBRA

•Relation schema–Named relation defined by a set of attribute and domain name pairs.

•Relational database schema–Set of relation schemas, each with a distinct name.

•Each tuple is distinct; there are no duplicate tuples.•Order of attributes has no significance.•Order of tuples has no significance, theoretically.•Relation name is distinct from all other relation names in relational schema.•Each cell of relation contains exactly one atomic (single) value.•Each attribute has a distinct name.•Values of an attribute are all from the same domain.•Each tuple is distinct; there are no duplicate tuples.•Order of attributes has no significance.•Order of tuples has no significance, theoretically.

Page 3: RELATIONAL ALGEBRA

Database Scheme

A relational database scheme, or schema, corresponds to a set of table definitions.

Eg: product(p_id, name, category, description) supply(p_id, s_id, qnty_per_month) supplier(s_id, name, address, ph#)

* remember the difference between a DB instance and a DB scheme.

Page 4: RELATIONAL ALGEBRA

SAMPLE SCHEMAS AND INSTANCESThe Schemas:

Sailors(sid: integer, sname: string, rating: integer, age: real)

Boats(bid: integer, bname: string, color: string)Reserves(sid: integer, bid: integer, day: date)

The Instances:

Page 5: RELATIONAL ALGEBRA
Page 6: RELATIONAL ALGEBRA

What is Relational Algebra?

Relational algebra is a procedural query language.

It consists of the select, project, union, set difference, Cartesian product, and rename operations.

Set intersection, division, natural join, and assignment combine the fundamental operations.

SQL is based on relational algebra

Page 7: RELATIONAL ALGEBRA

•Relational algebra and relational calculus are formal languages associated with the relational model.•Both are equivalent to one another

Page 8: RELATIONAL ALGEBRA

It is an abstract language. We use it to express the set of operations that any relational query language must perform.

Two types of operations: 1.set-theoretic operations: tables are

essentially sets of rows 2.native relational operations: focus on the

structure of the rows Query languages are specialized languages for asking questions,or queries,that involve the data in database.

What are the query languages ?

Page 9: RELATIONAL ALGEBRA

Query languages

procedural vs. non-proceduralcommercial languages have some

of bothwe will study:

relational algebra (which is procedural, i.e. tells you how to process a query)

relational calculus (which is non-procedural i.e. tells what you want)

Page 10: RELATIONAL ALGEBRA

SEMANTICS OF THE SAMPLE RELATIONS

Sailors: Entity set; lists the relevant properties of sailors.

BoatsBoats: Entity set; lists the relevant properties of boats.

Reserves: Relationship set: links sailors and boats by describing the boat number and date for which a sailor made a reservation.Example of the declarative sentences for which rows stand:Row 1: “Sailor ’22’ reserved boat number ‘101’ on 10/10/98”.

Page 11: RELATIONAL ALGEBRA

Selection and ProjectionSelection Operator: σrating>8 (S2)Retrieves from the current instance of relation named S2

those rows where the value of the attribute ‘rating’ is greater than 8.

Applying the above selection operator to the sample instance of S2 shown in figure 4.2 yields the relational instance on figure 4.4 as shown below:

π

condition

Page 12: RELATIONAL ALGEBRA

Projection Operator πsname,rating(S2)Retrieves from the current instance of the relation

named S2 those columns whose names are ‘sname’ and ‘rating’.

Applying the above operator to the sample instance of S2 shown in figure 4.2 yields the relational instance on figure 4.5 as shown below:

N. B.: Note that the projection operator can produce duplicate rows in the resulting instance.

Page 13: RELATIONAL ALGEBRA

- Projection Operator (cont’d)

Similarly πage(S2) yields the following

relational instance

Note here the elimination of duplicates

SQL would yield

For πage (S2):

age

35.0

55.0

35.0

35.0

Page 14: RELATIONAL ALGEBRA

Introduction

one of the two formal query languages of the relational model

collection of operators for manipulating relations

Operators: two types of operators Set Operators: Union(),Intersection(),

Difference(-), Cartesian Product (x) New Operators: Select (), Project (),

Join (⋈)

Page 15: RELATIONAL ALGEBRA

Introduction – cont’d

A Relational Algebra Expression: a sequence of relational algebra operators and operands (relations), formed according to a set of rules.

The result of evaluating a relational algebra expression is a relation.

Page 16: RELATIONAL ALGEBRA

Selection

Denoted by c(R)

Selects the tuples (rows) from a relation R that satisfy a certain selection condition c.

It is a unary operatorThe resulting relation has the same

attributes as those in R.

Page 17: RELATIONAL ALGEBRA

Example 1:

SNO SNAME AGE STATE

S1 MIKE 21 IL

S2 STEVE 20 LA

S3 MARY 18 CA

S4 MING 19 NY

S5 OLGA 21 NY

S:

state=‘IL’(S)

Page 18: RELATIONAL ALGEBRA

Example 2:

CNOCNAME

CREDIT

DEPT

C1Database

3 CS

C2Statistics

3 MATH

C3 Tennis 1SPORTS

C4 Violin 4 MUSIC

C5 Golf 2SPORTS

C6 Piano 5 MUSIC

C:

CREDIT 3(C)

Page 19: RELATIONAL ALGEBRA

Example 3

SNO CNO Grade

S1 C1 90

S1 C2 80

S1 C3 75

S1 C4 70

S1 C5 100

S1 C6 60

S2 C1 90

S2 C2 80

S3 C2 90

S4 C2 80

S4 C4 85

S4 C5 100

E:

SNO=‘S1’and CNO=‘C1’(E)

Page 20: RELATIONAL ALGEBRA
Page 21: RELATIONAL ALGEBRA

Selection - Properties

Selection Operator is commutative C1(C2 (R)) = C2(C1 (R))

The Selection is an unary operator, it cannot be used to select tuples from more than one relations.

Page 22: RELATIONAL ALGEBRA

Projection

Denoted by L(R), where L is list of attribute names and R is a relation name or some other relational algebra expression.

The resulting relation has only those attributes of R specified in L.

The projection is also an unary operation.  Duplicate rows are not permitted in relational

algebra. Duplication is removed from the result.

Duplicate rows can occur in SQL, though they may be controlled by explicit keywords.

Page 23: RELATIONAL ALGEBRA

Projection - Example

Example 1: STATE (S)

SNO SNAME AGE STATE

S1 MIKE 21 IL

S2 STEVE 20 LA

S3 MARY 18 CA

S4 MING 19 NY

S5 OLGA 21 NY

STATE

IL

LA

CA

NY

Page 24: RELATIONAL ALGEBRA

Projection - Example

Example 2: CNAME, DEPT(C)

CNOCNAME

CREDIT

DEPT

C1Database

3 CS

C2Statistics

3 MATH

C3 Tennis 1SPORTS

C4 Violin 4 MUSIC

C5 Golf 2SPORTS

C6 Piano 5 MUSIC

CNAME

DEPT

Database

CS

Statistics

MATH

TennisSPORTS

Violin MUSIC

GolfSPORTS

Piano MUSIC

Page 25: RELATIONAL ALGEBRA

Projection - Example

Example 3: S#(STATE=‘NY'(S))

SNO SNAME AGE STATE

S1 MIKE 21 IL

S2 STEVE 20 LA

S3 MARY 18 CA

S4 MING 19 NY

S5 OLGA 21 NY

SNO

S4

S5

Page 26: RELATIONAL ALGEBRA

SET Operations

UNION: R1 R2

INTERSECTION: R1 R2

DIFFERENCE: R1 - R2

CARTESIAN PRODUCT: R1 R2

Page 27: RELATIONAL ALGEBRA

Union Compatibility

For operators , , -, the operand relations R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) must have the same number of attributes, and the domains of the corresponding attributes must be compatible; that is, dom(Ai)=dom(Bi) for i=1,2,...,n.

The resulting relation for , , or - has the same attribute names as the first operand relation R1 (by convention).

Page 28: RELATIONAL ALGEBRA

Union Compatibility - Examples

Are S(SNO, SNAME, AGE, STATE) and C(CNO, CNAME, CREDIT, DEPT) union compatible?

Are S(S#, SNAME, AGE, STATE) and C(CNO, CNAME, CREDIT_HOURS, DEPT_NAME) union compatible?

Page 29: RELATIONAL ALGEBRA
Page 30: RELATIONAL ALGEBRA

UNION, SET DIFFERENCE & SET INTERSECT

Union puts all tuples of two relations in one relation. To use this operator, two conditions must hold:1. The two relations must be of the same arity.2. The domain of ith attribute of the two participating relation must

be the same.

Set difference operator computes tuples that are in one relation, but not in another.

Set intersect operator computes tuples that are common in two relations:

The five fundamental operations of the relational algebra are: select, project, cartesian product, Union, and set difference

All other operators can be constructed using these operators

Page 31: RELATIONAL ALGEBRA

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the bid of red colored boats:

Page 32: RELATIONAL ALGEBRA

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the bid of red colored boats: ∏bid(бcolor=red(Boats))

Page 33: RELATIONAL ALGEBRA

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the name of sailors who have reserved Boat number 2.

Page 34: RELATIONAL ALGEBRA

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the name of sailors who have reserved Boat number 2. ∏sname(бbid=2(Sailors (sid)Reserve))

Page 35: RELATIONAL ALGEBRA

EXAMPLE

Assume a database with the following three relations:Sailors (sid, sname, rating)Boats (bid, bname, color)Reserve (sid, bid, date)

Query 1: Find the name of sailors who have reserved both a red and a green boat.

Page 36: RELATIONAL ALGEBRA

Union, Intersection, Difference

T= R U S : A tuple t is in relation T if and only if t is in relation R or t is in relation S

T = R S: A tuple t is in relation T if and only if t is in both relations R and S

T= R - S :A tuple t is in relation T if and only if t is in R but not in S

Page 37: RELATIONAL ALGEBRA

Set-Intersection

Denoted by the symbol . Results in a relation that contains

only the tuples that appear in both relations.

R S = R – (R – S)Since set-intersection can be written

in terms of set-difference, it is not a fundamental operation.

Page 38: RELATIONAL ALGEBRA

Examples

A1 A2

1 Red

3 White

4 green

B1 B2

3 White

2 Blue

R S

Page 39: RELATIONAL ALGEBRA

Examples

A1 A2

1 Red

3 White

4 Green

2 Blue

A1 A2

3 White

R SR S

S - R

B1 B2

2 Blue

A1 A2

1 Red

4 Green

R - S

Page 40: RELATIONAL ALGEBRA

RENAME OPERATOR

Rename operator changes the name of its input table to its subscript, ρe2(Emp) Changes the name of Emp table to e2

Page 41: RELATIONAL ALGEBRA

RELATIONAL ALGEBRA INTRODUCTION

Assume the following two relations:Emp (SS#, name, age, salary, dno) Dept (dno, dname, floor, mgrSS#)

Relational algebra is a procedural query language, i.e., user must define both “how” and “what” to retrieve.

Relational algebra consists of a set of operators that consume either one or two relations as input. An operator produces one relation as its output.

Unary operators include: select, project, and rename

Binary operators include: cartesian product, equality join, natural join, join, semi-join, division, union, and set difference.

Page 42: RELATIONAL ALGEBRA

SELECT OPERATOR Select (б): selects tuples that satisfy a predicate; e.g., retrieve

the employees whose salary is 30,000 бSalary=30,000(Employee)

Conjunctive ( ) and disjunctive ( ) selection predicates are allowed;e.g., retrieve employees whose salary is higher than 30,000 and are younger than 25 years old:бSalary>30,000 age<25(Employee)

Note that only selection predicates are allowed. A selection predicate is either (1) a comparison (=, ≠, ≤, ≥, <, >) between an attribute and a constant (e.g., salary = 30,000) or (2) a comparison between two different attributes of the same relation (e.g., salary = age × 100).

Note: This operator is different than the SELECT command of SQL.

<

<<

Page 43: RELATIONAL ALGEBRA

EXAMPLE

Emp table:

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Page 44: RELATIONAL ALGEBRA

EXAMPLEEmp table:

бSalary=30,000(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Page 45: RELATIONAL ALGEBRA

EXAMPLEEmp table:

бSalary=30,000(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

SS# Name Age Salary dno

4 Kathy 30 30000 5

Page 46: RELATIONAL ALGEBRA

EXAMPLEEmp table:

бAge>22(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Page 47: RELATIONAL ALGEBRA

EXAMPLEEmp table:

бAge>22(Employee)

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

SS# Name Age Salary dno

1 Joe 24 20000 2

4 Kathy 30 30000 5

Page 48: RELATIONAL ALGEBRA

PROJECT OPERATOR

Project (∏) retrieves a column. It is a unary operator that eliminate duplicates.

e.g., name of employees: ∏ name(Employee)

e.g., name of employees earning more than 30,000:

∏ name(бSalary>30,000(Employee))

Page 49: RELATIONAL ALGEBRA

EXAMPLE

Emp table:

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Page 50: RELATIONAL ALGEBRA

EXAMPLEEmp table:

∏ age(Emp)

SS# Name Age Salary

dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Age

24

20

22

30

4

Page 51: RELATIONAL ALGEBRA

EXAMPLEEmp table:

∏ name,age(бSalary=4000 (Emp) )

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Page 52: RELATIONAL ALGEBRA

EXAMPLEEmp table:

∏ name,age(бSalary=4000 (Emp) )

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

SS# Name Age Salary dno

5 Shideh 4 4000 1

Page 53: RELATIONAL ALGEBRA

EXAMPLEEmp table:

∏ name,age(бSalary=4000 (Emp) )

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 3

3 Bob 22 27000 4

4 Kathy 30 30000 5

5 Shideh 4 4000 1

Name Age

Shideh 4

Page 54: RELATIONAL ALGEBRA

CARTESIAN PRODUCTCartesian Product (R1 × R2) combines two

relations by concatenating their tuples together, evaluating all possible combinations. If the name of a column is identical for two relations, this ambiguity is resolved by attaching the name of each relation to a column. e.g., Emp × Dept (SS#, name, age, salary, Emp.dno, Dept.dno, dname,

floor, mgrSS#)

If t(Emp) and t(Dept) is the cardinality of the Employee and Dept relations respectively, then the cardinality of Emp × Dept is: t(Emp) × t(Dept)

Page 55: RELATIONAL ALGEBRA

CARTESIAN PRODUCT (Cont…)Example:

Emp table:

Dept table:

345 John Doe 23 25,000 1943 Jane Java 25 28,000 2876 Joe SQL 22 32,000 1

SS# Name age salary dno

1 Toy 1 3452 Sho

e2 943

dno dname floor mgrSS#

Page 56: RELATIONAL ALGEBRA

CARTESIAN PRODUCT (Cont…) Cartesian product of Emp and Dept: Emp × Dept:

345

John Doe

23 25,000

1 1 Toy 1 345

943

Jane Java

25 28,000

2 1 Toy 1 345

876

Joe SQL 22 32,000

1 1 Toy 1 345

345

John Doe

23 25,000

1 2 Shoe 2 943

943

Jane Java

25 28,000

2 2 Shoe 2 943

876

Joe SQL 22 32,000

1 2 Shoe 2 943

SS# Name age salary Emp.dno Dept.dno dname floor mgrSS#

Page 57: RELATIONAL ALGEBRA

CARTESIAN PRODUCT

Example: retrieve the name of employees that work in the toy department:

Page 58: RELATIONAL ALGEBRA

CARTESIAN PRODUCT

Example: retrieve the name of employees that work in the toy department: ∏name(бEmp.dno=Dept.dno(Emp ×

бdname=‘toy’(Dept)))

Page 59: RELATIONAL ALGEBRA

CARTESIAN PRODUCT (Cont…)∏name(бdname=‘toy’ (б Emp.dno=Dept.dno(Emp × Dept)))

345 John Doe 23 25,000 1 1 Toy 1 345

943 Jane Java 25 28,000 2 1 Toy 1 345

876 Joe SQL 22 32,000 1 1 Toy 1 345

345 John Doe 23 25,000 1 2 Shoe 2 943

943 Jane Java 25 28,000 2 2 Shoe 2 943

876 Joe SQL 22 32,000 1 2 Shoe 2 943

SS# Name age salary Emp.dno Dept.dno dname floor mgrSS#

Page 60: RELATIONAL ALGEBRA

CARTESIAN PRODUCT (Cont…)∏name(бdname=‘toy’ (б Emp.dno=Dept.dno(Emp × Dept)))

345 John Doe

23 25,000 1 1 Toy 1 345

876 Joe SQL

22 32,000 1 1 Toy 1 345

943 Jane Java

25 28,000 2 2 Shoe 2 943

SS# Name age salary Emp.dno Dept.dno dname floor mgrSS#

Page 61: RELATIONAL ALGEBRA

CARTESIAN PRODUCT (Cont…) ∏name(бdname=‘toy’ (б Emp.dno=Dept.dno(Emp × Dept)))

345 John Doe

23 25,000 1 1 Toy 1 345

876 Joe SQL 22 32,000 1 1 Toy 1 345

SS# Name age salary Emp.dno Dept.dno dname floor mgrSS#

Page 62: RELATIONAL ALGEBRA

CARTESIAN PRODUCT (Cont…) ∏name(бdname=‘toy’ (б Emp.dno=Dept.dno(Emp × Dept)))

John Doe

Joe SQL

Name

Page 63: RELATIONAL ALGEBRA
Page 64: RELATIONAL ALGEBRA

•Join is a derivative of Cartesian product.

•Equivalent to performing a Selection, using join predicate as selection formula, over Cartesian product of the two operand relations.

•One of the most difficult operations to implement efficiently in an RDBMS and one reason why RDBMSs have intrinsic performance problems.•Various forms of join operation

–Natural join (defined by Codd)–Outer join –Theta join–Equijoin (a particular type of Theta join)–Semijoin

Page 65: RELATIONAL ALGEBRA

EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN

Equality join connects tuples from two relations that match on certain attributes. The specified joining columns are kept in the resulting relation. ∏name(бdname=‘toy’(Emp Dept)))

Natural join connects tuples from two relations that match on the specified common attributes ∏name(бdname=‘toy’(Emp Dept)))

How is an equality join between Emp and Dept using dno different than a natural join between Emp and Dept using dno? Equality join: SS#, name, age, salary, Emp.dno, Dept.dno, … Natural join: SS#, name, age, salary, dno, dname, …

Join is similar to equality join using different comparison operators A S op = {=, ≠, ≤, ≥, <, >} att op att

(dno)

(dno)

Page 66: RELATIONAL ALGEBRA

EXAMPLE JOIN

Equality Join, (Emp Dept)))

SS# Name Age Salary

dno

1 Joe 24 20000 2

2 Mary 20 25000 1

3 Bob 22 27000 1

4 Kathy 30 30000 2

5 Shideh 4 4000 1EMP

dno dname

floor mgrss#

1 Toy 1 5

2 Shoe 2 1

Dept

(dno)

SS# Name Age Salary

EMP.dno

Dept.dno

dname

floor mgrss#

1 Joe 24 20000 2 2 Shoe 2 1

2 Mary 20 25000 1 1 Toy 1 5

3 Bob 22 27000 1 1 Toy 1 5

4 Kathy 30 30000 2 2 Shoe 2 1

5 Shideh 4 4000 1 1 Toy 1 5

Page 67: RELATIONAL ALGEBRA

EXAMPLE JOIN

Natural Join, (Emp Dept)))

SS# Name Age Salary dno

1 Joe 24 20000 2

2 Mary 20 25000 1

3 Bob 22 27000 1

4 Kathy 30 30000 2

5 Shideh 4 4000 1

EMP

dno dname

floor mgrss#

1 Toy 1 5

2 Shoe 2 1

Dept

(dno)

SS# Name Age Salary dno dname floor mgrss#

1 Joe 24 20000 2 Shoe 2 1

2 Mary 20 25000 1 Toy 1 5

3 Bob 22 27000 1 Toy 1 5

4 Kathy 30 30000 2 Shoe 2 1

5 Shideh 4 4000 1 Toy 1 5

Page 68: RELATIONAL ALGEBRA

EXAMPLE JOIN

Join, (Emp ρx(Emp))))

SS# Name Age Salary

dno

1 Joe 24 20000 2

2 Mary 20 25000 1

3 Bob 22 27000 1

4 Kathy 30 30000 2

5 Shideh 4 4000 1EMP

dno dname

floor mgrss#

1 Toy 1 5

2 Shoe 2 1Dept

Salary > 5 * salary

SS#

Name

Age Salary

dno x.SS#

x.Name

x.Age

x.Salary

x.dno

2 Mary 20 25000 1 2 Shideh 4 4000 1

3 Bob 22 27000 1 3 Shideh 4 4000 1

4 Kathy 30 30000 2 4 Shideh 4 4000 1

Page 69: RELATIONAL ALGEBRA

EQUALITY JOIN, NATURAL JOIN, JOIN, SEMI-JOIN (Cont…)

Example: retrieve the name of employees who earn more than Joe: ∏name(Emp (sal>x.sal)бname=‘Joe’(ρ x(Emp)))

Semi-Join selects the columns of one relation that joins with another. It is equivalent to a join followed by a projection: Emp (dno)Dept ≡∏SS#, name, age, salary, dno(Emp

Dept)