Relational Algebra and My SQL(II)

Post on 31-Dec-2015

58 views 3 download

description

Lecture 6 CS157B. Relational Algebra and My SQL(II). Prof. Sin Min Lee Deparment of Computer Science San Jose State University. Lecture 12: Further relational algebra, further SQL. www.cl.cam.ac.uk/Teaching/current/Databases/. Today’s lecture. Where does SQL differ from relational model? - PowerPoint PPT Presentation

Transcript of Relational Algebra and My SQL(II)

Relational Algebra and My SQL(II)Prof. Sin Min Lee

Deparment of Computer Science

San Jose State University

Lecture 12:Further relational algebra, further SQL

www.cl.cam.ac.uk/Teaching/current/Databases/

Today’s lecture

Where does SQL differ from relational model?

What are some other features of SQL?

How can we extend the relational algebra to match more closely SQL?

Duplicate rows

Consider our relation instances from lecture 6, Reserves, Sailors and Boats

Consider SELECT rating,age

FROM Sailors; We get a relation that doesn’t satisfy

our definition of a relation! RECALL: We have the keyword DISTINCT to remove duplicates

Multiset semantics

A relation in SQL is really a multiset or bag, rather than a set as in the relational model A multiset has no order (unlike a

list), but allows duplicates E.g. {1,2,1,3} is a bag select, project and join work for

bags as well as sets Just work on a tuple-by-tuple basis

Bag operations

Bag union: Sum the number of times that an element

appears in the two bags, e.g. {1,2,1}{1,2,3} = {1,1,1,2,2,3}

Bag intersection: Take the minimum of the number of

occurrences in each bag, e.g. {1,2,1}{1,2,3,3} = {1,2}

Bag difference: Proper-subtract the number of

occurrences in the two bags, e.g. {1,2,1}-{1,2,3,3} = {1}

Laws for bags

Note that whilst some of the familiar (set-theoretic) laws continue to hold, some of them do not

Example: R(ST) = (RS)(RT) ??

Extended relational algebraAdd features needed for SQL

1. Bag semantics

2. Duplicate elimination operator, 3. Sorting operator, 4. Grouping and aggregation

operator, 5. Outerjoin operators, oV, Vo, oVo

Duplicate-elimination operator (R) = relation R with any

duplicated tuples removed R= (R)=

This is used to model the DISTINCT feature of SQL

A B

1 2

3 4

1 2

A B

1 2

3 4

Sorting L1,… Ln

(R) returns a list of tuples of R, ordered according to the

attributes L1, …, Ln

Note: does not return a relation R= B(R)= [(5,2),(1,3),(3,4)]

ORDER BY in SQL, e.g. SELECT * FROM Sailors WHERE rating>7 ORDER BY age, sname;

A B

1 3

3 4

5 2

Extended projection SQL allows us to use arithmetic operators SELECT age*5

FROM Sailors; We extend the projection operator to allow the columns in

the projection to be functions of one or more columns in the argument relation, e.g.

R= A+B,A,A(R)=

A B

1 2

3 4

A+B A.1 A.2

3 1 1

7 3 3

Arithmetic

Arithmetic (and other expressions) can not be used at the top level i.e. 2+2 is not a valid SQL query

How would you get SQL to compute 2+2?

Aggregation

SQL provides us with operations to summarise a column in some way, e.g.

SELECT COUNT(rating) FROM Sailors;

SELECT COUNT(DISTINCT rating) FROM Sailors;

SELECT COUNT(*) FROM Sailors WHERE rating>7; We also have SUM, AVG, MIN and MAX

Grouping

These aggregation operators have been applied to all qualifying tuples. Sometimes we want to apply them to each of several groups of tuples, e.g. For each rating, find the average

age of the sailors For each rating, find the age of the

youngest sailor

GROUP BY in SQL

SELECT [DISTINCT] target-list

FROM relation-list

WHERE qualification

GROUP BY grouping-list; The target-list contains

1. List of column names

2. Aggregate terms NOTE: The variables in target-list

must be contained in grouping-list

GROUP BY cont.

For each rating, find the average age of the sailors

SELECT rating,AVG(age)

FROM Sailors

GROUP BY rating;

For each rating find the age of the youngest sailor

SELECT rating,MIN(age)

FROM Sailors

GROUP BY rating;

Grouping and aggregation L(R) where L is a list of elements

that are either Individual column names (“Grouping

attributes”), or Of the form (A), where is an

aggregation operator (MIN, SUM, …) and A is the column it is applied to

For example,rating,AVG(age)(Sailors)

Semantics

Group R according to the grouping attributes

Within each group, compute (A) Result is the relation consisting of

one tuple for each group. The components of that tuple are the values associated with each element of L for that group

Example

Let R=

Compute beer,AVG(price)(R)

bar beer price

Anchor 6X 2.50

Anchor Adnam’s 2.40

Mill 6X 2.60

Mill Fosters 2.80

Eagle Fosters 2.90

Example cont.1. Group according to the grouping attribute,

beer:

2. Compute average of price within groups:

bar beer price

Anchor 6X 2.50

Mill 6X 2.60

Anchor Adnam’s 2.40

Mill Fosters 2.80

Eagle Fosters 2.90

beer price

6X 2.55

Adnam’s 2.40

Fosters 2.85

NULL values

Sometimes field values are unknown (e.g. rating not known yet), or inapplicable (e.g. no spouse name)

SQL provides a special value, NULL, for both these situations

This complicates several issues Special operators needed to check for

NULL Is NULL>8? Is (NULL OR TRUE)=TRUE? We need a three-valued logic Need to carefully re-define semantics

NULL values

Consider INSERT INTO Sailors (sid,sname) VALUES (101,”Julia”);

SELECT * FROM Sailors;

SELECT rating FROM Sailors;

SELECT sname FROM Sailors WHERE rating>0;

Entity integrity constraint An entity integrity constraint

states that no primary key value can be NULL

Outer join

Note that with the usual join, a tuple that doesn’t ‘join’ with any from the other relation is removed from the resulting relation

Instead, we can ‘pad out’ the columns with NULLs

This operator is called an full outer join, written oVo

Example of full outer join Let R= Let S=

Then RVS =

But RoVoS =

A B

1 2

3 4

B C

4 5

6 7

A B C

3 4 5

A B C

1 2 NULL

3 4 5

NULL 6 7

Outer joins in SQL

SQL/92 has three variants: LEFT OUTER JOIN (algebra: oV) RIGHT OUTER JOIN (algebra: Vo) FULL OUTER JOIN (algebra: oVo)

For example: SELECT * FROM Reserves r LEFT OUTER JOIN Sailors s ON r.sid=s.sid;

Views

A view is a query with a name that can be used in further SELECT statements, e.g.

CREATE VIEW ExpertSailors(sid,sname,age)

AS SELECT sid,sname,age

FROM Sailors

WHERE rating>9;

Note that ExpertSailors is not a stored relation

(WARNING: mysql does not support views )

Querying views

So an example query SELECT sname

FROM ExpertSailors

WHERE age>27; is translated by the system to the

following: SELECT sname

FROM Sailors

WHERE rating>9 AND age>27;

Relational Algebra

The Relational Algebra is used to define the ways in which relations (tables) can be operated to manipulate their data.

It is used as the basis of SQL for relational databases, and illustrates the basic operations required of any DML.

This Algebra is composed of Unary operations (involving a single table) and Binary operations (involving multiple tables).

SQL Structured Query Language (SQL)

Standardised by ANSI Supported by modern RDBMSs

Commands fall into three groups Data Definition Language (DLL)

Create tables, etc Data Manipulation Language (DML)

Retrieve and modify data Data Control Language

Control what users can do – grant and revoke privileges

Unary OperationsSelectionProjection

Selection

The selection or operation selects rows from a table that satisfy a condition:

  < condition > < tablename >

Example: course = ‘CM’ Students

Studentsstud# name course100 Fred PH stud# name

course200 Dave CM 200 Dave CM300 Bob CM 300 Bob CM

Projection The projection or operation selects a list of columns

from a table. < column list > < tablename >

Example: stud#, name Students

Studentsstud# name course stud# name100 Fred PH 100 Fred 200 Dave CM 200 Dave300 Bob CM 300 Bob

Selection / Projection

Selection and Projection are usually combined:

stud#, name ( course = ‘CM’ Students)

Studentsstud# name course100 Fred PH stud# name200 Dave CM 200 Dave300 Bob CM 300 Bob

Binary OperationsCartesian ProductTheta JoinInner JoinNatural JoinOuter JoinsSemi Joins

Cartesian Product

Concatenation of every row in the first relation (R) with every row in the second relation (S):

R X S

Cartesian Product - ExampleStudents Coursesstud# name course course# name100 Fred PH PH Pharmacy200 Dave CM CM Computing 300 Bob CM

Students X Courses =stud# Students.name course course# Courses.name100 Fred PH PH Pharmacy100 Fred PH CM Computing200 Dave CM PH Pharmacy200 Dave CM CM Computing300 Bob CM PH Pharmacy300 Bob CM CM Computing

Theta Join

A Cartesian product with a condition applied:

R ⋈ <condition> S

Theta Join - Example

Students Courses

stud# name course course#name

100 Fred PH PH Pharmacy

200 Dave CM CM Computing

300 Bob CM

Students ⋈ stud# = 200 Courses

stud# Students.name course course# Courses.name

200 Dave CM PH Pharmacy

200 Dave CM CM Computing

Inner Join (Equijoin)

A Theta join where the <condition> is the match (=) of the primary and foreign keys.

R ⋈ <R.primary_key = S.foreign_key> S

Inner Join - Example

Students Courses

stud# name course course#name

100 Fred PH PH Pharmacy

200 Dave CM CM Computing

300 Bob CM

Students ⋈ course = course# Courses

stud# Students.name course course# Courses.name

100 Fred PH PH Pharmacy

200 Dave CM CM Computing

300 Bob CM CM Computing

Natural Join

Inner join produces redundant data (in the previous example: course and course#). To get rid of this duplication:

< stud#, Students.name, course, Courses.name >

(Students ⋈ <course = course#> Courses)OrR1= Students ⋈ <course = course#> Courses

R2= < stud#, Students.name, course, Courses.name > R1

The result is called the natural join of Students and Courses

Natural Join - ExampleStudents Coursesstud# name course course#name100 Fred PH PH Pharmacy200 Dave CM CM Computing 300 Bob CM

R1= Students ⋈ <course = course#> CoursesR2= < stud#, Students.name, course, Courses.name > R1stud# Students.name course Courses.name

100 Fred PH Pharmacy200 Dave CM Computing300 Bob CM Computing

Outer Joins

Inner join + rows of one table which do not satisfy the <condition>.

Left Outer Join: R <R.primary_key = S.foreign_key> SAll rows from R are retained and unmatched rows of S are padded with NULL

Right Outer Join: R <R.primary_key = S.foreign_key> SAll rows from S are retained and unmatched rows of R are padded with NULL

Left Outer Join - ExampleStudents Coursesstud# name course course#name100 Fred PH PH Pharmacy200 Dave CM CM Computing 400 Peter EN CH Chemistry

Students <course = course#> Courses

stud# Students.name course course#Courses.name

100 Fred PH PH Pharmacy200 Dave CM CM

Computing400 Peter EN NULL NULL

Right Outer Join - ExampleStudents Coursesstud# name course course#name100 Fred PH PH Pharmacy200 Dave CM CM Computing 400 Peter EN CH Chemistry

Students <course = course#> Courses

stud# Students.name course course# Courses.name

100 Fred PH PH Pharmacy200 Dave CM CM ComputingNULL NULL NULL CH Chemistry

Combination of Unary and Join Operations

Students Coursesstud# name address course course# name100 Fred Aberdeen PH PH Pharmacy200 Dave Dundee CM CM Computing 300 Bob Aberdeen CM

Show the names of students (from Aberdeen) and the names of their coursesShow the names of students (from Aberdeen) and the names of their courses

R1= Students ⋈ <course=course#> CoursesR2= <address=“Aberdeen”> R1R3= <Students.name, Course.name> R2

Students.name Courses.nameFred PharmacyBob Computing

Set OperationsUnion

Intersection

Difference

Union Takes the set of rows in each table and combines them,

eliminating duplicates Participating relations must be compatible, ie have the

same number of columns, and the same column names, domains, and data types

R S R S

A Ba1 b1a2 b2

A Ba2 b2a3 b3

A Ba1 b1a2 b2a3 b3

Intersection Takes the set of rows that are common to each

relation Participating relations must be compatible

R S R S

A Ba1 b1a2 b2

A Ba2 b2a3 b3

A Ba2 b2

Difference Takes the set of rows in the first relation but not

the second Participating relations must be compatible

R S R - S

A Ba1 b1a2 b2

A Ba2 b2a3 b3

A Ba1 b1

Exercise (May 2004 Exam)

Employee WorkLoad Projectempid name empid* projid* duration projid name

E100 Fred E100 P001 17 P001 DB

E200 Dave E200 P001 12 P002Access

E300 Bob E300 P002 15 P003 SQL

E400 Peter

Determine the outcome of the following operations: A natural join between Employee and WorkLoad A left outer join between Employee and WorkLoad A right outer join between WorkLoad and Project

Relational Algebra Operations written in SQL

Unary OperationsSelection

course = ‘Computing’ Students

In SQL:

Select *

From Students

Where course = ‘Computing’;

Projection

stud#, name Students

In SQL:

Select stud#, name

From Students;

Selection & Projection

stud#, name ( course = ‘Computing’ Students)

In SQL:

Select stud#, name

From students

Where course = ‘Computing’;

Binary Operations/Joins

Cartesian Product: Students X Courses

In SQL:

Select *

From Students, Courses;

Theta Join: Students ⋈ <stud# =200> Courses

In SQL:

Select *

From Students, Courses

Where stud# = 200;

Binary Operations/Joins

Inner Join (Equijoin): Students ⋈ <course=course#> Courses

In SQL:

Select *

From Students, Courses

Where course=course#;

Natural Join:

R1= Students ⋈ <course = course#> Courses

R2= < stud#, Students.name, course, Courses.name > R1

In SQL:

Select stud#, Students.name, course, Courses.name

From Students, Courses

Where course=course#;

Outer JoinsLeft Outer JoinStudents <course = course#> CoursesIn SQL:Select * From Students, CoursesWhere course = course#(+)

Right Outer JoinStudents <course = course#> CoursesIn SQL:Select * From Students, CoursesWhere course(+) = course#

Combination of Unary and Join Operations

R1= Students ⋈ <course=course#> Courses

R2= <address=“Aberdeen”> R1

R3= <Students.name, Course.name> R2

In SQL:

Select Students.name, Courses.name

From Students, Courses

Where course=course#

AND address=“Aberdeen”;

Set Operations

Union: R S

In SQL:

Select * From R

Union

Select * From S;

Intersection: R S

In SQL:

Select * From R

Intersect

Select * From S;

Difference: R - S

In SQL:

Select * From R

Minus

Select * From S;

SQL OperatorsBetween, In, Like, Not

SQL OperatorsSELECT *FROM BookWHERE catno BETWEEN 200 AND 400;

SELECT *FROM ProductWHERE prod_desc BETWEEN ‘C’ AND ‘S’;

SELECT *FROM BookWHERE catno NOT BETWEEN 200 AND 400;

SQL Operators

SELECT Catno

FROM Loan

WHERE Date-Returned IS NULL;

SELECT Catno

FROM Loan

WHERE Date-Returned IS NOT NULL;

SQL Operators

SELECT Name

FROM Member

WHERE memno IN (100, 200, 300, 400);

SELECT Name

FROM Member

WHERE memno NOT IN (100, 200, 300, 400);

SQL OperatorsSELECT Name

FROM Member

WHERE address NOT LIKE ‘%Aberdeen%’;

SELECT Name

FROM Member

WHERE Name LIKE ‘_ES%’;

Note: In MS Access, use * and # instead of % and _

Selecting Distinct Values

Studentstud# name address100Fred Aberdeen200Dave Dundee300Bob Aberdeen

SELECT Distinct address

FROM Student;

address

Aberdeen

Dundee

Exercise

Employee(empid, name)

Project(projid, name)

WorkLoad(empid*, projid*, duration)

List the names of employees working on project

name ‘Databases’.

Nested Subqueries: Use of IN

SELECT property

FROM PropertyForRent

WHERE staff IN( staffs who works

at branch on ‘112 A St’);

Source: Database Systems Connolly/Begg

Since there are more than one row selected, “=“ cannot be used.

Use of ANY/SOME

SELECT name, salary

FROM Staff

WHERE salary > SOME( SELECT salary

FROM Staff

WHERE branch = ‘A’ );

Source: Database Systems Connolly/Begg

Result:{2000,3000,4000}

Result: {list of staff with salary greater than 2000.}

Use of ALL

SELECT name, salary

FROM Staff

WHERE salary > ALL( SELECT salary

FROM Staff

WHERE branch = ‘A’ );

Source: Database Systems Connolly/Begg

Result:{2000,3000,4000}

Result: {list of staff with salary greater than 4000.}

Use of Any/Some and All

If the subquery is empty: ALL returns true ANY returns false

ISO standard allows SOME to be

used interchangeably with ANY.

Source: Database Systems Connolly/Begg

04/19/23

Natural JoinNatural Join44A A Natural JoinNatural Join is a join operation that joins two is a join operation that joins two tables bytables by their common column. This their common column. This operation is similar to the setting relation of two operation is similar to the setting relation of two tables.tables.

SELECT a.comcol, a.SELECT a.comcol, a.col1col1, b., b.col2col2, , expr1expr1, , expr2expr2 ; ;

FROM FROM table1table1 a, a, table2table2 b ; b ;

WHERE a.WHERE a.comcolcomcol = b. = b.comcolcomcol

04/19/23

Natural JoinNatural Join44

MusicMusic

idid

98019801

typetype

StudentStudent

98019801

idid namename classclass

98019801

ProductProduct

idid namename classclass typetype

Same idSame id

JoinJoin

eg. 25eg. 25 Make a list of students and the instruments Make a list of students and the instruments they they learn. (Natural Join)learn. (Natural Join)

04/19/23

SELECT s.class, s.name, s.id, m.type ;

FROM student s, music m ;

WHERE s.id=m.id ORDER BY class, name

Natural JoinNatural Join44

class name id type1A Aaron 9812 Piano1A Bobby 9811 Flute1A Gigi 9824 Recorder1A Jill 9820 Piano1A Johnny 9803 Violin1A Luke 9810 Piano1A Mary 9802 Flute: : : :

Result

eg. 25eg. 25 Make a list of students and the instruments they Make a list of students and the instruments they learn. (Natural Join)learn. (Natural Join)

04/19/23

eg. 26eg. 26 Find the number of students learning piano in Find the number of students learning piano in each class.each class.

Natural JoinNatural Join44

Three Parts :Three Parts :

(1)(1) Natural Join.Natural Join.

(2)(2) Condition: Condition: m.type="Piano"m.type="Piano"

(3)(3) GROUP BY classGROUP BY class

04/19/23

Natural JoinNatural Join44

MusicMusic

StudentStudent

ProductProduct

JoinJoin ConditionConditionm.type= "Piano"m.type= "Piano"

Group ByGroup By

classclass

eg. 26eg. 26

04/19/23

eg. 26eg. 26 Find the number of students learning piano in Find the number of students learning piano in each class.each class.

SELECT s.class, COUNT(*) ;

FROM student s, music m ;

WHERE s.id=m.id AND m.type="Piano" ;

GROUP BY class ORDER BY class

Natural JoinNatural Join44

class cnt1A 41B 21C 1

Result

04/19/23

An An Outer JoinOuter Join is a join operation that includes is a join operation that includes rows that have a match, plus rows that do not rows that have a match, plus rows that do not have a match in the other table.have a match in the other table.

Outer JoinOuter Join44

04/19/23

eg. 27eg. 27 List the students who have not yet chosen an List the students who have not yet chosen an instrument. (No match)instrument. (No match)

Outer JoinOuter Join44

No matchNo match

MusicMusic

idid typetype

StudentStudent

98019801

idid namename classclass

04/19/23

eg. 27eg. 27 List the students who have not yet chosen an List the students who have not yet chosen an instrument. (No match)instrument. (No match)

SELECT class, name, id FROM student ;

WHERE id NOT IN ( SELECT id FROM music ) ;ORDER BY class, name

Outer JoinOuter Join44

Resultclass name id1A Mandy 98211B Kenny 98141B Tobe 98051C Edmond 98181C George 9817: : :

04/19/23

eg. 28eg. 28 Make a checking list of students and the Make a checking list of students and the instruments they learn. The list should also instruments they learn. The list should also contain the students contain the students without an instrument.without an instrument.

(Outer Join)(Outer Join)

Outer JoinOuter Join44

04/19/23

Outer JoinOuter Join44Natural JoinNatural Join

No MatchNo Match

Outer JoinOuter Join

eg. 28eg. 28

04/19/23

SELECT s.class, s.name, s.id, m.type ;

FROM student s, music m ;

WHERE s.id=m.id ;

Outer JoinOuter Join44UNION ;

SELECT class, name, id, "" ;

FROM student ;

WHERE id NOT IN ( SELECT id FROM music ) ;

ORDER BY 1, 2

eg. 28eg. 28

04/19/23

Outer JoinOuter Join44

emptyclass name id1A Mandy 98211B Kenny 98141B Tobe 98051C Edmond 98181C George 9817: : :

No Match

class name id type1A Aaron 9812 Piano1A Bobby 9811 Flute1A Gigi 9824 Recorder1A Jill 9820 Piano1A Johnny 9803 Violin1A Luke 9810 Piano1A Mary 9802 Flute: : : :

Natural Join

class name id type

1A Aaron 9812 Piano

1A Bobby 9811 Flute

1A Gigi 9824 Recorder

1A Jill 9820 Piano

1A Johnny 9803 Violin

1A Luke 9810 Piano

1A Mandy 9821

1A Mary 9802 Flute

1A Peter 9801 Piano

1A Ron 9813 Guitar

1B Eddy 9815 Piano

1B Janet 9822 Guitar

1B Kenny 9814

1B Kitty 9806 Recorder

: : : :

Outer Join

Multi-Table Queries

Join Inner Join Left Outer Join Right Outer Join Full Outer Join

Source: Database Systems Connolly/Begg

JoinSELECT client

FROM Client c, View v

WHERE c.client = v.client;

Source: Database Systems Connolly/Begg

FROM Client c JOIN View v ON c.client = v.client(creates two identical client columns)

FROM Client JOIN View USING clientFROM Client NATURAL JOIN View

ISO standard Alternatives

Join

The join operation combines data from two tables by forming pairs of related rows where the matching columns in each table have the same value.

If one row of a table is unmatched, the row is omitted from the resulting table.

Source: Database Systems Connolly/Begg