Relational Algebra and My SQL(II)Prof. Sin Min Lee
Deparment of Computer Science
San Jose State University
Lecture 12:Further relational algebra, further SQL
www.cl.cam.ac.uk/Teaching/current/Databases/
Today’s lecture
Where does SQL differ from relational model?
What are some other features of SQL?
How can we extend the relational algebra to match more closely SQL?
Duplicate rows
Consider our relation instances from lecture 6, Reserves, Sailors and Boats
Consider SELECT rating,age
FROM Sailors; We get a relation that doesn’t satisfy
our definition of a relation! RECALL: We have the keyword DISTINCT to remove duplicates
Multiset semantics
A relation in SQL is really a multiset or bag, rather than a set as in the relational model A multiset has no order (unlike a
list), but allows duplicates E.g. {1,2,1,3} is a bag select, project and join work for
bags as well as sets Just work on a tuple-by-tuple basis
Bag operations
Bag union: Sum the number of times that an element
appears in the two bags, e.g. {1,2,1}{1,2,3} = {1,1,1,2,2,3}
Bag intersection: Take the minimum of the number of
occurrences in each bag, e.g. {1,2,1}{1,2,3,3} = {1,2}
Bag difference: Proper-subtract the number of
occurrences in the two bags, e.g. {1,2,1}-{1,2,3,3} = {1}
Laws for bags
Note that whilst some of the familiar (set-theoretic) laws continue to hold, some of them do not
Example: R(ST) = (RS)(RT) ??
Extended relational algebraAdd features needed for SQL
1. Bag semantics
2. Duplicate elimination operator, 3. Sorting operator, 4. Grouping and aggregation
operator, 5. Outerjoin operators, oV, Vo, oVo
Duplicate-elimination operator (R) = relation R with any
duplicated tuples removed R= (R)=
This is used to model the DISTINCT feature of SQL
A B
1 2
3 4
1 2
A B
1 2
3 4
Sorting L1,… Ln
(R) returns a list of tuples of R, ordered according to the
attributes L1, …, Ln
Note: does not return a relation R= B(R)= [(5,2),(1,3),(3,4)]
ORDER BY in SQL, e.g. SELECT * FROM Sailors WHERE rating>7 ORDER BY age, sname;
A B
1 3
3 4
5 2
Extended projection SQL allows us to use arithmetic operators SELECT age*5
FROM Sailors; We extend the projection operator to allow the columns in
the projection to be functions of one or more columns in the argument relation, e.g.
R= A+B,A,A(R)=
A B
1 2
3 4
A+B A.1 A.2
3 1 1
7 3 3
Arithmetic
Arithmetic (and other expressions) can not be used at the top level i.e. 2+2 is not a valid SQL query
How would you get SQL to compute 2+2?
Aggregation
SQL provides us with operations to summarise a column in some way, e.g.
SELECT COUNT(rating) FROM Sailors;
SELECT COUNT(DISTINCT rating) FROM Sailors;
SELECT COUNT(*) FROM Sailors WHERE rating>7; We also have SUM, AVG, MIN and MAX
Grouping
These aggregation operators have been applied to all qualifying tuples. Sometimes we want to apply them to each of several groups of tuples, e.g. For each rating, find the average
age of the sailors For each rating, find the age of the
youngest sailor
GROUP BY in SQL
SELECT [DISTINCT] target-list
FROM relation-list
WHERE qualification
GROUP BY grouping-list; The target-list contains
1. List of column names
2. Aggregate terms NOTE: The variables in target-list
must be contained in grouping-list
GROUP BY cont.
For each rating, find the average age of the sailors
SELECT rating,AVG(age)
FROM Sailors
GROUP BY rating;
For each rating find the age of the youngest sailor
SELECT rating,MIN(age)
FROM Sailors
GROUP BY rating;
Grouping and aggregation L(R) where L is a list of elements
that are either Individual column names (“Grouping
attributes”), or Of the form (A), where is an
aggregation operator (MIN, SUM, …) and A is the column it is applied to
For example,rating,AVG(age)(Sailors)
Semantics
Group R according to the grouping attributes
Within each group, compute (A) Result is the relation consisting of
one tuple for each group. The components of that tuple are the values associated with each element of L for that group
Example
Let R=
Compute beer,AVG(price)(R)
bar beer price
Anchor 6X 2.50
Anchor Adnam’s 2.40
Mill 6X 2.60
Mill Fosters 2.80
Eagle Fosters 2.90
Example cont.1. Group according to the grouping attribute,
beer:
2. Compute average of price within groups:
bar beer price
Anchor 6X 2.50
Mill 6X 2.60
Anchor Adnam’s 2.40
Mill Fosters 2.80
Eagle Fosters 2.90
beer price
6X 2.55
Adnam’s 2.40
Fosters 2.85
NULL values
Sometimes field values are unknown (e.g. rating not known yet), or inapplicable (e.g. no spouse name)
SQL provides a special value, NULL, for both these situations
This complicates several issues Special operators needed to check for
NULL Is NULL>8? Is (NULL OR TRUE)=TRUE? We need a three-valued logic Need to carefully re-define semantics
NULL values
Consider INSERT INTO Sailors (sid,sname) VALUES (101,”Julia”);
SELECT * FROM Sailors;
SELECT rating FROM Sailors;
SELECT sname FROM Sailors WHERE rating>0;
Entity integrity constraint An entity integrity constraint
states that no primary key value can be NULL
Outer join
Note that with the usual join, a tuple that doesn’t ‘join’ with any from the other relation is removed from the resulting relation
Instead, we can ‘pad out’ the columns with NULLs
This operator is called an full outer join, written oVo
Example of full outer join Let R= Let S=
Then RVS =
But RoVoS =
A B
1 2
3 4
B C
4 5
6 7
A B C
3 4 5
A B C
1 2 NULL
3 4 5
NULL 6 7
Outer joins in SQL
SQL/92 has three variants: LEFT OUTER JOIN (algebra: oV) RIGHT OUTER JOIN (algebra: Vo) FULL OUTER JOIN (algebra: oVo)
For example: SELECT * FROM Reserves r LEFT OUTER JOIN Sailors s ON r.sid=s.sid;
Views
A view is a query with a name that can be used in further SELECT statements, e.g.
CREATE VIEW ExpertSailors(sid,sname,age)
AS SELECT sid,sname,age
FROM Sailors
WHERE rating>9;
Note that ExpertSailors is not a stored relation
(WARNING: mysql does not support views )
Querying views
So an example query SELECT sname
FROM ExpertSailors
WHERE age>27; is translated by the system to the
following: SELECT sname
FROM Sailors
WHERE rating>9 AND age>27;
Relational Algebra
The Relational Algebra is used to define the ways in which relations (tables) can be operated to manipulate their data.
It is used as the basis of SQL for relational databases, and illustrates the basic operations required of any DML.
This Algebra is composed of Unary operations (involving a single table) and Binary operations (involving multiple tables).
SQL Structured Query Language (SQL)
Standardised by ANSI Supported by modern RDBMSs
Commands fall into three groups Data Definition Language (DLL)
Create tables, etc Data Manipulation Language (DML)
Retrieve and modify data Data Control Language
Control what users can do – grant and revoke privileges
Unary OperationsSelectionProjection
Selection
The selection or operation selects rows from a table that satisfy a condition:
< condition > < tablename >
Example: course = ‘CM’ Students
Studentsstud# name course100 Fred PH stud# name
course200 Dave CM 200 Dave CM300 Bob CM 300 Bob CM
Projection The projection or operation selects a list of columns
from a table. < column list > < tablename >
Example: stud#, name Students
Studentsstud# name course stud# name100 Fred PH 100 Fred 200 Dave CM 200 Dave300 Bob CM 300 Bob
Selection / Projection
Selection and Projection are usually combined:
stud#, name ( course = ‘CM’ Students)
Studentsstud# name course100 Fred PH stud# name200 Dave CM 200 Dave300 Bob CM 300 Bob
Binary OperationsCartesian ProductTheta JoinInner JoinNatural JoinOuter JoinsSemi Joins
Cartesian Product
Concatenation of every row in the first relation (R) with every row in the second relation (S):
R X S
Cartesian Product - ExampleStudents Coursesstud# name course course# name100 Fred PH PH Pharmacy200 Dave CM CM Computing 300 Bob CM
Students X Courses =stud# Students.name course course# Courses.name100 Fred PH PH Pharmacy100 Fred PH CM Computing200 Dave CM PH Pharmacy200 Dave CM CM Computing300 Bob CM PH Pharmacy300 Bob CM CM Computing
Theta Join
A Cartesian product with a condition applied:
R ⋈ <condition> S
Theta Join - Example
Students Courses
stud# name course course#name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM
Students ⋈ stud# = 200 Courses
stud# Students.name course course# Courses.name
200 Dave CM PH Pharmacy
200 Dave CM CM Computing
Inner Join (Equijoin)
A Theta join where the <condition> is the match (=) of the primary and foreign keys.
R ⋈ <R.primary_key = S.foreign_key> S
Inner Join - Example
Students Courses
stud# name course course#name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM
Students ⋈ course = course# Courses
stud# Students.name course course# Courses.name
100 Fred PH PH Pharmacy
200 Dave CM CM Computing
300 Bob CM CM Computing
Natural Join
Inner join produces redundant data (in the previous example: course and course#). To get rid of this duplication:
< stud#, Students.name, course, Courses.name >
(Students ⋈ <course = course#> Courses)OrR1= Students ⋈ <course = course#> Courses
R2= < stud#, Students.name, course, Courses.name > R1
The result is called the natural join of Students and Courses
Natural Join - ExampleStudents Coursesstud# name course course#name100 Fred PH PH Pharmacy200 Dave CM CM Computing 300 Bob CM
R1= Students ⋈ <course = course#> CoursesR2= < stud#, Students.name, course, Courses.name > R1stud# Students.name course Courses.name
100 Fred PH Pharmacy200 Dave CM Computing300 Bob CM Computing
Outer Joins
Inner join + rows of one table which do not satisfy the <condition>.
Left Outer Join: R <R.primary_key = S.foreign_key> SAll rows from R are retained and unmatched rows of S are padded with NULL
Right Outer Join: R <R.primary_key = S.foreign_key> SAll rows from S are retained and unmatched rows of R are padded with NULL
Left Outer Join - ExampleStudents Coursesstud# name course course#name100 Fred PH PH Pharmacy200 Dave CM CM Computing 400 Peter EN CH Chemistry
Students <course = course#> Courses
stud# Students.name course course#Courses.name
100 Fred PH PH Pharmacy200 Dave CM CM
Computing400 Peter EN NULL NULL
Right Outer Join - ExampleStudents Coursesstud# name course course#name100 Fred PH PH Pharmacy200 Dave CM CM Computing 400 Peter EN CH Chemistry
Students <course = course#> Courses
stud# Students.name course course# Courses.name
100 Fred PH PH Pharmacy200 Dave CM CM ComputingNULL NULL NULL CH Chemistry
Combination of Unary and Join Operations
Students Coursesstud# name address course course# name100 Fred Aberdeen PH PH Pharmacy200 Dave Dundee CM CM Computing 300 Bob Aberdeen CM
Show the names of students (from Aberdeen) and the names of their coursesShow the names of students (from Aberdeen) and the names of their courses
R1= Students ⋈ <course=course#> CoursesR2= <address=“Aberdeen”> R1R3= <Students.name, Course.name> R2
Students.name Courses.nameFred PharmacyBob Computing
Set OperationsUnion
Intersection
Difference
Union Takes the set of rows in each table and combines them,
eliminating duplicates Participating relations must be compatible, ie have the
same number of columns, and the same column names, domains, and data types
R S R S
A Ba1 b1a2 b2
A Ba2 b2a3 b3
A Ba1 b1a2 b2a3 b3
Intersection Takes the set of rows that are common to each
relation Participating relations must be compatible
R S R S
A Ba1 b1a2 b2
A Ba2 b2a3 b3
A Ba2 b2
Difference Takes the set of rows in the first relation but not
the second Participating relations must be compatible
R S R - S
A Ba1 b1a2 b2
A Ba2 b2a3 b3
A Ba1 b1
Exercise (May 2004 Exam)
Employee WorkLoad Projectempid name empid* projid* duration projid name
E100 Fred E100 P001 17 P001 DB
E200 Dave E200 P001 12 P002Access
E300 Bob E300 P002 15 P003 SQL
E400 Peter
Determine the outcome of the following operations: A natural join between Employee and WorkLoad A left outer join between Employee and WorkLoad A right outer join between WorkLoad and Project
Relational Algebra Operations written in SQL
Unary OperationsSelection
course = ‘Computing’ Students
In SQL:
Select *
From Students
Where course = ‘Computing’;
Projection
stud#, name Students
In SQL:
Select stud#, name
From Students;
Selection & Projection
stud#, name ( course = ‘Computing’ Students)
In SQL:
Select stud#, name
From students
Where course = ‘Computing’;
Binary Operations/Joins
Cartesian Product: Students X Courses
In SQL:
Select *
From Students, Courses;
Theta Join: Students ⋈ <stud# =200> Courses
In SQL:
Select *
From Students, Courses
Where stud# = 200;
Binary Operations/Joins
Inner Join (Equijoin): Students ⋈ <course=course#> Courses
In SQL:
Select *
From Students, Courses
Where course=course#;
Natural Join:
R1= Students ⋈ <course = course#> Courses
R2= < stud#, Students.name, course, Courses.name > R1
In SQL:
Select stud#, Students.name, course, Courses.name
From Students, Courses
Where course=course#;
Outer JoinsLeft Outer JoinStudents <course = course#> CoursesIn SQL:Select * From Students, CoursesWhere course = course#(+)
Right Outer JoinStudents <course = course#> CoursesIn SQL:Select * From Students, CoursesWhere course(+) = course#
Combination of Unary and Join Operations
R1= Students ⋈ <course=course#> Courses
R2= <address=“Aberdeen”> R1
R3= <Students.name, Course.name> R2
In SQL:
Select Students.name, Courses.name
From Students, Courses
Where course=course#
AND address=“Aberdeen”;
Set Operations
Union: R S
In SQL:
Select * From R
Union
Select * From S;
Intersection: R S
In SQL:
Select * From R
Intersect
Select * From S;
Difference: R - S
In SQL:
Select * From R
Minus
Select * From S;
SQL OperatorsBetween, In, Like, Not
SQL OperatorsSELECT *FROM BookWHERE catno BETWEEN 200 AND 400;
SELECT *FROM ProductWHERE prod_desc BETWEEN ‘C’ AND ‘S’;
SELECT *FROM BookWHERE catno NOT BETWEEN 200 AND 400;
SQL Operators
SELECT Catno
FROM Loan
WHERE Date-Returned IS NULL;
SELECT Catno
FROM Loan
WHERE Date-Returned IS NOT NULL;
SQL Operators
SELECT Name
FROM Member
WHERE memno IN (100, 200, 300, 400);
SELECT Name
FROM Member
WHERE memno NOT IN (100, 200, 300, 400);
SQL OperatorsSELECT Name
FROM Member
WHERE address NOT LIKE ‘%Aberdeen%’;
SELECT Name
FROM Member
WHERE Name LIKE ‘_ES%’;
Note: In MS Access, use * and # instead of % and _
Selecting Distinct Values
Studentstud# name address100Fred Aberdeen200Dave Dundee300Bob Aberdeen
SELECT Distinct address
FROM Student;
address
Aberdeen
Dundee
Exercise
Employee(empid, name)
Project(projid, name)
WorkLoad(empid*, projid*, duration)
List the names of employees working on project
name ‘Databases’.
Nested Subqueries: Use of IN
SELECT property
FROM PropertyForRent
WHERE staff IN( staffs who works
at branch on ‘112 A St’);
Source: Database Systems Connolly/Begg
Since there are more than one row selected, “=“ cannot be used.
Use of ANY/SOME
SELECT name, salary
FROM Staff
WHERE salary > SOME( SELECT salary
FROM Staff
WHERE branch = ‘A’ );
Source: Database Systems Connolly/Begg
Result:{2000,3000,4000}
Result: {list of staff with salary greater than 2000.}
Use of ALL
SELECT name, salary
FROM Staff
WHERE salary > ALL( SELECT salary
FROM Staff
WHERE branch = ‘A’ );
Source: Database Systems Connolly/Begg
Result:{2000,3000,4000}
Result: {list of staff with salary greater than 4000.}
Use of Any/Some and All
If the subquery is empty: ALL returns true ANY returns false
ISO standard allows SOME to be
used interchangeably with ANY.
Source: Database Systems Connolly/Begg
04/19/23
Natural JoinNatural Join44A A Natural JoinNatural Join is a join operation that joins two is a join operation that joins two tables bytables by their common column. This their common column. This operation is similar to the setting relation of two operation is similar to the setting relation of two tables.tables.
SELECT a.comcol, a.SELECT a.comcol, a.col1col1, b., b.col2col2, , expr1expr1, , expr2expr2 ; ;
FROM FROM table1table1 a, a, table2table2 b ; b ;
WHERE a.WHERE a.comcolcomcol = b. = b.comcolcomcol
04/19/23
Natural JoinNatural Join44
MusicMusic
idid
98019801
typetype
StudentStudent
98019801
idid namename classclass
98019801
ProductProduct
idid namename classclass typetype
Same idSame id
JoinJoin
eg. 25eg. 25 Make a list of students and the instruments Make a list of students and the instruments they they learn. (Natural Join)learn. (Natural Join)
04/19/23
SELECT s.class, s.name, s.id, m.type ;
FROM student s, music m ;
WHERE s.id=m.id ORDER BY class, name
Natural JoinNatural Join44
class name id type1A Aaron 9812 Piano1A Bobby 9811 Flute1A Gigi 9824 Recorder1A Jill 9820 Piano1A Johnny 9803 Violin1A Luke 9810 Piano1A Mary 9802 Flute: : : :
Result
eg. 25eg. 25 Make a list of students and the instruments they Make a list of students and the instruments they learn. (Natural Join)learn. (Natural Join)
04/19/23
eg. 26eg. 26 Find the number of students learning piano in Find the number of students learning piano in each class.each class.
Natural JoinNatural Join44
Three Parts :Three Parts :
(1)(1) Natural Join.Natural Join.
(2)(2) Condition: Condition: m.type="Piano"m.type="Piano"
(3)(3) GROUP BY classGROUP BY class
04/19/23
Natural JoinNatural Join44
MusicMusic
StudentStudent
ProductProduct
JoinJoin ConditionConditionm.type= "Piano"m.type= "Piano"
Group ByGroup By
classclass
eg. 26eg. 26
04/19/23
eg. 26eg. 26 Find the number of students learning piano in Find the number of students learning piano in each class.each class.
SELECT s.class, COUNT(*) ;
FROM student s, music m ;
WHERE s.id=m.id AND m.type="Piano" ;
GROUP BY class ORDER BY class
Natural JoinNatural Join44
class cnt1A 41B 21C 1
Result
04/19/23
An An Outer JoinOuter Join is a join operation that includes is a join operation that includes rows that have a match, plus rows that do not rows that have a match, plus rows that do not have a match in the other table.have a match in the other table.
Outer JoinOuter Join44
04/19/23
eg. 27eg. 27 List the students who have not yet chosen an List the students who have not yet chosen an instrument. (No match)instrument. (No match)
Outer JoinOuter Join44
No matchNo match
MusicMusic
idid typetype
StudentStudent
98019801
idid namename classclass
04/19/23
eg. 27eg. 27 List the students who have not yet chosen an List the students who have not yet chosen an instrument. (No match)instrument. (No match)
SELECT class, name, id FROM student ;
WHERE id NOT IN ( SELECT id FROM music ) ;ORDER BY class, name
Outer JoinOuter Join44
Resultclass name id1A Mandy 98211B Kenny 98141B Tobe 98051C Edmond 98181C George 9817: : :
04/19/23
eg. 28eg. 28 Make a checking list of students and the Make a checking list of students and the instruments they learn. The list should also instruments they learn. The list should also contain the students contain the students without an instrument.without an instrument.
(Outer Join)(Outer Join)
Outer JoinOuter Join44
04/19/23
Outer JoinOuter Join44Natural JoinNatural Join
No MatchNo Match
Outer JoinOuter Join
eg. 28eg. 28
04/19/23
SELECT s.class, s.name, s.id, m.type ;
FROM student s, music m ;
WHERE s.id=m.id ;
Outer JoinOuter Join44UNION ;
SELECT class, name, id, "" ;
FROM student ;
WHERE id NOT IN ( SELECT id FROM music ) ;
ORDER BY 1, 2
eg. 28eg. 28
04/19/23
Outer JoinOuter Join44
emptyclass name id1A Mandy 98211B Kenny 98141B Tobe 98051C Edmond 98181C George 9817: : :
No Match
class name id type1A Aaron 9812 Piano1A Bobby 9811 Flute1A Gigi 9824 Recorder1A Jill 9820 Piano1A Johnny 9803 Violin1A Luke 9810 Piano1A Mary 9802 Flute: : : :
Natural Join
class name id type
1A Aaron 9812 Piano
1A Bobby 9811 Flute
1A Gigi 9824 Recorder
1A Jill 9820 Piano
1A Johnny 9803 Violin
1A Luke 9810 Piano
1A Mandy 9821
1A Mary 9802 Flute
1A Peter 9801 Piano
1A Ron 9813 Guitar
1B Eddy 9815 Piano
1B Janet 9822 Guitar
1B Kenny 9814
1B Kitty 9806 Recorder
: : : :
Outer Join
Multi-Table Queries
Join Inner Join Left Outer Join Right Outer Join Full Outer Join
Source: Database Systems Connolly/Begg
JoinSELECT client
FROM Client c, View v
WHERE c.client = v.client;
Source: Database Systems Connolly/Begg
FROM Client c JOIN View v ON c.client = v.client(creates two identical client columns)
FROM Client JOIN View USING clientFROM Client NATURAL JOIN View
ISO standard Alternatives
Join
The join operation combines data from two tables by forming pairs of related rows where the matching columns in each table have the same value.
If one row of a table is unmatched, the row is omitted from the resulting table.
Source: Database Systems Connolly/Begg
Top Related