Databases 2010 The Relational Model and SQLmis/dDB/sql-2010-1.pdf · Databases 2010 The Relational...
Transcript of Databases 2010 The Relational Model and SQLmis/dDB/sql-2010-1.pdf · Databases 2010 The Relational...
Databases 2010
The Relational Model and SQL
Christian S. Jensen
Computer Science, Aarhus University
Acknowledgments: revised version of slides developed by Michael I. Schwartzbach
2The Relational Model and SQL
What is a Database?
Queries are much more general than searching Efficient, convenient, and safe storage of and
multi-user access to very large amounts of persistent data
Main Entry: da·ta·base
Pronunciation: \ˈdā-tə-ˌbās, ˈda- also ˈdä-\
Function: noun
Date: circa 1962
: a usually large collection of data organized especially for rapid search and retrieval (as by a computer)— database transitive verb
3The Relational Model and SQL
What is a Database?
Queries are much more general than searching Efficient, convenient, and safe storage of and
multi-user access to massive amounts of persistent data
Main Entry: da·ta·base
Pronunciation: \ˈdā-tə-ˌbās, ˈda- also ˈdä-\
Function: noun
Date: circa 1962
: a usually large collection of data organized especially for rapid search and retrieval (as by a computer)— database transitive verb
Bank accounts
Blog archives
Google.com
Human genome
Amazon.com
Student records
4The Relational Model and SQL
Data Model
A (mathematical) representation of data• tables/relations
• trees
• graphs
Operations on data• insert, delete, update, query
Constraints on data• data types
• uniqueness
• dependencies
5The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
Simple but flexible and support many real-world applications
name age city
Joe 22 London
Jacques 27 Paris
Jose 34 Madrid
6The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
name age city
Joe 22 London
Jacques 27 Paris
Jose 34 Madrid
row (tuple)
7The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
name age city
Joe 22 London
Jacques 27 Paris
Jose 34 Madrid
schema
8The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
name age city
Joe 22 London
Jacques 27 Paris
Jose 34 Madrid
column
9The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
name age city
Joe 22 London
Jacques 27 Paris
Jose 34 Madrid
attribute
10The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
name age city
Joe 22 London
Jacques 27 Paris
Jose 34 Madrid
attribute value
11The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
Abstract tables• invariant under permutation of rows and columns
• no information is stored in the order
May or may not allow duplicate rows
name age city
Joe 22 London
Jacques 27 Paris
Jose 34 Madrid
12The Relational Model and SQL
The Relational Data Model
Data is stored in tables (relations)
Abstract tables• invariant under permutation of rows and columns
• no information is stored in the order
May or may not allow duplicate rows
city name age
Madrid Jose 34
London Joe 22
Paris Jacques 27
13The Relational Model and SQL
NULL Values
An attribute value may be NULL• it is unknown
• no value exists
• it is unknown or does not exist
NULL values are treated specially
animal color zoo
lion yellow Copenhagen
crocodile green London
Tyrannosaurus Rex NULL NULL
polar bear white Berlin
14The Relational Model and SQL
Advantages of The Relational Model
A simple, intuitive model
Often convenient for real-life data• but richer models are also needed, e.g., XML
An elegant mathematical foundation• set and multi-set theory
• relational algebra and calculi
Allows efficient algorithms
Industrial strength implementations are available
15The Relational Model and SQL
Schemas
Relation schema• name of the relation
• names of the attributes
• types of the attributes
• constraints
Database schema• collection of all relation schemas
16The Relational Model and SQL
Running Example
The database behind a tiny calendar system
• Rooms
• People
• Meetings
• Participants
• Equipment
17The Relational Model and SQL
Rooms
room: the name of a room
capacity: the number of people that it will hold
room capacity
Turing-216 6
Ada-333 26
Store-Aud 286
18The Relational Model and SQL
People
userid: unique user name
name: ordinary name
group: vip, tap, phd
office: a room or NULL
userid name group office
csj Christian S. Jensen vip Turing-216
doina Doina Bucur phd NULL
bnielsen Kai Birger Nielsen tap Hopper-017
19The Relational Model and SQL
Meetings
meetid: a unique id
date: the date of the meeting
slot: 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18
owner: the userid of the owner
what: a textual description
meetid date slot owner what
34716 2010-08-23 14 csj dDB
34717 2010-08-23 15 csj dDB
42835 2010-08-16 10 mis TA-meeting
20The Relational Model and SQL
Participants
meetid: the id of the meeting
pid: a userid or a room
status: u(nknown), a(ccept), d(ecline)
meetid pid status
34716 Store-Aud a
34716 csj a
42835 sigurd d
21The Relational Model and SQL
Equipment
room: the name of a room
type: the type of equipment
room type
Store-Aud projector
Store-Aud whiteboard
Hopper-017 mini-fridge
22The Relational Model and SQL
SQL
Structured Query Language
Invented by IBM in the 1970s (many versions)
High-Level, “declarative,” no low-level manipulations
Algebraic foundations
Representations, operations, constraints
Query optimization
DB2, Oracle, SQL Server, MySQL, …
23The Relational Model and SQL
Declaring Tables (1/3)
CREATE TABLE Rooms (
room VARCHAR(15),
capacity INT
);
CREATE TABLE People (
name VARCHAR(40),
office VARCHAR(15),
userid VARCHAR(15),
group CHAR(3)
);
24The Relational Model and SQL
Declaring Tables (2/3)
CREATE TABLE Meetings (
meetid INT,
date DATE,
slot INT,
owner VARCHAR(15),
what VARCHAR(40)
);
25The Relational Model and SQL
Declaring Tables (3/3)
CREATE TABLE Participants (
meetid INT,
pid VARCHAR(15),
status CHAR(1)
);
CREATE TABLE Equipment (
room VARCHAR(15),
type VARCHAR(20)
);
26The Relational Model and SQL
SQL Types
INT 217
CHAR(2) 'aa', 'ab', '12', '++'
VARCHAR(5) '', '12345', 'foo', 'x''y'
FLOAT 3.14, 42, 0.0018
DATE '2008-08-25'
TIME '14:15:00'
CLOB a text file BLOB a movie XML an XML document
27The Relational Model and SQL
Refinements
NOT NULL• the value cannot be NULL
DEFAULT value• a default value is specified
UNIQUE• the value is unique in the table
• unless it is NULL
PRIMARY KEY• the value is unique in the table
• the value is never NULL
• special syntax for multi-attribute primary keys
28The Relational Model and SQL
Refined Tables (1/3)
CREATE TABLE Rooms (
room VARCHAR(15) PRIMARY KEY,
capacity INT NOT NULL
);
CREATE TABLE People (
name VARCHAR(40) NOT NULL,
office VARCHAR(15),
userid VARCHAR(15) PRIMARY KEY,
group CHAR(3)
);
29The Relational Model and SQL
Declaring Tables (2/3)
CREATE TABLE Meetings (
meetid INT PRIMARY KEY,
date DATE,
slot INT,
owner VARCHAR(15) NOT NULL,
what VARCHAR(40)
);
30The Relational Model and SQL
Declaring Tables (3/3)
CREATE TABLE Participants (
meetid INT NOT NULL,
pid VARCHAR(15) NOT NULL,
status CHAR(1) DEFAULT 'u'
);
CREATE TABLE Equipment (
room VARCHAR(15) NOT NULL,
type VARCHAR(20) NOT NULL,
PRIMARY KEY (room, type)
);
31The Relational Model and SQL
SELECT-FROM-WHERE
The basic form of an SQL query
SELECT desired attributesFROM one or more tablesWHERE condition about the involved rows
Which meetings (“what”) have csj arranged?
SELECT what
FROM Meetings
WHERE owner = ‘csj';
meetid date slot owner what
34716 2010-08-23 14 csj dDB
34717 2010-08-23 15 csj dDB
42835 2010-08-16 10 mis TA-meeting
32The Relational Model and SQL
Simple Example
what
dDB
dDB
33The Relational Model and SQL
Loop Semantics for Single Table
Loop through all rows in the table
Check if the condition is true
Project the rows onto the desired attributes
Note that duplicates are kept...
34The Relational Model and SQL
Renaming in SELECT
The selected attributes can be given new names
SELECT name, group AS category
FROM People
WHERE office = ‘Ada-230';
name category
Vaida Ceikute phd
Rasmus Ibsen-Jensen phd
35The Relational Model and SQL
Expressions in SELECT
The attributes may have computed values
SELECT owner, date, slot*60 AS minute
FROM Meetings
WHERE owner = ‘csj';
owner date minute
csj 2010-08-23 840
csj 2010-08-23 900
36The Relational Model and SQL
Conditions in WHERE
AND, OR, NOT, =, <>, <, >, <=, >=, LIKE, ...
SELECT owner, what
FROM Meetings
WHERE slot >= 12 AND slot < 16
AND what LIKE '%beer%';
owner what
mis Afternoon beer
mis Belgian beer testing
mis Return empty beer bottles
37The Relational Model and SQL
3-Valued Logic
Arithmetic operations on NULL yield NULL
Any comparison with NULL yields unknown This gives 3 truth values: true, false, unknown Boolean connectives are defined appropriately
The WHERE clause accepts if the result is true
tt ff u
tt tt ff u
ff ff ff ff
u u ff u
tt ff u
tt tt tt tt
ff tt ff u
u tt u u
tt ff
ff tt
u u
AND NOTOR
38The Relational Model and SQL
A Surprise?
People
SELECT userid
FROM People
WHERE office='Turing-216' OR office<>'Turing-216';
userid name group office
csj Christian S. Jensen vip Turing-216
doina Doina Bucur phd NULL
bnielsen Kai Birger Nielsen tap Hopper-017
userid
csj
bnielsen
39The Relational Model and SQL
Testing for NULL
People
SELECT userid
FROM People
WHERE office IS NULL;
userid name group office
csj Christian S. Jensen vip Turing-216
doina Doina Bucur phd NULL
bnielsen Kai Birger Nielsen tap Hopper-017
userid
doina
40The Relational Model and SQL
Multiple Relations
Who have booked meetings on August 23, 2010?
SELECT name
FROM People, Meetings
WHERE date = '2010-08-23' AND
owner = userid;
The relations are joined
Multiple Relations Example
41The Relational Model and SQL
meetid date slot owner what
34716 2010-08-23 14 csj dDB
34717 2010-08-23 15 csj dDB
42835 2010-08-16 10 mis TA-meeting
userid name group office
csj Christian S. Jensen vip Turing-216
doina Doina Bucur phd NULL
bnielsen Kai Birger Nielsen tap Hopper-017
42The Relational Model and SQL
General Loop Semantics
Loop through all rows in all tables
For each combination• check if the condition is true
• project the rows onto the desired attributes
Note that duplicates are still kept...
43The Relational Model and SQL
Avoid possible name clashes
SELECT People.nameFROM People, Meetings
WHERE Meetings.date = '2008-08-23' AND Meetings.owner = People.userid;
Prefixing Attribute Variables
Multiple Relations
Who shares a room?
44The Relational Model and SQL
userid name group office
csj Christian S. Jensen vip Turing-216
vaida Vaida Ceikute phd Turing-216
ira Ira Assent vip Turing-217
roomie1 roomie2
Christian S. Jensen Vaida Ceikute
45The Relational Model and SQL
Naming Row Variables
Enables self-joins
SELECT p1.name AS roomie1, p2.name AS roomie2
FROM People p1, People p2WHERE p1.office = p2.office AND
p1.userid <> p2.userid;
A table of all roommates...
46The Relational Model and SQL
Avoiding Symmetric Pairs
SELECT p1.name AS roomie1,
p2.name AS roomie2
FROM People p1, People p2
WHERE p1.office = p2.office AND
p1.userid < p2.userid;
47The Relational Model and SQL
Aggregation
The SELECT clause may involve aggregate functions• SUM
• AVG
• COUNT
• MIN
• MAX
NULLs are ignored in these computations Except that count(*) counts all rows
48The Relational Model and SQL
Requirements
Aggregation of a column computes
a1 ⊗ a2 ⊗ a3 ⊗ ... ⊗ an
for some operator ⊗
This is only well-formed if ⊗ is• commutative: a ⊗ b = b ⊗ a
• associative: (a ⊗ b) ⊗ c = a ⊗ (b ⊗ c)
since the rows may be permuted
x
a1
a2
a3
...
an
49The Relational Model and SQL
Simple Example
What is the average capacity of a room?
SELECT AVG(capacity) AS average
FROM Rooms;
average
106
50The Relational Model and SQL
Avoiding Duplicates
SELECT DISTINCT removes duplicates
This is expensive!
But sometime necessary...
What kinds of equipment do we have?
SELECT DISTINCT type
FROM Equipment;
51The Relational Model and SQL
Avoiding Duplicates in Aggregation
How many kinds of equipment do we have?
SELECT COUNT(DISTINCT type) as number
FROM Equipment;
number
4
52The Relational Model and SQL
Scalar Functions
Lots of useful functions are available• integer and float functions
• string functions
• calendar functions
• ...
SELECT CHARACTER_LENGTH(name,CODEUNITS16),
UPPER(group)
FROM People;
53The Relational Model and SQL
Subqueries
Any query in parentheses can be used in• FROM clauses
• WHERE clauses
A query may be used as a value• if it returns only one row and one column
• otherwise, a run-time error occurs
54The Relational Model and SQL
Simple Example
Who shares an office with Ira?
SELECT name
FROM People
WHERE office = (SELECT office
FROM People
WHERE userid=‘ira');
55The Relational Model and SQL
Membership Tests
IN and NOT IN test membership in tables
Who has csj arranged to meet?
SELECT pid
FROM Participants
WHERE meetid IN (SELECT meetid
FROM Meetings
WHERE owner=‘csj')
AND
pid NOT IN (SELECT room
FROM Rooms);
Membership Tests
56The Relational Model and SQL
meetid pid status
34716 Store-Aud a
34716 csj a
42835 sigurd d
meetid date slot owner what
34716 2010-08-23 14 csj dDB
34717 2010-08-23 15 csj dDB
42835 2010-08-16 10 mis TA-meeting
57The Relational Model and SQL
Which meetings exceed the capacity of a room?
SELECT meetid
FROM Meetings
WHERE (SELECT COUNT(DISTINCT pid)
FROM Participants
WHERE meetid=Meetings.meetid AND
status<>'d' AND
pid NOT IN (SELECT room
FROM Rooms)
)
>
(SELECT capacity
FROM Rooms, Participants
WHERE room=pid AND meetid=Meetings.meetid)
;
Correlated Subqueries
58The Relational Model and SQL
Which meetings exceed the capacity of a room?
SELECT meetid
FROM Meetings
WHERE (SELECT COUNT(DISTINCT pid)
FROM Participants
WHERE meetid=Meetings.meetid AND
status<>'d' AND
pid NOT IN (SELECT room
FROM Rooms)
)
>
(SELECT capacity
FROM Rooms, Participants
WHERE room=pid AND meetid=Meetings.meetid)
;
Correlated Subqueries
static nested scope rules
59The Relational Model and SQL
EXISTS and NOT EXISTS
Check for emptiness or non-emptiness of a table
Who is alone in an office?
SELECT name
FROM People p1
WHERE NOT EXISTS (
SELECT *
FROM People
WHERE office = p1.office AND
userid <> p1.userid
);
60The Relational Model and SQL
ANY and ALL
Allow comparisons against• any row in a subquery
• all rows in a subquery
Which are the latest meetings that are planned?
SELECT what
FROM Meetings
WHERE date >= ALL(
SELECT date FROM Meetings
);
61The Relational Model and SQL
UNION, INTERSECT, and EXCEPT
Treat tables with the same schema as sets• remove duplicates (unless ALL is added)• computes ∪, ∩, and \
Who do not participate in a meeting they have themselves arranged?
(SELECT owner AS userid, meetid
FROM Meetings)
EXCEPT
(SELECT pid AS userid, meetid
FROM Participants);
62The Relational Model and SQL
The JOIN Operator
T1 JOIN T2 ON condition
is syntactic sugar for:
SELECT *
FROM T1,T2WHERE condition
63The Relational Model and SQL
Dangling Rows and FULL JOIN
T1 JOIN T2 ON condition
A row in T1 or T2 that does not match a row in the other table is dangling
An ordinary JOIN throws away dangling rows
A FULL JOIN preserves dangling rows by padding them with NULL values
A LEFT or RIGHT JOIN preserves dangling rows from one argument only
64The Relational Model and SQL
In which offices are meetings planned?
All offices with meetings or NULL SELECT office, meetid
FROM People LEFT JOIN Participants
ON pid=office;
Only those offices with meetings SELECT office, meetid
FROM People JOIN Participants
ON pid=office;
Simple Example
65The Relational Model and SQL
People and Participants
userid name group office
csj Christian S. Jensen vip Turing-216
doina Doina Bucur phd NULL
bnielsen Kai Birger Nielsen tap Hopper-017
meetid pid status
34716 Store-Aud a
34716 csj a
42835 sigurd d
66The Relational Model and SQL
Grouping
SELECT-FROM-WHERE-GROUP BY
Rows are grouped by a set of attributes
Aggregations in SELECT are done for each group
The attributes in SELECT must be either• aggregates or
• mentioned in the GROUP BY clause
67The Relational Model and SQL
How many meetings have each person arranged?
SELECT owner, COUNT(meetid) as number
FROM Meetings
GROUP BY owner;
Simple Example
owner number
amoeller 4
kjensen 1
csj 3
68The Relational Model and SQL
Advanced Example
What is the average number of invitations for the meetings that each person has arranged?
SELECT owner, AVG(pidno) AS average
FROM (SELECT owner,
m.meetid,
COUNT(pid) as pidno
FROM Meetings m, Participants p
WHERE m.meetid = p.meetid
GROUP BY owner, m.meetid)
GROUP BY owner;
69The Relational Model and SQL
HAVING
A HAVING clause may eliminate some groups
Which offices have more than one occupant?
SELECT office
FROM People
GROUP BY office
HAVING COUNT(*) > 1;
Attributes in HAVING must be aggregates or mentioned in GROUP BY
70The Relational Model and SQL
Modifications
SQL commands may modify the database
Three kinds of modifications• insert one or more rows
• delete one or more rows
• update existing rows or columns
Modifications do not return a result
71The Relational Model and SQL
INSERT INTO table VALUES (list of values);
INSERT INTO Participants
VALUES (42432, 'mis', 'a');
Optionally specify attribute names:
INSERT INTO
Participants(pid, status, meetid)
VALUES ('mis', 'a', 42432);
Missing values are NULL or defaults
Inserting a Single Row
72The Relational Model and SQL
Invite everyone Anders meets with to his Belgian beer tasting
INSERT INTO Participants (
SELECT 46432 AS meetid, pid, 'u' AS status
FROM Meetings, Participants
WHERE Meetings.meetid=Participants.meetid
AND owner = 'amoeller'
AND pid <> 'amoeller'
AND pid NOT IN (SELECT room FROM Rooms));
Inserting a Subquery
73The Relational Model and SQL
Deleting Some Rows
DELETE FROM table WHERE condition;
Delete Christian's office
DELETE FROM Rooms
WHERE room='Turing-216';
Delete all offices
DELETE FROM Rooms;
74The Relational Model and SQL
Delete all people with a roommate
DELETE FROM People p
WHERE EXISTS(
SELECT *
FROM People
WHERE office = p.office
AND userid <> p.userid
);
Deleting a Subquery
75The Relational Model and SQL
Meaning of Deletion
First the condition is computed for all rows
Then the deletions are performed
Otherwise the last person in a multi-person office would not be deleted!
76The Relational Model and SQL
Update
UPDATE table SET attribute assignmentsWHERE condition;
Move Anders to a smaller office
UPDATE People
SET office = 'Turing-213'
WHERE userid = 'amoeller';