Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall...

139
Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 2011 1

description

Outline and Reading Material Schema Normalization: 19.1 – 19.7 –Read only about BCNF, skip 3NF –Note: you need BCNF for HW2 Constraints and triggers: 3.2, 3.3, 5.8 Views: 3.6 –Answering queries using views: Sec. 1,2,3 Dan Suciu -- CSEP544 Fall Some material is NOT covered in the book. Read the slides carefully: we will not cover all in class!

Transcript of Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall...

Page 1: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Dan Suciu -- CSEP544 Fall 2011

Lecture 03:Normal Forms, Constraints, Views

Wednesday, October 12, 2011

1

Page 2: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

2

Announcements

• HW1: was due Monday

• HW2: due next Monday

Dan Suciu -- CSEP544 Fall 2011

Page 3: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Outline and Reading Material

• Schema Normalization: 19.1 – 19.7– Read only about BCNF, skip 3NF– Note: you need BCNF for HW2

• Constraints and triggers: 3.2, 3.3, 5.8• Views: 3.6

– Answering queries using views: Sec. 1,2,3

Dan Suciu -- CSEP544 Fall 2011 3

Some material is NOT covered in the book.Read the slides carefully: we will not cover all in class!

Page 4: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Schema Normalization

Dan Suciu -- CSEP544 Fall 2011 4

Page 5: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

5

Normal Forms

• 1st Normal Form = all tables are flat• 2nd Normal Form = obsolete• Boyce Codd Normal Form = will study• 3rd Normal Form = see book• 4th Normal Form = golden standard

Dan Suciu -- CSEP544 Fall 2011

Page 6: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

6

First Normal Form (1NF)• A database schema is in First Normal

Form if all tables are flat

Name GPA Courses

Alice 3.8

Bob 3.7

Carol 3.9

Math

DB

OS

DB

OS

Math

OS

Student Name GPA

Alice 3.8

Bob 3.7

Carol 3.9

Student

Course

Math

DB

OS

Student Course

Alice Math

Carol Math

Alice DB

Bob DB

Alice OS

Carol OS

Takes Course

May needto add keys

Page 7: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

7

Relational Schema Design

PersonbuysProduct

name

price name ssn

Conceptual Model:

Relational Model:plus FD’s

Normalization:Eliminates anomalies

Dan Suciu -- CSEP544 Fall 2011

Page 8: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

8

Data Anomalies

When a database is poorly designed we get anomalies:

Redundancy: data is repeated

Updated anomalies: need to change in several places

Delete anomalies: may lose data when we don’t want

Dan Suciu -- CSEP544 Fall 2011

Page 9: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

9

Relational Schema Design

Anomalies:• Redundancy = repeat data• Update anomalies = Fred moves to “Bellevue”• Deletion anomalies = Joe deletes his phone number:

what is his city ?

Recall set attributes (persons with several phones):Name SSN PhoneNumber CityFred 123-45-6789 206-555-1234 SeattleFred 123-45-6789 206-555-6543 SeattleJoe 987-65-4321 908-555-2121 Westfield

One person may have multiple phones, but lives in only one city

Page 10: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

10

Relation DecompositionBreak the relation into two:

Name SSN City

Fred 123-45-6789 SeattleJoe 987-65-4321 Westfield

SSN PhoneNumber

123-45-6789 206-555-1234123-45-6789 206-555-6543987-65-4321 908-555-2121Anomalies have gone:

• No more repeated data• Easy to move Fred to “Bellevue” (how ?)• Easy to delete all Joe’s phone number (how ?)

Name SSN PhoneNumber City

Fred 123-45-6789 206-555-1234 SeattleFred 123-45-6789 206-555-6543 SeattleJoe 987-65-4321 908-555-2121 Westfield

Page 11: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

11

Relational Schema Design(or Logical Design)

Main idea:• Start with some relational schema• Find out its functional dependencies• Use them to design a better relational

schema

Dan Suciu -- CSEP544 Fall 2011

Page 12: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

12

Functional Dependencies

What is a functional dependency?• It is a kind of constraint• In theory one should find the FDs during

requirement analysis, then apply rigorously the normalization steps that we discuss next

• In practice one rarely follows this strict prescription

Dan Suciu -- CSEP544 Fall 2011

Page 13: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

13

Functional Dependencies

Meaning: If two tuples agree on the attributes

then they must also agree on the attributes

Notation:

A1, A2, …, An B1, B2, …, Bm

A1, A2, …, An

B1, B2, …, Bm

Dan Suciu -- CSEP544 Fall 2011

Page 14: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

14

When Does an FD Hold

Definition: A1, ..., Am B1, ..., Bn holds in R if:

t, t’ R, (t.A1=t’.A1 ... t.Am=t’.Am t.B1=t’.B1 ... t.Bn=t’.Bn )

A1 ... Am B1 ... nm

if t, t’ agree here then t, t’ agree here

t

t’

R

Page 15: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

15

Examples

EmpID Name, Phone, PositionPosition Phonebut not Phone Position

An FD holds, or does not hold on an instance:

EmpID Name Phone PositionE0045 Smith 1234 ClerkE3542 Mike 9876 SalesrepE1111 Smith 9876 SalesrepE9999 Mary 1234 Lawyer

Dan Suciu -- CSEP544 Fall 2011

Page 16: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

16

Example

Position Phone

EmpID Name Phone PositionE0045 Smith 1234 ClerkE3542 Mike 9876 SalesrepE1111 Smith 9876 SalesrepE9999 Mary 1234 Lawyer

Dan Suciu -- CSEP544 Fall 2011

Page 17: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

17

Example

EmpID Name Phone PositionE0045 Smith 1234 ClerkE3542 Mike 9876 SalesrepE1111 Smith 9876 SalesrepE9999 Mary 1234 Lawyer

but not Phone PositionDan Suciu -- CSEP544 Fall 2011

Page 18: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

18

ExampleFD’s are constraints: On some instances they hold On others they don’t

name category color department price

Gizmo Gadget Green Toys 49

Tweaks Gadget Green Toys 99

Does this instance satisfy all the FDs ?

name colorcategory departmentcolor, category price

Dan Suciu -- CSEP544 Fall 2011

Page 19: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

19

Example

name category color department price

Gizmo Gadget Green Toys 49

Tweaks Gadget Black Toys 99

Gizmo Stationary Green Office-supp. 59

What about this one ?

name colorcategory departmentcolor, category price

Page 20: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

20

An Interesting Observation

If all these FDs are true:name colorcategory departmentcolor, category price

Then this FD also holds: name, category price

Why ??

Page 21: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

21

Goal: Find ALL Functional Dependencies

• Anomalies occur when certain “bad” FDs hold

• We know some of the FDs

• Need to find all FDs, then look for the bad ones

Dan Suciu -- CSEP544 Fall 2011

Page 22: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

22

Armstrong’s Rules (1/3)

Is equivalent to

Splitting rule and Combing rule

A1 ... Am B1 ... Bm

A1, A2, …, An B1, B2, …, Bm

A1, A2, …, An B1

A1, A2, …, An B2

. . . . .A1, A2, …, An Bm

Dan Suciu -- CSEP544 Fall 2011

Page 23: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

23

Armstrong’s Rules (2/3)

Trivial Rule

Why ?

A1 … Am

where i = 1, 2, ..., n

A1, A2, …, An Ai

Dan Suciu -- CSEP544 Fall 2011

Page 24: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

24

Armstrong’s Rules (3/3)

Transitive Closure Rule

If

and

then

Why ?

A1, A2, …, An B1, B2, …, Bm

B1, B2, …, Bm C1, C2, …, Cp

A1, A2, …, An C1, C2, …, Cp

Dan Suciu -- CSEP544 Fall 2011

Page 25: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

25

A1 … Am B1 … Bm C1 ... Cp

Dan Suciu -- CSEP544 Fall 2011

Page 26: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

26

Example (continued)Start with these:

Infer these:

Inferred FD Which Ruledid we apply ?

4. name, category name5. name, category color6. name, category category7. name, category color, category8. name, category price

1. name color2. category department3. color, category price

Page 27: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

27

Example (continued)

Answers:

Inferred FD Which Ruledid we apply ?

4. name, category name Trivial rule5. name, category color Transitivity on 4, 16. name, category category Trivial rule7. name, category color, category Split/combine on 5, 68. name, category price Transitivity on 3, 7

1. name color2. category department3. color, category price

THIS IS TOO HARD ! There is an easier way.

Page 28: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

28

Closure of a set of AttributesGiven a set of attributes A1, …, An

The closure, {A1, …, An}+ = the set of attributes B s.t. A1, …, An B

name colorcategory departmentcolor, category price

Example:

name+ = {name, color}{name, category}+ = {name, category, color, department, price}color+ = {color}

Page 29: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

29

Closure AlgorithmX={A1, …, An}.

repeat until X doesn’t change do: if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X.

{name, category}+ = ? [ …in class…]

name colorcategory departmentcolor, category price

Example:

Page 30: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

30

Practice at Home…

Compute {A,B}+ X = {A, B, }

Compute {A, F}+ X = {A, F, }

R(A,B,C,D,E,F) A, B CA, D EB DA, F B

Dan Suciu -- CSEP544 Fall 2011

Page 31: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

31

Why Do We Need Closure

• With closure we can find all FD’s easily

• To check if X A– Compute X+

– Check if A Î X+

Dan Suciu -- CSEP544 Fall 2011

Page 32: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Practice at Home

A, B CA, D BB D

Find all FD’s implied by:

Page 33: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Practice at Home

A, B CA, D BB D

Step 1: Compute X+, for every X:A+ = A, B+ = BD, C+ = C, D+ = DAB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CDABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)BCD+ = BCD, ABCD+ = ABCD

Find all FD’s implied by:

Page 34: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Practice at Home

A, B CA, D BB D

Step 1: Compute X+, for every X:A+ = A, B+ = BD, C+ = C, D+ = DAB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CDABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)BCD+ = BCD, ABCD+ = ABCD

Step 2: Enumerate all FD’s X Y, s.t. Y X+ and XY = :AB CD, ADBC, ABC D, ABD C, ACD B

Find all FD’s implied by:

Page 35: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Keys

A superkey is a set of attributes X = A1, ..., An s.t. for any other attribute B, we have X B

Note: X is a superkey if X+ = [all attributes]

A key is a minimal superkey X

How do we compute all keys X ?

Page 36: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

36

Example

Product(name, price, category, color)

name, category pricecategory color

What are the keys ?

Dan Suciu -- CSEP544 Fall 2011

Page 37: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

37

Example

Product(name, price, category, color)

name, category pricecategory color

(name, category) + = name, category, price, color

Hence X = (name, category) is a key

What are the keys ?

Page 38: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Practice at Home

Dan Suciu -- CSEP544 Fall 2011 38

student addressroom, time coursestudent, course room, time

Find all keys

Enrollment(student, address, course, room, time)

Page 39: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

39

Read at Home

We can have more than one key.

Example: relation R(A,B,C) satisfies

ABCBCA

ABCBACor

Find all keys

Page 40: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

40

Boyce-Codd Normal Form

There are no“bad” FDs:

A relation R is in BCNF if:

Whenever X B is a non-trivial dependency, then X is a superkey.

Equivalently:

Dan Suciu -- CSEP544 Fall 2011

A relation R is in BCNF if: " X, either X+ = X or X+ = [all attributes]

Page 41: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

41

Example

The only key is: {SSN, PhoneNumber}Hence SSN Name, City is a “bad” dependency

SSN Name, City

Alternatively: SSN+ = Name, City and is neither SSN nor all attributes

Name SSN PhoneNumber CityFred 123-45-6789 206-555-1234 SeattleFred 123-45-6789 206-555-6543 SeattleJoe 987-65-4321 908-555-2121 WestfieldJoe 987-65-4321 908-555-1234 Westfield

Page 42: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

42

BCNF Decomposition AlgorithmNormalize(R) find X s.t.: X ≠ X+ ≠ [all attributes] if (not found) then “R is in BCNF” let Y = X+ - X; Z = [all attributes] - X+ decompose R into R1(X Y) and R2(X Z) Normalize(R1); Normalize(R2);

Y X Z

X+

Page 43: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

43

Example BCNF DecompositionPerson(name, SSN, age, hairColor, phoneNumber)

SSN name, ageage hairColor

Find X s.t.: X ≠X+ ≠ [all attributes]

Dan Suciu -- CSEP544 Fall 2011

Page 44: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

44

Example BCNF DecompositionPerson(name, SSN, age, hairColor, phoneNumber)

SSN name, ageage hairColor

Find X s.t.: X ≠X+ ≠ [all attributes]

Dan Suciu -- CSEP544 Fall 2011

Iteration 1: Person: SSN+ = SSN, name, age, hairColorDecompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber)

SSNname,age,hairColor

phoneNumber

Page 45: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

45

Example BCNF DecompositionPerson(name, SSN, age, hairColor, phoneNumber)

SSN name, ageage hairColor

Find X s.t.: X ≠X+ ≠ [all attributes]

Dan Suciu -- CSEP544 Fall 2011

Iteration 1: Person: SSN+ = SSN, name, age, hairColorDecompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber)

Iteration 2: P: age+ = age, hairColorDecompose: People(SSN, name, age) Hair(age, hairColor) Phone(SSN, phoneNumber)

What arethe keys ?

Page 46: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

46

Practice at HomeA BB C

R(A,B,C,D) A+ = ABC ≠ ABCD

R(A,B,C,D)

Dan Suciu -- CSEP544 Fall 2011

Page 47: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

47

Practice at Home

What arethe keys ?

A BB C

R(A,B,C,D) A+ = ABC ≠ ABCD

R(A,B,C,D)

What happens if in R we first pick B+ ? Or AB+ ?

R1(A,B,C) B+ = BC ≠ ABC

R2(A,D)

R11(B,C) R12(A,B)

Dan Suciu -- CSEP544 Fall 2011

Page 48: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

48

Decompositions in General

R1 = projection of R on A1, ..., An, B1, ..., Bm

R2 = projection of R on A1, ..., An, C1, ..., Cp

R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)

R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)

Dan Suciu -- CSEP544 Fall 2011

Page 49: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

49

Theory of Decomposition

Sometimes it is correct:Name Price Category

Gizmo 19.99 Gadget

OneClick 24.99 Camera

Gizmo 19.99 Camera

Name Price

Gizmo 19.99

OneClick 24.99

Gizmo 19.99

Name Category

Gizmo Gadget

OneClick Camera

Gizmo Camera

Lossless decomposition

Page 50: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

50

Incorrect Decomposition

Sometimes it is not:

Name Price Category

Gizmo 19.99 Gadget

OneClick 24.99 Camera

Gizmo 19.99 Camera

Name Category

Gizmo Gadget

OneClick Camera

Gizmo Camera

Price Category

19.99 Gadget

24.99 Camera

19.99 Camera

What’sincorrect ??

Lossy decomposition

Page 51: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

51

Decompositions in GeneralR(A1, ..., An, B1, ..., Bm, C1, ..., Cp)

If A1, ..., An B1, ..., Bm

Then the decomposition is lossless

R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)

BCNF decomposition is always lossless. WHY ?

Note: don’t need A1, ..., An C1, ..., Cp

Page 52: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Constraints

Dan Suciu -- CSEP544 Fall 2011 52

Page 53: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Dan Suciu -- CSEP544 Fall 2011 53

Integrity Constraints in SQL

• Constraint = a property we want to hold• System enforces constraints at updates• Constraints in SQL:

– Keys, foreign keys– Attribute-level constraints– Tuple-level constraints– Global constraints: assertions

Page 54: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

54

Keys

OR:

CREATE TABLE Product (name CHAR(30) PRIMARY KEY,price INT)

CREATE TABLE Product (name CHAR(30),price INT,

PRIMARY KEY (name))

Product(name, price)

Dan Suciu -- CSEP544 Fall 2011

Page 55: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

55

Keys with Multiple Attributes

CREATE TABLE Product (name CHAR(30),category VARCHAR(20),price INT,

PRIMARY KEY (name, category))

Name Category Price

Gizmo Gadget 10

Camera Photo 20

Gizmo Photo 30

Gizmo Gadget 40

Product(name, category, price)

Page 56: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

56

Other Keys

CREATE TABLE Product ( productID CHAR(10),

name CHAR(30),category VARCHAR(20),price INT,

PRIMARY KEY (productID), UNIQUE (name, category))

There is at most one PRIMARY KEY;there can be many UNIQUE

Dan Suciu -- CSEP544 Fall 2011

Page 57: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

57

Foreign Key Constraints

CREATE TABLE Purchase ( buyer CHAR(30), seller CHAR(30), product CHAR(30) REFERENCES Product(name), store VARCHAR(30))

Foreign key

Purchase(buyer, seller, product, store)Product(name, price)

Dan Suciu -- CSEP544 Fall 2011

Page 58: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

58

Name Category

Gizmo gadget

Camera Photo

OneClick Photo

ProdName Store

Gizmo Wiz

Camera Ritz

Camera Wiz

Product Purchase

Dan Suciu -- CSEP544 Fall 2011

Page 59: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Foreign Key Constraints

59

Purchase(buyer, seller, product, category, store)Product(name, category, price)

CREATE TABLE Purchase( buyer VARCHAR(50), seller VARCHAR(50), product CHAR(20), category VARCHAR(20), store VARCHAR(30), FOREIGN KEY (product, category) REFERENCES Product(name, category) );

Dan Suciu -- CSEP544 Fall 2011

Page 60: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

60

Name Category

Gizmo gadget

Camera Photo

OneClick Photo

ProdName Store

Gizmo Wiz

Camera Ritz

Camera Wiz

Product Purchase

What happens during updates ?

Types of updates:• In Purchase: insert/update• In Product: delete/update

Page 61: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

61

What happens during updates ?

• SQL has three policies for maintaining referential integrity:

• Reject violating modifications (default)• Cascade: after a delete/update do a

delete/update• Set-null set foreign-key field to NULL

Dan Suciu -- CSEP544 Fall 2011

Page 62: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Constraints on Attributes and Tuples

62

CREATE TABLE Purchase ( . . . store VARCHAR(30) NOT NULL, . . . )

CREATE TABLE Product ( . . . price INT CHECK (price >0 and price < 999))

Attribute level constraints:

Tuple level constraints:

Dan Suciu -- CSEP544 Fall 2011 . . . CHECK (price * quantity < 10000) . . .

Page 63: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

63

CREATE TABLE Purchase (prodName CHAR(30)

CHECK (prodName IN SELECT Product.name FROM Product), date DATETIME NOT NULL)

Whatis the difference from

Foreign-Key ?

Dan Suciu -- CSEP544 Fall 2011

Page 64: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

64

General Assertions

CREATE ASSERTION myAssert CHECK NOT EXISTS(

SELECT Product.nameFROM Product, PurchaseWHERE Product.name = Purchase.prodNameGROUP BY Product.nameHAVING count(*) > 200)

Dan Suciu -- CSEP544 Fall 2011

Page 65: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

65

Comments on Constraints

• Can give them names, and alter later

• We need to understand exactly when they are checked

• We need to understand exactly what actions are taken if they fail

Dan Suciu -- CSEP544 Fall 2011

Page 66: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Semantic Optimization using Constraints

66

SELECT Purchase.storeFROM Product, PurchaseWHERE Product.name=Purchase.product

Purchase(buyer, seller, product, store)Product(name, price)

SELECT Purchase.storeFROM Purchase

Why ? and When ?

Page 67: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Views

Dan Suciu -- CSEP544 Fall 2011 67

Page 68: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Overview

Views are ubiquitous in data management:

• Used in SQL as names for predefined queries

• More generally, any derived data is a view

Dan Suciu -- CSEP544 Fall 2011 68

Page 69: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

69

CREATE VIEW CustomerPrice AS SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname

View Basics

CustomerPrice(customer, price) “virtual table”Dan Suciu -- CSEP544 Fall 2011

Views are relations, defined by a query

Purchase(customer, product, store)Product(pname, price)

CustomerPrice(customer, price)

Page 70: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

View Basics

Dan Suciu -- CSEP544 Fall 2011 70

SELECT DISTINCT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100

We can later use the view:

Purchase(customer, product, store)Product(pname, price)

CustomerPrice(customer, price)

Page 71: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

71

View Basics

SELECT DISTINCT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100

CREATE VIEW CustomerPrice AS SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname

View:

Query:

Purchase(customer, product, store)Product(pname, price)

CustomerPrice(customer, price)

Dan Suciu -- CSEP544 Fall 2011

Page 72: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

72

View Basics

SELECT DISTINCT u.customer, v.storeFROM (SELECT DISTINCT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname) u, Purchase vWHERE u.customer = v.customer AND u.price > 100

Modified query:

Dan Suciu -- CSEP544 Fall 2011

Purchase(customer, product, store)Product(pname, price)

CustomerPrice(customer, price)

Next, unnest the query…

Page 73: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

73

View Basics

SELECT DISTINCT x.customer, v.storeFROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname

Modified and unnested query:

Dan Suciu -- CSEP544 Fall 2011

Purchase(customer, product, store)Product(pname, price)

CustomerPrice(customer, price)

Page 74: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

74

Practice at Home…

SELECT DISTINCT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100

??

Dan Suciu -- CSEP544 Fall 2011

Purchase(customer, product, store)Product(pname, price)

CustomerPrice(customer, price)

Page 75: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

75

Answer

SELECT DISTINCT u.customer, v.storeFROM CustomerPrice u, Purchase vWHERE u.customer = v.customer AND u.price > 100

Dan Suciu -- CSEP544 Fall 2011

Purchase(customer, product, store)Product(pname, price)

CustomerPrice(customer, price)

SELECT DISTINCT x.customer, v.storeFROM Purchase x, Product y, Purchase v, WHERE x.customer = v.customer AND y.price > 100 AND x.product = y.pname

Page 76: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

76

Types of Views• Virtual views:

– Pros/cons ?

• Materialized views

– Pros/cons ?

Dan Suciu -- CSEP544 Fall 2011

Page 77: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

77

Types of Views• Virtual views:

– Used in databases– Computed only on-demand – slow at runtime– Always up to date

• Materialized views– Used in databases and data warehouses– Pre-computed offline – fast at runtime– May have stale data or expensive synchronization

Dan Suciu -- CSEP544 Fall 2011

Page 78: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Technical Aspects• View inlining, or query

modification

• Query answering using views

• Updating views

• Incremental view update

Db View

Answer

V

Q

Db View

Answer

V

Q

Db ViewV

Update??

Db ViewV

Update ??

Page 79: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

79

Applications

Views have lots of applications• Physical and logical data independence• Security• Indexes• Denormalization• Semantic caching• …

Dan Suciu -- CSEP544 Fall 2011

Page 80: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Physical Data Independence: Vertical Partitioning

SSN Name Address Resume Picture234234 Mary Huston Clob1… Blob1…345345 Sue Seattle Clob2… Blob2…345343 Joan Seattle Clob3… Blob3…234234 Ann Portland Clob4… Blob4…

Resumes

SSN Name Address234234 Mary Huston

345345 Sue Seattle . . .

SSN Resume234234 Clob1…345345 Clob2…

SSN Picture234234 Blob1…

345345 Blob2…

T1 T2 T3

Page 81: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

81

Vertical Partitioning

CREATE VIEW Resumes AS SELECT T1.ssn, T1.name, T1.address, T2.resume, T3.picture FROM T1,T2,T3 WHERE T1.ssn=T2.ssn and T2.ssn=T3.ssn

Dan Suciu -- CSEP544 Fall 2011

Page 82: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

82

Vertical Partitioning

SELECT addressFROM ResumesWHERE name = ‘Sue’

We want the system to query only table T1.

Will that happen ?

Dan Suciu -- CSEP544 Fall 2011

Page 83: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Vertical Partitioning

• Hot trend in databases today for analytics• Main idea:

– Storage = Column(TID, value) pairs– Sort by TID enables reconstructing the table– Compress great compression, minimize I/O– Updates = VERY, VERY expensive

• Companies: C-Store and Vertica

Dan Suciu -- CSEP544 Fall 2011 83

Page 84: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Horizontal Partitioning

SSN Name City234234 Mary Huston

345345 Sue Seattle

345343 Joan Seattle

234234 Ann Portland

-- Frank Calgary

-- Jean Montreal

Customers

SSN Name City234234 Mary Huston

CustomersInHuston

SSN Name City345345 Sue Seattle

345343 Joan Seattle

CustomersInSeattle

. . . . .

Page 85: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

85

Horizontal Partitioning

CREATE VIEW Customers AS CustomersInHuston UNION ALL CustomersInSeattle UNION ALL . . .

Dan Suciu -- CSEP544 Fall 2011

Page 86: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

86

Horizontal Partitioning

SELECT nameFROM CustomersWHERE city = ‘Seattle’

Which tables are queried by the system ?

WHY ???Dan Suciu -- CSEP544 Fall 2011

Page 87: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

87

Horizontal Partitioning

SELECT nameFROM CustomersWHERE city = ‘Seattle’

Now even humanscan’t tell which tablecontains customersin Seattle

Dan Suciu -- CSEP544 Fall 2011

CREATE VIEW Customers AS CustomersInXXX UNION ALL CustomersInYYY UNION ALL . . .

Page 88: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

88

Horizontal Partitioning

CREATE VIEW Customers AS (SELECT SSN, name, ‘Huston’ as city FROM CustomersInHuston) UNION ALL (SELECT SSN, name, ‘Seattle’ as city FROM CustomersInSeattle) UNION ALL . . .

A hack around the problem:

Dan Suciu -- CSEP544 Fall 2011

Page 89: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

89

Horizontal Partitioning

SELECT nameFROM CustomersWHERE city = ‘Seattle’

SELECT nameFROM CustomersInSeattle

Dan Suciu -- CSEP544 Fall 2011

Page 90: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Data Integration TerminologyLocal DB1

Local DBk

Integrated Data

Local DB1

Local DBk

Integrated Data

Global as View

VV1

Vk

Local as View

Which one needs query expansion,which one needs query answering using views ?

Page 91: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Horizontal Partitioning as LAV

CREATE VIEW CustomersInSeattle AS (SELECT * FROM Customers WHERE city = ‘Seattle’)CREATE VIEW CustomersInHuston AS (SELECT * FROM Customers WHERE city = ‘Huston’)….

SELECT name FROM CustomersWHERE city = ‘Seattle’

SELECT name FROM CustomersInSeattle

Page 92: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Views and Security

Name Address BalanceMary Huston 450.99Sue Seattle -240Joan Seattle 333.25Ann Portland -520

Customers:

Fred is notallowed to

see Balance

CREATE VIEW PublicCustomers SELECT Name, Address FROM Customers

Name Address BalanceMary Huston 450.99Sue Seattle -240Joan Seattle 333.25Ann Portland -520

John isnot allowedto see >0balances

CREATE VIEW BadCreditCustomers SELECT * FROM Customers WHERE Balance < 0

Page 93: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

93

IndexesREALLY important to speed up query processing time.

SELECT *FROM PersonWHERE name = 'Smith'

CREATE INDEX myindex05 ON Person(name)

Person (name, age, city)

May take too long to scan the entire Person table

Now, when we rerun the query it will be much faster

Page 94: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

B+ Tree Index

94

Adam Betty Charles …. Smith ….

We will discuss them in detail in a later lecture.

Dan Suciu -- CSEP544 Fall 2011

Page 95: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

95

Creating IndexesIndexes can be created on more than one attribute:

CREATE INDEX doubleindex ON Person (age, city)Example:

Page 96: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

96

Creating IndexesIndexes can be created on more than one attribute:

SELECT * FROM Person WHERE age = 55 AND city = 'Seattle'

Helps in:

CREATE INDEX doubleindex ON Person (age, city)Example:

Page 97: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

97

Creating IndexesIndexes can be created on more than one attribute:

SELECT * FROM Person WHERE age = 55 AND city = 'Seattle'

Helps in:

CREATE INDEX doubleindex ON Person (age, city)Example:

SELECT * FROM Person WHERE age = 55

and even in:

Page 98: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

98

Creating IndexesIndexes can be created on more than one attribute:

SELECT * FROM Person WHERE age = 55 AND city = 'Seattle'

Helps in:

SELECT * FROM Person WHERE city = 'Seattle'

But not in:

CREATE INDEX doubleindex ON Person (age, city)Example:

SELECT * FROM Person WHERE age = 55

and even in:

Page 99: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

CREATE INDEX W ON Product(weight)CREATE INDEX P ON Product(price)

Indexes are Materialized Views

SELECT weight, priceFROM ProductWHERE weight > 10 and price < 100

Product(pid, name, weight, price, …)

SELECT x.weight, y.priceFROM W x, P yWHERE x.weight > 10 and y.price < 100 and x.pid = y.pid

CREATE VIEW W AS SELECT weight, pid FROM Product yCREATE VIEW P AS SELECT price, pid FROM Product y

Indexes as LAV:

“Covering indexes”:query usesonly the indexes

Page 100: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Denormalization

• Compute a view that is the join of several tables

• The view is now a relation that is not in BCNF (why not?)

Dan Suciu -- CSEP544 Fall 2011 100

CREATE VIEW CustomerPurchase AS SELECT x.customer, x.store, y.pname, y.price FROM Purchase x, Product y WHERE x.product = y.pname

Purchase(customer, product, store)Product(pname, price)

Page 101: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

101

Semantic Caching• Queries Q1, Q2, … have been executed,

and their results are stored at the client• Now we need to compute a new query Q• Sometimes we can use the prior results in

answering Q• These queries can be seen as

materialized views• The problem becomes: answer Q using

the views Q1,Q2,…

Page 102: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Technical Aspects of Views• Simplifying queries after the views have

been in-lined– Query un-nesting– Query minimization

• Handling updates– Updating virtual views– Incremental update of materialized views

102Dan Suciu -- CSEP544 Fall 2011

Page 103: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

103

Set v.s. Bag Semantics

SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .

SELECT a,b,cFROM R, S, TWHERE . . .

Set semantics

Bag semantics

Dan Suciu -- CSEP544 Fall 2011

Page 104: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

104

Unnesting: Sets/Sets

SELECT DISTINCT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .

SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .

Dan Suciu -- CSEP544 Fall 2011

Page 105: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

105

Unnesting: Sets/Bags

SELECT DISTINCT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .

SELECT DISTINCT a,b,cFROM R, S, TWHERE . . .

Dan Suciu -- CSEP544 Fall 2011

Page 106: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

106

Unnesting: Bags/Bags

SELECT a,b,cFROM (SELECT u,v FROM R,S WHERE …), TWHERE . . .

SELECT a,b,cFROM R, S, TWHERE . . .

Dan Suciu -- CSEP544 Fall 2011

Page 107: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

107

Unnesting: Bags/Sets

SELECT a,b,cFROM (SELECT DISTINCT u,v FROM R,S WHERE …), TWHERE . . .

NO

Dan Suciu -- CSEP544 Fall 2011

Page 108: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization• Replace a query Q with Q’ having fewer tables in the

FROM clause

• When Q has fewest number of tables in the FROM clause, then we say it is minimized

• Usually (but not always) users write queries that are already minimized

• But the result of rewriting a query over view is often not minimized

Dan Suciu -- CSEP544 Fall 2011 108

Page 109: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization under Bag Semantics

Rule 1: If:• x, y are tuple variables over the same

table and:• The condition x.key = y.key is in the

WHERE clauseThen combine x, y into a single variablequery

Dan Suciu -- CSEP544 Fall 2011 109

Page 110: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization under Bag Semantics

SELECT x.date, y.nameFROM Order x, Product y, Order zWHERE x.pid = y.pid and y.price < 99 and z.weight > 150 and y.pid = z.pid and x.cid = z.cid

Order(cid, pid, weight, date)Product(pid, name, price)

SELECT x.date, y.nameFROM Order x, Product yWHERE x.pid = y.pid and y.price < 99 and x.weight > 150

What constraintsdo we need to havefor this optimization ?

Page 111: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

111

Query Minimization under Bag Semantics

Rule 2: If • x ranges over S, y ranges over T, and• The condition x.fk = y.key is in the

WHERE clause, and• there is a not null constraint on x.fk• y is not used anywhere else, andThen remove T (and y) from the query

Dan Suciu -- CSEP544 Fall 2011

Page 112: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization under Bag Semantics

Order(cid, pid, weight, date)Product(pid, name, price)

SELECT x.cid, x.date FROM Order x WHERE x.weight > 20

SELECT x.cid, x.date FROM Order x, Product y WHERE x.pid = y.pid and x.weight > 20

Q: Where do weencounter non-

minimized queries ?

What constraintsdo we need to havefor this optimization ?

Page 113: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization under Bag Semantics

CREATE VIEW CheapOrders AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid and y.price < 99

CREATE VIEW HeavyOrders AS SELECT a.cid,a.pid,a.date,b.name,b.price FROM Order a, Product b WHERE a.pid = b.pid and a.weight > 150

Order(cid, pid, weight, date)Product(pid, name, price)

SELECT u.cidFROM CheapOrders u, HeavyOrders vWHERE u.pid = v.pid and u.cid = v.cid

A: in queries resultingfrom view inlining

Customers who orderedcheap, heavy products

Page 114: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization

CREATE VIEW CheapOrders AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid and y.price < 99

CREATE VIEW HeavyOrders AS SELECT a.cid,a.pid,a.date,b.name,b.price FROM Order a, Product b WHERE a.pid = b.pid and a.weight > 150

Order(cid, pid, weight, date)Product(pid, name, price)

SELECT u.cidFROM CheapOrders u, HeavyOrders vWHERE u.pid = v.pid and u.cid = v.cid

SELECT a.cidFROM Order x, Product y Order a, Product bWHERE . . . .

Redundant Orders and Products

Page 115: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

SELECT a.cidFROM Order x, Product y, Order a, Product bWHERE x.pid = y.pid and a.pid = b.pid and y.price < 99 and a.weight > 150 and x.cid = a.cid and x.pid = a.pid

SELECT x.cidFROM Order x, Product y, Product bWHERE x.pid = y.pid and x.pid = b.pid and y.price < 99 and x.weight > 150

x = a

SELECT x.cidFROM Order x, Product yWHERE x.pid = y.pid and y.price < 99 and x.weight > 150

y = b

Page 116: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization under Set Semantics

• Rules 1 and 2 still apply

• Rule 3 involves homomorphisms

116Dan Suciu -- CSEP544 Fall 2011

Page 117: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Definition of a Homomorphism

A homomorphism from Q’ to Qis a mapping h : {y1, …, ym} {x1, …, xk}such that: (a) If h(yi) = xj, then Ri’ = Rj(b) C logically implies h(C’) and(c) h(A’) = A

SELECT DISTINCT AFROM R1 x1, …, Rk xkWHERE C

SELECT DISTINCT A’FROM R1’ y1, …, Rm’ ymWHERE C’

Q Q’

Page 118: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Definition of a HomomorphismTheorem If there exists a homomorphismfrom Q’ to Q, then every answer returned by Qis also returned by Q’.

We say that Q is contained in Q’

If there exists a homomorphism from Q’ to Q,and a homomorphism from Q to Q’,then Q and Q’ are equivalent

Page 119: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Find Homomorphism

Dan Suciu -- CSEP544 Fall 2011 119

SELECT DISTINCT x.cidFROM Order x, Product yWHERE x.pid = y.pid and y.price < 99 and x.weight > 150

QSELECT DISTINCT x.cidFROM Order x, Product y, Order zWHERE x.pid = y.pid and y.pid = z.pid and y.price < 99 and x.weight > 150 and z.weight > 100

Q’

Order(cid, pid, weight, date)Product(pid, name, price)

Page 120: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Homomorphism Q Q’

Dan Suciu -- CSEP544 Fall 2011 120

SELECT DISTINCT x.cidFROM Order x, Product yWHERE x.pid = y.pid and y.price < 99 and x.weight > 150

QSELECT DISTINCT x.cidFROM Order x, Product y, Order zWHERE x.pid = y.pid and y.pid = z.pid and y.price < 99 and x.weight > 150 and z.weight > 100

Q’

Order(cid, pid, weight, date)Product(pid, name, price)

Every answer to Qis also an answer to Q’

WHY ?

Page 121: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Homomorphism Q Q’

Dan Suciu -- CSEP544 Fall 2011 121

SELECT DISTINCT x.cidFROM Order x, Product yWHERE x.pid = y.pid and y.price < 99 and x.weight > 150

QSELECT DISTINCT x.cidFROM Order x, Product y, Order zWHERE x.pid = y.pid and y.pid = z.pid and y.price < 99 and x.weight > 150 and z.weight > 100

Q’

Order(cid, pid, weight, date)Product(pid, name, price)

Q and Q’ are equivalent !

Page 122: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Query Minimization under Set Semantics

SELECT DISTINCT x.pidFROM Product x, Product y, Product zWHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10

SELECT DISTINCT x.pidFROM Product x, Product zWHERE x.category = z.category and z.price > 500 and z.weight > 10

Same as:

Page 123: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

123

Query Minimization under Set Semantics

Rule 3: Let Q’ be the query obtained by removing the tuple variable x from Q. If:

• Q has set semantics (and same for Q’)• there exists a homomorphism from Q to Q’

Then Q’ is equivalent to Q. Hence one can safely remove x.

Dan Suciu -- CSEP544 Fall 2011

Page 124: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Example

SELECT DISTINCT x.pidFROM Product x, Product y, Product zWHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10

SELECT DISTINCT x’.pidFROM Product x’, Product z’WHERE x’.category = z’.category and z’.price > 500 and z’.weight > 10

Q

Q’ Find a homomorphism h: Q Q’

Page 125: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Example

SELECT DISTINCT x.pidFROM Product x, Product y, Product zWHERE x.category = y.category and y.price > 100 and x.category = z.category and z.price > 500 and z.weight > 10

SELECT DISTINCT x’.pidFROM Product x’, Product z’WHERE x’.category = z’.category and z’.price > 500 and z’.weight > 10

Q

Q’ Answer: H(x) = x’, H(y) = H(z) = z’

Page 126: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

126

CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100

Updating Views

INSERT INTO Expensive-Product VALUES(‘Gizmo’)

Purchase(customer, product, store)Product(pname, price)

Updateableview

Dan Suciu -- CSEP544 Fall 2011

Page 127: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Updatable Views• Have a virtual view V(A1, A2, …) over

tables R1, R2, …• User wants to update a tuple in V

– Insert/modify/delete• Can we translate this into updates to

R1, R2, … ?• If yes: V = “an updateable view”• If not: V = “a non-updateable view”

Dan Suciu -- CSEP544 Fall 2011 127

Page 128: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

128

CREATE VIEW Expensive-Product AS SELECT pname FROM Product WHERE price > 100

Updating Views

INSERT INTO Product VALUES(‘Gizmo’, NULL)

Purchase(customer, product, store)Product(pname, price)

Updateableview

Dan Suciu -- CSEP544 Fall 2011

INSERT INTO Expensive-Product VALUES(‘Gizmo’)

Page 129: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

129

CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’

Updating Views

INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’)

Purchase(customer, product, store)Product(pname, price)

Updateableview

Dan Suciu -- CSEP544 Fall 2011

Page 130: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

130

CREATE VIEW AcmePurchase AS SELECT customer, product FROM Purchase WHERE store = ‘AcmeStore’

Updating Views

INSERT INTO AcmePurchase VALUES(‘Joe’, ‘Gizmo’)

INSERT INTO PurchaseVALUES(‘Joe’,’Gizmo’,NULL)

Notethis

Purchase(customer, product, store)Product(pname, price)

Updateableview

Dan Suciu -- CSEP544 Fall 2011

Page 131: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

131

Updating Views

INSERT INTO CustomerPrice VALUES(‘Joe’, 200)

? ? ? ? ?

Non-updateableview

Most views arenon-updateable

CREATE VIEW CustomerPrice AS SELECT x.customer, y.price FROM Purchase x, Product y WHERE x.product = y.pname

Purchase(customer, product, store)Product(pname, price)

Dan Suciu -- CSEP544 Fall 2011

Page 132: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

132

Incremental View Update

Also known as view synchronization• Immediate synchronization = after each

update• Deferred synchronization

– Lazy = at query time– Periodic– Forced = manual

Page 133: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

133

CREATE VIEW FullOrder AS SELECT x.cid,x.pid,x.date,y.name,y.price FROM Order x, Product y WHERE x.pid = y.pid

Incremental View Update

UPDATE ProductSET price = price / 2WHERE pid = ‘12345’

Order(cid, pid, date)Product(pid, name, price)

UPDATE FullOrderSET price = price / 2WHERE pid = ‘12345’

No need to recompute the entire view !Dan Suciu -- CSEP544 Fall 2011

Page 134: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

134

CREATE VIEW Categories AS SELECT DISTINCT category FROM Product

Incremental View Update

DELETE ProductWHERE pid = ‘12345’

Product(pid, name, category, price)

DELETE CategoriesWHERE category in (SELECT category FROM Product WHERE pid = ‘12345’)

It doesn’t work ! Why ? How can we fix it ?

Page 135: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

135

CREATE VIEW Categories AS SELECT category, count(*) as c FROM Product GROUP BY category

Incremental View Update

DELETE ProductWHERE pid = ‘12345’

Product(pid, name, category, price)

UPDATE CategoriesSET c = c-1 WHERE category in (SELECT category FROM Product WHERE pid = ‘12345’);DELETE CategoriesWHERE c = 0

Page 136: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

136

Answering Queries Using Views

[See the paper]• We have several materialized views:

– V1, V2, …, Vn• Given a query Q

– Answer it by using views instead of base tables• Variation: Query rewriting using views

– Answer it by rewriting it to another query first• Example: if the views are indexes, then we

rewrite the query to use indexesDan Suciu -- CSEP544 Fall 2011

Page 137: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Rewriting Queries Using Views

137

Purchase(buyer, seller, product, store)Person(pname, city)

CREATE VIEW SeattleView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y WHERE x.city = ‘Seattle’ AND x.pname = y.buyer

SELECT y.buyer, y.sellerFROM Person x, Purchase yWHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’

Goal: rewrite this queryin terms of the view

Have thismaterializedview:

Page 138: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Rewriting Queries Using Views

138

SELECT y.buyer, y.sellerFROM Person x, Purchase yWHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’

SELECT buyer, sellerFROM SeattleViewWHERE product= ‘gizmo’

Dan Suciu -- CSEP544 Fall 2011

Page 139: Lecture 03: Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 20111.

Rewriting is not always possible

139

CREATE VIEW DifferentView AS SELECT y.buyer, y.seller, y.product, y.store FROM Person x, Purchase y, Product z WHERE x.city = ‘Seattle’ AND x.pname = y.buyer AND y.product = z.name AND z.price < 100

SELECT y.buyer, y.sellerFROM Person x, Purchase yWHERE x.city = ‘Seattle’ AND x..pname = y.buyer AND y.product=‘gizmo’ SELECT buyer, seller

FROM DifferentViewWHERE product= ‘gizmo’

“Maximallycontainedrewriting”