Lecture 09: Functional Dependencies, Database Design

33
1 Lecture 09: Functional Dependencies, Database Design Monday, January 24, 2005

description

Lecture 09: Functional Dependencies, Database Design. Monday, January 24, 2005. Outline. Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6). Functional Dependencies. Definition: A 1 , ..., A m  B 1 , ..., B n holds in R if: - PowerPoint PPT Presentation

Transcript of Lecture 09: Functional Dependencies, Database Design

Page 1: Lecture 09: Functional Dependencies, Database Design

1

Lecture 09:Functional Dependencies,

Database Design

Monday, January 24, 2005

Page 2: Lecture 09: Functional Dependencies, Database Design

2

Outline

• Functional dependencies (3.4)

• Rules about FDs (3.5)

• Design of a Relational schema (3.6)

Page 3: Lecture 09: Functional Dependencies, Database Design

3

Functional Dependencies

Definition: A1, ..., Am B1, ..., Bn holds in R if:

t, t’ R, (t.A1=t’.A1 ... t.Am=t’.Am t.B1=t’.B1 ... t.Bn=t’.Bn )

A1 ... Am B1 ... Bm

if t, t’ agree here then t, t’ agree here

t

t’

R

Page 4: Lecture 09: Functional Dependencies, Database Design

4

In Class: Find All FDs

Student Dept Course Room

Alice CSE C++ 020

Bob CSE C++ 020

Alice EE HW 040

Carol CSE DB 045

Dan CSE Java 050

Elsa CSE DB 045

Frank EE Circuits 020

Page 5: Lecture 09: Functional Dependencies, Database Design

5

Answer

Course Dept, RoomDept, Room CourseStudent, Dept Course, RoomStudent, Course Dept, RoomStudent, Room Dept, Course

Course Dept, RoomDept, Room CourseStudent, Dept Course, RoomStudent, Course Dept, RoomStudent, Room Dept, Course

Do all FDsmake sensein practice ?

Page 6: Lecture 09: Functional Dependencies, Database Design

6

Review: Closure

Compute {A,B}+ X = {A, B, }

Compute {A, F}+ X = {A, F, }

R(A,B,C,D,E,F) A, B CA, D EB DA, F B

A, B CA, D EB DA, F B

Example

Page 7: Lecture 09: Functional Dependencies, Database Design

7

Review: Inference

A, B CA, D BB D

A, B CA, D BB D

Example:

Step 1: Compute X+, for every X:

A+ = A, B+ = BD, C+ = C, D+ = DAB+ = ABCD, AC+ = AC, AD+ = ABCDABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)BCD+ = BCD, ABCD+ = ABCD

A+ = A, B+ = BD, C+ = C, D+ = DAB+ = ABCD, AC+ = AC, AD+ = ABCDABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)BCD+ = BCD, ABCD+ = ABCD

Step 2: Enumerate all FD’s X Y, s.t. Y X+ and XY = :

AB CD, ADBC, ABC D, ABD C, ACD BAB CD, ADBC, ABC D, ABD C, ACD B

Page 8: Lecture 09: Functional Dependencies, Database Design

8

Keys

• A superkey is a set of attributes A1, ..., An s.t. for any other attribute B, we have A1, ..., An B

• In other words: a set of attributes X is a superkey if X+ = all attributes

• A key is a minimal superkey– I.e. set of attributes which is a superkey and for which no subset

is a superkey

Page 9: Lecture 09: Functional Dependencies, Database Design

9

Computing (Super)Keys

• Compute X+ for all sets X

• If X+ = all attributes, then X is a key

• List only the minimal X’s

Note: there can be many keys !

• Example: R(A,B,C), ABC, BCAKeys: AB and BC

Page 10: Lecture 09: Functional Dependencies, Database Design

10

Examples of Keys• Product(name, price, category, color)

name, category pricecategory color

Key is: {name, category} ; superkeys are all supersets

• Enrollment(student, address, course, room, time)student addressroom, time coursestudent, course room, time

Keys are: [in class]

Page 11: Lecture 09: Functional Dependencies, Database Design

11

FD’s for E/R Diagrams

Given a relation constructed from an E/R diagram, what is its key?

Rule 1: If the relation comes from an entity set, the key of the relation is the set of attributes which is the key of the entity set.

address name ssn

Person Person(address, name, ssn)

Page 12: Lecture 09: Functional Dependencies, Database Design

12

FD’s for E/R Diagrams

PersonbuysProduct

name

price name ssn

buys(name, ssn, date)

date

Rule 2: If the relation comes from a many-many relationship, the key of the relation is the set of all attribute keys in the relations corresponding to the entity sets

Page 13: Lecture 09: Functional Dependencies, Database Design

13

FD’s for E/R DiagramsExcept: if there is an arrow from the relationship to E, then we don’t need the key of E as part of the relation key.

Purchase

Product

Person

Store

CreditCard

name

card-nossn

sname

Purchase(name , sname, ssn, card-no)

Page 14: Lecture 09: Functional Dependencies, Database Design

14

FD’s for E/R Diagrams

More rules:

• Many-one, one-many, one-one relationships

• Multi-way relationships

• Weak entity sets

(Try to find them yourself, or check book)

Page 15: Lecture 09: Functional Dependencies, Database Design

15

FD’s for E/R DiagramsSay: “the CreditCard determines the Person”

Purchase

Product

Person

Store

CreditCard

name

card-nossn

sname

Purchase(name , sname, ssn, card-no)

Incomplete(what does

it say ?)

card-no ssn

Page 16: Lecture 09: Functional Dependencies, Database Design

16

Relational Schema Design(or Logical Design)

Main idea:

• Start with some relational schema

• Find out its FD’s

• Use them to design a better relational schema

Page 17: Lecture 09: Functional Dependencies, Database Design

17

Data Anomalies

When a database is poorly designed we get anomalies:

Redundancy: data is repeated

Updated anomalies: need to change in several places

Delete anomalies: may lose data when we don’t want

Page 18: Lecture 09: Functional Dependencies, Database Design

18

Relational Schema Design

Anomalies:• Redundancy = repeat data• Update anomalies = Fred moves to “Bellevue”• Deletion anomalies = Joe deletes his phone number:

what is his city ?

Recall set attributes (persons with several phones):

SSN Name, CitySSN Name, City

Name SSN PhoneNumber City

Fred 123-45-6789 206-555-1234 Seattle

Fred 123-45-6789 206-555-6543 Seattle

Joe 987-65-4321 908-555-2121 Westfield

but not SSN PhoneNumber

Page 19: Lecture 09: Functional Dependencies, Database Design

19

Relation DecompositionBreak the relation into two:

Name SSN City

Fred 123-45-6789 Seattle

Joe 987-65-4321 Westfield

SSN PhoneNumber

123-45-6789 206-555-1234

123-45-6789 206-555-6543

987-65-4321 908-555-2121Anomalies have gone:• No more repeated data• Easy to move Fred to “Bellevue” (how ?)• Easy to delete all Joe’s phone number (how ?)

Name SSN PhoneNumber City

Fred 123-45-6789 206-555-1234 Seattle

Fred 123-45-6789 206-555-6543 Seattle

Joe 987-65-4321 908-555-2121 Westfield

Page 20: Lecture 09: Functional Dependencies, Database Design

20

Relational Schema Design

PersonbuysProduct

name

price name ssn

Conceptual Model:

Relational Model:plus FD’s

Normalization:Eliminates anomalies

Page 21: Lecture 09: Functional Dependencies, Database Design

21

Decompositions in General

R1 = projection of R on A1, ..., An, B1, ..., Bm

R2 = projection of R on A1, ..., An, C1, ..., Cp

R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)

R1(A1, ..., An, B1, ..., Bm)R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)R2(A1, ..., An, C1, ..., Cp)

Page 22: Lecture 09: Functional Dependencies, Database Design

22

Decomposition

• Sometimes it is correct:Name Price Category

Gizmo 19.99 Gadget

OneClick 24.99 Camera

Gizmo 19.99 Camera

Name Price

Gizmo 19.99

OneClick 24.99

Gizmo 19.99

Name Category

Gizmo Gadget

OneClick Camera

Gizmo Camera

Lossless decomposition

Page 23: Lecture 09: Functional Dependencies, Database Design

23

Incorrect Decomposition

• Sometimes it is not:

Name Price Category

Gizmo 19.99 Gadget

OneClick 24.99 Camera

Gizmo 19.99 Camera

Name Category

Gizmo Gadget

OneClick Camera

Gizmo Camera

Price Category

19.99 Gadget

24.99 Camera

19.99 Camera

What’sincorrect ??

Lossy decomposition

Page 24: Lecture 09: Functional Dependencies, Database Design

24

Decompositions in General

R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)

If A1, ..., An B1, ..., Bm

Then the decomposition is lossless

R1(A1, ..., An, B1, ..., Bm)R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)R2(A1, ..., An, C1, ..., Cp)

Example: name price, hence the first decomposition is lossless

Note: don’t need A1, ..., An C1, ..., Cp

Page 25: Lecture 09: Functional Dependencies, Database Design

25

Normal Forms

First Normal Form = all attributes are atomic

Second Normal Form (2NF) = old and obsolete

Third Normal Form (3NF) = will discuss

Boyce Codd Normal Form (BCNF) = will discuss

Others...

Page 26: Lecture 09: Functional Dependencies, Database Design

26

Boyce-Codd Normal Form

A simple condition for removing anomalies from relations:

In English (though a bit vague):

Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R.

A relation R is in BCNF if:

If A1, ..., An B is a non-trivial dependency

in R , then {A1, ..., An} is a superkey for R

A relation R is in BCNF if:

If A1, ..., An B is a non-trivial dependency

in R , then {A1, ..., An} is a superkey for R

Page 27: Lecture 09: Functional Dependencies, Database Design

27

BCNF Decomposition Algorithm

A’s OthersB’s

R1

Is there a 2-attribute relation that isnot in BCNF ?

Repeat choose A1, …, Am B1, …, Bn that violates the BNCF condition split R into R1(A1, …, Am, B1, …, Bn) and R2(A1, …, Am, [others]) continue with both R1 and R2

Until no more violations

Repeat choose A1, …, Am B1, …, Bn that violates the BNCF condition split R into R1(A1, …, Am, B1, …, Bn) and R2(A1, …, Am, [others]) continue with both R1 and R2

Until no more violations

R2

In practice, we havea better algorithm (next):

Page 28: Lecture 09: Functional Dependencies, Database Design

28

Example

What are the dependencies?SSN Name, City

What are the keys?{SSN, PhoneNumber}

Is it in BCNF?

Name SSN PhoneNumber City

Fred 123-45-6789 206-555-1234 Seattle

Fred 123-45-6789 206-555-6543 Seattle

Joe 987-65-4321 908-555-2121 Westfield

Joe 987-65-4321 908-555-1234 Westfield

Page 29: Lecture 09: Functional Dependencies, Database Design

29

Decompose it into BCNF

Name SSN City

Fred 123-45-6789 Seattle

Joe 987-65-4321 Westfield

SSN PhoneNumber

123-45-6789 206-555-1234

123-45-6789 206-555-6543

987-65-4321 908-555-2121

987-65-4321 908-555-1234

SSN Name, City

Let’s check anomalies:• Redundancy ?• Update ?• Delete ?

Page 30: Lecture 09: Functional Dependencies, Database Design

30

Example Decomposition Person(name, SSN, age, hairColor, phoneNumber)

SSN name, ageage hairColor

Decompose in BCNF (in class):

Page 31: Lecture 09: Functional Dependencies, Database Design

31

BCNF Decomposition Algorithm

BCNF_Decompose(R)

find X s.t.: X ≠X+ ≠ [all attributes]

if (not found) then “R is in BCNF”

let Y = X+ - X let Z = [all attributes] - X+ decompose R into R1(X Y) and R2(X Z) continue to decompose R1 and R2

BCNF_Decompose(R)

find X s.t.: X ≠X+ ≠ [all attributes]

if (not found) then “R is in BCNF”

let Y = X+ - X let Z = [all attributes] - X+ decompose R into R1(X Y) and R2(X Z) continue to decompose R1 and R2

Page 32: Lecture 09: Functional Dependencies, Database Design

32

Example BCNF DecompositionPerson(name, SSN, age, hairColor, phoneNumber)

SSN name, ageage hairColor

Iteration 1: PersonSSN+ = SSN, name, age, hairColorDecompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber)

Iteration 2: Page+ = age, hairColorDecompose: People(SSN, name, age) Hair(age, hairColor) Phone(SSN, phoneNumber)

Iteration 1: PersonSSN+ = SSN, name, age, hairColorDecompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber)

Iteration 2: Page+ = age, hairColorDecompose: People(SSN, name, age) Hair(age, hairColor) Phone(SSN, phoneNumber)

Find X s.t.: X ≠X+ ≠ [all attributes]

What isthe key ?

Page 33: Lecture 09: Functional Dependencies, Database Design

33

Example

• R(A,B,C,D) A B, B C

• In R: A+ = ABC ≠ ABCD– Split into R1(A,B,C), R2(A,D)

• In R1: B+ = BC ≠ ABC– Split into R11(B,C), R12(A,B)

• Final decomposition: R11(B,C), R12(A,B), R2(A,D)

• What happens if in R we first pick B+ ? Or AB+ ?

What arethe keys ?