The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model...

40
1 The Relational Data Model Lecture 6 2 Outline Relational Data Model Functional Dependencies Logical Schema Design Reading Chapter 8

Transcript of The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model...

Page 1: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

1

The Relational DataModel

Lecture 6

2

Outline

• Relational Data Model

• Functional Dependencies

• Logical Schema Design

Reading

Chapter 8

Page 2: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

3

The Relational Data Model

Data

Modeling

Relational

Schema

Physical

storage

E/R diagrams Tables:

column names: attributes

rows: tuples

Complex

file organization

and index

structures.

Have seen

this in SQL

Have seen

this tooDiscuss next

4

Terminology

Name Price Category Manufacturer

gizmo $19.99 gadgets GizmoWorks

Power gizmo $29.99 gadgets GizmoWorks

SingleTouch $149.99 photography Canon

MultiTouch $203.99 household Hitachi

Tuples or rows or records

Attribute namesTable name or relation name

Products:

Page 3: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

5

Schemas

Relational Schema:

– Relation name plus attribute names

– E.g.• Product(Name, Price, Category, Manufacturer)

– In practice we add the domain for each attribute

Database Schema

– Set of relational schemas

– E.g.• Product(Name, Price, Category, Manufacturer)

• Company(Name, Address, Phone), . . .

6

Instances

• Relational schema = R(A1,…,Ak)Instance = relation with k attributes

• Database schema = R1(…), R2(…), …, Rn(…)Instance = n relations, of types R1, R2, ..., Rn

This is all mathematics, not to be confused with SQLtables!

(What's the difference?)

Page 4: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

7

Example

Name Price Category Manufacturer

gizmo $19.99 gadgets GizmoWorks

Power gizmo $29.99 gadgets GizmoWorks

SingleTouch $149.99 photography Canon

MultiTouch $203.99 household Hitachi

Relational schema: Product(Name, Price, Category, Manufacturer)

Instance:

8

Design Criteria

• A relational schema should ensure:– Data integrity

• data is consistent and satisfies integrityconstraints

– Data redundancy should be avoided

• The process of creating a relationalschema that follows certain rules iscalled normalisation– We will learn several of these rules

Page 5: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

9

Another “Example”

3.9Carol

3.7Bob

3.8Alice

CoursesGPAName

OS

DB

Math

OS

DB

OS

Math

Student

How can this be expressed in a relational schema?

10

Outlook: First Normal Form (1NF)

• A database schema is in First Normal Form ifall tables are flat

3.9Carol

3.7Bob

3.8Alice

CoursesGPAName

OS

DB

Math

OS

DB

OS

Math

Student

3.9Carol

3.7Bob

3.8Alice

GPAName

Student

Course

OS

DB

Math

OSCarol

OSAlice

DBBob

Alice

Carol

Alice

Student Course

DB

Math

Math

Takes Course

Page 6: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

11

Example cont.

OS3.7Bob

OS3.8Alice

DB3.8Alice

OS3.9Carol

DB3.7Bob

Math3.8Alice

CourseGPAName

Theoretically, also this flattened

table is in 1NF but it has several

problems:

• Update anomalies• Inserts etc.

Only a compound key between

Name and Course is possible

12

1NF cont.

• Features:– All values (attributes) are atomic

• For instance, no comma separated are valuesallowed!

– “Conventional” SQL based databasestypically adhere to the 1NF

• Relational Rule 1 (based on Codd!s 12 Rules):

– A table that has no multi-valued fields issaid to be in the first normal form.

Page 7: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

13

Normal Forms: Overview

– 1st Normal Form (1NF)

– 2nd Normal Form (2NF)

– 3rd Normal Form (3NF)

– Boyce Codd Normal Form (BCNF)

• The higher the normal form, the moreredundancies are reduced

Str

ict

ord

erin

g

14

Outline

• Relational Data Model

• Functional Dependencies

• Logical Schema Design

Page 8: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

15

Functional Dependencies (FD)

• A form of constraint– hence, part of the schema

• Finding them is part of the databasedesign

• Also used in normalising the relations

Warning: this is the most abstract, and “hardest” part ofthe course.

16

Functional Dependency:

Graphically

(Patrick O’Neil 1994)

Page 9: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

17

FD cont.

• Important: the intent of the DB designer isexpressed

• “Two rows cannot agree in value on attributeA and disagree on B”.

If r1(A) = r2(A) then r1(B) = r2(B)

– “A functionally determines B”

– “B is functionally dependent on A”

• In other words: A must be unique

18

FD Definitions

Definition:

If two tuples agree on the attributes

then they must also agree on the attributes

Formally:

A1, A2, …, An ! B1, B2, …, Bm

A1, A2, …, An

B1, B2, …, Bm

Page 10: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

19

Examples

• EmpID ! Name, Phone, Position

• Position ! Phone

• but Phone ! Position

EmpID Name Phone Position

E0045 Smith 1234 ClerkE1847 John 9876 SalesrepE1111 Smith 9876 SalesrepE9999 Mary 1234 Lawyer

20

Example

EmpID Name Phone Position

E0045 Smith 1234 Clerk

E1847 John 9876 Salesrep

E1111 Smith 9876 Salesrep

E9999 Mary 1234 Lawyer

Position ! Phone

Cle

rk

Sal

esre

p

Lay

er

.

.

.

Phone

Position

1234

9876

Page 11: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

21

In General

• To check A ! B, erase all other columns

• check if the remaining relation is many-to-one (called functional in mathematics)

… A … B

X1 Y1

X2 Y2

… …

22

Typical Examples of FDs

Product: name ! price, manufacturer

Person: ssn ! name, age

Company: name ! stockprice, president

Page 12: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

23

Example

Product(name, category, color, department, price)

name ! color

category ! department

color, category ! price

Consider these FDs:

What do they say?

24

ExampleFDs are constraints on relations:

• On some instances they hold

• On others they don’t

99ToysGreenGadgetTweaker

49ToysGreenGadgetGizmo

pricedepartmentcolorcategoryname

Does this instance satisfy all the FDs?

name ! color

category ! department

color, category ! price

Page 13: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

25

Example

59Office-

supp.GreenStationaryGizmo

99ToysBlackGadgetTweaker

49ToysGreenGadgetGizmo

pricedepartmentcolorcategoryname

What about this one?

name ! color

category ! department

color, category ! price

26

Example

If some FDs are satisfied, then

others are satisfied too

If all these FDs are true:

name ! color

category ! department

color, category ! price

Then this FD also holds: name, category ! price

Why ??

Page 14: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

27

Inference Rules for FDs

Is equivalent to

(1) Splitting rule

and

(2) Combining rule

Bm...B1An...A1

A1, A2, …, An ! B1, B2, …, Bm

A1, A2, …, An ! B1

A1, A2, …, An ! B2

. . . . .

A1, A2, …, An ! Bm

splitcombine

28

Inference Rules for FDs

(continued)

(3) Trivial Rule

Am…A1

where i = 1, 2, ..., n

Ai is a subset of A1..n

A1, A2, …, An ! Ai

Page 15: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

29

Inference Rules for FDs

(continued)

(4) Transitive Closure Rule

If

and

then

A1, A2, …, An ! B1, B2, …, Bm

B1, B2, …, Bm ! C1, C2, …, Cp

A1, A2, …, An ! C1, C2, …, Cp

30

...C1 CpBm…B1Am…A1

Page 16: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

31

Example (continued)

Start from the following FDs:

Infer the following FDs:

1. name ! color

2. category ! department

3. color, category ! price

8. name, category ! price

7. name, category ! color, category

6. name, category ! category

5. name, category ! color

4. name, category ! name

Which Rule

did we apply ?Inferred FD

32

Example (continued)

Answers:

Transitivity on 3, 78. name, category ! price

Split/combine on 5, 67. name, category ! color, category

Trivial rule6. name, category ! category

Transitivity on 4, 15. name, category ! color

Trivial rule4. name, category ! name

Which Rule

did we apply ?Inferred FD

1. name ! color

2. category ! department

3. color, category ! price

Page 17: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

33

Another Example

• Enrollment(student, major, course, room,time)

student ! major

major, course ! room

course ! time

34

Another Rule

If

then

Augmentation follows from trivial rules and transitivity

How?

A1, A2, …, An ! B

A1, A2, …, An , C1, C2, …, Cp ! B

(5) Augmentation Rule

Page 18: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

35

“Solving” the Augmentation

Rule

A1, A2, …, An, C1, C2, …, Cp ! A1, A2, …, An

A1, A2, …, An , C1, C2, …, Cp ! B

A1, A2, …, An ! B

Transitivity gives:

Trivial Rule

36

Summary of Rules

(1)Splitting (Decomposition) Rule

If X ! YZ, then X ! Y and X ! Z

(2) Combining (Union) Rule

If X ! Y and X ! Z, then X ! YZ

(3) Trivial (Reflexivity) Rule

If Y ! X, then X ! Y

(4) Transitive Closure Rule

If X ! Y and Y ! Z, then X ! Z

(5) Augmentation Rule

If X ! Y, then XZ ! Y, for every Z

Page 19: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

37

Problem: infer ALL FDs

Given a set of FDs, infer all possible FDs

How to proceed ?

• Try all possible FDs, apply all rules– E.g. R(A, B, C, D): how many FDs are possible ?

• Answer: 24 subsets of attributes

• Drop trivial FDs, drop augmented FDs– Still way too many

• Better: use the Closure Algorithm (next)

38

FDs and Closure

• Typically, a relation R has a set of

defined functional dependencies F

• The closure of F (written F+) is the setof all functional dependencies that may

be derived from F

– F+ contains all defined FDs and derivedFDs

Page 20: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

39

Closure of a set of AttributesGiven a set of attributes A1, …, An

The closure, {A1, …, An}+ , is the set of attributes B

s.t. A1, …, An ! B

name ! color

category ! department

color, category ! price

Example:

Closures:

name+ = {name, color}

{name, category}+ = {name, category, color, department, price}

color+ = {color}

40

Closure Algorithm

Start with X={A1, …, An}.

Repeat until X doesn’t change do:

if B1, …, Bn ! C is a FD and

B1, …, Bn are all in X

then add C to X. {name, category}+ =

{name, category, color,

department, price}

name ! color

category ! department

color, category ! price

Example:

Page 21: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

41

Closure Algorithm more verbose

• Starting with the given set of attributes, repeatedly expand the set by

adding the right sides of FDs as soon as we have included their left sides.

• Eventually, we cannot expand the set any more, and the resulting set is

the closure.

1 Let X be a set of attributes that eventually will become the closure. First

we initialize X to be {A1, A2, …, An}.

2 Now, repeatedly search for some FD in X:

B1B2…Bm!C

such that all of Bs are in the set X, but C is not. We then add C to X.

3 Repeat step 2 as many times as necessary until no more attributes can

be added to X.

Since X can only grow, and the number of attributes is finite, eventually nothing

more can be added to X.

4 The set X after no more attributes can be added to it is the: {A1, A2, …,

An}+. (Alex Thomo 2006)

42

Example

Compute {A,B}+ X = {A, B, ? }

Compute {A, F}+ X = {A, F, ? }

R(A,B,C,D,E,F) A, B ! C

A, D ! E

B ! D

A, F ! B

Page 22: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

43

Example (Solution)

Compute {A,B}+ X = {A, B, C, D, E }

Compute {A, F}+ X = {A, F, B, C, D, E }

R(A,B,C,D,E,F) A, B ! C

A, D ! E

B ! D

A, F ! B

44

Using Closure to Infer ALL FDs

A, B ! C

A, D ! B

B ! D

Example:

A+ = A, B+ = BD, C+ = C, D+ = D

AB+ = ABCD, AC+ = AC, AD+ = ABCD

ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)

BCD+ = BCD, ABCD+ = ABCD

Page 23: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

45

Problem: Finding FDs

• Approach 1: During Database Design– Designer derives them from real-world knowledge

of users

– Problem: knowledge might not be available

• Approach 2: From a Database Instance– Analyze a given database instance and find all

FDs satisfied by that instance

– Useful if designers don!t get enough informationfrom users

– Problem: FDs might be artificial for the giveninstance

46

Find All FDs

020CircuitsEEFrank

045DBCSEElsa

050JavaCSEDan

045DBCSECarol

040HWEEAlice

020C++CSEBob

020C++CSEAlice

RoomCourseDeptStudent

Page 24: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

47

Some Answers

Course ! Dept, Room

Dept, Room ! Course

Student, Dept ! Course, Room

Student, Course ! Dept, Room

Student, Room ! Dept, Course

Do all FDs

make sense

in practice ?

48

Keys

• A key is a set of attributes A1, ..., An s.t. for anyother attribute B, we have A1, ..., An ! B– Example

• A minimal key is a set of attributes which is akey and for which no subset is a key– Example

• Note: our course book calls them superkeyand key

A1, A2, A3, A4

A1, A2, A3, A4

Page 25: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

49

Computing Keys

• Compute X+ for all sets X

• If X+ = all attributes, then X is a key

• List only the minimal keys

Note: there can be many minimal keys !

• Example: R(A,B,C), AB!C, BC!AMinimal keys: AB and BC

50

Let!s reuse the Closure

ExampleA, B ! C

A, D ! B

B ! D

Example:

A+ = A, B+ = BD, C+ = C, D+ = D

AB+ = ABCD, AC+ = AC, AD+ = ABCD

ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)

BCD+ = BCD, ABCD+ = ABCD

Minimal keys

Keys

Page 26: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

51

More Examples of Keys

• Product(name, price, category, color)name, category ! price

category ! color

Keys are: {name, category} and all supersets

• Enrollment(student, address, course, room, time)student ! address

room, time ! course

student, course ! room, time

… done in class

52

Outline

• Relational Data Model

• Functional Dependencies

• Logical Schema Design

Page 27: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

53

Relational Schema Design

(or Logical Schema Design)

Main idea:

• Start with some relational schema

• Find out its FDs

• Use them to design a better relational

schema

54

Data Anomalies

When a database is poorly designed we getanomalies:

Redundancy: data is repeated

Update anomalies: need to change in severalplaces

Delete anomalies: may lose data when we don!twant

Page 28: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

55

Relational Schema Design

Anomalies:• Redundancy = repeat data

• Update anomalies = Fred moves to “Bellevue”

• Deletion anomalies = Joe deletes his phone number:

what is his city ?

Example: Persons with several phones

SSN ! Name, City

Westfield908-555-2121987-65-4321Joe

Seattle206-555-6543123-45-6789Fred

Seattle206-555-1234123-45-6789Fred

CityPhoneNumberSSNName

but not SSN ! PhoneNumber

56

Relation DecompositionBreak the relation into two:

Westfield987-65-4321Joe

Seattle123-45-6789Fred

CitySSNName

908-555-2121987-65-4321

206-555-6543123-45-6789

206-555-1234123-45-6789

PhoneNumberSSN

Anomalies have gone:• No more repeated data

• Easy to move Fred to “Bellevue” (how ?)

• Easy to delete all Joe’s phone number (how ?)

Westfield908-555-2121987-65-4321Joe

Seattle206-555-6543123-45-6789Fred

Seattle206-555-1234123-45-6789Fred

CityPhoneNumberSSNName

Page 29: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

57

Relational Schema Design

PersonbuysProduct

name

price name ssn

Conceptual Model:

Relational Model:

plus FDs

Normalization:

Eliminates anomalies

58

Decompositions in General

R1 = projection of R on A1, ..., An, B1, ..., Bm

R2 = projection of R on A1, ..., An, C1, ..., Cp

R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)

R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)

Page 30: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

59

Decomposition

• Sometimes it is correct:

Camera19.99Gizmo

Camera24.99OneClick

Gadget19.99Gizmo

CategoryPriceName

19.99Gizmo

24.99OneClick

19.99Gizmo

PriceName

CameraGizmo

CameraOneClick

GadgetGizmo

CategoryName

Lossless decomposition

60

Incorrect Decomposition

• Sometimes it is not:

Camera19.99Gizmo

Camera24.99OneClick

Gadget19.99Gizmo

CategoryPriceName

CameraGizmo

CameraOneClick

GadgetGizmo

CategoryName

Camera19.99

Camera24.99

Gadget19.99

CategoryPrice

What’s

incorrect ??

Lossy decomposition

Page 31: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

61

Decompositions in General

R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)

If A1, ..., An ! B1, ..., Bm

Then the decomposition is lossless

R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)

Example: name ! price, hence the first decomposition is lossless

Note: don’t need necessarily A1, ..., An ! C1, ..., Cp

62

Reminder: Normal Forms

First Normal Form = all attributes are atomic

Second Normal Form (2NF) = in principleold/obsolete

Third Normal Form (3NF) = this lecture

Boyce Codd Normal Form (BCNF) = this lecture

Others...

Page 32: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

63

Third Normal Form (3NF)

• A relation is in 3NF if, and only if:– 1NF is satisfied

– Every non-key attribute is functionallydependent on the whole key

• i.e. no partial key functional dependencies

– No transitive functional dependencies:• A non-key attribute must not be functionally

dependent on another non-key attribute

– No data redundancies

64

Example

• Now we havereached 3NF

1NF but not 3NF

manager ! address(http://db.grussel.org)

Page 33: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

65

Potential Problems with 3NF

• If a relation has more than 1 candidate key,anomalies may occur

• 3NF does not deal satisfactory withoverlapping candidate keys

• Need a stricter form:

– Boyce Codd Normal Form (BCNF)

• Example see:http://www.answers.com/topic/boyce-codd-normal-form

66

“Prod Cat, Stock Cat ! Range Code”

BCNF

Page 34: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

67

Boyce-Codd Normal Form

In English (though a bit vague):

Whenever a set of attributes of R is determining another attribute,

it should determine all the attributes of R.

Every determinant (LHS of the FD) is a candidate key.

A relation R is in BCNF if:

If A1, ..., An ! B is a non-trivial dependency

in R , then {A1, ..., An} is a key for R

68

BCNF Decomposition

Algorithm

A’s OthersB’s

R1

Repeat

choose A1, …, Am ! B1, …, Bn that violates the BCNF condition

split R into R1(A1, …, Am, B1, …, Bn) and R2(A1, …, Am, [others])

continue with both R1 and R2

Until no more violations

R2

Page 35: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

69

Example

What are the dependencies?

SSN ! Name, City

What are the keys?

{SSN, PhoneNumber}

Is it in BCNF?

Westfield908-555-1234987-65-4321Joe

Westfield908-555-2121987-65-4321Joe

Seattle206-555-6543123-45-6789Fred

Seattle206-555-1234123-45-6789Fred

CityPhoneNumberSSNName

70

Decompose it into BCNF

Westfield987-65-4321Joe

Seattle123-45-6789Fred

CitySSNName

908-555-1234987-65-4321

908-555-2121987-65-4321

206-555-6543123-45-6789

206-555-1234123-45-6789

PhoneNumberSSN

SSN ! Name, City

Let’s check anomalies:

• Redundancy ?

• Update ?

• Delete ?

Page 36: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

71

Summary of BCNF

DecompositionFind a dependency that violates the BCNF condition:

A’s OthersB’s

R1 R2

Heuristics: choose B , B , … B “as large as possible”1 2 m

Decompose:

2-attribute

relations are BCNF

Continue until

there are no

BCNF violations

left.

A1, A2, …, An ! B1, B2, …, Bm

72

Example DecompositionPerson(name, SSN, age, hairColor, phoneNumber)

SSN ! name, age

age ! hairColor

Decompose in BCNF (in class):

Step 1: find all keys (How ? Compute S+, for various sets S)

Step 2: now decompose

Page 37: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

73

Other Example

• R(A,B,C,D) A ! B, B ! C

• Key: AD

• Violations of BCNF (need to decompose)

– A ! B, A! C, A!BC

• Pick A! BC:

– split into R1(A,B,C) R2(A,D)

• What happens if we pick A ! B first ?

74

Lossless Decompositions

A decomposition is lossless if we can recover:

R(A,B,C)

R1(A,B) R2(A,C)

R!(A,B,C) should be the same as R(A,B,C)

R’ is in general larger than R. Must ensure R’ = R

Decompose

Recover

Page 38: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

75

Lossless Decompositions

• Given R(A,B,C) s.t. A!B, thedecomposition into R1(A,B), R2(A,C) is

lossless

76

3NF: A Problem with BCNF

Unit Company Product

Unit Company

Unit Product

FDs: Unit ! Company; Company, Product ! Unit

So, there is a BCNF violation, and we decompose.

Unit ! Company

No FDs

Notice: we loose the FD: Company, Product ! Unit

Page 39: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

77

So What!s the Problem?

Unit Company Product

Unit Company Unit Product

Galaga99 UW Galaga99 databases

Bingo UW Bingo databases

No problem so far. All local FD’s are satisfied.

Let’s put all the data back into a single table again (anomalies?):

Galaga99 UW databases

Bingo UW databases

Violates the dependency: company, product -> unit!

78

Solution: 3rd Normal Form

(3NF)

A relation R is in Third Normal Form if :

Whenever there is a nontrivial dependency A1, A2, ..., An ! B

for R , then {A1, A2, ..., An } is a key for R,

or B is part of a key.

Tradeoff:

BCNF = no anomalies, but may lose some FDs

3NF = keeps all FDs, but may have some anomalies

Page 40: The Relational Datalsir · The Relational Data Model Lecture 6 2 Outline ¥Relational Data Model ¥Functional Dependencies ¥Logical Schema Design Reading Chapter 8. 3 The Relational

79

Summary

• Dependencies on attributes are

important when designing database

schema

– Functional dependencies

– Attributes should be dependent only on(primary) key

• Use normalised database schemas to

avoid certain anomalies with data