Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational...

49
Intro to DB CHAPTER 7 RELATIONAL DB DESIGN

Transcript of Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational...

Page 1: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Intro to DB

CHAPTER 7RELATIONAL DB DESIGN

Page 2: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Chapter 7: Relational Database Design

Features of Good Relational DesignAtomic Domains and First Normal FormDecomposition Using Functional DependenciesFunctional Dependency TheoryAlgorithmsDecomposition Using Multivalued Dependencies More Normal FormDatabase-Design ProcessModeling Temporal Datag p

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 2

Page 3: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Pitfalls of Relational Database Design

Relational database designFi d “ d” ll ti f l ti h f i f ti dFind a “good” collection of relation schemas for our information need

R = ( A B C D E ) <----- single relation schemaDB1 = { R1, …… , Rn } <----- DB schema (set of relation schemas)1 { 1, , n } ( )

Design Goals:Ensure that relationships among attributes are represented (information

)content)Avoid redundant dataFacilitate enforcement of database integrity constraintsg y

A bad design may lead to Inability to represent certain informationRepetition of InformationLoss of information

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 3

Page 4: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

ExampleLending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)

Redundancy:Data for branch-name, branch-city, assets are repeated for each loan that a branch makesWastes spaceWastes space Complicates updating, introducing possibility of inconsistency of assets value

Null valuesC ll l b h diffi l h dlCan use null values, but they are difficult to handle.

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 4

Page 5: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Redundancy creates problems

Anomalies (by Codd)I i l i f i b b h if l iInsertion anomaly: cannot store information about a branch if no loans exist Deletion anomaly: lose branch info when that last account for the branch is deletedUpdate anomaly: what happens when you modify asset for a branch in only a single record?

The problems are caused by redundancy!Solution => decompose schema so that each information

i d l (l )content is represented only once (later)information content: relationship between attributes

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 5

Page 6: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

First Normal Form

Domain is atomic if its elements are considered to be indivisible unitsExamples of non-atomic domains:p

set of names, composite attributesidentification numbers like CS101 that can be broken up into parts

A relational schema is in first normal form (1NF) if the domains of allA relational schema is in first normal form (1NF) if the domains of all attributes are atomicAtomicity is actually a property of how the elements of the domain are used

Student ID numbers: CS0012, EE1127, …If the first two characters are extracted to find the department => the domain is not atomic

Non-atomic attributesleads to encoding of information in application program rather than in the databasecomplicate storage and query processing

We assume all relations are in first normal form

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 6

Page 7: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Relational Theory

Goal: Devise a theory for the followingDecide whether a particular relation R is in “good” form.In the case that a relation R is not in “good” form, decompose it i f l i {R R R } h hinto a set of relations {R1, R2, ..., Rn} such that

each relation is in good form the decomposition is lossless (preserves the information in the originalthe decomposition is lossless (preserves the information in the original relation before decomposition)

Our theory is based on:yfunctional dependenciesmultivalued dependencies (not covered in this semester)

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 7

Page 8: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Functional Dependencies

Constraints on the set of legal relations.Require that the value for a certain set of attributes determines uniquely the value for another set of attributes.A f i l d d i li i f h i fA functional dependency is a generalization of the notion of a key.

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 8

Page 9: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Functional Dependencies (Cont.)

Let R be a relation schemaα ⊆ R d β ⊆ Rα ⊆ R and β ⊆ R

The functional dependency α → β holds on R if and only iffor any legal relations r(R),y g ( ),whenever any two tuples t1 and t2 of r agree on the attributes α,they also agree on the attributes β.Th t iThat is,

t1[α] = t2 [α] ⇒ t1[β ] = t2 [β ]

Examplep eConsider r(A,B) with the following instance of r 1 4

1 53 7

On this instance, A → B does NOT hold, but B → A does hold

3 7

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 9

Page 10: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Applications of FD

K is a superkey for relation schema R if and only if K → RK i did t k f R if d l ifK is a candidate key for R if and only if

K → R, andfor no α ⊂ K, α → R,

Functional dependencies allow us to express constraints that cannot be expressed using superkeys.Loan-info-schema = (customer-name, loan-number,

branch-name, amount).W h f ll i f i l d d i h ldWe expect the following functional dependencies to hold:

loan-number → amountloan-number → branch-nameloan number → branch name

but would not expect the following to hold: loan-number → customer-name

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 10

loan number → customer name

Page 11: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Applications of FD (Cont.)

Specify constraints on the set of legal relationsW h F h ld R if ll l l l i R i f h fWe say that F holds on R if all legal relations on R satisfy the set of functional dependencies F.

Test relations to see if they are legal under a given set of FDsTest relations to see if they are legal under a given set of FDs If a relation r is legal under a set F of functional dependencies, we say that r satisfies F.

Note: A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances.

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 11

Page 12: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Trivial FD

A functional dependency is trivial if it is satisfied by all instances of a relationof a relation

E.g.customer-name, loan-number → customer-namecustomer name, loan number → customer namecustomer-name → customer-name

Lemma: α → β is trivial if β ⊆ α

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 12

Page 13: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Closure of a Set of FDs

Given a set F of FDs, there are certain other FDs that are logically implied by Flogically implied by F

E.g. If A → B and B → C, then we can infer that A → CThe set of all functional dependencies logically implied by F is p g y p ythe closure of F.We denote the closure of F by F+

We can find all of F+ by applying Armstrong’s Axioms:if β ⊆ α, then α → β (reflexivity)if α → β then γ α → γ β (augmentation)if α → β, then γ α → γ β (augmentation)if α → β, and β → γ, then α → γ (transitivity)

These rules are sound (generate only functional dependencies that actually hold) and complete (generate all functional dependencies that hold).

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 13

Page 14: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Example

R = (A, B, C, G, H, I)F = { A → B{

A → CCG → HCG → IB → H}

some members of F+

A → HA → H by transitivity from A → B and B → H

AG → I by augmenting A → C with G, to get AG → CG

and then transitivity with CG → I CG → HI

from CG → H and CG → I : “union rule” can be inferred fromdefinition of functional dependencies, or Augmentation of CG → I to infer CG → CGI, augmentation of

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 14

g gCG → H to infer CGI → HI, and then transitivity

Page 15: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Decomposition

Redundancy causes problems: Solution => decompose schema so that each information content is represented only oncep yDefinition: Let R be a relation scheme {R1, ..., Rn} is a decomposition of R if R = R1∪ ... ∪ Rn

(i.e., all of R’s attributes are represented) We will deal mostly with binary decomposition:

R into {R1 R2} where R = R1 ∪ R2R into {R1, R2} where R R1 ∪ R2

student(ID, name, dept, dept_chair, dept_phone, year)=> student’(ID, name, year, dept)

department(dept, chair, phone)Lending = (b name asset b city loan# c name amount)Lending (b_name, asset, b_city, loan#, c_name, amount)

=> Branch = (b_name, asset, b_city)Loan = (loan#, c_name, amount)

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 15

Page 16: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Lossy Decomposition

Careless decomposition leads to loss of information: Lossydecompositionp

Lending = (b_name, asset, b_city, loan#, c_name, amount)B h (b b i )=> Branch = (b_name, asset, b_city)Loan = (loan#, c_name, amount)

problem: relationship between loan and branch is lost- problem: relationship between loan and branch is lost- loss of information

> B h (b t b it )=> Branch = (b_name, asset, b_city)Loan = (loan#, c_name, amount, b_city)

t pl i t l j i- more tuples in natural join- but we have lost the relationship- loss of information

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 16

Page 17: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Lossy Decomposition (cont.)

Decomposition of R = (A, B) intoR1 = (A) and R2 = (B)

A B A B

ααβ

121

αβ

12

Can we recover the original information content?

β 1∏A(r) ∏B(r)

∏A (r) ∏B (r)A B

α 1ααββ

1212

Lossy!

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 17

β 2

Page 18: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Lossless-join Decomposition

For r(R) and decomposition {R1, R2}, it is always the case that ∏ ∏r⊆ ∏R1 (r) ∏R2 (r)

D fi iti D iti {R R } i l l j i d p iti fDefinition: Decomposition {R1, R2} is a lossless-join decomposition of R if

r = ∏ (r) ∏ (r)r = ∏R1 (r) ∏R2 (r)

The information content of the original relation r is always the basisg y

r1r2

a 1c acr1 r2

a 1r

aabb

1123

cdef

aabb

cdef

abb

123

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 18

Page 19: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Lossless-join Decomposition

Lemma: {R1,...,Rn} is a lossless decomposition if

R1 ∩ R2 → R1, or R1 ∩ R2 → R2

i.e., if one of the two subschemas hold the key of the other subschema

r1r2

a 1c acr1 r2

a 1r

aabb

123

cdef

aabb

cdef

abb

123

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 19

Page 20: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Boyce-Codd Normal Form

We want a way to decide whether a particular relation R is in “good” formgood form.Definition: A relation schema R is in BCNF (with respect to a set F of FDs) if for each FD α → β in F+ (α ⊆ R and β ⊆ R), at ) β ( β )least one of the following holds:

α → β is trivial (i.e., β ⊆ α)α is a superkey for R

ExampleR = (A, B, C), F = {A → B ; B → C}, Key = {A}

R is not in BCNFR is not in BCNFDecompose into R1 = (A, B), R2 = (B, C)

R1 and R2 in BCNF

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 20

Lossless-join decomposition

Page 21: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

BCNF Example

R = (bname, bcity, assets, cname, loan#, amount)F = { bname → assets bcity ; loan# → amount bname }Key = {loan#, cname}

DecompositionR1 = (bname, bcity, assets)R (b l # )R2 = (bname, cname, loan#, amount)

R3 = (bname, loan#, amount)R4 = (cname, loan#)R4 (cname, loan#)

Final decomposition result: { R1, R3, R4 }p { 1, 3, 4 }

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 21

Page 22: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Dependency Preservation

Example student{name, dept, college}name → dept, collegedept→ college

Decomposition 1student1(name, dept) name → deptdepartment(dept, college) dept → college

Decomposition 2student1(name, dept) name → deptstudent2(name, college) name → college( , g ) g

is a lossless decompositionb i d d pt ll j i i i d

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 22

but in order to test dept→ college, a join is required

Page 23: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Dependency Preservation (cont.)

DefinitionF f FD R {R R } d i i f RF : set of FD on R. {R1, ..., Rn} : decomposition of R.Fi : restriction of F to Ri is the set of all F.D.s in F+ that include only attributes

of R ii

DefinitionLet F'=F1∪…∪ Fn. The decomposition is dependency-preserving if F+ = F'+

Motivation: We wish to guarantee F by locally enforcing the each g y y grestriction (Ri) on the respective decomposed relation.

SQL does not provide a direct way of specifying functional dependencies h h kother than superkeys.

Using assertions can be expensive

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 23

Page 24: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Example

R = (A, B, C)F = { A → B B → C }F = { A → B, B → C }

R1 = (A, B), R2 = (B, C)Lossless-join decomposition: R1 ∩ R2 = {B} and B → BCDependency preserving

R1 = (A, B), R2 = (A, C)Lossless-join decomposition: R1 ∩ R2 = {A} and A → ABNot dependency preserving (cannot check B → C without computing R1 R2)

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 24

Page 25: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

BCNF and Dependency Preservation

R = (J, K, L)F = { JK → L L → K }F = { JK → L; L → K }Two candidate keys = JK and JL

R is not in BCNFR is not in BCNFAny decomposition of R will fail to preserve

JK → L

It is not always possible to get a BCNF decomposition that is dependency preserving There are some situations where

BCNF is not dependency preserving, and ffi i t h ki f FD i l ti d t i i t tefficient checking for FD violation on updates is important

=> solution: define a weaker normal form

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 25

Page 26: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Third Normal Form

Third Normal FormAllows some redundancy (with resultant problems)y ( p )But FDs can be checked on individual relations without computing a joinThere is always a lossless-join, dependency-preserving decomposition into 3NF

A relation schema R is in third normal form (3NF) if for all α → β in F+ at least one of the following holds:β g

α → β is trivial (i.e., β ∈ α)α is a superkey for REach attribute A in β – α is contained in a candidate key for R.

(NOTE: each attribute may be in a different candidate key)

If a relation is in BCNF it is in 3NFsince in BCNF one of the first two conditions above must hold

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 26

Page 27: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Example

R = (J, K, L)F = { JK → L L → K }F = { JK → L, L → K }

Two candidate keys: JK and JLR is in 3NFR is in 3NF

JK → L JK is a superkeyL → K K is contained in a candidate key

There is some redundancy in this schemaBCNF decomposition has (JL) and (LK)=> Testing for JK → L requires a join

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 27

Page 28: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

St Zp C

R(street, city, zip)street, city → zip

s1

s2

z1

z1

z

c1

c1

czip → city

/* 3NF b t t i BCNF ( t i i l & ip i t k ) */

s3

null

z2

z3

c1

c2

/* 3NF but not in BCNF (nontrivial & zip is not key) */repetition of information (e.g., the relationship z1, c1)need to use null values (e.g., to represent the relationship z3, c2 where there is ( g , p p z3, 2 wno corresponding value for St)

R1(street, zip) R2(zip, city)

St

s1

Zp

z1

Zp

z1

C

c1

R1, R2 are in BCNF but not dependency-preserving.s2

s3

z1

z2

z2

z3

c1

c2

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 28

Page 29: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Comparison of BCNF and 3NF

It is always possible to decompose a relation into relations in 3NF and3NF and

the decomposition is losslessth d d i dthe dependencies are preserved

i l ibl d l i i l i iIt is always possible to decompose a relation into relations in BCNF and

th d iti i l lthe decomposition is losslessit may not be possible to preserve dependencies.

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 29

Page 30: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Design Goals

When we decompose a relation schema R with a set of functional dependencies F into R1 R2 R we wantfunctional dependencies F into R1, R2,.., Rn we want1. Lossless decomposition2. No redundancy3. Dependency preservation

First try to achieveFirst, try to achieveBCNFLossless joinDependency preservation

If t hi thi t fIf we cannot achieve this, we accept one ofLack of dependency preservation Redundancy due to use of 3NF

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 30

y

Page 31: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Algorithms

Testing for BCNFBCNF DecompositionTesting for 3NF3NF Decomposition

Closure of FDsClosure of attributesCoverCanonical cover

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 31

Page 32: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Closure of Attribute Sets

Given a set of attributes α, define the closure of α under F(d n t d b α+) th t f ttrib t th t r f n ti n ll(denoted by α+) as the set of attributes that are functionally determined by α under F:

α → β is in F+ ⇔ β ⊆ α+α → β is in F ⇔ β ⊆ α

Al ith t t α+ th l f α d FAlgorithm to compute α+, the closure of α under Fresult := α;while (changes to result) dowhile (changes to result) do

for each β → γ in F dobeging

if β ⊆ result then result := result ∪ γend

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 32

Page 33: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Example

R = (A, B, C, G, H, I)F = { A → B; A → C; CG → H; CG → I; B → H }F { A → B; A → C; CG → H; CG → I; B → H }

(AG)+

1. result = AG2. result = ABCG (A → C and A → B)3. result = ABCGH (CG → H and CG ⊆ AGBC)( ⊆ )4. result = ABCGHI (CG → I and CG ⊆ AGBCH)

Is AG a candidate key?Is AG a candidate key? Is AG a super key?

Does AG → R? I b f AG k ?Is any subset of AG a superkey?

Does A+ → R?Does G+ → R?

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 33

Page 34: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Uses of Attribute Closure

There are several uses of the attribute closure algorithm:Testing for superkey:

To test if α is a superkey, we compute α+, and check if α+ contains all attributes of Rattributes of R.

Testing functional dependenciesTo check if a functional dependency α → β holds (or, in other words, is inTo check if a functional dependency α → β holds (or, in other words, is in F+), just check if β ⊆ α+. Is a very useful simple test

Computing closure of FFor each γ ⊆ R, we find the closure γ+, and f h S + f i l d d Sfor each S ⊆ γ+, we output a functional dependency γ → S.

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 34

Page 35: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Testing for BCNF

Check if α →β cause a violation of BCNF1 omp te α+ (the ttrib te los re of α) nd1. compute α+ (the attribute closure of α), and 2. verify that it includes all attributes of R (i.e., it is a superkey of R)

Check if R is in BCNF (w.r.t. F)Simplified test: check only the dependencies in F for violation ofSimplified test: check only the dependencies in F for violation of BCNF, rather than checking all dependencies in F+

It can be shown that if none of the dependencies in F causes a violation of BCNF, then none of the dependencies in F+ will cause a violation of BCNF either

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 35

Page 36: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Testing for BCNF (cont.)

However, using only F is incorrect when testing a relation in a decomposition of Rdecomposition of R

ExampleE a p eConsider R (A, B, C, D)

with F = { A→B, B→C }{ , }Decompose into R1(A, B) and R2(A, C, D) Neither of the dependencies in F contain only attributes from (A, C, D) so

i h b i l d i hi ki R i fi BCNFwe might be mislead into thinking R2 satisfies BCNF. In fact, dependency A → C in F+ shows R2 is not in BCNF.

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 36

Page 37: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Cover

Sets of functional dependencies may have redundant dependencies that can be inferred from the othersdependencies that can be inferred from the others

A → C is redundant in: { A → B, B → C, A → C }Parts of a functional dependency may be redundant

E.g. on RHS: { A → B, B → C, A → CD } can be simplified to { A → B, B → C, A → D }

E.g. on LHS: { A → B, B → C, AC → D } can be simplified to { A B B C A D }{ A → B, B → C, A → D }

A cover of F is any F’ such that F’+ = F+yA FD g ∈ F is redundant if

(F- {g})+ = F+ or g ∈ (F- {g})+

F’ is a nonredundant (minimal) cover of F ifF’+ = F+ andF’ contains no redundant FD

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 37

Page 38: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Extraneous Attributes

Let α → β in F.A ∈ α is extraneous if F ⇒ (F {α→β }) ∪ {(α A)→β }A ∈ α is extraneous if F ⇒ (F - {α→β }) ∪ {(α - A)→β }A ∈ β is extraneous if F ⇐ (F - {α→β }) ∪ {α→(β - A)}Note: implication in the opposite direction is trivial in each of the cases b i “ ” f i l d d l i li kabove, since a “stronger” functional dependency always implies a weaker

oneExamplep

Given F = {A → C, AB → C }B is extraneous in AB → C because

A C l i ll i li AB CA → C logically implies AB → C

ExampleGiven F = {A → C, AB → CD}Given F {A → C, AB → CD}C is extraneous in AB → CD since

A → C can be inferred even after deleting C

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 38

Page 39: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Testing if an Attribute is Extraneous

Consider a set F of functional dependencies and the functional d p nd n α → β in Fdependency α → β in F.To test if attribute A ∈ α is extraneous in α1 omp te ({α} A)+ sing the dependen ies in F1. compute ({α} – A)+ using the dependencies in F2. check that ({α} – A)+ contains A; if it does, A is extraneous

To test if attribute A ∈ β is extraneous in βTo test if attribute A ∈ β is extraneous in β1. compute α+ using only the dependencies in

F’ = (F – {α → β}) ∪ {α →(β – A)}, 2. if α+ contains A, A is extraneous

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 39

Page 40: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Canonical Cover

A canonical cover for F is a set of dependencies Fc such that Fc

+ = F+c

No FD in Fc contains an extraneous attributeEach left side of a FD in Fc is unique

Intuitively a canonical cover of F is a “minimal” set of functionalIntuitively, a canonical cover of F is a minimal set of functional dependencies equivalent to F, with no redundant dependencies or having redundant parts of dependencies T i l f FTo compute a canonical cover for F:repeat

Use the union rule to replace any dependencies in Fα → β d α → β ith α → β βα1 → β1 and α1 → β1 with α1 → β1 β2

Find a functional dependency α → β with an extraneous attribute either in α or in βdelete the extraneous attribute from α → βdelete the extraneous attribute from α → β

until F does not changeUnion rule may become applicable after some extraneous attributes have been deleted so it has to be re-applied

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 40

deleted, so it has to be re-appliedO(n2)

Page 41: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Example

R = (A, B, C)F = { A → BCF { A → BC

B → CA → B

AB → C }AB → C }Combine A → BC and A → B into A → BC

Set is now { A → BC, B → C, AB → C }

A is extraneous in AB → C because B → C logically implies AB → C.Set is now { A → BC, B → C }

C is extraneous in A → BC since A → BC is logically implied by A → B and B → C.The canonical cover is:The canonical cover is:

A → BB → C

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 41

Page 42: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

BCNF Decomposition Algorithmresult := {R}; done := false; compute F+;while (not done) do

if (there is a schema Ri in result that is not in BCNF)if (there is a schema Ri in result that is not in BCNF)then begin

let α → β (α ∩ β = ∅) be a nontrivial FDthat holds on Ri , and α→Ri is not in F+,

result := (result – Ri) ∪ (Ri – β) ∪ (α, β );dend

else done := true;

Note: each Ri in result is in BCNF, and decomposition is lossless-join.i , p j

R = (bname, bcity, assets, cname, loan#, amount)F = { bname → assets bcity ; loan# → amount bname } Key = {loan#, cname}

DecompositionR1 = (bname, bcity, assets), R2 = (bname, cname, loan#, amount)R = (bname loan# amount) R = (cname loan#)R3 = (bname, loan#, amount), R4 = (cname, loan#)

Final decomposition result: R1, R3, R4

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 42

Page 43: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Testing for 3NF

Need to check only FDs in F (not F+)

Use attribute closure to check, for each dependency α → β, if αis a superkey.If i k h if if h ib i β iIf α is not a superkey, we have to verify if each attribute in β is contained in a candidate key of R

this test is rather more expensive since it involves finding candidate keysthis test is rather more expensive, since it involves finding candidate keystesting for 3NF has been shown to be NP-hardInterestingly, decomposition into third normal form can be done in g ppolynomial time

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 43

Page 44: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

3NF Decomposition AlgorithmLet Fc be a canonical cover for F;i := 0;for each FD α → β in F dofor each FD α → β in Fc do

if none of Rj, 1 ≤ j ≤ i contains α β then { i := i + 1R := α β }Ri := α β }

if none of Rj, 1 ≤ j ≤ i contains a candidate key for Rthen { i := i + 1;

Ri := any candidate key for R }r t rn (R R R )return (R1, R2, ..., Ri)

Example: Banker = (branch, cname, banker, office#)FD: { banker → branch office# ; cname branch → banker }

The key: {cname, branch}Follow the algorithmg

Banker1 = (banker, branch, office#)Banker2 = (cname, branch, banker)

Since Banker2 contains a candidate key we are done

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 44

Since Banker2 contains a candidate key we are done.

Page 45: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Overall Database Design Process

We have assumed schema R is givenR could have been a single relation containing all attributes that are of interest (called universal relation).

Normalization breaks R into smaller relationsNormalization breaks R into smaller relations.

R could have been generated when converting E-R diagram to a set of tablesset of tables.R could have been the result of some ad hoc design of relations, which we then test/convert to normal form.which we then test/convert to normal form.

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 45

Page 46: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

ER Model and Normalization

When an E-R diagram is carefully designed, identifying all entities correctl the tables generated from the E R diagramentities correctly, the tables generated from the E-R diagram should not need further normalization.However in a real (imperfect) design there can be FDs fromHowever, in a real (imperfect) design there can be FDs from non-key attributes of an entity to other attributes of the entity

E.g. employee entity with attributes department-number and department-address, g p y y p p ,and an FD department-number → department-addressGood design would have made department an entity

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 46

Page 47: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Denormalization for Performance

May want to use non-normalized schema for performanceE di l i l i h b d b l iE.g. displaying customer-name along with account-number and balance requires join of account with depositor

Alternative 1: Use denormalized relation containing attributes ofAlternative 1: Use denormalized relation containing attributes of account as well as depositor with all above attributes

faster lookupExtra space and extra execution time for updatesextra coding work for programmer and possibility of error in extra code

Alternative 2: use a materialized view defined asaccount depositor

B fit d d b k b t t di k fBenefits and drawbacks same as above, except no extra coding work for programmer and avoids possible errors

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 47

Page 48: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

Other Design Issues

Some aspects of database design are not caught by normalizationE l f b d d t b d i t b id dExamples of bad database design, to be avoided: Instead of earnings(company-id, year, amount), use

earnings-2000 earnings-2001 earnings-2002 etc all on the schema (company-idearnings-2000, earnings-2001, earnings-2002, etc., all on the schema (company-id, earnings).

Above are in BCNF, but make querying across years difficult and needs new table each yeartable each year

company-year(company-id, earnings-2000, earnings-2001, earnings-2002)Also in BCNF, but also makes querying across years difficult and requires new attribute each yearattribute each year.Is an example of a crosstab, where values for one attribute become column namesUsed in spreadsheets and in data analysis toolsUsed in spreadsheets, and in data analysis tools

Original Slides:© Silberschatz, Korth and Sudarshan

Intro to DB (2008-1)Copyright © 2006 - 2008 by S.-g. Lee Chap 7 - 48

Page 49: Intro to DB CHAPTER 7 RELATIONAL DB DESIGN · CHAPTER 7 RELATIONAL DB DESIGN. Chapter 7: Relational Database Design ... Test relations to see if they are legal under a given set of

END OF CHAPTER 7