Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

56
Database System Concepts, 5th Ed. Chapter 7: Relational Database Chapter 7: Relational Database Design Design

Transcript of Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Page 1: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Database System Concepts, 5th Ed.

Chapter 7: Relational Database DesignChapter 7: Relational Database Design

Page 2: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.2Database System Concepts - 5th Edition

Chapter 7: Relational Database DesignChapter 7: Relational Database Design

Features of Good Relational Design

Atomic Domains and First Normal Form

Decomposition Using Functional Dependencies

Functional Dependency Theory

Algorithms for Functional Dependencies

Database-Design Process

Page 3: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.3Database System Concepts - 5th Edition

The Banking SchemaThe Banking Schema Branch = (branch_name, branch_city, assets)

支行(支行名称,部门所在城市,资产) customer = (customer_id, customer_name, customer_street,

customer_city)

顾客(顾客 id, 顾客名,顾客所在街道,顾客所在城市) loan = (loan_number, amount) 贷款(贷款号,金额) account = (account_number, balance) 账户(账户号,余额) employee = (employee_id. employee_name, telephone_number,

start_date)

雇员(雇员 id ,雇员姓名,电话号码,开始工作日期) dependent_name = (employee_id, dname)

家属姓名(雇员 id, 家属姓名) account_branch = (account_number, branch_name)

loan_branch = (loan_number, branch_name)

Page 4: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.4Database System Concepts - 5th Edition

The Banking SchemaThe Banking Schema borrower = (customer_id, loan_number)

depositor = (customer_id, account_number)

cust_banker = (customer_id, employee_id, type)

works_for = (worker_employee_id, manager_employee_id)

payment = (loan_number, payment_number, payment_date, payment_amount)

savings_account = (account_number, interest_rate)

checking_account = (account_number, overdraft_amount)

Page 5: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.5Database System Concepts - 5th Edition

Combine Schemas?Combine Schemas?

Suppose we combine borrower and loan to get

bor_loan = (customer_id, loan_number, amount )

Result is possible repetition of information (L-100 in example below)

Page 6: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.6Database System Concepts - 5th Edition

A Combined Schema Without RepetitionA Combined Schema Without Repetition

Consider combining loan_branch and loan

loan_amt_br = (loan_number, amount, branch_name)

No repetition (as suggested by example below)

Page 7: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.7Database System Concepts - 5th Edition

What About Smaller Schemas?What About Smaller Schemas?

Suppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan?

Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key”

Denote as a functional dependency:

loan_number amount

In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan.

Not all decompositions are good. Suppose we decompose employee into

employee1 = (employee_id, employee_name)

employee2 = (employee_name, telephone_number, start_date)

The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.

Page 8: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.8Database System Concepts - 5th Edition

A Lossy DecompositionA Lossy Decomposition

Page 9: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.9Database System Concepts - 5th Edition

关系模型的形式化定义关系模型的形式化定义

1 、关系模型的五元组定义: R < U,D,DOM,F >

R — 关系名, U — 属性组, D — 域,

DOM — 映射(属性与域之间的联系),

F — 数据依赖(属性与属性之间的联系)

2 、关系模型的三元组定义: R < U,F >

当且仅当 U 上的一个关系 r 满足 F 时, r 称为关系模式R<U,F> 的一个关系。

Page 10: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.10Database System Concepts - 5th Edition

数据依赖数据依赖

1 、定义

数据依赖是通过一个关系中属性间值的相等与否体现出来的数据间的相互关系,它是现实世界属性间相互联系的抽象。

2 、种类 函数依赖

数据依赖 多值依赖

连接依赖

Page 11: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.11Database System Concepts - 5th Edition

Functional DependenciesFunctional Dependencies

定义:设 R(U) 是属性集 U 上的关系模式。 X , Y 是 U 的子集。对于R(U) 的任意一个可能的关系 r ,如果 r 中不存在两个元组,它们在 X 上的属性值相等,而在 Y 上的属性值不等,则称“ X 函数确定 Y” 或“ Y函数依赖于 X” ,记作 XY 。 X 称为这个函数依赖的决定属性集。

Constraints on the set of legal relations. Require that the value for a certain set of attributes determines

uniquely the value for another set of attributes. A functional dependency is a generalization of the notion of a key.

学号 姓名 年龄 性别 籍贯 98601 王晓燕 20 女 北京 98602 李 波 23 男 上海 98603 陈志坚 21 男 长沙 98604 张 兵 20 男 上海

学号姓名学号年龄学号性别学号籍贯

Page 12: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.12Database System Concepts - 5th Edition

Functional Dependencies (Cont.)Functional Dependencies (Cont.)

Let R be a relation schema

R and R The functional dependency

holds on (成立) R if and only if for any legal relations r(R), whenever any two tuples t1 and t2 of r agree on the attributes , they also agree on the attributes . That is,

t1[] = t2 [] t1[ ] = t2 [ ] Example: Consider r(A,B ) with the following instance of r.

On this instance, A B does NOT hold, but B A does hold.

1 41 53 7

Page 13: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.13Database System Concepts - 5th Edition

Functional Dependencies (Cont.)Functional Dependencies (Cont.)

K is a superkey for relation schema R if and only if K R

K is a candidate key for R if and only if

K R, and

for no K, R

Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema:

bor_loan = (customer_id, loan_number, amount ).

We expect this functional dependency to hold:

loan_number amount

but would not expect the following to hold:

amount customer_name

Page 14: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.14Database System Concepts - 5th Edition

Use of Functional DependenciesUse of Functional Dependencies

We use functional dependencies to:

test relations to see if they are legal under a given set of functional dependencies.

If a relation r is legal under a set F of functional dependencies, we say that r satisfies F.

specify constraints on the set of legal relations

We say that F holds on R if all legal relations on R satisfy the set of functional dependencies F.

Note: A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances.

For example, a specific instance of loan may, by chance, satisfy amount customer_name.

Page 15: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.15Database System Concepts - 5th Edition

Functional Dependencies (Cont.)Functional Dependencies (Cont.)

A functional dependency is trivial (平凡) if it is satisfied by all instances of a relation

Example:

customer_name, loan_number customer_name

customer_name customer_name

In general, is trivial if

Page 16: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.16Database System Concepts - 5th Edition

Closure of a Set of Functional Closure of a Set of Functional DependenciesDependencies

Given a set F of functional dependencies, there are certain other functional dependencies that are logically implied by F.

For example: If A B and B C, then we can infer that A C

The set of all functional dependencies logically implied by F is the closure of F (函数依赖 F 的闭包) .

We denote the closure of F by F+.

F+ is a superset of F.

Page 17: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.17Database System Concepts - 5th Edition

完全依赖

在 R(U) 中,如果 XY ,并且对于 X 的任何一个真子集 X’ 都有X’Y ,则称 Y 对 X 完全依赖,记作 X Y 。

部分依赖

若 XY 但 Y 不完全依赖于 X ,则称 Y 对 X 部分函数依赖,记作 X

Y 。

传递依赖

在 R(U) 中,如果 XY , YZ ,且 YX, ZY , YX ,则称 Z对 X 传递依赖。记作 X Y 。

f

p

函数依赖的种类函数依赖的种类

传递

Page 18: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.18Database System Concepts - 5th Edition

规范化规范化 【目的】通过研究关系之间的等价问题,找出一些方法来指导我们

定义数据库的逻辑结构,使其具有好的性能(冗余小、数据完整性好、操作方便)。

OF240ZHOU90C1S4

OF347WANG56C4S3

OF235LIU70C2S3

OF240ZHOU75C1S3

OF240ZHOU90C1S2

OF347WANG87C4S1

OF235LIU85C3S1

OF235LIU90C2S1

OF240ZHOU90C1S1

OFFICETAGETNAMEGRADEC-NOS-NO

SCT 关系 关系规范化定义——通常将结构较简单的关系取代结构较复杂的关系的过程称为关系的规范化。

Page 19: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.19Database System Concepts - 5th Edition

Normalization (Normalization ( 范式范式 ))

范式表示符合某一种级别的关系模式的集合。

R 为第几范式写成 RxNF 。

范 式 的 概 念 是 由 Codd 给 出 的 , 并 在 1971 ~ 1972 年 提 出 了1NF、 2NF、 3NF 的概念, 1974年 Codd和 Boyce 又共同提出了BCNF 的概念, 1976年 Fagin 又提出了 4NF ,后来又有人提出了5NF 。

对于各种范式之间的联系是:

5NF 4NF BCNF 3NF 2NF 1NF 。

一个低一级范式的关系模式,通过模式分解可以转换为若干个高一级范式的关系模式的集合,这种过程就叫规范化。

Page 20: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.20Database System Concepts - 5th Edition

Goals of NormalizationGoals of Normalization

Let R be a relation scheme with a set F of functional dependencies.

Decide whether a relation scheme R is in “good” form.

In the case that a relation scheme R is not in “good” form, decompose it into a set of relation scheme {R1, R2, ..., Rn} such that

each relation scheme is in good form

the decomposition is a lossless-join decomposition

Preferably, the decomposition should be dependency preserving.

Page 21: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.21Database System Concepts - 5th Edition

7.2 First Normal Form7.2 First Normal Form

Domain is atomic if its elements are considered to be indivisible units

Examples of non-atomic domains:

Set of names, composite attributes

Identification numbers like CS101 that can be broken up into parts

A relational schema R is in first normal form if the domains of all attributes of R are atomic

Non-atomic values complicate storage and encourage redundant (repeated) storage of data

Example: Set of accounts stored with each customer, and set of owners stored with each account

We assume all relations are in first normal form (and revisit this in Chapter 9)

Page 22: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.22Database System Concepts - 5th Edition

First Normal Form (Cont’d)First Normal Form (Cont’d)

Atomicity is actually a property of how the elements of the domain are used.

Example: Strings would normally be considered indivisible

Suppose that students are given roll numbers which are strings of the form CS0012 or EE1127

If the first two characters are extracted to find the department, the domain of roll numbers is not atomic.

Doing so is a bad idea: leads to encoding of information in application program rather than in the database.

Page 23: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.23Database System Concepts - 5th Edition

2NF2NF

若 R1NF ,且每个非主属性完全函数依赖于关键字,则 R2NF 。

Example:

R( S_NO, C_NO, GRADE, TNAME, TAGE, OFFIC

E) ;

F = { (S_NO,C_NO)GRADE,

C_NOTNAME, TNAMETAGE, TNAMEOFFICE };

请问 R2NF ?分解: SC( S_NO, C_NO, GRADE ) CTO( C_NO, TNAME, TAGE, OFFICE ) FSC = { (S_NO,C_NO)GRADE }

FCTO ={ C_NOTNAME, TNAMETAGE, TNAMEOFFICE };

Page 24: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.24Database System Concepts - 5th Edition

3NF3NF

若 R2NF ,且 R 的每个非主属性都不传递依赖于关键字,则 R3NF

。Example : SC( S_NO, C_NO, GRADE ) CTO( C_NO, TNAME, TAGE, OFFICE ) FSC = { (S_NO,C_NO)GRADE }

FCTO ={ C_NOTNAME, TNAMETAGE, TNAMEOFFICE }

请问 SC3NF? CTO3NF ?

分解: SC( S_NO, C_NO, GRADE) FSC = { (S_NO,C_NO)GRADE }

CT( C_NO, TNAME) FCT ={ C_NOTNAME }

TO( TNAME, TAGE, OFFICE) FTO={TNAMETAGE,TNAMEOFFICE

}

Page 25: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.25Database System Concepts - 5th Edition

BCNFBCNF

R1NF ,若 XY 且 YX时 X 必含有关键字,则 RBCNF 。 一个满足 BCNF 的关系框架 R :

所有非主属性对每一候选关键字都是完全依赖; 所有主属性对每一不包含它的候选关键字也是完全依赖; 没有任何属性完全依赖于非候选关键字的任何一组属性。

消除了主属性对候选关键字的部分依赖和传递依赖。

S-NO NAME C-NO GRADE

S1 WANG C1 90

S1 WANG C2 90

S1 WANG C3 85

S2 LI C1 90

S3 CHEN C1 75

S3 CHEN C2 70

分析:此关系有两个候选关键字(S-NO,C-NO)

(NAME,C-NO)

而 S-NONAME

因此 RBCNF

Page 26: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.26Database System Concepts - 5th Edition

Boyce-Codd Normal FormBoyce-Codd Normal Form

is trivial (i.e., )

is a superkey for R

A relation schema R is in BCNF with respect to a set F of functional dependencies if for all functional dependencies in F+ of the form

where R and R, at least one of the following holds:

Example : schema not in BCNF:

bor_loan = ( customer_id, loan_number, amount )

because loan_number amount holds on bor_loan but loan_number is not a superkey

Page 27: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.27Database System Concepts - 5th Edition

BCNF and Dependency PreservationBCNF and Dependency Preservation

Constraints, including functional dependencies, are costly to check in practice unless they pertain to only one relation

If it is sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that all functional dependencies hold, then that decomposition is dependency preserving.

Because it is not always possible to achieve both BCNF and dependency preservation, we consider a weaker normal form, known as third normal form.

Page 28: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.28Database System Concepts - 5th Edition

Functional-Dependency TheoryFunctional-Dependency Theory

We now consider the formal theory that tells us which functional dependencies are implied logically by a given set of functional dependencies.

We then develop algorithms to generate lossless decompositions into 3NF( 3NF 分解算法)

We then develop algorithms to test if a decomposition is dependency-preserving (保持函数依赖的分解算法)

Page 29: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.29Database System Concepts - 5th Edition

Closure of a Set of Functional Closure of a Set of Functional DependenciesDependencies

Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F.

For example: If A B and B C, then we can infer that A C

The set of all functional dependencies logically implied by F is the closure of F.

We denote the closure of F by F+.

Page 30: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.30Database System Concepts - 5th Edition

公理

F1 (自反性, reflexivity ):若 XY ,则 XY 或 XX 。

F2 (增广性, augmentation ):若 XY ,则 XZYZ或 XZY。

F3 (传递性, transitivity) ):若 XY , YZ ,则 XZ 。 推理规则

F5 (伪传 性 , pseudotransitivity ) : 若 XY , YWZ , 则XWZ 。

F6 (合成性 , union ):若 XY , XZ ,则 XYZ 。

F7 (分解性 , decomposition ):若 XYZ ,则 XY , XZ 。

FDFD 公理及推理规则公理及推理规则

Page 31: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.31Database System Concepts - 5th Edition

ExampleExample R = (A, B, C, G, H, I)

F = { A B A CCG HCG I B H}

some members of F+

A H

by transitivity from A B and B H

AG I

by augmenting A C with G, to get AG CG and then transitivity with CG I

CG HI

by augmenting CG I to infer CG CGI,

and augmenting of CG H to infer CGI HI,

and then transitivity

Page 32: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.32Database System Concepts - 5th Edition

Procedure for Computing FProcedure for Computing F++ (选讲)(选讲)

To compute the closure of a set of functional dependencies F:

F + = Frepeat

for each functional dependency f in F+

apply reflexivity and augmentation rules on f add the resulting functional dependencies to F +

for each pair of functional dependencies f1and f2 in F +

if f1 and f2 can be combined using transitivity then add the resulting functional dependency

to F +

until F + does not change any further

NOTE: We shall see an alternative procedure for this task later

Page 33: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.33Database System Concepts - 5th Edition

Closure of Attribute SetsClosure of Attribute Sets (重点)(重点)

Given a set of attributes define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under F

Algorithm to compute +, the closure of under F

result := ;while (changes to result) do

for each in F dobegin

if result then result := result end

Page 34: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.34Database System Concepts - 5th Edition

Example of Attribute Set ClosureExample of Attribute Set Closure

R = (A, B, C, G, H, I) F = {A B

A C CG HCG IB H}

(AG)+

1. result = AG

2. result = ABCG (A C and A B)

3. result = ABCGH (CG H and CG AGBC)

4. result = ABCGHI (CG I and CG AGBCH)

Page 35: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.35Database System Concepts - 5th Edition

Example of Attribute Set ClosureExample of Attribute Set Closure

Is AG a candidate key?

1. Is AG a super key?

1. Does AG R? == Is (AG)+ R

2. Is any subset of AG a superkey?

1. Does A R? == Is (A)+ R

2. Does G R? == Is (G)+ R

Page 36: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.36Database System Concepts - 5th Edition

Uses of Attribute ClosureUses of Attribute Closure

There are several uses of the attribute closure algorithm:

Testing for superkey:

To test if is a superkey, we compute +, and check if + contains all attributes of R.

Testing functional dependencies

To check if a functional dependency holds (or, in other words, is in F+), just check if +.

That is, we compute + by using attribute closure, and then check if it contains .

Is a simple and cheap test, and very useful

Computing closure of F

For each R, we find the closure +, and for each S +, we output a functional dependency S.

Page 37: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.37Database System Concepts - 5th Edition

Canonical CoverCanonical Cover (最小覆盖、正则覆盖)(最小覆盖、正则覆盖)

Sets of functional dependencies may have redundant dependencies that can be inferred from the others

For example: A C is redundant in: {A B, B C}

Parts of a functional dependency may be redundant

E.g.: on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D}

E.g.: on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D}

Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies

Page 38: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.38Database System Concepts - 5th Edition

Canonical CoverCanonical Cover

A canonical cover for F is a set of dependencies Fc such that

F logically implies all dependencies in Fc, and

Fc logically implies all dependencies in F, and

No functional dependency in Fc contains an extraneous attribute, and

Each left side of functional dependency in Fc is unique.

To compute a canonical cover for F:repeat

Use the union rule to replace any dependencies in F 1 1 and 1 2 with 1 1 2

Find a functional dependency with an extraneous attribute either in or in

If an extraneous attribute is found, delete it from until F does not change

Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied

Page 39: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.39Database System Concepts - 5th Edition

Lossless-join DecompositionLossless-join Decomposition(分解成简单模式的判别算法)(分解成简单模式的判别算法)

For the case of R = (R1, R2), we require that for all possible relations r on schema R

r = R1 (r ) R2 (r )

A decomposition of R into R1 and R2 is lossless join if and only if at least one of the following dependencies is in F+:

R1 R2 R1

R1 R2 R2

Page 40: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.40Database System Concepts - 5th Edition

ExampleExample

R = (A, B, C)F = {A B, B C)

Can be decomposed in two different ways

R1 = (A, B), R2 = (B, C)

Lossless-join decomposition:

R1 R2 = {B} and B BC

Dependency preserving

R1 = (A, B), R2 = (A, C) ? Lossless-join decomposition:

R1 R2 = {A} and A AB

Not dependency preserving (cannot check B C without computing R1 R2)

Page 41: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.41Database System Concepts - 5th Edition

Dependency PreservationDependency Preservation

Let Fi be the set of dependencies F + that include only attributes in Ri.

A decomposition is dependency preserving, if

(F1 F2 … Fn )+ = F +

If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive.

Page 42: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.42Database System Concepts - 5th Edition

Overall Database Design ProcessOverall Database Design Process

We have assumed schema R is given

R could have been generated when converting E-R diagram to a set of tables.

R could have been a single relation containing all attributes that are of interest (called universal relation).

Normalization breaks R into smaller relations.

R could have been the result of some ad hoc design of relations, which we then test/convert to normal form.

Page 43: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.43Database System Concepts - 5th Edition

ER Model and NormalizationER Model and Normalization

When an E-R diagram is carefully designed, identifying all entities correctly, the tables generated from the E-R diagram should not need further normalization.

However, in a real (imperfect) design, there can be functional dependencies from non-key attributes of an entity to other attributes of the entity

Example: an employee entity with attributes department_number and department_address, and a functional dependency department_number department_address

Good design would have made department an entity

Functional dependencies from non-key attributes of a relationship set possible, but rare --- most relationships are binary

Page 44: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.44Database System Concepts - 5th Edition

Denormalization for PerformanceDenormalization for Performance

May want to use non-normalized schema for performance

For example, displaying customer_name along with account_number and balance requires join of account with depositor

Alternative 1: Use denormalized relation containing attributes of account as well as depositor with all above attributes

faster lookup

extra space and extra execution time for updates

extra coding work for programmer and possibility of error in extra code

Alternative 2: use a materialized view defined as account depositor

Benefits and drawbacks same as above, except no extra coding work for programmer and avoids possible errors

Page 45: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Database System Concepts, 5th Ed.

End of ChapterEnd of Chapter

Page 46: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Database System Concepts, 5th Ed.

Proof of Correctness of 3NF Proof of Correctness of 3NF Decomposition AlgorithmDecomposition Algorithm

Page 47: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.47Database System Concepts - 5th Edition

Correctness of 3NF Decomposition Correctness of 3NF Decomposition AlgorithmAlgorithm

3NF decomposition algorithm is dependency preserving (since there is a relation for every FD in Fc)

Decomposition is lossless

A candidate key (C ) is in one of the relations Ri in decomposition

Closure of candidate key under Fc must contain all attributes in R.

Follow the steps of attribute closure algorithm to show there is only one tuple in the join result for each tuple in Ri

Page 48: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.48Database System Concepts - 5th Edition

Correctness of 3NF Decomposition Correctness of 3NF Decomposition Algorithm (Cont’d.)Algorithm (Cont’d.)

Claim: if a relation Ri is in the decomposition generated by the

above algorithm, then Ri satisfies 3NF.

Let Ri be generated from the dependency

Let B be any non-trivial functional dependency on Ri. (We need only consider FDs whose right-hand side is a single attribute.)

Now, B can be in either or but not in both. Consider each case separately.

Page 49: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.49Database System Concepts - 5th Edition

Correctness of 3NF Decomposition Correctness of 3NF Decomposition (Cont’d.)(Cont’d.)

Case 1: If B in :

If is a superkey, the 2nd condition of 3NF is satisfied

Otherwise must contain some attribute not in Since B is in F+ it must be derivable from Fc, by using

attribute closure on .

Attribute closure not have used . If it had been used, must be contained in the attribute closure of , which is not possible, since we assumed is not a superkey.

Now, using (- {B}) and B, we can derive B

(since , and B since B is non-trivial)

Then, B is extraneous in the right-hand side of ; which is not possible since is in Fc.

Thus, if B is in then must be a superkey, and the second condition of 3NF must be satisfied.

Page 50: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.50Database System Concepts - 5th Edition

Correctness of 3NF Decomposition Correctness of 3NF Decomposition (Cont’d.)(Cont’d.)

Case 2: B is in .

Since is a candidate key, the third alternative in the definition of 3NF is trivially satisfied.

In fact, we cannot show that is a superkey.

This shows exactly why the third alternative is present in the definition of 3NF.

Q.E.D.

Page 51: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.51Database System Concepts - 5th Edition

Figure 7.5: Sample Relation Figure 7.5: Sample Relation rr

Page 52: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.52Database System Concepts - 5th Edition

Figure 7.6Figure 7.6

Page 53: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.53Database System Concepts - 5th Edition

Figure 7.7Figure 7.7

Page 54: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.54Database System Concepts - 5th Edition

Figure 7.15: An Example of Figure 7.15: An Example of Redundancy in a BCNF RelationRedundancy in a BCNF Relation

Page 55: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.55Database System Concepts - 5th Edition

Figure 7.16: An Illegal Figure 7.16: An Illegal RR22 RelationRelation

Page 56: Database System Concepts, 5th Ed. Chapter 7: Relational Database Design.

Pany, ccsu7.56Database System Concepts - 5th Edition

Figure 7.18: Relation of Practice Figure 7.18: Relation of Practice Exercise 7.2Exercise 7.2