Download - Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Transcript
Page 1: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Relational Database Design Theory

� Informal guidelines for good relational designs

� Functional dependencies

� Normal forms and normalization

� 1NF, 2NF, 3NF

31

� BCNF, 4NF, 5NF

� Inference rules on functional dependencies

� Additional properties for relational decompositions

� Nonadditive join property

� Dependency preservation property

Page 2: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

2NF, 3NF

� 2NF

� 3NF

32

Page 3: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

33

Page 4: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

34

Page 5: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

35

Page 6: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Boyce-Codd Normal Form� BCNF

� Difference from 3NF:

� 3NF allows A to be a key attribute

36

� 3NF allows A to be a key attribute

� Every relation in BCNF is also in 3NF

� Most relation schemas that are in 3NF are also in

BCNF but not all:

Page 7: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

37

Page 8: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

38

Page 9: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

1NF, 2NF, 3NF, BCNF

Test on any non-

trivial X�A

Violations normalization

1NF Multi-valued attributes New relation for each

multi-valued attributes

2NF a) X is a super key

or

b) X is not a key

1) partial key -> non-key

attribute

(partial FD)

New relation for the

partial key and its

dependent attributes

39

b) X is not a key

or

c) A is a key attribute

(partial FD) dependent attributes

3NF a) X is a super key

or

c) A is a key attribute

1) or

2) Non-key attribute -> non-

key attribute

(transitive FD)

Above and

New relation for the

non-key attribute and its

dependent attributes

BCNF a) X is a super key 1) or

2) or

3) non-key -> key attribute

Above and

New relation for the

non-key attribute and its

dependent attributes

Page 10: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Relational Database Design Theory

� Informal guidelines for good relational designs

� Functional dependencies

� Normal forms and normalization

� 1NF, 2NF, 3NF

40

� BCNF, 4NF, 5NF

� Functional dependencies and keys

� Additional properties for relational decompositions

� Nonadditive join property

� Dependency preservation property

Page 11: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Keys and Functional

Dependencies� Superkey

� No two distinct tuples in any state r of R can have the

same value for SK

� Functional dependency: SK � R

� Key

41

� Key

� Superkey of R; and it is minimal (removing any attribute

A from K leaves a set of attributes that is not a superkey

any more)

� Functional dependency: K � R, and for any A in K, K-

{A}�R does not hold

� Given a set of functional dependencies, can we find

the keys of R?

Page 12: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Inference Rules for FDs

� Armstrong’s inference rules (complete)

� Reflexivity rule: if Y ⊆ X then X→ Y

� Augmentation rule: if X→ Y then XZ→ YZ

� Transitivity rule: if X→ Y and Y→ Z then X→ Z

42

Transitivity rule: if X→ Y and Y→ Z then X→ Z

� Other rules

� Decomposition Rule:

if X→ YZ then: X→ Y and X→ Z

� Union or Additive Rule:

if X→ Y and X→ Z then: X→ YZ

� Pseudfotransitive Rule:

if X→ Y and YW→ Z then: XW→ Z

Page 13: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Inference Rules for FDs

� Specify a set of FDs that can be easily determined

from attribute semantics

� Infer additional FDs using inference rules

� The closure of a FD set F

43

� The closure of a FD set F

� the set of all FDs that include F as well as all FDs that

can be inferred from F

Page 14: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Closure of attribute set

� The closure of attribute set X under FD set F,

denoted as X+

� The set of attributes that are functionally determined by

X based on F

44

� How can we computationally find X+

� How can we determine a set of attributes X is a

key?

Page 15: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 16: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Example

� SSN+

� Dnumber+

46

� Dnumber+

� {SSN, Dnumber}+

� {SSN, Dnumber, Ename}+

Page 17: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Example

� SSN+ = {SSN, Ename, Bdate, Address, Dnumber, Dname,

Dmgr_ssn}

47

Dmgr_ssn}

� Dnumber+ = {Dnumber, Dname, Dmgr_ssn}

� {SSN, Dnumber}+ = {SSN, Ename, Bdate, Address,

Dnumber, Dname, Dmgr_ssn}

� {SSN, Dnumber, Ename}+ = {SSN, Ename, Bdate, Address,

Dnumber, Dname, Dmgr_ssn}

� Which of these attribute sets are superkeys? Keys?

Page 18: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Finding Keys based on Attribute

Closure� If X+ = R, then X is a superkey

� If X+ = R, and (X-{A})+ != R, then X is a key

� How do we find all keys given a FD set F?

48

� How do we find all keys given a FD set F?

� How do we find one key given a FD set F?

Page 19: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 20: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Example

� Initialization

� K = {SSN, Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn}

50

� K = {SSN, Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn}

� Decrease one attribute at a time

� …

� K = {SSN, Dnumber}

Page 21: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Relational Database Design Theory

� Informal guidelines for good relational designs

� Functional dependencies

� Normal forms and normalization

� 1NF, 2NF, 3NF

51

� BCNF, 4NF, 5NF

� Functional dependencies and keys

� Additional properties for relational decompositions

� Nonadditive join property

� Dependency preservation property

Page 22: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Dependency Preservation Property

� Informally, given a decomposition D of R and a FD set F on R, each FD in F either appear directly in D or could be inferred

� Claim 1. it is always possible to find a dependency-preserving decomposition D with respect to F such

52

preserving decomposition D with respect to F such that each relation Ri in D is in 3NF

Page 23: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Non-additive Join Property

� A decomposition D = {R1, R2, ..., Rm} of R has the lossless (nonadditive) join property with respect to the set of FDs F on R if, for every legal relation state r, the following holds, where * is the natural join of all the relations in D:

π π

53

* (π R1(r), ..., πRm(r)) = r

� lossless for “loss of information”, i.e. “addition of spurious information”

� How to test whether a decomposition satisfies the lossless join property?

Page 24: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 25: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 26: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 27: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 28: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 29: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 30: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Successive Lossless Join

Decomposition� Claim 2 (Preservation of non-additivity in

successive decompositions):

� If a decomposition D = {R1, R2, ..., Rm} of R has the

lossless (non-additive) join property with respect to F

60

� and if a decomposition Di = {Q1, Q2, ..., Qk} of Ri has

the lossless (non-additive) join property with respect to

the projection of F

� then the decomposition D2 = {R1, R2, ..., Ri-1, Q1, Q2,

..., Qk, Ri+1, ..., Rm} of R has the non-additive join

property with respect to F.

Page 31: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Normalization with BCNF and

Lossless Join Property

Page 32: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

In Practice

� Relational design from ER model or existing

tables/reports

� Normalization for 3NF or BCNF with lossless join

property

62

property

� Sometimes normal forms are violated deliberately

to achieve better performance (less join operations)