Relational Database Design Theory - Department of...

32
Relational Database Design Theory Informal guidelines for good relational designs Functional dependencies Normal forms and normalization 1NF, 2NF, 3NF 31 BCNF, 4NF, 5NF Inference rules on functional dependencies Additional properties for relational decompositions Nonadditive join property Dependency preservation property

Transcript of Relational Database Design Theory - Department of...

Page 1: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Relational Database Design Theory

� Informal guidelines for good relational designs

� Functional dependencies

� Normal forms and normalization

� 1NF, 2NF, 3NF

31

� BCNF, 4NF, 5NF

� Inference rules on functional dependencies

� Additional properties for relational decompositions

� Nonadditive join property

� Dependency preservation property

Page 2: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

2NF, 3NF

� 2NF

� 3NF

32

Page 3: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

33

Page 4: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

34

Page 5: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

35

Page 6: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Boyce-Codd Normal Form� BCNF

� Difference from 3NF:

� 3NF allows A to be a key attribute

36

� 3NF allows A to be a key attribute

� Every relation in BCNF is also in 3NF

� Most relation schemas that are in 3NF are also in

BCNF but not all:

Page 7: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

37

Page 8: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

38

Page 9: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

1NF, 2NF, 3NF, BCNF

Test on any non-

trivial X�A

Violations normalization

1NF Multi-valued attributes New relation for each

multi-valued attributes

2NF a) X is a super key

or

b) X is not a key

1) partial key -> non-key

attribute

(partial FD)

New relation for the

partial key and its

dependent attributes

39

b) X is not a key

or

c) A is a key attribute

(partial FD) dependent attributes

3NF a) X is a super key

or

c) A is a key attribute

1) or

2) Non-key attribute -> non-

key attribute

(transitive FD)

Above and

New relation for the

non-key attribute and its

dependent attributes

BCNF a) X is a super key 1) or

2) or

3) non-key -> key attribute

Above and

New relation for the

non-key attribute and its

dependent attributes

Page 10: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Relational Database Design Theory

� Informal guidelines for good relational designs

� Functional dependencies

� Normal forms and normalization

� 1NF, 2NF, 3NF

40

� BCNF, 4NF, 5NF

� Functional dependencies and keys

� Additional properties for relational decompositions

� Nonadditive join property

� Dependency preservation property

Page 11: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Keys and Functional

Dependencies� Superkey

� No two distinct tuples in any state r of R can have the

same value for SK

� Functional dependency: SK � R

� Key

41

� Key

� Superkey of R; and it is minimal (removing any attribute

A from K leaves a set of attributes that is not a superkey

any more)

� Functional dependency: K � R, and for any A in K, K-

{A}�R does not hold

� Given a set of functional dependencies, can we find

the keys of R?

Page 12: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Inference Rules for FDs

� Armstrong’s inference rules (complete)

� Reflexivity rule: if Y ⊆ X then X→ Y

� Augmentation rule: if X→ Y then XZ→ YZ

� Transitivity rule: if X→ Y and Y→ Z then X→ Z

42

Transitivity rule: if X→ Y and Y→ Z then X→ Z

� Other rules

� Decomposition Rule:

if X→ YZ then: X→ Y and X→ Z

� Union or Additive Rule:

if X→ Y and X→ Z then: X→ YZ

� Pseudfotransitive Rule:

if X→ Y and YW→ Z then: XW→ Z

Page 13: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Inference Rules for FDs

� Specify a set of FDs that can be easily determined

from attribute semantics

� Infer additional FDs using inference rules

� The closure of a FD set F

43

� The closure of a FD set F

� the set of all FDs that include F as well as all FDs that

can be inferred from F

Page 14: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Closure of attribute set

� The closure of attribute set X under FD set F,

denoted as X+

� The set of attributes that are functionally determined by

X based on F

44

� How can we computationally find X+

� How can we determine a set of attributes X is a

key?

Page 15: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 16: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Example

� SSN+

� Dnumber+

46

� Dnumber+

� {SSN, Dnumber}+

� {SSN, Dnumber, Ename}+

Page 17: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Example

� SSN+ = {SSN, Ename, Bdate, Address, Dnumber, Dname,

Dmgr_ssn}

47

Dmgr_ssn}

� Dnumber+ = {Dnumber, Dname, Dmgr_ssn}

� {SSN, Dnumber}+ = {SSN, Ename, Bdate, Address,

Dnumber, Dname, Dmgr_ssn}

� {SSN, Dnumber, Ename}+ = {SSN, Ename, Bdate, Address,

Dnumber, Dname, Dmgr_ssn}

� Which of these attribute sets are superkeys? Keys?

Page 18: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Finding Keys based on Attribute

Closure� If X+ = R, then X is a superkey

� If X+ = R, and (X-{A})+ != R, then X is a key

� How do we find all keys given a FD set F?

48

� How do we find all keys given a FD set F?

� How do we find one key given a FD set F?

Page 19: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 20: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Example

� Initialization

� K = {SSN, Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn}

50

� K = {SSN, Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn}

� Decrease one attribute at a time

� …

� K = {SSN, Dnumber}

Page 21: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Relational Database Design Theory

� Informal guidelines for good relational designs

� Functional dependencies

� Normal forms and normalization

� 1NF, 2NF, 3NF

51

� BCNF, 4NF, 5NF

� Functional dependencies and keys

� Additional properties for relational decompositions

� Nonadditive join property

� Dependency preservation property

Page 22: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Dependency Preservation Property

� Informally, given a decomposition D of R and a FD set F on R, each FD in F either appear directly in D or could be inferred

� Claim 1. it is always possible to find a dependency-preserving decomposition D with respect to F such

52

preserving decomposition D with respect to F such that each relation Ri in D is in 3NF

Page 23: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Non-additive Join Property

� A decomposition D = {R1, R2, ..., Rm} of R has the lossless (nonadditive) join property with respect to the set of FDs F on R if, for every legal relation state r, the following holds, where * is the natural join of all the relations in D:

π π

53

* (π R1(r), ..., πRm(r)) = r

� lossless for “loss of information”, i.e. “addition of spurious information”

� How to test whether a decomposition satisfies the lossless join property?

Page 24: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 25: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 26: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 27: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 28: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 29: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued
Page 30: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Successive Lossless Join

Decomposition� Claim 2 (Preservation of non-additivity in

successive decompositions):

� If a decomposition D = {R1, R2, ..., Rm} of R has the

lossless (non-additive) join property with respect to F

60

� and if a decomposition Di = {Q1, Q2, ..., Qk} of Ri has

the lossless (non-additive) join property with respect to

the projection of F

� then the decomposition D2 = {R1, R2, ..., Ri-1, Q1, Q2,

..., Qk, Ri+1, ..., Rm} of R has the non-additive join

property with respect to F.

Page 31: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

Normalization with BCNF and

Lossless Join Property

Page 32: Relational Database Design Theory - Department of …lxiong/cs377_f11/share/slides/15_relational... · Relational Database Design Theory ... Violations normalization 1NF Multi-valued

In Practice

� Relational design from ER model or existing

tables/reports

� Normalization for 3NF or BCNF with lossless join

property

62

property

� Sometimes normal forms are violated deliberately

to achieve better performance (less join operations)