Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database...

60
Part V Relational Database Design Theory

Transcript of Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database...

Page 1: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Part V

Relational Database Design Theory

Page 2: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory

Relational Database Design Theory

1 Target Model of the Logical Design

2 Relational DB Design

3 Normal Forms

4 Transformation Properties

5 Design Methods

Saake Database Concepts Last Edited: April 2019 5–1

Page 3: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory

Educational Objective for Today . . .

Know how to refine the relational designUnderstanding of normal formsMethodology and techniques fornormalization

Saake Database Concepts Last Edited: April 2019 5–2

Page 4: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Target Model of the Logical Design

Relation ModelWINES WineID Name Color Vintage Vineyard

1042 La Rose Grand Cru Rot 1998 Château La Rose2168 Creek Shiraz Rot 2003 Creek3456 Zinfandel Rot 2004 Helena2171 Pinot Noir Rot 2001 Creek3478 Pinot Noir Rot 1999 Helena4711 Riesling Reserve Weis̈ 1999 Müller4961 Chardonnay Weis̈ 2002 Bighorn

PRODUCER Vineyard District Region

Creek Barossa Valley South AustraliaHelena Napa Valley CaliforniaChâteau La Rose Saint-Emilion BordeauxChâteau La Pointe Pomerol BordeauxMüller Rheingau HessenBighorn Napa Valley California

Saake Database Concepts Last Edited: April 2019 5–3

Page 5: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Target Model of the Logical Design

Terms of the Relational Model

Term Informal MeaningAttribute Column of a tableValue domain Possible values of an attributeAttribute value Element of a value domainRelation schema Set of attributesRelation Set of rows in a tableTuple Row in a tableDatabase schema Set of relation schemasDatabase Set of relations (base relations)

Saake Database Concepts Last Edited: April 2019 5–4

Page 6: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Target Model of the Logical Design

Terms of the Relational Model /2

Term Informal MeaningKey Minimal set of attributes, whose values

uniquely identify a tuple in a tablePrimary key A key designated during database de-

signForeign key Set of attributes that are key in another

relationForeign key constraint All attribute values of the foreign key

show up as keys in the other relation

Saake Database Concepts Last Edited: April 2019 5–5

Page 7: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Target Model of the Logical Design

Integrity Constraints

Identifying set of attributes K := {B1, . . . ,Bk} ✓ R:

8t1, t2 2 r [t1 6= t2 =) 9B 2 K : t1(B) 6= t2(B)]

Key: is minimal identifying set of attributesI {Name, Vintage, Vineyard} andI {WineID} for WINES

Prime attribute: element of a keyPrimary key: designated keySuperkey: every superset of a key (= identifying set of attributes)Foreign key: X(R1) ! Y(R2)

{t(X)|t 2 r1} ✓ {t(Y)|t 2 r2}

Saake Database Concepts Last Edited: April 2019 5–6

Page 8: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Relation with RedundanciesWINES WineID Name ... Vineyard District Region

1042 La Rose Gr. Cru . . . Ch. La Rose Saint-Emilion Bordeaux2168 Creek Shiraz . . . Creek Barossa Valley South Australia3456 Zinfandel . . . Helena Napa Valley California2171 Pinot Noir . . . Creek Barossa Valley South Australia3478 Pinot Noir . . . Helena Napa Valley California4711 Riesling Res. . . . MÃ 1

4 ller Rheingau Hessen4961 Chardonnay . . . Bighorn Napa Valley California

Saake Database Concepts Last Edited: April 2019 5–7

Page 9: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Update Anomalies

Insertion into the redundancy-containing relation WINES:

insert into WINES (WineID, Name, Color, Vintage,Vineyard, District, Region)

values (4711, ’Chardonnay’, ’Weis̈’, 2004,’Helena’, ’Rheingau’, ’California’)

I WineID 4711 already assigned to another wine: violates FDWineID!Name

I Up to now, vineyard Helena was located in Napa Valley: violates FDVineyard!District

I Rheingau is not located in California: violates FDDistrict!Region

Also: update- and delete anomalies

Saake Database Concepts Last Edited: April 2019 5–8

Page 10: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Functional Dependencies

Functional dependency between two sets of attribute X and Y of arelation holds iff

for each tuple of the relation, the attribute values of the X

components determine the attribute values of the Y components.

If two tuples have the same values for the X attributes, they alsohave the same values for all Y attributes.Notation for functional dependency (FD): X!Y

Example:WineID !Name, VineyardDistrict!Region

But not: Vineyard!Name

Saake Database Concepts Last Edited: April 2019 5–9

Page 11: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Keys as a Special Case

For example on Slide 5-7

WineID!Name, Color, Vintage, Vineyard, District, Region

Always: WineID!WineID,then whole schema on the right sideIf left side minimal: KeyFormally: X is key if FD X!R holds for relation schema R and X isminimal

Goal of database design: Transform all existing functionaldependencies into “key dependencies”, without losing semanticinformation

Saake Database Concepts Last Edited: April 2019 5–10

Page 12: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Deriving FDs

r A B Ca1 b1 c1a2 b1 c1a3 b2 c1a4 b1 c1

Satisfies A!B and B!C

Then A!C also holdsNot derivable: C!A or C!B

Saake Database Concepts Last Edited: April 2019 5–11

Page 13: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Deriving FDs /2

If for f over R, it holds that SATR(F) ✓ SATR(f ), then F implies theFD f (short: F |= f )Previous example:

F = {A!B,B!C} |= A!C

Computing the closure: Determine all functional dependenciesthat can be derived from a given set of FDsClosure F

+R:= {f | (f FD over R) ^ F |= f}

Example:

{A!B,B!C}+ = {A!B,B!C,A!C,AB!C,A!BC, . . . ,

AB!AB, . . . }

Saake Database Concepts Last Edited: April 2019 5–12

Page 14: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Derivation RulesF1 Reflexivity X ◆ Y =) X!Y

F2 Augmentation {X!Y} =) XZ!YZ and XZ!Y

F3 Transitivity {X!Y, Y !Z} =) X!Z

F4 Decomposition {X!YZ} =) X!Y

F5 Union {X!Y,X!Z} =) X!YZ

F6 Pseudo-transitivity {X!Y,WY !Z} =) WX!Z

F1-F3 known as Armstrong axioms (sound, complete)

Sound: Rules do not derive FDs that are not logically impliedComplete: All implied FDs are derivedIndependent (i.e., minimal w.r.t.1 ✓): No rule can be omitted

1w.r.t. = with respect toSaake Database Concepts Last Edited: April 2019 5–13

Page 15: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Alternative Set of Rules

B-Axioms or RAP-rulesR Reflexivity {} =) X!X

A Accumulation {X!YZ, Z!AW} =) X!YZA

P Projectivity {X!YZ} =) X!Y

Rule set is complete because it allows to derive the Armstrongaxioms

Saake Database Concepts Last Edited: April 2019 5–14

Page 16: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Membership Problem

Can a certain FD X!Y be derived from a given set F, i.e., is itimplied by F?

Membership problem: “X!Y 2 F+ ?”

Closure over a set of attributes X w.r.t. F is X+F:= {A | X!A 2 F

+}Membership problem can be solved in linear time by solving themodified problem

Membership problem (2): “Y ✓ X+F

?”

Saake Database Concepts Last Edited: April 2019 5–15

Page 17: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Algorithm CLOSURE

Compute X+F

, the closure of X w.r.t. F

CLOSURE(F,X):X+ := X

repeat

X+:= X

+ /* R-rule */forall FDs Y !Z 2 F

if Y ✓ X+ then X

+ := X+ [ Z /* A-rule */

until X+ = X

+

return X+

MEMBER(F,X!Y): /* Test if X!Y 2 F+ */

return Y ✓CLOSURE(F,X) /* P-rule */

Example: A!C 2 {A!B| {z }f1

,B!C| {z }f2

}+?

Saake Database Concepts Last Edited: April 2019 5–16

Page 18: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Relational DB Design

Minimal Cover... to minimize a set of FDs

forall FD X!Y 2 F /* Left reduction */forall A 2 X /* A superflous? */

if Y ✓ CLOSURE(F,X � {A})then replace X!Y with (X � A)!Y in F

forall remaining FD X!Y 2 F /* Right reduction */forall B 2 Y /* B superflous? */

if B ✓ CLOSURE(F � {X!Y} [ {X!(Y � B)},X)then replace X!Y with X!(Y � B)

Eliminate FDs of the form X!;Combine FDs of the form X!Y1,X!Y2, . . . into X!Y1Y2 . . .

Saake Database Concepts Last Edited: April 2019 5–17

Page 19: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Normal Forms . . .

. . . determine properties of relation schemata

. . . forbid certain combinations of functional dependencies inrelations. . . should prevent redundancies and anomalies

Saake Database Concepts Last Edited: April 2019 5–18

Page 20: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

First Normal Form

Allows only atomic attributes in relation schemas, i.e., onlyelements of standard datatypes, such as integer or string, areallowed as attribute values, but not array or setNot in 1NF:

Vineyard District Region WName

Ch. La Rose Saint-Emilion Bordeaux La Rose Grand CruCreek Barossa Valley South Australia Creek Shiraz, Pinot NoirHelena Napa Valley California Zinfandel, Pinot NoirMÃ 1

4 ller Rheingau Hessen Riesling ReserveBighorn Napa Valley California Chardonnay

Saake Database Concepts Last Edited: April 2019 5–19

Page 21: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

First Normal Form /2

In first normal form:

Vineyard District Region WName

Ch. La Rose Saint-Emilion Bordeaux La Rose Grand CruCreek Barossa Valley South Australia Creek ShirazCreek Barossa Valley South Australia Pinot NoirHelena Napa Valley California ZinfandelHelena Napa Valley California Pinot NoirMÃ 1

4 ller Rheingau Hessen Riesling ReserveBighorn Napa Valley California Chardonnay

Saake Database Concepts Last Edited: April 2019 5–20

Page 22: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Second Normal FormPartial dependency: An attribute functionally depends on only partof the key

Name Vineyard Color District Region Price

La Rose Grand Cru Ch. La Rose Rot Saint-Emilion Bordeaux 39.00Creek Shiraz Creek Rot Barossa Valley South Australia 7.99Pinot Noir Creek Rot Barossa Valley South Australia 10.99Zinfandel Helena Rot Napa Valley California 5.99Pinot Noir Helena Rot Napa Valley California 19.99Riesling Reserve Müller Weis̈ Rheingau Hessen 14.99Chardonnay Bighorn Weis̈ Napa Valley California 9.90

f1: Name, Vineyard!Pricef2: Name !Colorf3: Vineyard !District, Regionf4: District !Region

Second normal form eliminates such partial dependencies fornon-key attributes

Saake Database Concepts Last Edited: April 2019 5–21

Page 23: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Elimination of Partial DependenciesKey K

dependentAttribute APart of Key X

Saake Database Concepts Last Edited: April 2019 5–22

Page 24: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Second Normal Form /2

Example relation in 2NFR1(Name, Vineyard, Price)R2(Name, Color)R3(Vineyard, District, Region)

Saake Database Concepts Last Edited: April 2019 5–23

Page 25: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Second Normal Form /3

Note: Partially dependent attribute is only problematic if it is not aprime attribute2NF formally: Extended relation schema R = (R,K), FD set F

over R

Y partially depends on X w.r.t. F if the FD X!Y is notleft-reducedY fully depends on X if the FD X!Y is left-reducedR is in 2NF if R is in 1NF and every non-prime attribute of R fullydepends on every key of R

Saake Database Concepts Last Edited: April 2019 5–24

Page 26: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Third Normal Form

Eliminates transitive dependencies (in addition to the other kindsof dependencies)For instance, Vineyard ! District and District ! Region inrelation on Slide 5-21Note: 3NF only considers non-key attributes as endpoints oftransitive dependencies

Saake Database Concepts Last Edited: April 2019 5–25

Page 27: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Elimination of Transitive DependenciesKey K

dependentAttribute ASet of Attributes X

Saake Database Concepts Last Edited: April 2019 5–26

Page 28: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Third Normal Form /2

Transitive dependency in R3, i.e., R3 violates 3NFExample relation in 3NFR3_1(Vineyard, District)R3_2(District, Region)

Saake Database Concepts Last Edited: April 2019 5–27

Page 29: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Third Normal Form: Formally

Relation schema R, X ✓ R and F is an FD set over R

A 2 R is called transitively dependent on X w.r.t. F if and only ifthere is a Y ✓ R for which it holds thatX!Y, Y 6!X, Y !A,A 62 XY

Extended relation schema R = (R,K) is in 3NF w.r.t. F

if and only if

6 9A 2 R : A is non-prime attribute in R

^ A transitively dependent on a K 2 K w.r.t. F.

Saake Database Concepts Last Edited: April 2019 5–28

Page 30: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Boyce-Codd Normal Form

Stronger version of 3NF: Elimination of transitive dependenciesalso between prime attributesName Vineyard Dealer Price

La Rose Grand Cru Château La Rose Weinkontor 39.90Creek Shiraz Creek Wein.de 7.99Pinot Noir Creek Wein.de 10.99Zinfandel Helena GreatWines.com 5.99Pinot Noir Helena GreatWines.com 19.99Riesling Reserve MÃ 1

4 ller Weinkeller 19.99Chardonnay Bighorn Wein-Dealer 9.90

FDs:Name, Vineyard!PriceVineyard !DealerDealer !Vineyard

Candidate keys: { Name, Vineyard } and { Name, Dealer }Example relation meets 3NF but not BCNF

Saake Database Concepts Last Edited: April 2019 5–29

Page 31: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Boyce-Codd-Normalform /2

Extended relation schema R = (R,K), FD set F

BCNF formally:

6 9A 2 R : A transitively depends on a K 2 K w.r.t. F.

Schema in BCNF:WINES(Name, Vineyard, Price)WINE_TRADE(Vineyard, Dealer)

However, BCNF may violate dependency preservation, thereforeoften stop at 3NF

Saake Database Concepts Last Edited: April 2019 5–30

Page 32: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Minimality

Avoid global redundanciesMeet other criteria (such as normal forms) with as few schemasas possibleExample: Set of attributes ABC, set of FDs {A!B,B!C}Database schema in third normal form:

S = {(AB, {A}), (BC, {B})}

S0 = {(AB, {A}), (BC, {B}), (AC, {A})}

Redundancies in S0

Saake Database Concepts Last Edited: April 2019 5–31

Page 33: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Normal Forms

Schema Properties

Identifier Schema Property Key Points1NF Only atomic attributes2NF No non-prime attribute that partially

depends on a keyS1 3NF No non-prime attribute that transi-

tively depends on a keyBCNF No attribute that transitively de-

pends on a keyS2 Minimality Minimal number of relation schemas

that satisfies the other properties

Saake Database Concepts Last Edited: April 2019 5–32

Page 34: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Transformation Properties

When decomposing a relation in multiple relations, care must betaken that . . .

1 . . . only semantically sensible and consistent application data ispresented (dependency preservation), and

2 . . . all application data can be derived from the base relations(lossless-join decomposition)

Saake Database Concepts Last Edited: April 2019 5–33

Page 35: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Dependency Preservation

Dependency preservation: A set of dependencies can betransformed into an equivalent second set of dependenciesMore specifically: into the set of key dependencies because thesecan be validated efficiently by the database system

I The set of dependencies shall be equivalent to the set of keyconstraints in the resulting database schema.

I Equivalence ensures that, on a semantic level, the keydependencies express the exact same integrity constraints as thefunctional and other dependencies did before.

Saake Database Concepts Last Edited: April 2019 5–34

Page 36: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Dependency Preservation: Example

Decomposition of the relation schema WINES (Slide 5-21) into 3NF:

R1(Name, Vineyard, Price)R2(Name, Color)R3_1(Vineyard, District)R3_2(District, Region)

with key dependencies

Name, Vineyard!PriceName !ColorVineyard !DistrictDistrict !Region

Equivalent to FDs f1 . . . f4 (Slide 5-21) dependency-preserving

Saake Database Concepts Last Edited: April 2019 5–35

Page 37: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Dependency Preservation: Example /2

Zip code (a.k.a. postal code) structure of the Deutsche PostADDRESS(ZIP (Z), City (C), Street (S), Street Number (N))

and functional dependencies F

CSN!Z, Z!C

Candidate keys: CSN and ZSN 3NFDoes not meet BCNF (because ZSN!Z!C): thereforedecomposition of ADDRESSBut: every decomposition would destroy CSN!Z

Set of resulting FDs is not equivalent to F, the decomposition istherefore not dependency-preserving

Saake Database Concepts Last Edited: April 2019 5–36

Page 38: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Dependency Preservation: Formally

Locally extended database schema S = {(R1,K1), . . . , (Rp,Kp)};a set F of local dependencies

S fully characterizes F (or: is dependency-preserving w.r.t. F) if andonly if

F ⌘ {K!R | (R,K) 2 S,K 2 K}

Saake Database Concepts Last Edited: April 2019 5–37

Page 39: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Lossless-Join Decomposition

In order to satisfy the criteria of the normal forms, relationschemas sometimes have to be decomposed into smaller relationschemasIn order to restrict to “sensible” decomposition, require that theoriginal relation can be recreated from the decomposed relationsusing a natural join lossless-join decomposition

Saake Database Concepts Last Edited: April 2019 5–38

Page 40: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Lossless-Join Decomposition: Examples

Decompose the relation schema R = ABC into

R1 = AB and R2 = BC

Decomposition is not join-lossless given the dependencies

F = {A!B,C!B}

In contrast, the decomposition is join-lossless given thedependencies

F0 = {A!B,B!C}

Saake Database Concepts Last Edited: April 2019 5–39

Page 41: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Lossless-Join Decomposition

Original relation:

A B C1 2 34 2 3

Decomposition:

A B1 24 2

B C2 3

Join (join-lossless):

A B C1 2 34 2 3

Saake Database Concepts Last Edited: April 2019 5–40

Page 42: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Non-Join-Lossless Decomposition

Original relation:A B C1 2 34 2 5

Decomposition:A B1 24 2

B C2 32 5

Join (not join-lossless):A B C1 2 34 2 51 2 54 2 3

Saake Database Concepts Last Edited: April 2019 5–41

Page 43: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Lossless-Join Decomposition: Formally

The decomposition of a set of attributes X in X1, . . . ,Xp withX =

Sp

i=1 Xi is called a lossless-join decomposition under a set ofdependencies F over X if and only if

8r 2 SATX(F) : ⇡X1(r) ./ · · · ./ ⇡Xp(r) = r

holds.

Simple criterion for a join-lossless decomposition into two relationschemas: Decomposition of X into X1 and X2 is join-lossless underF, if X1 \ X2!X1 2 F

+ or X1 \ X2!X2 2 F+

Saake Database Concepts Last Edited: April 2019 5–42

Page 44: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Transformation Properties

Transformation Properties

Identifier Transformation Property Key PointsT1 Dependency Preservation All given dependencies are repre-

sented by keysT2 Lossless-Join Decomposition Original relations can be recreated

by joining base relations

Saake Database Concepts Last Edited: April 2019 5–43

Page 45: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Design Methods: Goals

Given: Universe U and set of FDs F

Locally extended database schema S = {(R1,K1), . . . , (Rp,Kp)}compute with

I T1: Dependency Preservation (S fully characterizes F)I S1: S is in 3NF under F

I T2: Lossless-Join DecompositionI S2: Minimality, i.e.,

6 9S0 : S

0 satisfies T1, S1, T2 and |S0| < |S|

Saake Database Concepts Last Edited: April 2019 5–44

Page 46: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Design Methods: Example

Database schemas badly designed if only one of these fourcriteria is not fulfilledExample: S = {(AB, {A}), (BC, {B}), (AC, {A})} fulfills T1, S1 andT2 under F = {A!B,B!C,A!C}in third relation AC tuple redundant or inconsistentCorrect: S

0 = {(AB, {A}), (BC, {B})}

Saake Database Concepts Last Edited: April 2019 5–45

Page 47: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Decomposition

Given: Initial universal relation schema R = (U ,K(F)) with allattributes and a set of implied keys implied by FDs F over R

I Set of attributes U and set of FDs F

I Find all K!U with K minimal, for which K!U 2 F+ (K(F))

Wanted: Decomposition into D = {R1,R2, . . . } of 3NF-relationschemas

Saake Database Concepts Last Edited: April 2019 5–46

Page 48: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Decomposition: Algorithm

DECOMPOSE(R)Set D := {R}while R0 2 D, does not meet 3NF

/* Find attribute A that is transitively dependent on K */if Key K with K!Y, Y 6!K, Y !A,A 62 KY

then/* Decompose relation schema R w.r.t. A */R1 := R � A , R2 := YA

R1 := (R1,K) , R2 := (R2,K2 = {Y})D := (D �R0) [ {R1} [ {R2}

end ifend whilereturn D

Saake Database Concepts Last Edited: April 2019 5–47

Page 49: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Decomposition: Example

Initial relation schema R = ABC

Functional dependencies F = {A!B,B!C}Keys K = A

Saake Database Concepts Last Edited: April 2019 5–48

Page 50: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Decomposition: Example /2

Initial relation schema R with Name, Vineyard, Price, Color,District, RegionFunctional dependencies

f1: Name, Vineyard!Pricef2: Name, Vineyard!Vineyardf3: Name, Vineyard!Namef4: Name !Colorf5: Vineyard !District, Regionf6: District !Region

Saake Database Concepts Last Edited: April 2019 5–49

Page 51: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Decomposition: Assessment

Advantages: 3NF, lossless-join decompositionDisadvantages: other criteria not fulfilled, depends on order,NP-hard (search for keys)

Saake Database Concepts Last Edited: April 2019 5–50

Page 52: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Synthesis Method

Principle: Synthesis transforms original set of FDs F into aresulting set of key dependencies G such that F ⌘ G

“Dependency Preservation” built into the method3NF and minimality also achieved, independent of orderComputational complexity: quadratic

Saake Database Concepts Last Edited: April 2019 5–51

Page 53: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Comparison Decomposition — Synthesis

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

...

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

...

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

WEINE WeinID Name Farbe Jahrgang Weingut

ERZEUGER Weingut Anbaugebiet Region

WINZER Weingut Name

. . .R1,K1 Rn,Kn . . .R1,K1 Rn,Kn

Dekomposition Synthese

. . .R�1,K �

1 R�n,K �

n FDs F ��

FDs F �

FDs F

R,K

U, FDs F

i

...

Dekomposition SyntheseDecomposition Synthesis

Saake Database Concepts Last Edited: April 2019 5–52

Page 54: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Synthesis Method: Algorithm

Given: Relation schema R mit FDs F

Wanted: Join-lossless and dependency-preserving decompositioninto R1, . . .Rn where all Ri are in 3NFAlgorithm:

SYNTHESIZE(F):F̂ := MINIMALCOVER(F) /* Determine minimal cover */Compute equivalence classes Ci of FDs from F̂ with equal

or equivalent left sides, i.e., Ci = {Xi!Ai1,Xi!Ai2, . . . }For each equivalence class Ci create a schema of the form

RCi= {Xi [ {Ai1} [ {Ai2} [ . . . }

if none of the schemas RCicontains a key from R

then create additional relation schema RK with attributesfrom R, which form the key

return {RK ,RC1 ,RC2 , . . . }

Saake Database Concepts Last Edited: April 2019 5–53

Page 55: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Equivalence Classes

Class of FDs whose left sides are equal or equivalentLeft sides are equivalent if they determine each other functionallyRelation schema R with Xi, Y ⇢ R, set of FDsXi!Xj and Xi!Y with 1 i, j n can be expressed as

(X1,X2, . . . ,Xn)!Y

X4

X3 X1

X2

Y

Saake Database Concepts Last Edited: April 2019 5–54

Page 56: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Equivalence Classes: Example

Set of FDs

F = {A!B,AB!C,A!C,B!A,C!E}

Minimal cover

F̂ = {A!B,B!C,B!A,C!E}

Aggregation into equivalence classes

C1 = {A!B,B!C,B!A}C2 = {C!E}

Result of synthesis

(ABC, {{A}, {B}}), (CE, {C})

Saake Database Concepts Last Edited: April 2019 5–55

Page 57: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Achieving a Lossless-Join Decomposition

Achieve a lossless-join decomposition by a simple “trick”:I Extend the original set of FDs F with U!�, where � is a dummy

attributeI � is removed after synthesis

Example: {A!B,C!E}I Result of synthesis (AB, {A}), (CE, {C}) is not lossless, because the

universal key is not part of any schemaI Dummy-FD ABCE!�; reduced to AC!�I Yields third relation schema

(AC, {AC})

Saake Database Concepts Last Edited: April 2019 5–56

Page 58: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Synthesis: Example

Relation schema and set of FDs from Slide 5-49Steps

1 Minimal cover: removal of f2, f3 as well as Region in f52 Equivalence classes:

C1 = {Name, Vineyard!Price}C2 = {Name!Color}C3 = {Vineyard!District}C4 = {District!Region}

3 Derivation of relation schemas

Saake Database Concepts Last Edited: April 2019 5–57

Page 59: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Summary

Functional dependenciesNormal forms (1NF – 3NF, BCNF)Dependency preservation and lossless-join decompositionDesign methods

Saake Database Concepts Last Edited: April 2019 5–58

Page 60: Part V Relational Database Design Theory - dbse.ovgu.deConcepts+(DB1+Eng... · Relational Database Design Theory Target Model of the Logical Design Relation Model WINES WineID Name

Relational Database Design Theory Design Methods

Control Questions

What is the goal of normalizing relationalschemas?Which properties of relational schemas dothe normal forms take into account?What is the difference between 3NF andBCNF?What does it mean for a decomposition tobe dependency-preserving?What is a lossless-join decomposition?

Saake Database Concepts Last Edited: April 2019 5–59