Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

49
2 December 2005 Introduction to Databases Relational Database Design Prof. Beat Signer Department of Computer Science Vrije Universiteit Brussel http://www.beatsigner.com

description

This lecture is part of an Introduction to Databases course given at the Vrije Universiteit Brussel.

Transcript of Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Page 1: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

2 December 2005

Introduction to Databases Relational Database Design

Prof. Beat Signer

Department of Computer Science

Vrije Universiteit Brussel

http://www.beatsigner.com

Page 2: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

2 March 7, 2014

Relational Database Design

There are two major relational database design

approaches

Top-down design develop a conceptual model (e.g. ER model)

reduction (mapping) of the conceptual model to relation schemas

use normalisation as a validation technique to check the quality of the resulting relation schemas

- a relational database schema resulting from the mapping of a good ER model

(with the correct entity sets) normally requires no further normalisation

Bottom-up design design by decomposition

use normalisation to iteratively create (decompose) a set of relations starting with a single relation

Page 3: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

3 March 7, 2014

Relational Database Design ...

A relation schema might contain certain dependencies in

which case it should be decomposed (normalised) into

multiple smaller relation schemas this normalisation process is based on functional dependencies

and multivalued dependencies

Sometimes multiple relations resulting from an ER to

relation schema reduction might be merged to save

some join query operations we have to ensure that the resulting larger relation schema does

not introduce new undesirable dependencies

Page 4: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

4 March 7, 2014

Reduction

A conceptual ER model can be reduced to a set of

relation schemas (relational database schema)

The quality of the resulting set of relation schemas

depends on the quality of the original ER design

In the following we discuss the reduction of the different

ER model concepts introduced earlier

Page 5: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

5 March 7, 2014

Strong Entity Sets

A strong entity set E with only simple attributes a1,..., an is

mapped to a relation R with attributes a1,..., an the primary key of the entity set E becomes the primary key of the

relation R

Employees

id name

Employee (id, name)

id name

1234 Beat Signer

1576 Lode Hoste

3212 Sandra Trullemans

... ...

relation schema

employee = (Employee)

Page 6: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

6 March 7, 2014

Composite Attributes

For each component of a composite attribute, we create

an attribute ai in the relation R no special attribute is created for the composite attribute itself

Employee (id, name, street, city)

Employees

id name address

street city

Page 7: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

7 March 7, 2014

Multivalued Attributes

Multivalued attributes are treated separately since a

relation should only contain attributes with atomic values for each multivalued attribute ai of an entity set E, we create a

new relation S containing the attribute ai as well as the primary key attributes of the relation R that is created for the entity set E

- define a foreign key constraint to the original relation R

Employees

id name phone

Phones (id, phone)

id phone

1234 032 2 612 1337

1234 032 2 612 3123

1576 032 2 623 8765

... ...

phones = (Phones)

Page 8: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

8 March 7, 2014

Weak Entity Sets

A weak entity set E with attributes a1,..., an is mapped to a

relation R with attributes a1,..., an combined with the pri-

mary key attributes b1,..., bm of the identifying entity set F the primary key of R is defined by the primary key attributes of the

identifying entity set F combined with the discriminator of E

a foreign key constraint is defined from the attributes b1,..., bm to the primary key of the relation that is created for the identifying entity set F

Page 9: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

9 March 7, 2014

Weak Entity Sets ...

Seat (id, number, colour)

id number colour

1 1 red

1 20 black

4 1 black

... ... ...

seat = (Seat)

Offers SeatsCinemas

id name number colour

Page 10: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

10 March 7, 2014

Relationship Sets

A relationship set over the entity sets E1,..., En with the

optional descriptive attributes b1,..., bm is mapped to a

relation R with the primary key attributes of E1,..., En

combined with b1,..., bm

The primary key of relation R is defined as follows binary many-to-many relationship

- union of all primary key attributes of E1 and E2

binary one-to-one relationship

- choose the primary key of E1 or E2

binary one-to-many or many-to-one relationship

- choose the primary key of the entity set on the "many" side

Page 11: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

11 March 7, 2014

Relationship Sets ...

The primary key of relation R is defined as follows ... n-ary relationship without cardinality constraints

- union of all primary key attributes of E1,..., En

n-ary relationship with one 0..1 or 1..1 cardinality constraint over the entity set Ej

- union of all primary key attributes of E1,..., En , except the primary key of Ej

- note that we allow only one such 0..1 or 1..1 cardinality constraint for

n-ary relationships

A foreign key constraint is defined for each set of primary

key attributes (provided by the entity set Ei) to the

primary key of the corresponding relation that is defined

for Ei

Page 12: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

12 March 7, 2014

Relationship Sets ...

LocatedAt (id, name, address, duration)

id name address duration

1234 10F721 Pleinlaan 2 1

1576 10F733 Pleinlaan 2 1

... ... ... ...

locatedAt = (LocatedAt)

LocatedAt OfficesEmployees

id name name address

duration

size

Page 13: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

13 March 7, 2014

Relationship Sets ...

LocatedAt (id, name, address, duration)

id name address duration

1234 10F721 Pleinlaan 2 1

1576 10F733 Pleinlaan 2 1

... ... ... ...

locatedAt = (LocatedAt)

LocatedAt OfficesEmployees

id name name address

duration

1..1size

0..*

Page 14: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

14 March 7, 2014

Weak Entity Existence Relationship

The special relationship set from a weak entity set to its

defining entity set is always a many-to-one relationship the special weak entity existence relationship does not have to be

mapped to a separate relation since it is already covered by the relation that is created for the weak entity set

- e.g. potential Offers relation schema already covered by Seat relation schema

Offers SeatsCinemas

id name number colour

Seat (id, number, colour)

Page 15: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

15 March 7, 2014

Combination of Schemas

Relations resulting from the mapping of a relationship set

with a total participation constraint can be integrated with

the relation over which the constraint is defined key of the relation with the constraint (1..1) used as primary key

also works for partial relationships (have to use null values)

LocatedAt OfficesEmployees

id name name address

duration

1..1size

0..*

Employee (id, employeeName, duration, name, address) Office (name, address, size)

Page 16: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

16 March 7, 2014

Specialisation and Generalisation

Create a new relation R for each entity subset combine the attributes of the entity set with the primary key

attributes of the superclass

Personsid name

Students

ISA

Teachers teachinghours

studentID

Person (id, name)

Student (id, studentID)

Teacher (id, teachingHours)

Page 17: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

17 March 7, 2014

Specialisation and Generalisation ...

For a disjoint and total ISA constraint we might omit the

separate superclass relation saves some join operations but it is no longer possible to define a

foreign key constraint on the id attribute (now at two places)

Personsid name

Students

ISA

Teachers teachinghours

studentID

disjoint

Student (id, name, studentID) Teacher (id, name, teachingHours)

Page 18: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

18 March 7, 2014

Aggregations

Like the regular

relationship set

mapping

note that the name

attribute is the one from the Companies

entity set

WorksFor CompaniesEmployees

id name name address

Durationsfrom to

Manages

ManagersmId name

Manages (id, from, to, name, address, mId)

Page 19: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

19 March 7, 2014

Relational Database Design

The goal of relational database design is to create a set

of relation schemas that can be used to store information without unnecessary redundancy

allow us to easily retrieve information

The quality of the set of schemas resulting from a

reduction (top-down design) depends on how good the

original ER design was

In a design by decomposition approach (bottom-up

design) we need a way to reduce any redundancy via a

decomposition process split large relations into multiple smaller relations

Page 20: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

20 March 7, 2014

Update Anomalies

Insertion anomaly redundant information has to be kept consistent

- e.g. insertion of a new order for an already existing CD

information about a CD can only be inserted if there is an order or we have to populate the customer information (i.e. name and street) with null values

id name street cdName price

1 Max Frisch Bahnhofstrasse 7 Falling into Place 17.90

2 Eddy Merckx Pleinlaan 25 Falling into Place 17.90

53 Albert Einstein Bergstrasse 18 Chromatic 16.50

5 Max Frisch Bahnhofstrasse 7 Carcassonne 15.50

Order (id, name, street, cdName, price)

order = (Order)

Page 21: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

21 March 7, 2014

Update Anomalies ...

Modification anomaly if we want to modify information about a particular CD, we have to

ensure that the information is updated in all redudant entries

- e.g. modification of the price of the CD named "Falling into Place"

Deletion anomaly if we delete a customer who is the only buyer of a specific CD, we

also lose the information about that specific CD

- e.g. deletion of the customer "Albert Einstein"

id name street cdName price

1 Max Frisch Bahnhofstrasse 7 Falling into Place 17.90

2 Eddy Merckx Pleinlaan 25 Falling into Place 17.90

53 Albert Einstein Bergstrasse 18 Chromatic 16.50

5 Max Frisch Bahnhofstrasse 7 Carcassonne 15.50

Page 22: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

22 March 7, 2014

Normalisation

Normalisation is a formal method to analyse relation

schemas based on their keys, functional dependen-

cies (FD) as well as multivalued dependencies (MVD) remove redundancy

prevent certain update anomalies

- insertion, modification and deletion

There exists a set of rules

to check if a relation is in a

specific normal form

original normal forms

described by Codd

Fifth Normal Form (5NF)

Fourth Normal Form (4NF)

Boyce-Codd Normal Form (BCNF)

Third Normal Form (3NF)

Second Normal Form (2NF)

First Normal Form (1NF)

str

onger

Page 23: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

23 March 7, 2014

Normalisation ...

A relation that does not conform to a certain degree of

normalisation can be decomposed (lossless-join

decomposition) into multiple relations that are in the

desired normal form can be done automatically

Normalisation is often done in a stepwise manner a higher normal form means a more restricted format and less

problems with update anomalies

note that only the first normal form (1NF) is mandatory for the relational model and all the other normal forms are optional

Page 24: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

24 March 7, 2014

First Normal Form (1NF)

As we have seen earlier, the ER model supports

complex attributes composite attributes

multivalued attributes

In the reduction process, we remove this substructure

from attributes to create a relational model with atomic

attribute values only

A relation schema R is in first normal form (1NF) if the

domains D1,..., Dn of all attributes a1,..., an of R are atomic no composite attributes or attributes with a set of values

the intersection of each row and column contains one and only one value

Page 25: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

25 March 7, 2014

Functional Dependencies

In this example, there are various sets of attributes that uniquely identify a set of other attributes teacherID teacher

teacherID salary

teacherID {teacher, salary}

{teacherID, teacher} {salary}

department {building, budget}

...

We say that there is a functional dependency ()

between these two sets of attributes a functional dependency should always hold on a relation schema

and not just on a particular relation instance

TeacherDept (teacherID, teacher, salary, department, building, budget)

Page 26: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

26 March 7, 2014

Functional Dependencies ...

A functional dependency can be used to express

constraints (generalisation of keys) over a set of

attributes (determinant) that uniquely identify a set of

other attributes (dependent attributes)

For a relation schema R with a R and b R the

functional dependency a b holds on R, if for any r(R) " t1,t2 r(R) with t1[a] = t2[a] t1[b] = t2[b]

Note that any K R is a superkey if K R we can use functional dependencies to check whether K is a

superkey

Page 27: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

27 March 7, 2014

Functional Dependencies ...

The relation r(R) contains the follow-

ing set F of functional dependencies A B

C E

...

A functional dependency a b is trivial if b a trivial dependencies are satisfied by all relations

A full functional dependency has a minimal determinant if the determinant is not minimal, we talk about a partial functional

dependency (e.g. AD B in the example)

For a relation r(R) with a b and b we say that is

transitively dependent on a via b

A B C D E

a1 b1 c1 d1 e1

a2 b2 c2 d1 e2

a2 b2 c3 d1 e3

a3 b2 c4 d3 e3

r(R)

Page 28: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

28 March 7, 2014

Closure of Attributes

For a given relation schema R, a number of functional

dependencies and a set of attributes a R, the closure

a+ is defined by all attributes Bi such that a Bi

Computing the closure

If the closure a+ contains all attributes of the relation

schema R, then the attributes a form a superkey of R

Initialise the set s with the attributes of a

Repeat until the set s does not grow anymore { if there is a functional dependency b and b is in s, then add to the set s }

Page 29: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

29 March 7, 2014

Computation of Candidate Keys

We can test whether a is a candidate key for a given

relation schema R by checking whether the closure a+

contains all attributes of R

We can further use this approach to find all the candidate

keys for a relation schema R and a given set of functional

dependencies check for each set a R of attributes whether the closure a+

contains all attributes

the search process can be slightly optimised by starting with the smallest possible subsets

Page 30: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

30 March 7, 2014

Functional Dependency Inference

For a given set F of functional dependencies we can

derrive new functional dependencies based on a set of

axioms to compute the closure F+ of F the closure F+ includes all functional dependencies that are

logically implied by F

Three rules (Armstrong's axioms) can be used to

compute F+

reflexivity

- for a given set of attributes a and b a, a b holds (see trivial dependency)

augmentation

- for given a set of attributes ; if a b then a b holds

transitivity

- if a b and b , then a holds

Page 31: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

31 March 7, 2014

Functional Dependency Inference ...

Armstrong's axioms are sound (produce only elements

of F+) and complete (produce all elements in F+) since it may take a lot of time to compute F+ with Armstrong's

axioms only, there exist some additional rules

Decomposition if a b, then a b and a hold

Union if a b and a , then a b holds

Trivial dependency rules if a b, then a a b holds

if a b, then a a b holds

Page 32: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

32 March 7, 2014

Second Normal Form (2NF)

A relation schema R is in second normal form (2NF)

if it is in 1NF and if there exists no non-prime attribute that

is functionally dependent on a part of a candidate key every non-prime attribute has to be fully functionally dependent on

a candidate key

a non-prime attribute is an attribute that is not part of any candidate key

the Lecturer relation schema shown in the example is not in 2NF since the office attribute functionally depends on the teacher attribute

teacher course office

Beat Signer Databases 10G731d

Beat Signer WIS 10G731d

Lode Hoste Databases 10F716

Lode Hoste ATIS 10F716

Sandra Trullemans WIS 10G731e

Lecturer (teacher, course, office)

lecturer = (Lecturer)

Page 33: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

33 March 7, 2014

Second Normal Form (2NF) ...

2NF normalisation process remove any partially dependent attributes from the relation and

put them in a new relation together with their determinant

The original Lecturer relation can be losslessly

decomposed into two relations which are both in 2NF relations with single attribute keys are automatically in 2NF

teacher office

Beat Signer 10G731d

Lode Hoste 10F716

Sandra Trullemans 10G731e

Lecturer (teacher, office) Course (teacher, course)

teacher course

Beat Signer Databases

Beat Signer WIS

Lode Hoste Databases

Lode Hoste ATIS

Sandra Trullemans WIS

lecturer = (Lecturer)

course = (Course)

Page 34: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

34 March 7, 2014

Lossless Decomposition

Given a relation schema R and the two decompositions

R1 and R2 of R, we say that R1 and R2 form a lossless

decomposition if pR1 (r) ⋈ pR2

(r) = r

Let F be a set of functional dependencies on R R1 and R2 form a lossless decomposition of R if either R1 R2 R1

or R1 R2 R2 are in F+

- this means that R1 R2 is a superkey of R1 or R2

Page 35: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

35 March 7, 2014

Third Normal Form (3NF)

A relation schema R is in third normal form (3NF) if it

is in 2NF and no non-prime attribute is transitively de-

pendent on a candidate key, i.e. for all functional

dependencies

a b in F+ one of the following has to hold a b is a trivial functional dependency (i.e. b a)

a is a superkey of R

each attribute Ai in b - a is contained in a candidate key of R

- note that each Ai can be in different candidate keys

Each non-key attribute "must provide a fact about the

key, the whole key, and nothing but the key" [Bill Kent]

Page 36: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

36 March 7, 2014

Third Normal Form (3NF) ...

The Prize relation example schema is in 2NF

The Prize relation schema is not in 3NF since birthdate

is functionally dependent on winner and non of the three

conditions holds for this functional dependency birthdate is transitively dependent on the key (award, year)

award year winner birthdate

ACM Turing Award 1981 Edgar F. Codd 23.08.1923

Nobel Peace Prize 1979 Mother Teresa 26.08.1910

ACM Turing Award 1984 Niklaus Wirth 15.02.1934

Nobel Peace Prize 1984 Desmond Tutu 07.10.1931

prize = (Prize)

Prize (award, year, winner, birthdate)

Page 37: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

37 March 7, 2014

Third Normal Form (3NF) ...

3NF normalisation process remove any transitively dependent attributes from the relation and

place them in a new relation together with their determinant

Decomposition of the Prize relation schema into two 3NF

relation schemas

winner birthdate

Edgar F. Codd 23.08.1923

Mother Teresa 09.01.1959

Niklaus Wirth 15.02.1934

Desmond Tutu 07.10.1931

prize = (Prize)

Prize (award, year, winner) Birthdate (winner, birthdate)

award year winner

ACM Turing Award 1981 Edgar F. Codd

Nobel Peace Prize 1992 Mother Teresa

ACM Turing Award 1984 Niklaus Wirth

Nobel Peace Prize 1984 Desmond Tutu

bdate = (Birthdate)

Page 38: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

38 March 7, 2014

Boyce-Codd Normal Form (BCNF)

The Boyce-Codd normal form is a stronger form of 3NF

A relation schema R is in Boyce-Codd Normal

Form (BCNF) if it is in 3NF and if every determinant is a

candidate key, i.e. for all functional dependencies a b

in F+ one of the following holds a b is a trivial functional dependency (i.e. b a)

a is a superkey of R

Any relation that is in BCNF is also in 3NF since the

BCNF conditions are equivalent to the first two 3NF

conditions

Page 39: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

39 March 7, 2014

BCNF Decomposition

If a relation R is not in BCNF, then there exists a least

one nontrivial functional dependency a b where a is

not a superkey of R the relation R can then be decomposed into the two relation

schemas R1 (a b) and R2 (R - (b - a))

We can for example apply the BCNF decomposition to

the previous Prize relation schema example with the

functional dependency winner birthdate a b = (winner, birthdate)

(R - (b - a)) = (award, year, winner)

Further details about the algorithms for BCNF and 3NF

decomposition can be found in the course book

Page 40: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

40 March 7, 2014

Multivalued Dependencies

Some relation schemas that are in BCNF may still

contain redundant information

The fourth normal form (4NF) deals with some of these

problems based on multivalued dependencies for a given relation schema R with a R and b R the

multivalued dependency a ↠ b holds if for all pairs of tuples t1 and t2 in r(R) (with t1[a] = t2[a]) there exist tuples t3 and t4 in r(R) such that

- t1[a] = t2[a] = t3[a] = t4[a]

- t3[b] = t1[b]

- t3[R - b] = t2[R - b]

- t4[b] = t2[b]

- t4[R - b] = t1[R - b]

a b R - a - b

t1 a1...ai ai+1...aj aj+1...an

t2 a1...ai bi+1...bj bj+1...bn

t3 a1...ai ai+1...aj bj+1...bn

t4 a1...ai bi+1...bj aj+1...an

Page 41: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

41 March 7, 2014

Multivalued Dependencies ...

Every functional dependency is also a multivalued

dependency, e.g. if a b then a ↠ b

Page 42: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

42 March 7, 2014

Fourth Normal Form (4NF)

A relation schema R is in fourth normal fom (4NF) if

it is in BCNF and if any non-trivial multivalued depen-

dency is a dependency on a candidate key, i.e. for all

multivalued dependencies a ↠ b in D+ one of the

following has to hold a ↠ b is a trivial functional dependency (i.e. b a or b a = R)

a is a superkey of R

Note that the fourth normal form is very similar to BCNF

except that we use multivalued dependencies

4NF normalisation process remove any multivalued attributes from the relation and

place them in a new relation together with their determinant

Page 43: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

43 March 7, 2014

Fifth Normal Form (5NF)

There are some forms of constraints called join

dependencies that generalise multivalued dependencies leads to the project-join normal form or fifth normal form (5NF)

not discussed in detail in this course

Page 44: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

44 March 7, 2014

Normalisation Summary

Relations in higher normal forms are less vulnerable to

update anomalies generally it is recommended that relations are at least in 3NF

Fifth Normal Form (5NF)

Fourth Normal Form (4NF)

Boyce-Codd Normal Form (BCNF)

Third Normal Form (3NF)

Second Normal Form (2NF)

First Normal Form (1NF)

str

onger

Unnormalised (UN) remove repeating groups

remove partial dependencies

remove transitive dependencies

every determinant has to be a candidate key

remove multivalued dependencies

remove join dependencies

Page 45: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

45 March 7, 2014

Denormalisation

Sometimes a database designer decides to store

information in a redudant way to save join operations

and improve the performance may result in additional work for insert, update and delete

operations

An alternative is to keep the normalised schema and

introduce additional materialised views

Page 46: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

46 March 7, 2014

Homework

Study the following chapter of the

Database System Concepts book chapter 7

- sections 7.6 and 7.8.6

- Reduction to Relation Schemas

chapter 8

- sections 8.1-8.9

- Relational Database Design

Page 47: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

47 March 7, 2014

Exercise 4

Relational algebra

Relational database design ER to relational model reduction

Page 48: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

Beat Signer - Department of Computer Science - [email protected]

48 March 7, 2014

References

A. Silberschatz, H. Korth and S. Sudarshan,

Database System Concepts (Sixth Edition),

McGraw-Hill, 2010

Page 49: Relational Database Design - Lecture 4 - Introduction to Databases (1007156ANR)

2 December 2005

Next Lecture Structured Query Language (SQL)