CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

22
CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

Transcript of CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

Page 1: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Normalisation

(special thanks to Janet Francis for this presentation)

Page 2: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Aim

• To demonstrate the meaning of normalisation

• To demonstrate how normalisation can be used to good effect

2

Page 3: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Normalisation

• A process which uses a set of rules for grouping data elements into logical entities (relations)

• If followed carefully, it will result in a robust database design

• Each stage in the process results in the production of a structure - a normal form.

• For most purposes, the first three stages (to 3rd normal form – 3NF) are sufficient

3

Page 4: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Un-normalised data• A list of fields needed for the system• Scenario

– All staff are released for two hours a week for staff development.

– Employees work at their own pace in a lab.

– A total of six attributes are recorded about each employee including their normal office location (building and room), the date they joined the course and how many hours it is planned for them to work on it.

Page 5: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Un-normalised dataCourse IDCourse NameEmployee IDNameBuildingRoom IDDate Joined CourseAllocated HoursIn this example, Course ID, Employee ID and Room ID are known to be unique.

Page 6: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Problems

• There is no record of the employee until they have joined a course.

• Lots of duplicate employee data is created once employees start to join courses.

Page 7: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

First Normal Form (1NF)

• An entity is in 1NF if it has an identifying key and there are no repeating attributes or repeating groups of attributes

• To get to 1NF we must remove all repeating groups

Page 8: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Remember what we started with• Course details and Employee details

are repeating groups

Course IDCourse NameEmployee IDNameBuildingRoom IDDate Joined CourseAllocated Hours

Page 9: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

We need to:• Take one of the unique identifiers eg. “Course

ID” (we could have used Room ID or Employee ID)

• For each of the other attributes, check if they have a one to one relationship with “Course ID”

• If so, keep them, if not move them into a new entity.

• For the new entity, a unique identifier is required and is formed as a composite by using “Course ID” combined with another unique identifier. “Employee ID” is chosen for this example though it could have been Room ID.

Page 10: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Our Example

COURSE EMP_ON_COURSE

Course IDCourse Name

Course ID* , Employee IDNameBuildingRoom IDDate Joined CourseAllocated Hours

NB: Course ID is part of the composite Primary Key of the new entity “EMP_ON_COURSE”.It is also the Foreign Key providing a relationship with COURSE

Page 11: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Problems

• Still problem with employee details• Not so much duplicate data – course

details are now only entered once per course.

Page 12: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Second Normal Form (2NF)

• An entity is in 2NF if it is in 1NF and has no attributes which require only part of the key to identify them uniquely

• To get to 2NF we remove part key dependencies

• All data items must be dependant on the whole of the composite primary key

Page 13: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Not all groups are in 2NF• COURSE is already in 2NF• EMP_ON_COURSE is not because

Attribute Depends On

NameBlockRoom IDDate Joined CourseHours

Employee IDEmployee IDEmployee IDEmployee ID + Course IDEmployee ID + Course ID

Page 14: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

So we..

• Take out details that are linked only to “Employee ID” into a separate entity.

• If in any doubt, ask a question such as ‘Are these fields affected when an Emloyee joins a course’Attribute Depends On

NameBuildingRoom ID

Employee IDEmployee IDEmployee ID

Page 15: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

….end up with three entities

COURSE EMP_ON_COURSE EMPLOYEE

Course IDCourse Name

Course ID*, Employee ID*Date Joined CourseAllocated Hours

Employee ID NameBuildingRoom ID

The two parts of the composite Primary key in

EMP_ON_COURSE are Foreign keys in the linked tables

Page 16: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Problems• No problem with courses• No problem with Employees• But

– Building and Room Number are related in that a room is in a particular building. If one is updated the other will be affected.

– If the building names change, then with the current structure the whole of the employee records currently stored will have to be updated

Page 17: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Third Normal Form (3NF)

• An entity is in 3NF if it is in 2NF and no non-key attribute depends on another non-key attribute.

• To get to 3NF we must remove attributes that depend on other non-key attributes i.e. resolve the Room and Building problem

Page 18: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

We need to:• Decide on the direction of the dependency

between the attributes • For example

– If, given a value for A, there is only one possible value for B, then

• A determines B• B is dependant on A

– So for rooms at Staffordshire University, the room number is unique – we know for example that K342 is in the Octagon Building and C312 is in the Beacon Building.

– If you know the room, you can find out the building - the same is not true vice-versa because if you know the Building you cannot determine the room.

Page 19: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

And the solution is..

• Leave Room ID in the original entity as a foreign key, but remove Building into a separate entity with Room ID as the Primary Key.

EMPLOYEE ROOM

Employee IDNameRoom ID*

Room IDBuilding

Page 20: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Entity Relationship Modelling

Course

Emp_On_Course Employee

Room

This is not perfect – why – at least 3 reasons

Page 21: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

To Normalise to 3NF1. Remove all repeating data elements

and ensure that everything is dependant on the Primary Key

2. Ensure data items are dependant on the whole of the composite primary key

3. Remove to new entities all fields dependant on non-key fields

This process is sometimes referred to as The key, the whole key and nothing but the key!!

Page 22: CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)

CREATE THE DIFFERENCE

Ways to represent NormalisationTo make it easier to write down

– # represents a numeric field– Primary keys are underlined– Foreign keys* are in Italics with an asteriskThe entities we created would be

represented as:EMPLOYEE (#Employee ID,Name,#Room ID*)ROOM (#Room No, Building)COURSE (#Course ID, Course Name)EMP_ON_COURSE (#Course ID*, #Employee ID*, Date Joined

Course, allocated Hours