DBMS Lecture 8 - Normalization
-
Upload
ericka-tagarda -
Category
Education
-
view
316 -
download
2
Transcript of DBMS Lecture 8 - Normalization
Lec 08: Normalization
BCA 20: DATABASE MANAGEMENT SYSTEM AND PROGRAMMING
Department of Information SystemsCollege of Computer Studies
Xavier University – Ateneo de Cagayan
Review1. What is a Table?2. What is a Column?3. What is a Row?
The Apparel Store Case StudyIn preparation for next year’s sale event, a certain apparel shop is coming up with ideas for the Item database.Analyze the table on the succeeding slide, and see how you can improve on it.
tbl_ItemsItem Colors Price Tax
T-shirt Red 12.00 0.60
Polo Red 12.00 0.60
Sweatshirt Blue 12.00 0.60
AnomaliesThe above table might be sufficient for a
simple database, but later on, errors (or anomalies) can occur when using it.
There are three general types of anomalies: Updation, Insertion, and Deletion.
tbl_ItemsItem Colors Price Tax
T-shirt Red,blue 12.00 0.60
Polo Red, Yellow 12.00 0.60
T-shirt Red, Black 12.00 0.60
Sweatshirt Blue, Black 12.00 0.60
Updation AnomalyFor example, to update the colors of the item
where it occurs twice or more than twice in a table, we will have to update column in all the rows, or else data will become inconsistent.
tbl_ItemsItem Colors Price Tax
T-shirt Red,blue 12.00 0.60
Polo Red, Yellow 12.00 0.60
T-shirt Red, Black 12.00 0.60
Sweatshirt Blue, Black 12.00 0.60
Insertion AnomalySuppose for a new item, we have the item and
color of the item but if it has not opted for a price yet then we have to insert NULL in there, leading to an Insertion Anomaly.
tbl_ItemsItem Colors Price Tax
T-shirt Red,blue 12.00 0.60
Polo Red, Yellow 12.00 0.60
T-shirt Red, Black 12.00 0.60
Sweatshirt Blue, Black 12.00 0.60
Deletion AnomalyLikewise, if one item was suggested to be
drops, then during the time when we delete that row, the entire item record will have to be deleted along with it.
tbl_ItemsItem Colors Price Tax
T-shirt Red,blue 12.00 0.60
Polo Red, Yellow 12.00 0.60
T-shirt Red, Black 12.00 0.60
Sweatshirt Blue, Black 12.00 0.60
The SolutionThrough Normalization, we can make sure that the
data are logically arranged. Usually there are 5 levels of normal forms, but usually 3rd normal form is sufficient for most typical database applications:
There are three steps in the Normalization process:First Normal Form (1NF);Second Normal Form (2NF); andThird Normal Form (3NF)
NormalizationNormalization is a technique of organizing data in a database through the systematic decomposition of tables in order to eliminate data redundancies and anomalies.
NormalizationThese anomalies refer to Insertion, Updation, and Deletion Anomalies.
NormalizationNormalization ensures that redundant data is eliminated, and data is logically stored (i.e. data dependencies make sense).
The Apparel Store Case StudyLet’s see how we can apply normalization to the Registrar’s database.
First Normal FormIn First Normal Form, no two Rows of data must contain repeating group of information (i.e each set of column must have a unique or single value). Each table should be organized into rows, and each row should have a primary key.
1NF
The Primary KeyThe Primary Key is a single column (or a combination of two or more columns) that uniquely identifies each row.We will use primary keys to help us in the Normalization process.
First Normal FormRemember, in First Normal Form, each row must not have a column in which more than one value is saved (liked separated with commas). Also, each row must be unique and distinguished by a primary key.tbl_Student1NF will now look like this:
1NF
tbl_ItemsItem Colors Price Tax
T-shirt Red,blue 12.00 0.60
Polo Red, Yellow 12.00 0.60
T-shirt Red, Black 12.00 0.60
Sweatshirt Blue, Black 12.00 0.60
First Normal FormTable is not in 1st normal form because: - Multiple items in color field - Duplicate records / no primary keySOLUTION:
BREAK IT DOWN
1NF
tbl_Items
Item Colors Price TaxT-shirt Red 12.00 0.60T-shirt Blue 12.00 0.60Polo Red 12.00 0.60Polo Yellow 12.00 0.60Sweatshirt Blue 12.00 0.60Sweatshirt Black 12.00 0.60
SecondNormal FormA table in Second Normal Form must first be in First Normal Form, and it must not have any partial dependencies.All non-key fields depend on all components of the primary key, guaranteed when primary key is a single field.
2NF
Partial DependencyA Partial Dependency refers to non-key attributes which are only dependent on part of the primary key (aka the composite primary key).Let’s take a look at table and see how this applies.
tbl_Items
Item Colors Price TaxT-shirt Red 12.00 0.60T-shirt Blue 12.00 0.60Polo Red 12.00 0.60Polo Yellow 12.00 0.60Sweatshirt Blue 12.00 0.60Sweatshirt Black 12.00 0.60
SecondNormal FormTable is not in second normal form because:- PRICE and TAX depend on ITEM, but not COLOR 2NF
tbl_ColorItem
Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black
Item Price Tax
T-shirt 12.00 0.60
Polo 12.00 0.60
Sweatshirt 12.00 0.6
0
tbl_PriceItem
Third Normal FormTables in Third Normal Form must first be in Second Normal Form, and all non-prime attributes of each table must be dependent on the primary key. 3NF
TransitiveDependencyA Transitive Dependency refers to non key attributes which dependent on another non key attribute.
Third Normal FormTables in Third Normal Form must first be in Second Normal Form, and all non-prime attributes of each table must be dependent on the primary key.Let’s look at a table again:
3NF
tbl_ColorItem
Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black
Item Price Tax
T-shirt 12.00 0.60
Polo 12.00 0.60
Sweatshirt 12.00 0.6
0
tbl_PriceItem
Third Normal FormTables are not in third normal form because: - TAX depends on PRICE, not ITEM 3NF
tbl_ColorItem
Item ColorT-shirt RedT-shirt BluePolo RedPolo YellowSweatshirt BlueSweatshirt Black
Item PriceT-shirt 12.00Polo 12.00Sweatshirt 12.00
tbl_PriceItem
Price Tax12.00 0.60
tbl_Tax
Another Example
Name Assignment A Assignment B
Jeff Smith Article Summary Poetry AnalysisNancy Jones Article Summary Reaction PaperJane Scott Article Summary Poetry Analysis
Table_Assignment
Problem:Table is not in first normal form because:
- Assignment field repeating- First and last name in one field- No (guaranteed unique) primary key field
Solution:Break down the field NAME into First Name, and Last Name. 1NF
tbl_Assignment
First Name Last Name Assignment 1
Assignment 2
Jeff Smith Article Summary
Poetry Analysis
Nancy Jones Article Summary
Reaction Paper
Jane Scott Article Summary
Poetry Analysis
No Primary Key??Ans: CREATE ANOTHER FIELD in this case name it Student ID 1NF
tbl_AssignmentStudent
IDFirst Name
Last Name
Assignment 1
Assignment 2
1 Jeff Smith Article Summary
Poetry Analysis
2 Nancy Jones Article Summary
Reaction Paper
3 Jane Scott Article Summary
Poetry Analysis
Seems okay right?
Look again in the table 1NF
tbl_AssignmentStudent
IDFirst Name
Last Name
Assignment 1
Assignment 2
1 Jeff Smith Article Summary
Poetry Analysis
2 Nancy Jones Article Summary
Reaction Paper
3 Jane Scott Article Summary
Poetry Analysis
Solution:Assignment field repeatingSolution:Create a new fields (Assignment ID & Description)
1NF
tbl_AssignmentStudent
IDFirst Name
Last Name
Assignment ID Description
1 Jeff Smith A Article Summary
1 Jeff Smith B Poetry Analysis
2 Nancy Jones A Article Summary
2 Nancy Jones C Reaction Paper
3 Jane Scott A Article Summary
3 Jane Scott B Poetry Analysis
Table is not in 2NF since:- Description does not depend on Student ID 2NF
tbl_AssignmentStudent
IDFirst Name
Last Name
Assignment ID Description
1 Jeff Smith A Article Summary
1 Jeff Smith B Poetry Analysis
2 Nancy Jones A Article Summary
2 Nancy Jones C Reaction Paper
3 Jane Scott A Article Summary
3 Jane Scott B Poetry Analysis
tbl_StudentStuden
t IDFirst Name
Last Name
1 Jeff Smith2 Nancy Jones3 Jane Scott
Student ID
Assignment ID Description
1 A Article Summary
1 B Poetry Analysis
2 A Article Summary
2 C Reaction Paper
3 A Article Summary
3 B Poetry Analysis
tbl_Assignment
Table is not in 3NF since:-Description does not depend still on Student ID-Data Repetition
3NF
tbl_StudentStudent
IDFirst Name
Last Name
1 Jeff Smith2 Nancy Jones3 Jane Scott
Student ID
Assignment ID
1 A2 A3 A1 B 3 B 2 C
tbl_Assignment
Assignment ID Description
1 Article Summary
2 Poetry Analysis
3 Reaction Paper
tbl_Descript
Normalization for Non-IT ProfessionalsWhile the process of Normalization can be tricky for non-IT students and professionals, everyone should still be able to create logically-sound databases in the Third Normal Form.
SummaryNormalization is the systematic decomposition of tables in order to eliminate data redundancies and anomalies. There are three normal forms: 1NF, 2NF, and 3NF.A Primary Key is a single column (or a combination of two or more columns) that uniquely identifies each row.A Partial Dependency refers to non-key attributes which are only dependent on part of the primary key.A Transitive Dependency refers to non key attributes which dependent on another non key attribute.
Exercise 1Normalize the following “Pet_Health” table to 3NF:
Pet_ID Pet_Name Pet_Type Pet_Age Owner
771 Rover Dog 12 Sam Villa
204 Spot Dog 2 Anna Dy
348 Mrs Whiskers Cat 4 Sam Villa
Exercise 2Item_ID Item_Na
meItem_Desc
Supplier_Name Address PO_num PO_date
A101 BckBPOne box of black ballpens
De Oro Office Supplies
Cagayan de Oro City
20986 12-11-2014
A102 BluBPOne box of blue ballpens
De Oro Office Supplies
Cagayan de Oro City
20986 12-11-2014
P100 SBPOne ream of short bond paper
King PapersCagayan de Oro City
1217 02-10-2011
P100 SBPOne ream of short bond paper
Office Depot Iligan City 21044 01-05-
2015
EndReferences:www.lib.ku.edu/instruction