Normalization. Rigorous technique used to break down data represented in a user view into a set of...

14
Normalization

Transcript of Normalization. Rigorous technique used to break down data represented in a user view into a set of...

Page 1: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Normalization

Page 2: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Normalization• Rigorous technique used to break down

data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation are defined by the key, the whole key and nothing but the key”

• Resulting relations will correspond to the tables to be used in a relational database

• Identification of primary key for a relation is critical

Page 3: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Sample User View: Class List

Section: DBS201A Subject Name: Intro to Database Design

Instr. No: 213 Instr. Name:Belvedere

Stud. No:111222333 Stud. Name: Joe Brown

Stud. No: 212121212 Stud. Name: Le Huang

Section: DBS201B Subject Name: Intro to Database Design

Instr. No: 222 Instr. Name: Langer

Stud. No: 323232323 Ella Zeltserman

Stud. No: 555555555 Maria Ramirez

Page 4: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Un-normalized Relation

• Identify all attributes presented in user view• Choose a primary key (made up of 1 or more

attributes) that best represents what user view describes

• Name the relation and list all attributes for the relation

• Indicate primary key by underlining 1 or more attributes

• Indicate if an attribute or group of related attributes can have more than one value for a given value of the primary key by enclosing within brace brackets { } – this is referred to as a ‘repeating group’

Page 5: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Class List Relation –Un-normalized Form

• CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name, Subject Name, {Student Number, Student Name} ]

Page 6: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

1st Normal Form Relation

• A relation is in 1st normal form when the primary key determines a single value of each attribute for all attributes in the relation (i.e. the relation contains no repeating groups)

• 2 different approaches can be used to take a relation from un-normalized to 1NF (both produce same results at the end of the normalization process!)

Page 7: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Un-normalized ->1st Normal Form Approach 1

• Un-normalized relation: CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name, Subject Name, {Student Number, Student Name} ]

• Restate original un-normalized relation without repeating group: CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name, Subject Name ]

• Create new relation consisting of key of original relation and attributes within repeating group and add to key to ensure uniqueness: CLASSLISTSTUDENT [ Subject Code, Section Code, Student Number, Student Name ]

Page 8: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Un-normalized ->1st Normal Form Approach 2

• Un-normalized Relation: CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name, Subject Name, {Student Number, Student Name} ]

• Add to key of un-normalized relation to insure primary key identifies 1 and only 1 value of each attribute in the relation: CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name, Subject Name, Student Number, Student Name ]

• Regardless of approach used you will now have 1 or more relations in which the primary key identifies 1 and only 1 value of each of the non-key attributes in the 1NF relation

Page 9: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

2nd Normal Form

• A 1NF relation is in 2NF when the entire primary key is needed to determine the value of each non-key attribute (i.e. relation has no partial dependencies – attributes whose values can be determined by knowing only part of the key)

Page 10: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

1st Normal Form --> 2nd Normal Form

1NF Relations: CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name, Subject Name ] contains the partial dependency SubjectCode -> Subject Name and

CLASSLISTSTUDENT [ Subject Code, Section Code, Student Number, Student Name ] contains the partial dependency Student Number-> Student Name so are not in 2NF

• Create new relation(s) consisting of part of the primary key and all attributes whose values are determined by this part of the primary key: SUBJECT [Subject Code, Subject Name ] and STUDENT [Student Number, Student Name ]

• Restate original relation(s) without partially dependent attributes: CLASSLISTSTUDENT [ Subject Code, Section Code, Student Number ] and CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name ]

Page 11: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

3Rd Normal Form

• A 2NF relation is in 3NF when the primary key and nothing but the primary key can be used to determine the value of each non-key attribute (i.e. relation has no transitive dependencies – attributes whose values can be determined by knowing something other than the key)

Page 12: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

2NF Relations -> 3NF2NF Relations: CLASSLISTSTUDENT [ Subject Code, Section Code,

Student Number ] , CLASSLIST [ Subject Code, Section Code, Instructor No, Instructor Name ] , SUBJECT [Subject Code, Subject Name ] and STUDENT [Student Number, Student Name ]

• Create new relation(s) consisting of the attribute(s) which are determined by something other than the primary key (transitive dependencies) and make the primary key of these new relation(s) the attribute that actually determines the value of these attributes. In CLASSLIST the Instructor Name is determined by Instructor No so create the new relation: INSTRUCTOR [Instructor No, Instructor Name ]

• Restate original relation(s) without transitively dependent attributes (Original relation will now contain a foreign key – a non-key attribute that relates to the primary key of the new relation) : CLASSLIST [ Subject Code, Section Code, Instructor No ] , CLASSLISTSTUDENT [ Subject Code, Section Code, Student Number ] , SUBJECT [Subject Code, Subject Name ] and STUDENT [Student Number, Student Name ]

Page 13: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

3NF Relations for ClassList User ViewSet of 3NF Relations for the Class List Userview: CLASSLIST [ Subject Code, Section Code, Instructor No ] CLASSLISTSTUDENT [ Subject Code, Section Code, Student

Number ] SUBJECT [Subject Code, Subject Name ] STUDENT [Student Number, Student Name ] INSTRUCTOR [Instructor No, Instructor Name ]

• 1 un-normalized user view will always result in 1 or more relations in UNF

• Each 1NF relation will result in 1 or more 2NF relations• Each 2NF relation will result in 1 or more 3NF relations• You can never lose (ie not include) an attribute – it must

always be found in one of the relations at each step • You can never lose a relation

Page 14: Normalization. Rigorous technique used to break down data represented in a user view into a set of 2-dimensional tables where “all attributes in the relation.

Normalize Remaining User views

• Normalization process is then applied to each remaining user view (eg grade sheet, timetable request, …)

• A set of 3NF relations is produced for each user view

• Then 3NF relations from each user view are then integrated to form one complete set of relations for the application