Www.monash.edu.au CSE4500 Information Retrieval Systems XML Schema – Part 1.
Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002...
-
Upload
natalie-summers -
Category
Documents
-
view
213 -
download
0
Transcript of Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002...
![Page 1: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/1.jpg)
www.monash.edu.au
Detailed Data Modelling:
Attribute Collection and Normalisation of Data
IMS1002 /CSE1205 Systems Analysis and Design
![Page 2: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/2.jpg)
www.sims.monash.edu.au
2
Detailed Data Modelling
The objective of detailed data modelling is to develop a detailed data structure that:
• has stability, minimum redundancy, and is flexible to allow for future change
• can be used as the basis for physical file and database design
• reflects the actual data requirements of the system
![Page 3: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/3.jpg)
www.sims.monash.edu.au
3
Detailed Data Modelling
• To expand the conceptual data model, we need to identify and describe the details of the entities and relationships
• Attributes (data elements) of entities and relationships are identified
• The attributes should be independent of implementation technology
• Each attribute should represent a single “fact”
• An organisation-wide perspective should be adopted to ensure minimum redundancy and inconsistency and to facilitate data sharing
![Page 4: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/4.jpg)
www.sims.monash.edu.au
4
• Detailed data modelling aims to identify and describe attributes, convert ER models to relations, and to normalise the relations to ensure that they are well structured
• Techniques:
>attribute collection
>convert ER models to relations
>normalisation
>convert to data structure diagram
>Data Dictionary
Detailed Data Modelling
![Page 5: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/5.jpg)
www.sims.monash.edu.au
5
Attributes
• An attribute is a named property or characteristic of an entity that is of interest to the organisation
• Use an initial capital letter followed by lower case letters in naming attributes
> eg Date, First Name, Suburb, Account No,
• Each attribute name must be unique and distinctive
![Page 6: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/6.jpg)
www.sims.monash.edu.au
6
Identifying Attributes
There are three main sources of attributes:
– data to support essential user functions
– data to support current operations
– data to measure performance against objectives
![Page 7: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/7.jpg)
www.sims.monash.edu.au
7
Identifying Attributes
• Data to support all essential user activities identified as system requirements:
– Involve all potential users
– Include future data requirements of userseg.The inventory control function needs:
ITEM Item NumberItem DescriptionItem Location Item Quantity-on-hand Item Re-order Quantity
![Page 8: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/8.jpg)
www.sims.monash.edu.au
8
Data to support current operations:
• examine all forms, documents, reports and files (computerised and manual) used in the current system
• check with users for accurate definitions of attributes
• ensure all attributes identified are still required in the new system
Identifying Attributes
![Page 9: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/9.jpg)
www.sims.monash.edu.au
9
Identifying Attributes
Data to measure business objectives and performance:
– ensure that data necessary to measure performance against objectives is identified
eg. Objective stock should be held in the warehouse for no more than
seven days
Requires Date Of Delivery Date Of Despatch
for each Item
![Page 10: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/10.jpg)
www.sims.monash.edu.au
10
Documenting Attributes
• All attributes should be defined and described in the Data Dictionary
• Information should include:> name> description> format, precision, example values> domain of values, range of values> synonyms> method of derivation> validation constraints> entities which this attribute describes
![Page 11: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/11.jpg)
www.sims.monash.edu.au
11
Data Dictionary -data element entry
ACADEMIC
DATA ATTRIBUTE Product code
Alias: Inventory number, product numberDescription: Number to identify and differentiate each product held in warehouse
Format: Must be a positive integer
Range: 00001 to 99999Method of derivation: Next unused number in sequence
Validation: No characters; 6 digits only;
ENTITY: PRODUCT
AUTHOR: David Ross
DATE: 14 Oct 2005
![Page 12: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/12.jpg)
www.sims.monash.edu.au
12
Attribute Names
• Attribute names should be clear, unambiguous, and meaningful
• Each attribute name should be unique– Price, Retail Price, Cost Price for Item
– Date, Delivery Date of Purchase Order
– Name, First Name, Last Name of Member
– Quantity-on-hand, Quantity Sold, Qty Ordered
![Page 13: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/13.jpg)
www.sims.monash.edu.au
13
Attribute Names
Potential problem areas are:• Synonyms
some attributes may be synonyms of otherse.g. retail price, sale price
• Homonyms
some attributes may be homonyms of otherse.g. price (for retail price)
price (for wholesale price)
• Derived attributes
some attributes may be derived from otherse.g. total salary for department
![Page 14: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/14.jpg)
www.sims.monash.edu.au
14
Attribute Definition
• Each attribute should convey a single fact for ease of processing and querying
e.g Pay Code A normal hourlyB normal salariedC overtime hourlyD overtime salaried
should be
Pay Factor 1 normal2 overtime
Job Category X hourlyY salaried
![Page 15: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/15.jpg)
www.sims.monash.edu.au
15
Attribute Collection exercise 1.Take an order
• Customers name• Customers address• Customers phone number• Product description• Quantity ordered• Date product required by• Date of order• Time of order
![Page 16: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/16.jpg)
www.sims.monash.edu.au
16
Attribute Collection exercise 2.Get product
• Product description• Product number• Location of product• Quantity on hand• Product supplier• Quantity ordered• Date product required by• Date of order
![Page 17: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/17.jpg)
www.sims.monash.edu.au
17
Attribute Collection exercise 3. Order product
• Product number• Supplier name• Supplier address• Supplier order date• Quantity ordered from supplier• Expected delivery date• Approximate cost of order
![Page 18: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/18.jpg)
www.sims.monash.edu.au
18
Attribute Collection exercise 4. Sell product
• Product description• Product sell price• Quantity sold• Customer• Invoice Number• Invoice date• Total cost• Sale number• Date of sale• Time of sale• Payment amount• Receipt number
![Page 19: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/19.jpg)
www.sims.monash.edu.au
19
Problem of transcribing attributes into database structure
• Customers name• Customers address• Customers phone number• Product description• Quantity ordered• Date product required by• Date of order• Time of order• Product description• Product number• Location of product• Quantity on hand• Product supplier• Quantity ordered• Date product required by• Date of order
• Product number• Supplier name• Supplier address• Supplier order date• Quantity ordered from supplier• Expected delivery date• Approximate cost of order• Product description• Product sell price• Quantity sold• Customer• Invoice Number• Invoice date• Total cost• Sale number• Date of sale• Time of sale• Payment amount• Receipt number
![Page 20: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/20.jpg)
www.sims.monash.edu.au
20
• Each attribute should convey a single fact
– avoid embedding extra information in ranges of values
e.g. Invoice Number
0000-1499 north-east region1500-2999 south-east region3000-4499 central region
Attribute Definition
![Page 21: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/21.jpg)
www.sims.monash.edu.au
21
• Each attribute should convey a single fact– avoid embedding extra information in
identifiers of attributes
e.g. Part Number (9 digits) consists of:
Part Type Code (1)Factory Code (2)
Part Seq Number (4) Year Of Manuf
(2)
Attribute Definition
![Page 22: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/22.jpg)
www.sims.monash.edu.au
22
Normalisation
• Normalisation is a process for converting complex data structures into simple, stable data structures in the form of relations.
• Data models consisting of normalised relations:
>are robust and stable
>have minimum redundancy
>are flexible
>are technology-independent (logical)
![Page 23: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/23.jpg)
www.sims.monash.edu.au
23
• Normalisation ensures that each attribute is attached to the appropriate relation
– each attribute is contained in the relation which represents the real world system object or concept that the attribute describes or is a property ofe.g. the attribute Student Name should be in the relation STUDENT which represents the real world object “student” of interest to a student records system
Normalisation
![Page 24: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/24.jpg)
www.sims.monash.edu.au
24
Normalisation
• Normalisation was originally developed as part of relational database theory by E.F. Codd (1970)
• Normalisation is accomplished in stages, each of which corresponds to a “normal form”
• Originally, Codd defined first, second and third normal forms. Third normal form is adequate for most business applications.
• Later extensions include Boyce-Codd, 4th, 5th and domain-key normal forms.
![Page 25: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/25.jpg)
www.sims.monash.edu.au
25
The Relational Database Model
• The Relational Database Model represents data in the form of tables or relations
• Important concepts are:>relation>primary key>foreign key>functional dependency
![Page 26: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/26.jpg)
www.sims.monash.edu.au
26
Relation
• A relation is a named, two-dimensional table of data
• Each relation consists of a set of named columns and an arbitrary number of rows
• Each column corresponds to an attribute of the relation
• Each row corresponds to an instance (or record) that contains values for that instance
![Page 27: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/27.jpg)
www.sims.monash.edu.au
27
Example Relation
• A relation generally corresponds to some real world object or concept of interest to the system (similar to an entity) e.g.:
Emp# Name Salary Dept
1247
1982
9314
Adams
Smith
Jones
24000
27000
33000
Finance
MIS
Finance
Employee
Employee (Emp#, Name, Salary, Dept)
![Page 28: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/28.jpg)
www.sims.monash.edu.au
28
![Page 29: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/29.jpg)
www.sims.monash.edu.au
29
Properties of Relations
• Relational tables are tables in which:
– data values are atomic (single-valued)
– data values in columns are from the same domain
– each row in the relation is unique
– the sequence of columns is insignificant
– the sequence of rows is insignificant
![Page 30: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/30.jpg)
www.sims.monash.edu.au
30
Primary Key
• An attribute or group of attributes which uniquely identifies a row of a relation
– Employee (Emp#, Name, Salary, Dept)– Order-item (Order#, Item#, Qty-ordered)– Book (Book#, ISBN#, Call#, Copy#, Title, Author)
• Entity integrity (Relational Database theory) requires that each relation has a non-null primary key
![Page 31: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/31.jpg)
www.sims.monash.edu.au
31
Primary Key
• Where several possible keys are identified, they are known as candidate keys - choose one to be the primary key
> stability> meaning> value
eg EMPLOYEE (Emp#, Office#, Name)
PERSON (TaxFile#, Medicare#, Lic#, SSec#)
…or assign a unique key
BOOK (Book#, ISBN, Call#, Title)
![Page 32: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/32.jpg)
www.sims.monash.edu.au
32
Foreign key
• A foreign key is an attribute in one relation that is also a primary key in another relation
• The referential integrity constraint (relational database theory) specifies that if an attribute value exists in one relation then it must also exist in a linked relation
• A foreign key must satisfy referential integrity
![Page 33: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/33.jpg)
www.sims.monash.edu.au
33
Foreign Key
• In the example below, if a given Dept# exists in an Employee relation then that Dept# must exist in the Department relation
Employee (Emp#, Name, Salary, Dept#)
Department (Dept#, Dname, Budget)
foreign key
![Page 34: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/34.jpg)
www.sims.monash.edu.au
34
Functional Dependency
• A functional dependency is a particular relationship between attributes in a relation
• For any relation R, attribute B is functionally dependent on attribute A if each value of A has only ONE value of B associated with it,
i.e. if for every valid instance of A, that value of A uniquely determines the value of B
A identifies B
A B
Emp# Emp-name
Emp# Salary
![Page 35: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/35.jpg)
www.sims.monash.edu.au
35
• Normalisation to Third Normal Form (3NF) is accomplished in 3 steps each corresponding to a basic normal form
• A normal form is a state of a relation that can be determined by applying simple rules concerning dependencies within that relation
• Each step of the normalisation process is applied to a single relation in sequence so that the relation is converted to 3NF
Steps in Normalisation
![Page 36: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/36.jpg)
www.sims.monash.edu.au
36
Steps in Normalisation
First Normal Form 1NF
Second Normal Form
2NF
Third Normal Form 3NF
Unnormalised form UNF
ID Keys, Remove repeating groups
Remove partial dependencies
Remove transitive dependencies
![Page 37: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/37.jpg)
www.sims.monash.edu.au
37
Well Structured Relations
• A well structured relation contains a minimum amount of redundancy and allows users to insert, modify, and delete rows in a table without errors or inconsistencies (known as anomalies)
• Three types of anomaly are possible:
>Insertion >Deletion >Modification
• 3NF relations are considered to be well structured relations
![Page 38: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/38.jpg)
www.sims.monash.edu.au
38
First Normal Form 1NF
• A relation is in 1NF if it contains no repeating data :
the value of the data at the intersection of each row and column must be single-valued
• Remove any repeating groups of attributes to convert a relation to 1NF (key of the removed group is a composite key)
Order (Order#, Customer#, (Item#, Desc, Qty))
Order-Item (Order#, Item#, Desc, Qty)
Order (Order#, Customer#)
![Page 39: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/39.jpg)
www.sims.monash.edu.au
39
Second Normal Form 2NF
• A relation is in 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on the primary key
• A partial dependency exists if one or more non-key attributes are dependent on only part of a composite primary key
• Remove any partial dependencies to convert a 1NF relation to 2NF
• If the primary key consists of only one attribute or there are no non-key attributes, then a 1NF relation is automatically in 2NF
![Page 40: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/40.jpg)
www.sims.monash.edu.au
40
Second Normal Form 2NF
• Remove Partial Dependencies
–a non-key attribute cannot be identified by part of a composite key
Order (Order#, Item#, Desc, Qty-ordered)
Order-Item (Order#, Item#, Qty-ordered)
Item (Item#, Desc)
![Page 41: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/41.jpg)
www.sims.monash.edu.au
41
• If a relation has a composite key, it must be checked for dependencies within the key to determine whether they should be retained e.g.
DEPT (Dept#, Dept-name, (Emp#, Emp-name))
DEPT (Dept#, Dept-name), DEPT-EMP (Dept#, Emp#, Emp-name)
But DEPT-EMP (Dept#, Emp#, Emp-name)
Remove Dept# from the primary key and make it a foreign key
EMP (Emp# ,Emp-name, Dept#)
Primary Key Dependencies
![Page 42: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/42.jpg)
www.sims.monash.edu.au
42
Partial Dependency Anomalies
Order# Item# Item-desc Qty
27
28
28
873
402
873
nut
bolt
nut
2
1
10
Order-Item
30 495 washer 50
UPDATE change Item-desc in many places
DELETE data for last item lost when last order for that item is deleted
CREATE cannot add new item until it is ordered
![Page 43: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/43.jpg)
www.sims.monash.edu.au
43
Order# Item# Qty
27
28
28
873
402
873
2
1
10
Order
30 495 50
Item# Desc
873
402
nut
bolt
495 washer
delete last order for item, but item remains
add new item at any time
change item description in one place only
Item
Partial Dependency Anomalies
![Page 44: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/44.jpg)
www.sims.monash.edu.au
44
Third Normal Form 3NF
• A relation is in 3NF if it is in 2NF and no transitive dependencies exist
• A transitive dependency is a functional dependency between two or more non-key attributes
• Remove any transitive dependencies to convert a 2 NF relation to 3 NF
![Page 45: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/45.jpg)
www.sims.monash.edu.au
45
Third Normal Form 3NF
• Remove transitive dependencies– a non-key attribute cannot be identified by another non
key attribute
Employee (Emp#, Ename, Dept#, Dname)
Employee (Emp#, Ename, Dept#)
Department( Dept#, Dname)
(look for foreign keys and their attributes)
![Page 46: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/46.jpg)
www.sims.monash.edu.au
46
Transitive DependencyAnomalies
Emp# Emp-name Dept# Dname
10
20
25
Smith
Jones
Smith
D5
D7
D7
MIS
Finance
Finance
Employee
30 Black D8 Sales
UPDATE change dept name in many places
DELETE data for dept lost when last employee for that dept is deleted
CREATE cannot add new dept until an employee is allocated to it
![Page 47: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/47.jpg)
www.sims.monash.edu.au
47
Emp# Ename Dept#
10
20
25
Smith
Jones
Smith
D5
D7
D7
Employee
30 Black D8
Dept# Dname
D5
D7
MIS
Finance
D8 Sales
delete last emp in dept, but dept remains
add new dept at any time
change dept name in one place
Department
Transitive DependencyAnomalies
![Page 48: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/48.jpg)
www.sims.monash.edu.au
48
Normalisation to 3NF
• A relation is normalised if all attributes are fully functionally dependent on the primary key
>Remove repeating groups
>Remove partial dependencies
>Remove transitive dependencies
![Page 49: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/49.jpg)
www.sims.monash.edu.au
49
ER Diagrams and Relations
• Transforming an ER diagram into normalised relations, and then merging all the relations into one final, consolidated set of relations can be accomplished in four steps:
1. Represent entities as relations
2. Represent relationships as relations
3. Normalise each relation
4. Merge the relations
![Page 50: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/50.jpg)
www.sims.monash.edu.au
50
Entities and Relations
• Each entity in the ER model is transformed into a relation e.g.
CUSTOMER
Customer (Customer-no, Name, Address, City,
State, Postcode, Discount)
BECOMES
![Page 51: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/51.jpg)
www.sims.monash.edu.au
51
• 1. Binary Relationships (1:N, 1:1)
CUSTOMER ORDER
Customer (Customer-no, Name, Address, City, State, Postcode, Discount)
Order (Order-no, Order-date, Promised-date, Customer-no)
places
Relationships and Relations
![Page 52: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/52.jpg)
www.sims.monash.edu.au
52
1. Binary Relationships (1:N, 1:1)
• For 1:M, add the primary key of the entity on the ‘one’ side of the relationship as a foreign key in the relation that is on the ‘many’ side
• For 1:1 relationship involving entities A and B, choose from
add the primary key of A as a foreign key of B
add the primary key of B as a foreign key of A
both of the above
Relationships and Relations
![Page 53: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/53.jpg)
www.sims.monash.edu.au
53
• 2. Binary and Higher Degree Relationships (M:N)
PRODUCT ORDER
Where we wish to know the quantity of each product on each order, i.e. this attribute is an attribute of the relationship ‘requests’:
requests
Order (Order-no, Order-date, Promised-date)Order Line (Order-no, Product-no, Quantity-ordered)Product (Product-no, description, (other attributes))
Relationships and Relations
![Page 54: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/54.jpg)
www.sims.monash.edu.au
54
Relationships and Relations
• 2. Binary and Higher Degree Relationships (M:N)
For M:N, first create a relation for each for each of the entity types, and then create a relation for the relationship, with a composite primary key formed from the primary keys of the participating entity types.
![Page 55: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/55.jpg)
www.sims.monash.edu.au
55
• 3. Unary Relationships
EMPLOYEEITEM
Has component manages
reports to
is a component of
(1:N)(M:N)
Employee (Emp-id, Name, Birthdate, Manager-id)
Item (Item-no, Name, Cost)Item-Bill (Item-no, Component-no, Quantity)
Relationships and Relations
![Page 56: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/56.jpg)
www.sims.monash.edu.au
56
•4. IS-A Relationship (Class-Subclass or generalisation)
PROPERTY
BEACH PROPERTY
MOUNTAINPROPERTY
Property (Street-address, City-state-postcode, No-rooms, Typical-rent)Beach_Property (Street-address, City-state-postcode , Distance-to-beach)Mountain_Property (Street-address, City-state-postcode , Skiing)
Relationships and Relations
![Page 57: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/57.jpg)
www.sims.monash.edu.au
57
• During the normalisation process two or more relations with the same primary key may appear
• The set of 3NF relations must not contain any duplicate data
• Relations with the same primary key should be merged
Merging Normalised Relations
![Page 58: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/58.jpg)
www.sims.monash.edu.au
58
Normalisation of Relations
• Synonyms• Two or more attributes may have different names
but the same meaning• Either adopt one of the names as a standard or
choose a third nameSTUDENT1 (Student-id, Name, Phone-no)
STUDENT2 (VCE-no, Name, Address)
STUDENT (Student-id, Name, Address, Phone-no)
![Page 59: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/59.jpg)
www.sims.monash.edu.au
59
• Homonyms• Two or more attributes may have the same name but
different meanings• To resolve the conflict, new attribute names need to
be createdSTUDENT1 (Student-id, Name, Address)
STUDENT2 (Student-id , Name, Phone-no, Address)
STUDENT (Student-id, Name, Phone-no, Campus- address, Permanent-address )
Normalisation of Relations
![Page 60: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/60.jpg)
www.sims.monash.edu.au
60
Normalisation of Relations
• Transitive dependenciesWhen two 3NF relations are merged, transitive dependencies may result
STUDENT1 (Student-id, Major)STUDENT2 (Student-id , Advisor)
STUDENT (Student-id, Major, Advisor)
BUT Major Advisor
STUDENT1 (Student-id, Major)MAJOR (Major , Advisor)
![Page 61: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/61.jpg)
www.sims.monash.edu.au
61
Data Structure Diagram DSD
• A set of 3NF relations may be converted to a simple diagrammatic form to begin physical database design
• The conversion is simple:
1. Draw a named rectangle for each relation
2. Draw a relationship line between rectangles linked by foreign keys with a “many”
cardinality at the foreign key end of the relationship
![Page 62: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/62.jpg)
www.sims.monash.edu.au
62
Data Structure Diagram DSD
CUSTOMER (Cust#, Cname,Phone number)
SALES ORDER (Sord#, Sord-date, Cust#)
SALES ORDER-ITEM (Sord#, Item#, Qty)
ITEM (Item#, Item-desc)
CUSTOMER
SALESORDER
ITEM
SALES ORDERLINE
![Page 63: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/63.jpg)
www.sims.monash.edu.au
63
Data Structure Diagram DSD
• Eliminate Redundant Relationships
(Tourcode, ...)
(Tourcode, depdate, ...)
(Booking#, ... , Tourcode, depdate)
redundant
TOUR
DEPARTURE
BOOKING
![Page 64: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/64.jpg)
www.sims.monash.edu.au
64
Summary
• Detailed data modelling involves:– collecting detailed attributes for each entity and
relationship identified
– converting ER models to relations
– normalising the relations
– merging relations from each user viewpoint
– converting the normalised and merged relations to create a data structure diagram
![Page 65: Www.monash.edu.au Detailed Data Modelling: Attribute Collection and Normalisation of Data IMS1002 /CSE1205 Systems Analysis and Design.](https://reader035.fdocuments.in/reader035/viewer/2022070407/56649e235503460f94b10fcc/html5/thumbnails/65.jpg)
www.sims.monash.edu.au
65
Summary
Hoffer, J.A., George, J.F. and Valacich, (1999) 2nd edn., Modern Systems Analysis and Design, Benjamin-Cummings, MA USA.Chapter 16
Whitten, J.L. & Bentley, L.D. and Dittman, K.C., (2001) 5th edn., Systems Analysis and Design Methods, McGraw-Hill Irwin, Burr Ridge, IllinoisChapters 7, 12