Dr. Chaitali Basu Mukherji

55
Data Modelling Dr. Chaitali Basu Mukherji 1

description

Data Modeling Concepts A data model is a simply a diagram that describes the most important “things” in your business environment from a data-centric point of view.

Transcript of Dr. Chaitali Basu Mukherji

Page 1: Dr. Chaitali Basu Mukherji

1

Data Modelling

Dr. Chaitali Basu Mukherji

Page 2: Dr. Chaitali Basu Mukherji

2

Data Modeling Concepts

A data model is a simply a diagram that describes the most important “things” in your business environment from a data-centric point of view.

Page 3: Dr. Chaitali Basu Mukherji

3

Data Model

• for representation of a part of a real world• it is an abstraction of the reality : ignores

unnecessary details• represents operational data about real world

events, entities, activities, etc. • model may be at various levels depending of

requirements : – logical or physical– external, conceptual, internal

Page 4: Dr. Chaitali Basu Mukherji

4

Data Model……

• a good model– is easy to understand– has a few concepts– permits top-down specifications

• model offers concepts, constructs and operations

• must capture meaning of data (data semantics) which help us in interpreting and manipulating data

Page 5: Dr. Chaitali Basu Mukherji

5

Data Model……

• semantics captured through data types, inter-relationships and data integrity constraints– uniqueness– existence dependence– restrictions on some operations such as insertions,

deletions

Page 6: Dr. Chaitali Basu Mukherji

6

Example : Data Model in a PL

• data structuring concepts– data field/variable– data groups and arrays– Record/structure : unit of file i/o– file : collection of records

Page 7: Dr. Chaitali Basu Mukherji

7

Example : Data Model in a PL …

• Operations– file level : open, close– record level :

• read next/random• write next/random

– field level : computations• no inter-file and inter-record relationships• no constraints except primary key for indexed

files

Page 8: Dr. Chaitali Basu Mukherji

8

Why We Model We build models of complex systems because we

cannot comprehend any such system in its entirety Need to develop a common understanding of the

problem and the solution Cannot afford a trial-and-error approach to communicate the desired structure and behavior of

our systems to visualize and control system’s architecture to understand the system we are building, often

exposing opportunities for simplification and reuse

Page 9: Dr. Chaitali Basu Mukherji

9

Why We Modelto manage risk The choice of which model we use has a profound

influence on how a problem is attacked and how a solution is shaped

No single model is sufficient; every complex system is best approached through a set of independent models

Every model may be expressed at different levels of fidelity

The best models are connected to reality

Page 10: Dr. Chaitali Basu Mukherji

10

Conceptual Data ModelingData can be modeled at many levels, including the

conceptual, logical and physical level.

Conceptual Data Modeling is a very high level representation of organizational data. The purpose is to show the basic building blocks for the organization, i.e.

the entities and rules about their meaning and interrelationships

Logical data modeling adds more detail to conceptual modeling, but is still concerned only with how the organization/business uses data.

Physical data modeling adds more detail, but is especially concerned with the actual physical implementation of the data.

Page 11: Dr. Chaitali Basu Mukherji

11

Gathering InformationTwo perspectives

Top-downData model is derived from an intimate understanding of the

businessBottom-up

Data model is derived by reviewing specifications and business documents

Page 12: Dr. Chaitali Basu Mukherji

12

Four Types of Data Models

Current System Proposed SystemProject-Level

E-R diagram for system being replaced

Covers just data needed in the project’s application

Enterprise-Level

An E-R diagram for the whole database from which data for the application system being replaced is drawn

An E-R diagram for the whole database from which the new application’s data are extracted

Page 13: Dr. Chaitali Basu Mukherji

13

Sample conceptual data model diagram

Page 14: Dr. Chaitali Basu Mukherji

14

Business Principles

Business PrinciplesOrganizational consensusData integrityImplementation efficiencyUser friendlinessOperational efficiency

IT PrinciplesScalabilityCompliance with IT standards

Page 15: Dr. Chaitali Basu Mukherji

15

Primary steps

Problem definitionDo I need data warehouse?What specific problems will it solve?What are my available resources (time, money,

and personnel)?What criteria will I use to measure success?Should I outsource all, some, or none of the

development and operation?Am I upgrading an existing system, converting

from a legacy system, or developing from scratch?

Page 16: Dr. Chaitali Basu Mukherji

16

Primary steps

Requirement analysisGroup and "bubble-up" requirements.Generate a prioritized requirements table listing the

requirement, where it came from, the success criteria, and priority. Keep this table high-level. A table with a dozen requirements will be much easier to manage than one with hundreds.

Produce a detailed development schedule including hardware, software, personnel, documentation, and reviews. Include outsourcing requirements and long lead-time items.

Get a sign-off of the requirements, resource allocation, and schedule from top management before you go any further.

Page 17: Dr. Chaitali Basu Mukherji

17

Primary stepsRequirement analysis

Clearly state the problem(s) you wish to solve.Identify all data sources and formats.Identify the users of the completed system.Formulate a specific budget - time, money, personnel.Ask identified users to specifically state what they expect

the system to do.Ask management to specifically state their success

criteriaSeparate their requirements from their "desirements."

Only design to requirements. The enhancement phase is where you address the "desirements."

Page 18: Dr. Chaitali Basu Mukherji

18

Primary stepsConceptual design

Information or data modelingFirst level – ER ModelingSecond level – Normalization

Design and prototypingRapid prototypingStructured development

Page 19: Dr. Chaitali Basu Mukherji

19

Primary stepsDevelopment and DocumentationTest and ReviewDeployment and TrainingOperationEnhancementHelp Desk

Page 20: Dr. Chaitali Basu Mukherji

20

Objectives of ER Model

• Define terms related to entity relationship modeling, including entity, entity instance, attribute, relationship and cardinality, and primary key.

• Describe the entity modeling process.• Discuss how to draw an entity relationship

diagram.• Describe how to recognize entities, attributes,

relationships, and cardinalities.

Page 21: Dr. Chaitali Basu Mukherji

21

Database Model

A database can be modeled as:– a collection of entities,– relationship among entities.

Database systems are often modeled using an Entity Relationship (ER) diagram as the "blueprint" from which the actual data is stored — the output of the design phase.

Page 22: Dr. Chaitali Basu Mukherji

22

Entity Relationship Diagram (ERD)• ER model allows us to sketch database designs

• ERD is a graphical tool for modeling data.• ERD is widely used in database design • ERD is a graphical representation of the logical

structure of a database • ERD is a model that identifies the concepts or entities

that exist in a system and the relationships between those entities

Page 23: Dr. Chaitali Basu Mukherji

23

Purposes of ERDAn ERD serves several purposes• The database analyst/designer gains a better

understanding of the information to be contained in the database through the process of constructing the ERD.

• The ERD serves as a documentation tool.• Finally, the ERD is used to communicate the

logical structure of the database to users. In particular, the ERD effectively communicates the logic of the database to users.

Page 24: Dr. Chaitali Basu Mukherji

24

Components of an ERD

An ERD typically consists of four different graphical components:1. Entity2. Relationship3. Cardinality4. Attribute

Page 25: Dr. Chaitali Basu Mukherji

25

Classification of Relationship

• Optional Relationship– An Employee may or may not be assigned to a

Department– A Patient may or may not be assigned to a Bed

• Mandatory Relationship– Every Course must be taught by at least one

Teacher– Every mother have at least a Child

Page 26: Dr. Chaitali Basu Mukherji

26

Cardinality Constraints Express the number of entities to which another

entity can be associated via a relationship set.• Cardinality Constraints - the number of instances of

one entity that can or must be associated with each instance of another entity.

• Minimum Cardinality– If zero, then optional– If one or more, then mandatory

• Maximum Cardinality– The maximum number

Page 27: Dr. Chaitali Basu Mukherji

27

Cardinality Constraints (Contd.)

• For a binary relationship set the mapping cardinality must be one of the following types:–One to one

• A Manager Head one Department and vice versa

–One to many ( or many to one)• An Employee Works in one Department or One

Department has many Employees

–Many to many • A Teacher Teaches many Students and A student is

taught by many Teachers

Page 28: Dr. Chaitali Basu Mukherji

28

Cardinality Constraints (Contd.)

Page 29: Dr. Chaitali Basu Mukherji

29

Cardinality Constraints Example

• In our model, we wish to indicate that each school may enroll many students, or may not enroll any students at all.

• We also wish to indicate that each student attends exactly one school. The following diagram indicates this optionality and cardinality:

Page 30: Dr. Chaitali Basu Mukherji

30

Cardinality Constraints Example (Contd.)

SCHOOL

STUDENTEach school enrolls

at least zero

and at most many

students

Each student attends

at least one

and at most one

school

Page 31: Dr. Chaitali Basu Mukherji

31

Developing an ERDThe process has ten steps:

1. Identify Entities2. Find Relationships3. Draw Rough ERD4. Fill in Cardinality5. Define Primary Keys6. Draw Key-Based ERD7. Identify Attributes8. Map Attributes9. Draw fully attributed ERD10. Check Results

Page 32: Dr. Chaitali Basu Mukherji

32

A Simple Example

A company has several departments. Each department has a supervisor and at least one employee. Employees must be assigned to at least one, but possibly more departments. At least one employee is assigned to a project, but an employee may be on vacation and not assigned to any projects. The important data fields are the names of the departments, projects, supervisors and employees, as well as the supervisor and employee number and a unique project number.

Page 33: Dr. Chaitali Basu Mukherji

33

Identify entities• One approach to this is to work through the information

and highlight those words which you think correspond to entities.

• A company has several departments. Each department has a supervisor and at least one employee. Employees must be assigned to at least one, but possibly more departments. At least one employee is assigned to a project, but an employee may be on vacation and not assigned to any projects. The important data fields are the names of the departments, projects, supervisors and employees, as well as the supervisor and employee number and a unique project number.

• A true entity should have more than one instance

Page 34: Dr. Chaitali Basu Mukherji

34

Find Relationships

• Aim is to identify the associations, the connections between pairs of entities.

• A simple approach to do this is using a relationship matrix (table) that has rows and columns for each of the identified entities.

Page 35: Dr. Chaitali Basu Mukherji

35

Find Relationships (Contd.)

• Go through each cell and decide whether or not there is an association. For example, the first cell on the second row is used to indicate if there is a relationship between the entity "Employee" and the entity "Department".

Page 36: Dr. Chaitali Basu Mukherji

36

Identified Relationships Names placed in the cells are meant to

capture/describe the relationships. So you can use them like this

• A Department is assigned an employee• A Department is run by a supervisor• An employee belongs to a department• An employee works on a project• A supervisor runs a department• A project uses an employee

Page 37: Dr. Chaitali Basu Mukherji

37

Draw Rough ERDDraw a diagram and:• Place all the entities in rectangles• Use diamonds and lines to represent the

relationships between entities.• General Examples

Page 38: Dr. Chaitali Basu Mukherji

38

Drawing Rough ERD (Contd.)

Page 39: Dr. Chaitali Basu Mukherji

39

Drawing Rough ERD (Contd.)

Page 40: Dr. Chaitali Basu Mukherji

40

Drawing Rough ERD (Contd.)

Page 41: Dr. Chaitali Basu Mukherji

41

Fill in Cardinality• Supervisor

– Each department has one supervisor.• Department

– Each supervisor has one department.– Each employee can belong to one or more departments

• Employee– Each department must have one or more employees– Each project must have one or more employees

• Project– Each employee can have 0 or more projects.

Page 42: Dr. Chaitali Basu Mukherji

42

Fill in Cardinality (Contd.)

The cardinality of a relationship can only have the following values

–One and only one–One or more–Zero or more–Zero or one

Page 43: Dr. Chaitali Basu Mukherji

43

Cardinality Notation

Page 44: Dr. Chaitali Basu Mukherji

44

Cardinality Examples

A

A

A

A

B

B

B

B

Each instance of A is related to a minimum ofzero and a maximum of one instance of B

Each instance of B is related to a minimum ofone and a maximum of one instance of A

Each instance of A is related to a minimum ofone and a maximum of many instances of B

Each instance of B is related to a minimum ofzero and a maximum of many instances of A

Page 45: Dr. Chaitali Basu Mukherji

45

ERD with cardinality

Page 46: Dr. Chaitali Basu Mukherji

46

Examples

Page 47: Dr. Chaitali Basu Mukherji

47

ERD for Course Enrollment

Page 48: Dr. Chaitali Basu Mukherji

48

ERD for Course Registration

Page 49: Dr. Chaitali Basu Mukherji

49

Rough ERD Plus Primary Keys

Page 50: Dr. Chaitali Basu Mukherji

50

Identify Attributes

• In this step we try to identify and name all the attributes

essential to the system we are studying without trying to

match them to particular entities.

• The best way to do this is to study the forms, files and

reports currently kept by the users of the system and

circle each data item on the paper copy.

Page 51: Dr. Chaitali Basu Mukherji

51

Identify Attributes• Cross out those which will not be transferred to the new system,

extraneous items such as signatures, and constant information

which is the same for all instances of the form (e.g. your company

name and address). The remaining circled items should represent

the attributes you need. You should always verify these with your

system users. (Sometimes forms or reports are out of date.)

• The only attributes indicated are the names of the departments,

projects, supervisors and employees, as well as the supervisor and

employee NUMBER and a unique project number.

Page 52: Dr. Chaitali Basu Mukherji

52

Map Attributes• For each attribute we need to match it with exactly one

entity. Often it seems like an attribute should go with more than one entity (e.g. Name). In this case you need to add a modifier to the attribute name to make it unique (e.g. Customer Name, Employee Name, etc.) or determine which entity an attribute "best' describes.

• If you have attributes left over without corresponding entities, you may have missed an entity and its corresponding relationships. Identify these missed entities and add them to the relationship matrix now.

Page 53: Dr. Chaitali Basu Mukherji

53

Map Attributes (Contd.)

Page 54: Dr. Chaitali Basu Mukherji

54

Draw Fully Attributed ERD

Page 55: Dr. Chaitali Basu Mukherji

55

Check ERD Results

• Look at your diagram from the point of view of a system owner or user. Is everything clear?

• Check through the Cardinality pairs.• Also, look over the list of attributes associated

with each entity to see if anything has been omitted.