Copyright Irwin/McGraw-Hill 1998 1 Data Modeling Prepared by Kevin C. Dittman for Systems Analysis &...

28
Prepared by Kevin C. Dittman for Systems Analysis & Design Methods 4ed by J. L. Whitten & L. D. Bentley Copyright Irwin/McGraw-Hill 1998 1 Data Modeling An Introduction to Systems Modeling Systems Modeling One way to structure unstructured problems is to draw models. A model is a representation of reality. Just as a picture is worth a thousand words, most system models are pictorial representations of reality. Models can be built for existing systems as a way to better understand those systems, or for proposed systems as a way to document business requirements or technical designs. What are Logical Models? Logical models show what a system ‘is’ or ‘does’. They are implementation- in dependent; that is, they depict the system independent of any technical

Transcript of Copyright Irwin/McGraw-Hill 1998 1 Data Modeling Prepared by Kevin C. Dittman for Systems Analysis &...

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19981

Data ModelingAn Introduction to Systems

Modeling

Systems Modeling One way to structure unstructured problems is to draw models.

A model is a representation of reality. Just as a picture is worth a thousand words, most system models are pictorial representations of reality.

Models can be built for existing systems as a way to better understand those systems, or for proposed systems as a way to document business requirements or technical designs.

What are Logical Models? Logical models show what a system ‘is’ or ‘does’. They are

implementation-independent; that is, they depict the system independent of any technical implementation. As such, logical models illustrate the essence of the system.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19982

Data ModelingAn Introduction to Systems

Modeling

Systems Modeling What are Physical Models?

Physical models show not only what a system ‘is’ or ‘does’, but also how the system is physically and technically implemented. They are implementation-dependent because they reflect technology choices, and the limitations of those technology choices.

Systems analysts use logical system models to depict business requirements, and physical system models to depict technical designs.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19983

Data ModelingAn Introduction to Systems

Modeling

Systems Modeling Data modeling is a technique for defining business requirements

for a database. Data modeling is a technique for organizing and documenting

a system’s DATA. Data modeling is sometimes called database modeling because a data model is usually implemented as a database. It is sometimes called information modeling.

Many experts consider data modeling to be the most important of the modeling techniques.

Data is a resource to be shared by as many processes as possible.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19984

Data Modeling

CUSTOMER

Customer Number (PK) Customer Name Shipping Address Billing Address Balance Due

ORDER

Order Number (PK) Order Date Order Total Cost Customer Number (FK)

INVENTORY PRODUCT

Product Number (PK) Product Name Product Unit of Measure Product Unit Price

ORDERED PRODUCT

Ordered Product ID (PK) . Order Number (FK) . Product Number (FK) Quantity Ordered Unit Price at Time of Order

has placed

sold

sold as

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19985

Data Modeling

System Concepts for Data Modeling

System Concepts Most systems analysis techniques are strongly rooted in systems

thinking. Systems thinking is the application of formal systems theory

and concepts to systems problem solving.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19986

Data Modeling

System Concepts for Data Modeling

Entities All systems contain data. Data describes ‘things’. A concept to abstractly represent all instances of a group of

similar ‘things’ is called an entity. An entity is something about which we want to store data.

Synonyms include entity type and entity class. An entity is a class of persons, places, objects, events, or

concepts about which we need to capture and store data. An entity instance is a single occurrence of an entity.

STUDENT

An entity

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19987

Data Modeling

System Concepts for Data Modeling

Attributes The pieces of data that we want to store about each instance of a

given entity are called attributes. An attribute is a descriptive property or characteristic of an

entity. Synonyms include element, property, and field. Some attributes can be logically grouped into super-attributes

called compound attributes. A compound attribute is one that actually consists of more

primitive attributes. Synonyms in different data modeling languages are numerous: concatenated attribute, composite attribute, and data structure.

STUDENT

Name . Last Name . First Name . Middle Initial Address . Street Address . City . State or Province . Country . Postal Code Phone Number . Area Code . Exchange Number . Number Within Exchange Date of Birth Gender Race Major Grade Point Average

Attributes and compound attributes

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19988

Data Modeling

System Concepts for Data Modeling

Attributes Domains:

The values for each attribute are defined in terms of three properties: data type, domain, and default.

• The data type for an attribute defines what class of data can be stored in that attribute.

• For purposes of systems analysis and business requirements definition, it is useful to declare logical (non-technical) data types for our business attributes.

• An attribute’s data type determines its domain.

– The domain of an attribute defines what values an attribute can legitimately take on.

• Every attribute should have a logical default value.

– The default value for an attribute is that value which will be recorded if not specified by the user.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 19989

Data Modeling

Logical Data Type Logical Business Meaning

NUMBER Any number, real or integer

TEXT A string of characters, inclusive of numbers. When numbers are

included in a TEXT attribute, it means we do not expect to

perform arithmetic or comparisons with those numbers.

MEMO Same as TEXT but of an indeterminate size. Some business

systems require the ability to attach potentially lengthy note to a

give database record.

DATE Any date in any format.

TIME Any time in any format.

YES/NO An attribute that can only assume one of these two values

VALUE SET A finite set of values. In most cases, a coding scheme would be

established (e.g., FR=freshman, SO=sophomore, JR=junior,

SR=senior, etc.)

IMAGE Any picture or image.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199810

Data ModelingData Type Domain Examples

NUMBER For integers, specify the range:

{minimum - maximum}

For real numbers, specify the range and

precision:

{minimum.precision -

maximum.precision}

{10- 99}

{1.000 - 799.999}

TEXT TEXT (maximum size of attribute)

Actual values are usually infinite;

however, users may specify certain

narrative restrictions.

TEXT (30)

MEMO Not applicable. There are no restrictions

on size or content.

Not applicable.

DATE Variation on the MMDDYYYY format. To

accommodate the year 2000, do not

abbreviate year to YY. Formatting

characters are rarely stored; therefore, do

not include hyphens or slashes.

MMDDYYYY

MMYYYY

YYYY

TIME For AM/PM times: HHMMT

- or -

HHMMT

HHMM

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199811

Data Modeling

Default Value Interpretation Examples

A legal value from the

domain (as described above)

For an instance of the attribute, if the user

does not specify a value, then use this value.

0

1.00

FR

NONE or NULL For an instance of the attribute, if the user

does not specify a value, then leave it blank.

NONE

NULL

REQUIRED or NOT NULL For an instance of the attribute, require the

user to enter a legal value from the domain.

(This is used when no value in the domain is

common enough to be a default, but a some

value must be entered.)

REQUIRED

NOT NULL

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199812

Data Modeling

System Concepts for Data Modeling

Attributes Identification:

An entity typically has many instances; perhaps thousands or millions and there exists a need to uniquely identify each instance based on the data value of one or more attributes.

Every entity must have an identifier or key.• An key is an attribute, or a group of attributes, which assumes a

unique value for each entity instance. It is sometimes called an identifier.

Sometimes more than one attribute is required to uniquely identify an instance of an entity.

• A group of attributes that uniquely identifies an instance of an entity is called a concatenated key. Synonyms include composite key and compound key.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199813

Data Modeling

System Concepts for Data Modeling

Attributes Identification:

Frequently, an entity may have more than one key. Each of these attributes is called a candidate key.

• A candidate key is a ‘candidate to become the primary identifier’ of instances of an entity. It is sometimes called a candidate identifier. (Note: A candidate key may be a single attribute or a concatenated key.)

• A primary key is that candidate key which will most commonly be used to uniquely identify a single entity instance.

• Any candidate key that is not selected to become the primary key is called an alternate key.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199814

Data Modeling

System Concepts for Data Modeling

Attributes Identification:

Sometimes, it is also necessary to identify a subset of entity instances as opposed to a single instance.

• For example, we may require a simple way to identify all male students, and all female students.

• A subsetting criteria is a attribute (or concatenated attribute) whose finite values divide all entity instances into useful subsets. Some methods call this an inversion entry.

STUDENT

Student Number (Primary Key 1) Name (Alternate Key 1) . Last Name . First Name . Middle Initial Address . Street Address . City . State or Province . Country . Postal Code Phone Number . Area Code . Exchange Number . Number Within Exchange Date of Birth Gender (Subsetting Criteria 1) Race (Subsetting Criteria 2) Major (Subsetting Criteria 3) Grade Point Average

Keys and submitting criteria

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199815

Data Modeling

System Concepts for Data Modeling

Relationships Conceptually, entities and attributes do not exist in isolation. Entities interact with, and impact one another via relationships to

support the business mission. A relationship is a natural business association that exists

between one or more entities. The relationship may represent an event that links the entities, or merely a logical affinity that exists between the entities.

A connecting line between two entities on an ERD represents a relationship.

A verb phrase describes the relationship.• All relationships are implicitly bidirectional, meaning that they

can interpreted in both directions.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199816

Data Modeling

STUDENT CURRICULUMis enrolled inis being studied by

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199817

Data Modeling

System Concepts for Data Modeling

Relationships Foreign Keys:

A relationship implies that instances of one entity are related to instances of another entity.

To be able to identify those instances for any given entity, the primary key of one entity must be migrated into the other entity as a foreign key.

• A foreign key is a primary key of one entity that is contributed to (duplicated in) another entity for the purpose of identifying instances of a relationship. A foreign key (always in a child entity) always matches the primary key (in a parent entity).

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199818

Data Modeling

CURRICULUM Program of Study Code (Primary Key) Title of Program Type of Degree Awarded (Subsetting Criteria 1) Department Number (Foreign Key)

DEPARTMENT Department Number (Primary Key) Department Name

offers is offered by

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199819

Data Modeling

How to Construct Data Models

1st Step - Entity Discovery The first task in data modeling is to discover those fundamental

entities in the system that are or might be described by data. There are several techniques that may be used to identify entities.

During interviews with system owners and users, pay attention to key words in their discussion.

During interviews specifically ask the system owners and users to identify things about which they would like to capture, store, and produce information.

Study existing forms and files. Some CASE tools can reverse engineer existing files and

databases into physical data models.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199820

Data Modeling

How to Construct Data Models

2nd Step - The Context Data Model The second task in data modeling is to construct the context data

model. The context data model includes the fundamental or

independent entities that were previously discovered.• An independent entity is one which exists regardless of the

existence of any other entity. Its primary key contain no attributes that would make it dependent on the existence of another entity.

• Independent entities are almost always the first entities discovered in your conversations with the users.

Relationships should be named with verb phrases that, when combined with the entity names, form simple business sentences or assertions.

• Always name the relationship from parent-to-child.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199821

Data Modeling

MEMBER ORDERComment

PRODUCTComment

MEMBERComment

PROMOTIONComment

AGREEMENTComment

CLUBComment

responds to

is featured in

places

establishessponsors

belongs to

sells binds

generates

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199822

Data Modeling

How to Construct Data Models

3rd Step - The Key-Based Data Model The third task is to identify the keys of each entity. The following guidelines are suggested for keys:

The value of a key should not change over the lifetime of each entity instance.

The value of a key cannot be null. Controls must be installed to ensure that the value of a key is

valid.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199823

Data Modeling

How to Construct Data Models

4th Step - Generalized Hierarchies At this time, it would be useful to identify any generalization

hierarchies in a business problem.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199824

Data Modeling

How to Construct Data Models

5th Step - The Fully Attributed Data Model The fifth task is to identify the remaining data attributes.

The following guidelines are offered for attribution.• Many organizations have naming standards and approved

abbreviations.

– The data or repository administrator usually maintains such standards.

• Many attributes share common base names such as NAME, ADDRESS, DATE.

– Unless the attributes can be generalized into a supertype, it is best to give each variation a unique name such as:

CUSTOMER NAME vs SUPPLIER NAME

– Names must be distinguishable across projects.

• Logical attribute names should not be abbreviated.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199825

Data Modeling

How to Construct Data Models

5th Step - The Fully Attributed Data Model The following guidelines are offered for attribution.

(continued)• For attributes that have only YES or NO values, name as

questions.

– For example, CANDIDATE FOR A DEGREE?

• Each attribute should be mapped to only one entity.

– Foreign keys are the exception – they identify associated instances of related entities.

• An attribute’s domain should not be based on logic.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199826

Data Modeling

How to Construct Data Models

6th Step - The Fully Described Model The last task is to fully describe the data model.

This task is the most time consuming. This task can be started in parallel with the key-based model or

fully attributed model, but it is usually the last data modeling task completed.

At this time the descriptions for the attributes are still incomplete – they require domains.

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199827

Data Modeling

How to Construct Data Models

6th Step - The Fully Described Model Additional descriptive properties may be recorded for attributes

such as:• Who should be able to create, delete, update, and access each

attribute?

• How long should each attribute (or entity) be kept before the data is deleted or archived?

Prepared by Kevin C. Dittman for

Systems Analysis & Design Methods 4ed

by J. L. Whitten & L. D. BentleyCopyright Irwin/McGraw-Hill 199828

Data Modeling

The Next Generation

Data modeling should remain a value-added skill for many years.

The demand for data modeling as a skill is dependent on two factors: (1) the need for databases, and (2) the use of relational database management system

technology to implement those databases.

Internet