Logical Data Modeling Guide

17
Logical The data models are maintained in the shared folder SHARED\MODELS in server ev010dataaegm Revision: 1.0 04/18/2009 Owner: IS THIS THE LATEST REVISION OF THIS DOCUMENT? Before proceeding, please verify that you are using the latest approved revision of this document. If you are not sure how to do this, please ask your Enterprise Data Architect. Revision Information

Transcript of Logical Data Modeling Guide

Page 1: Logical Data Modeling Guide

Logical The data models are maintained in the shared folder SHARED\MODELS in server ev010dataaegm

Revision: 1.0 04/18/2009 Owner:

IS THIS THE LATEST REVISION OF THIS DOCUMENT?

Before proceeding, please verify that you are using the latest approved revision of this document. If you are not sure how to do this, please ask your Enterprise Data Architect.

Revision InformationModification Log:

Date Revision #

Person Updating

Section: Brief Description of Issue

Page 2: Logical Data Modeling Guide

Table of Contents1 INTRODUCTION 4

2 LOGICAL DATA MODELING 4

2.1 Project Deliverables Required for Logical Data Modeling 4

2.1.1 Technical Architecture Document (TAD) 4

2.1.2 Functional Specifications (FS) 4

2.1.3 Business Requirements Document (BRD) 4

2.1.4 Data Requirements Document (DRD) 4

2.2 Conceptual Data Model 5

2.3 Logical Data Model – General Guidelines 6

2.4 Additional Guidelines for Data Marts 6

2.4.1 Dimensional Data Modeling Guide 6

2.5 Model Naming Standard 6

2.6 Entities 7

2.6.1 Creating New Entities 7

2.6.2 Entity Naming 7

2.6.3 Entity Definition 8

2.6.4 Primary Key & Alternate Keys 9

2.6.5 Foreign Key Relationship 10

2.7 Attributes 10

2.7.1 Attribute Naming 10

2.7.2 Attribute Data Types 12

2.7.3 Attribute Constraints 12

2.7.4 Attribute Definition 12

2.8 Logical Data Model Review Checklist 13

3 ABBREVIATIONS 13

4 ORACLE 10G RESERVED WORDS 13

5 APPENDIX A - GLOSSARY 13

Page 3: Logical Data Modeling Guide

1 IntroductionIn the Logical Data Modeling section, the document captures the list of inputs from project team (business analyst) required by a data modeler to begin logical modeling, naming standards for entities, attributes, standards about identifying and creating keys for an entity and standards for creating foreign key relationships. The document also provides a checklist for data modeler to verify the data model before the logical model can be taken to a model review.

2 Logical Data Modeling

2.1 Project Deliverables Required for Logical Data ModelingIn order to ensure the success of the logical data modeling, please make sure that the following deliverables are provided:

Technical Architecture Document (TAD)

Functional Specifications (FS)

Business Requirements Document (BRD)

Data Requirements Document (DRD)

Completed Source Inventory file

2.1.1 Technical Architecture Document (TAD)See link to the standard TAD template

http://supportcentral.XX.com/*TAD

2.1.2 Functional Specifications (FS)See link to the standard FS template.

http://libraries.XX.com/download?fileid=22710683101&entity_id=3209860101&sid=101

2.1.3 Business Requirements Document (BRD)The BRD must provide data requirements that will include but are not limited to all source data, reporting, and application and extract requirements. One output of the BRD will be the documentation of the source data structures using SourceInventoryTemplate.xls. This template is a working document that starts with the BRD and is refined in the DRD.

2.1.4 Data Requirements Document (DRD)All data requirements that are needed in order to satisfy the Functional Specifications and Business Requirements, including the reporting requirements need to be identified in the DRD document. The DRD is a further detailed refinement of the Data Requirements that may have been already addressed in the FS and the BRD. In some instances, much of these requirements will already

Page 4: Logical Data Modeling Guide

be identified to some level of granularity, in the BRD. This document can stand-alone

or be referenced in a section of the BRD. Often not all the Data Requirements are sorted out at the time the BRD is finalized. The DRD is the primary responsibility of the Data Analyst (DA); however

This may be a shared responsibility of the DA, BA, DM and/or the Data Architect with the guidance of the Business.

A template for data requirements is provided at DRD Template.doc

The DRD will define:

All sources and inputs to the project

Technical and business definitions for all data elements

All the outputs or extracts (if any) that will feed other systems or that will be downloaded to local applications included those that may be used to populate excel spreadsheets on the users desktop.

Dimensions and measures will be identified as they relate to reporting requirements

Key subject areas will be identified in a conceptual data model, e.g. Customer, Location, Shipments, and Orders etc. Once can use these to create a conceptual data model of the data.

Data Requirements Definition. The data requirement has to be defined in appropriate format. For data integration projects, please document the source data structures using SourceInventoryTemplate.xls

2.2 Conceptual Data ModelIf the complexity of the requirements is high, a conceptual data model needs to be created, before attempting to derive a logical data model.

Features of conceptual data model must include:

Includes the important entities and the relationships among them.

No attribute is specified.

No primary key is specified.

At this level, the data modeler attempts to identify the highest-level relationships among the different entities.

Conceptual model can be done with Erwin. Bpwin may be a superior tool for conceptual model. But the tool evaluation is out of scope for this effort. We will evaluate the tool and give you the guidance later. Until then, please feel free to use ERwin for Conceptual model design.

For an overview of how to use BPwin, please refer to the presentation.

http://data.supportcentral.XX.com/upload/17786/doc_1177818.ppt

Page 5: Logical Data Modeling Guide

2.3 Logical Data Model – General Guidelines The intent of the Logical Data Model is to document explicitly, a comprehensive understanding of the data to be stored and delivered in an application. In the logical data model, the data modeler will attempt to describe the data in as much detail as possible, without regard to how they will be physically implemented in the database.

The logical data model has to emphasize the following:

The logical data model has to be at least in 3rd normal form.

Separate subject area has to be created to group entities that belong to a unique business function. While creating the subject area please bring all the 1st level parent entities to the subject area.

Includes all entities and relationships among them.

All attributes for each entity are specified.

Audit columns are not required to be in the logical data model, Create them as Physical ONLY columns.

Primary key for each entity specified.

Foreign keys (keys identifying the relationship between different entities) are specified.

Please refer to the data model image at section 2.7.2 for a example of how table should be named and columns should be named.

For a detailed discussion of design standards for entities, attributes and relationships please refer the section for Entities or Attributes in this document.

2.4 Additional Guidelines for Data Marts

2.4.1 Dimensional Data Modeling GuideDimensional Data models are useful for representing star schemas, a.k.a. Fact tables to be used for reporting purposes. While this is not a function of logical modeling; some known facts and dimensions may be started at this stage. It is the job of the physical modeler however, to take the suggestions, in any, and add to or finalize the fact tables in the physical model. Dimensional models are often used to depict structures used in Data marts. The PMM should follow the Dimensional Data Modeling methodology. Dimensional Data Modeling Guide provides the information and the TSG standards for dimensional data modeling that PMM should follow.

http://data.supportcentral.XX.com/upload/17786/doc_690968.doc

2.5 Model Naming StandardThe data model must be named per the following to naming convention.

SSS + XXX + v.<n>.erwin

Page 6: Logical Data Modeling Guide

where:

SSS = Schema

XXX = DEV, QA, PROD (development, QA, production)

n = Model version number as in Model Mart (stating with 1)

All section has to be separated by an underscore ("_").

E.g., BDW_QA_v.2.erwin

For work-in-progress models, you may add a date after the Version number to track the changes in the model. But you should remove the date part upon finalizing the model.

2.6 Entities

2.6.1 Creating New EntitiesTo speed up creation of new entities, copy from the template data model’s ‘template entity’. This entity has all of the necessary settings pre-defined. You can always override the settings if required.

OLTP Data Model ERwin Template: OLTP Template V3.erwin

Dimensional Data Model ERwin Template: Dimensional Template v4.erwin

2.6.2 Entity Naming Entity names must be meaningful and must not conflict with other entity

names. The entity name should be self-explanatory and should reflect the correct business meaning.

Entity names should not be prefixed with schema name and should have spaces not underscores). For e.g. SALES ORDERLINE

Entity name should not be too generic with the exception of a super type entity (E.g.: An entity for a sales transaction cannot be called “Transaction”, it must be called “Sales Order”).

Avoid any abbreviations on the entity name unless it is commonly used in the business community.

When abbreviations are used, the full words must be obvious or in other words, only standard abbreviations should be used. For XX standard abbreviations see Abbreviations

There is no restriction to the length of the entity name but unusually long names should not be used. The entity name should be long enough to describe the business meaning of the entity.

Entities should be represented by singular names.

Entity names should be mixed case (i.e., lowercase with leading caps) with spaces between the words (e.g., “Customer Name”).

Special characters except spaces should not be used in entity names.

Page 7: Logical Data Modeling Guide

Please find the following model snippet as example for good entity naming.

contains

classifies

describes

is charged onis charged on

is the child of

is associated with

INVOICE

Invoice Num

Invoice Created DttmInvoice Description TxtHost Invoice NumInvoice Type Cd (FK)Invoice Transmittal Mode Cd (FK)Consolidated Invoice Id (FK)Invoice Source Type Cd (FK)Invoice Status Object Id (FK)Invoice Desc2 TxtInvoice Desc3 TxtCollection District Cd

INVOICE AMOUNT

Invoice Amount Type Cd (FK)Sales Invoice Line Id (FK)Invoice Amount DttmSales Invoice Num (FK)

Invoice Amount Global AmtInvoice Amount Transaction AmtInvoice Amount Local AmtInvoice Amount Currency Cd (FK)

SALES ORDER LINE

Sales Order Id (FK)Sales Order Line Num

Parent Sales Order Id (FK)Parent Sales Order Line Num (FK)Sales Order Line Create DttmSales Order Line Item Id (FK)Sales Order Line Item QtySales Order Line Qty UOM Cd (FK)Sales Order Line Unit Standard Price AmtSales Order Line Unit Average Cost AmtSales Order Line Unit Price AmtSales Order Line Currency Cd (FK)Sales Order Line Forecasted Revenue DttmSales Order Line Ship To Party Id (FK)Sales Order Line Ship To Address Id (FK)Sales Order Line Contract Id (FK)Sales Order Line Sales Associate Id (FK)Sales Order Line Web Visit Id (FK)Sales Order Line Page View Sequence Num (FK)Sales Order Line Requested Ship To Location Id (FK)Sales Order Line Status Object Id (FK)Billing Rates Entered FlagEstimated Final Cost MarginOrder Entered By Name TxtPartial Billing FlagQuote Created By NameVerbal PO FlagPO Received FlagPO Hard Copy Received FlagPricing Basis Type Cd (FK)Requestor Telephone Number Id (FK)Sales Order Line Extended Global AmtSales Order Line Extended Local AmtShip To AttentionShip To Contact NameSales Order Line Ship To Telephone Id (FK)Sales Order Line Ship To Fax Id (FK)Sales Order Line Ship To Email Address Id (FK)Warranty Reason Id (FK)Forced Outage Flag

INVOICE TYPE

Invoice Type Cd

Invoice Type Desc

SALE INVOICE ORDER LINE

Sales Invoice Num (FK)Sales Invoice Line Id (FK)Sales Order Line Num (FK)Sales Order Id (FK)

Sales Invoice Line Order Line QtySales Invoice Line Order Line UOM Cd (FK)

SALES INVOICE LINE

Sales Invoice Num (FK)Sales Invoice Line Id

Sales Invoice Line Item Id (FK)Sales Invoice Line DescHost Invoice Line NumSales Invoice Line Type Cd (FK)Sales Invoice Line Status Object Id (FK)Bill To Intraco Billing Cd

SALE INVOICE

Sales Invoice Num (FK)

Sale Invoice Type Cd (FK)Sale Invoice Reason Type Cd (FK)Sale Invoice Ledger Batch Id (FK)Sales Invoice Chart Of Account Id (FK)Invoice Comment Line2 Txt (FK)Invoice Comment Line3 Txt (FK)Account Distribution Number Segment Id (FK)GL Account Overwrite Transaction Type Cd (FK)Alternate Invoice Account (FK)

INVOICE AMOUNT TYPE

Invoice Amount Type Cd

Invoice Amount Type Desc

2.6.3 Entity Definition Entity definition must be at least one complete sentence with a

meaningful textual description of the meaning/use of the entity with up to a max of 4000 characters of data.

Simply repeating the full entity name is not an acceptable definition.

Page 8: Logical Data Modeling Guide

Definition should benefit the developers and users, to better understand the application and its functionality.

Refrain from giving a dictionary definition for the entity. They can serve as a good starting point, but they do not put the concept in prospective of the application or the functional use of the data.

Do not use abbreviations in the definition. The individuals reading these manuals may not know the meanings of abbreviations or acronym that are common in the development environment.

Follow the details mentioned in the attached document for more details:

The definition entered in the logical Entity Editor will automatically appear on the physical table as comments.

The following is a good example. It defines an entity called “Organization” which is a subtype of the entity called “Party” in a typical Manufacturing domain.

____________________________________________________________________

In the context of XX Energy the organization is used to track details of internal department and customer. SOURCE: GIB, PLP, AND COPICS

Description:

This is a subtype entity to PARTY. An organization is any group of individuals (or one individual) or other organizations formed for a specific purpose. An organization can be either an internal organization or an external organization.

This definition does not include groups of individuals that form a segment, nor does it include households (a grouping of individuals created by the enterprise for marketing purposes.)

____________________________________________________________________

2.6.4 Primary Key & Alternate Keys Each entity should have a unique primary key.

Each entity should have at least one candidate key identified. A candidate key is an attribute or a set of attributes that can uniquely identify a record in the entity. Among the identified candidate keys, select the most appropriate candidate key (The candidate key which is most static over its life) as the primary key. All the other candidate keys must be created as Alternate Keys. Do not define the Primary key also as an Alternate Key in the data model.

All the Alternate keys will result into UNIQUE indexes in the physical data model.

Page 9: Logical Data Modeling Guide

If no candidate primary key is available, discuss it with the EDA. One alternative is to create a new attribute using an Oracle-generated sequence number in the format: “LogicalEntityName Seq Id” (follow the attribute naming standards discussed below). Primary key attributes will be set to “NOT NULL”.

For oracle generated sequence number attributes please use “Sequence Identifier” as the domain.

Sequence ID’s are preferred to multi-column keys in most cases, especially if the entity has many child tables through relationships.

2.6.5 Foreign Key Relationship If a logical relationship exists between two entities, it is recommended to

create the relationship and the resulting constraint in the Erwin data model, unless there is a valid.

Reason for not creating it (e.g., legacy data which doesn’t satisfy the constraint and can’t be cleaned up).

Relationships in the logical data model should have names that are verb phrases and are lowercase (e.g., “is evaluated by”). The relationship Verb phase is required ONLY for the parent-to-child relationship. The Child-to-Parent relationship is quite apparent from the parent-to-child relationship and should be left empty.

If a foreign key relationship does not accurately define the role in the associated entity use a role name in the relationship definition.

2.7 Attributes

2.7.1 Attribute Naming

Attribute name must be meaningful, self-explanatory and self-documenting and should reflect the correct business meaning of the data that is represented by the attribute. For e.g. in Sales Transaction entity, Sales Transaction Priority Cd clearly depicts the attribute definition.

The attribute names should be "qualified" so that the domain of values is properly represented. In order to avoid ambiguity, or in order to add clarity, the name should be qualified. For instance; an attribute should not just be named "date"; rather, it should be named "Hire Date" or "Birth Date" etc (as applicable).

The logical attribute names are usually used as column title in reports.

Attribute names should not be prefixed with the entity name unless the attribute name is too generic. For example; in entity Vendor, its better to call an attribute “Vendor Name” Than calling the attribute “Name”.

Attribute names must be short, but still logical and self-descriptive.

Always end the attribute name with a domain Name. The idea here is to make the attribute data unambiguous to the user (reader) of a logical model.

Page 10: Logical Data Modeling Guide

The words must be separated by a space and NOT by an underscores.

Avoid any abbreviations on the logical unless it is commonly used in the business community. When abbreviations are used, the full words must be obvious or in other words, only standard abbreviations should be used. For XX standard abbreviations see Abbreviations.

There is no restriction to the length of the attribute name; but unusually long names should not be used. The attribute name should be long enough to describe the business meaning of the attribute.

Attribute should be represented by singular names.

Attribute names should be mixed case (i.e., lowercase with leading caps) with spaces between the words (e.g., “Customer Name”).

Special characters except spaces should not be used in attribute names.

When the attribute is a foreign key, please make sure to give an appropriate and meaningful Role Name to the attribute.

Following is an example of proper attribute naming:

describes

has

becomes

is status ofSALES TRANSACTION

Sales Transaction Id

Sales Transaction Visit Id (FK)Sales Transaction Type Cd (FK)Sales Transaction Create DttmSales Transaction Priority Cd (FK)Sales Transaction Created By Party Id (FK)Sales Transaction Transmittal Mode Cd (FK)Sales Transaction Status Object Id (FK)

SALES TRANSACTION TYPE

Sales Transaction Type Cd

Sales Transaction Type Desc

QUOTE

Quote Id (FK)

Quote Party Id (FK)Quote Address Id (FK)Quote NumQuote Received Location TxtQuote Created By

QUOTE LINE

Quote Id (FK)Quote Line Num

Quote Line DttmQuote Line Sales Associate Id (FK)Quote Item Id (FK)Quote Line Item QtyQuote Line UOM Cd (FK)Quote Line Unit Std Price AmtQuote Line Unit Std Price Currency Cd (FK)Quote Line Requested Shipment DttmQuote Line Valid From DttmQuote Line Valid To DttmQuote Line Ship To Party Id (FK)Quote Line Expected Purchase DttmQuote Line Web Visit Id (FK)Quote Line Page View Sequence Num (FK)Quote Line Status Object Id (FK)Quote Line Cost MarginQuote Line Extended Amt

QUOTE LINE ORDER LINE

Quote Id (FK)Quote Line Num (FK)Sales Order Id (FK)Sales Order Line Num (FK)

Quote Line Order Line Item QtyQuote Line Order Line UOM Cd (FK)

QUOTE LINE STATUS OBJECT

Quote Line Status Object Id (FK)

Page 11: Logical Data Modeling Guide

2.7.2 Attribute Data Types All attributes must belong to a data domain, which is already created and

available in the logical data model template.

If you couldn’t find an appropriate data domain, please work with you EDA to create a domain in the appropriate template and also in the logical data model you are working on.

2.7.3 Attribute Constraints You are required to capture all the applicable constraints such as

Validation constraints, Null constraints and Default constraints in the logical data model. Commonly used constraints are available in the ERwin template you use.

If you need additional constraints to be created, please work with your EDA to create them in the ERwin template and also in the data model you are working on. Please pay attention not to create duplicate constraint rules with different names.

All the primary key attributes must be set to “ “NOT NULL”.

All the non-key attributes can be set to “NULL” or “NOT NULL” as desired.

2.7.4 Attribute Definition Attribute comments must be at least one complete sentence with a

meaningful textual description of the meaning/use of the attribute with up to a max of 4000 characters of data.

Simply repeating the full attribute name is not an acceptable definition.

Definition should benefit the developers and users, to better understand the application and its functionality.

Refrain from giving a dictionary definition for the attribute. They can serve as a good starting point, but they do not put the concept in prospective of the application or the functional use of the data.

Do not use abbreviations in the definition. The individuals reading these manuals may not know the meanings of abbreviations or acronym that are common in the development environment.

As far as possible, provide examples in the comments. This will assist in understanding the usage of the entity/attribute.

The definitions entered in the logical Attribute Editor are setup in the template data model to automatically appear on the physical column comments.

Follow the details mentioned in the attached document for more details

Page 12: Logical Data Modeling Guide

2.8 Logical Data Model Review ChecklistOnce the logical data model is created and is ready to be reviewed, the data modeler is required to run the checklist provided in Logical Data Model Checklist.doc. The checklist is provided to ensure that the data model review is done at the least amount of time with minimum repetitions.

3 AbbreviationsFor commonly used abbreviations, check:

http://data.supportcentral.XX.com/upload/17786/doc_2381791.xls

4 Oracle 10g Reserved WordsList of Oracle 10g Reserved Words

5 Appendix A - Glossary

Terms Descriptions

APL Application Project Leader

C.A Computer Associates

E/R Diagram Entity Relationship Diagram

EDA Enterprise Data Architect

Kintana IT Project Management Tool

Data Model Mart / Data Model manager

A place where all data models are stored.

MODELS shared drive Place to store data models, documents and scripts. Used before introducing Data Model manager.

PM Project Manager

PMM Project Master Data Modeler

SAFE Simple Algorithm for Fragmentation Elimination

TSG XX Technology Services Group

 

   

   

   

Page 13: Logical Data Modeling Guide