(Kajal Maam(Principles of Dimensional Modeling
-
Upload
rahulmhatre26 -
Category
Documents
-
view
196 -
download
2
description
Transcript of (Kajal Maam(Principles of Dimensional Modeling
![Page 1: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/1.jpg)
Principles Of Dimensional Modeling
![Page 2: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/2.jpg)
Design Requirements
Design of the DW must directly reflect the way the managers look at the business
Should capture the measurements of importance along with parameters by which these parameters are viewed
It must facilitate data analysis, i.e., answering business questions
![Page 3: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/3.jpg)
3
What is Dimensional Modeling (DM)?
DM is a logical design technique that seeks to present the data in a standard, intuitive framework that allows for high-performance access.
Can be implemented using a relational or a multidimensional DBMS, with some restrictions.
It is different from ER modeling Every dimensional model is composed of one table with a
multipart key, called the fact table, and a set of smaller tables called dimension tables.
Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multipart key in the fact table.
This characteristic "star-like" structure is often called a star schema.
![Page 4: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/4.jpg)
ER Modeling
A logical design technique that seeks to eliminate data redundancy
Illuminates the microscopic relationships among data elements
Perfect for OLTP systems Responsible for success of transaction processing in
Relational Databases
![Page 5: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/5.jpg)
Problems with ER Model
ER models are NOT suitable for DW? End user cannot understand or remember an ER
Model Many DWs have failed because of overly complex ER
designs Not optimized for complex, ad-hoc queries Data retrieval becomes difficult due to normalization Browsing becomes difficult
![Page 6: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/6.jpg)
ER vs Dimensional Modeling
ER models are constituted to Remove redundant data (normalization) Facilitate retrieval of individual records having
certain critical identifiers Thereby optimizing OLTP performance
Dimensional model supports the reporting and analytical needs of a data warehouse system.
![Page 7: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/7.jpg)
7
Comparison between the ER & Dimensional Model
Dimensional Modeling ER Model
Support adhoc querying for business analyses and complex analyses
Support for OLTP
The data model is multidimensional
The data model has two dimensions
It is asymmetric It is symmetric
Permit redundancy Removes redundancy
It is extensible, application is not changed
If the model is modified ,applications are modified
It can be done independent of expected query patterns
It is variable in structure and very vulnerable to changes in the user’s querying habits
Easy and understandable Hard for people to visualize
Models the business practically Models the micro relationships among data elements
![Page 8: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/8.jpg)
8
Dimension Modeling Concepts
Design goals :user understanability,Query performance,resilience to change
Components of DM: Fact Tables Dimension Tables
![Page 9: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/9.jpg)
9
Inside Dimension table
Dimensional table key Large no. of attributes Textual attributes Attributes not directly related Flattened table,not normalized Ability to drill down/roll up Multiple hierarchies Less number of records
![Page 10: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/10.jpg)
10
Inside Fact Table
Concatenated fact table key Grain/level of data identified Fully-additive-all dimensions Semi-additive-some dimensions Large no. of records Few attributes Sparsity of data Degenerate dimensions
![Page 11: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/11.jpg)
11
Factless Fact Table
Some fact tables have no measured facts Useful to describe events and
coverage ,tables contain information that something has/has not happened
Often used to represent many-to-many relationships
The only thing they contain is concatenated key
![Page 12: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/12.jpg)
12
Star Schema keys
Primary keys Surrogate keys Foreign keys
![Page 13: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/13.jpg)
Modeling Design Process
1. Identify the Business Process Source of “measurements”
2. Identify the Grain What does 1 row in the fact table represent
or mean?
3. Identify the Dimensions Descriptive context, true to the grain
4. Identify the Facts Numeric additive measurements, true to the
grain
![Page 14: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/14.jpg)
Step 1 - Identify the Business Process
This is a business activity typically tied to a source system.
Not to be confused with a business department or function. An Orders dimensional model should support the activities of both Sales and Marketing.
“If we establish departmentally bound dimensional models, we’ll inevitably duplicate data with different labels and terminology.”
![Page 15: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/15.jpg)
Step 2 - Identify the Grain
The level of detail associated with the fact table measurements.
A critical step necessary before steps 3 and 4. Preferably it should be at the most atomic
level possible. “How do you describe a single row in the fact
table?”
![Page 16: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/16.jpg)
Step 3 - Identify the Dimensions
The list of all the discrete, text-like attributes that emanate from the fact table.
They are the “by” words used to describe the requirements.
Each dimension could be though of as an analytical “entry point” to the facts.
“How do business people describe the data that results from the business process?”
![Page 17: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/17.jpg)
Step 4 - Identify the Facts
Must be true to the grain defined in step 2. Typical facts are numeric additive figures. Facts that belong to a different grain belong in
a separate fact table. Facts are determined by answering the
question, “What are we measuring?” Percentages and ratios, such as gross margin,
are non-additive. The numerator and denominator should be stored in the fact table.
![Page 18: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/18.jpg)
18
Advantages of star schema
Easy for users to understand Optimizes navigation Most suitable for query processing
![Page 19: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/19.jpg)
19
DM:Advanced Topics
Slowly Changing dimensionsType 1 changes: Correction of errorsIs used when
the old value of the attribute has no significance or can be discarded.
Easy and Fast
Type 2 changes: preservation history Partitions history so that fact tables properly
reflect original values. Requires use of Surrogate Keys Causes table growth due to additional history rows Users must be aware of the added complexity Effective Dates used secondary to cleaner fact joins
![Page 20: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/20.jpg)
20
Type 3 changes: tentative soft revisions Additional attribute used to capture changes.
Used less frequently then Type 1 or 2. Relate to tentative changes in the source systems. Used to compare performances. Ability to track forward and backward
![Page 21: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/21.jpg)
21
Large Dimensions Rapidly changing dimensions
![Page 22: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/22.jpg)
22
Snowflake Schema
Snowflaking is a method of normalizing the dimension tables in STAR schema
Advantages: Small savings in storage space Normalized structures are easier to update and maintainDisadvantage: Schema less intuitive and end users are put off by
complexity Ability to browse through the contents difficult Degraded query performance because of additional joins
![Page 23: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/23.jpg)
23
Star Schema
![Page 24: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/24.jpg)
24
Flattened Star
![Page 25: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/25.jpg)
25CSE 5331/7331
F'07
Normalized Star
![Page 26: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/26.jpg)
26CSE 5331/7331
F'07
Snowflake Schema
![Page 27: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/27.jpg)
27
Snowflake Schema
Star Schema
Joins: Higher number of Joins Fewer Joins
Ease of Use:
More complex queries and hence less easy to understand
Less no. of foreign keys and hence lesser query execution time
Query Performance:More foreign keys-and hence more query execution time
Less no. of foreign keys and hence lesser query execution time
Ease of maintenance/change:
No redundancy and hence more easy to maintain and change
Has redundant data and hence less easy to maintain/change
Type of Data warehouse:
Good to use for small data warehouses/datamarts
Good for large data warehouses
Dimension table:
It may have more than one dimension table for each dimension
Contains only single dimension table for each dimension
DimTable Normalization:
3 Normal Form2 Normal Denormalized Form
![Page 28: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/28.jpg)
28
Fact Constellation schema
It is shaped like constellation of stars For each star schema or snowflake schema it is possible to
construct a fact constellation schema This schema is more complex than star or snowflake architecture,
which is because it contains multiple fact tables allows dimension tables to be shared amongst many fact tables. solution is very flexible, however it may be hard to manage and
support. The main disadvantage of the fact constellation schema is a more
complicated design because many variants of aggregation must be considered
Different fact tables are explicitly assigned to the dimensions, which are for given facts relevant. This may be useful in cases when some facts are associated with a given dimension level and other facts with a deeper dimension level.
![Page 29: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/29.jpg)
29
Dimensional Model Star Schema
![Page 30: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/30.jpg)
30
Snow-Flake Schema in Dimensional Modeling
![Page 31: (Kajal Maam(Principles of Dimensional Modeling](https://reader035.fdocuments.in/reader035/viewer/2022081419/5517ba2c49795968658b4719/html5/thumbnails/31.jpg)
31
Fact Constellation Schema