Modelling Slowly Changing Dimensions

3
Dimensional Modeling - Slowly Changing Dimensions About Tracking changes in dimension is referred in datawarehousing as slowly changing dimensions. In the source system a lot of changes are daily made : new customers are added, addresses are modified, new regional hierarchies are implemented, or simply the product descriptions and packaging change. These sorts of changes need to be reflected in the dimension tables and in several cases, the history of the changes also needs to be tracked. By remembering history, we are then able to look at historical data and compare it to their current situation. Articles Related Hash function Dimensional Modeling - Dimension Data Integration - ETL Moving to Pervasive Integration Data Warehousing Data Warehousing - Subsytem OWB - Dimension OWB - Key lookup operator OWB - How to implement a type 2 slowly changing dimension with a hash function ? Techniques Type 1 - Overwrite Original Value A change does not require tracking

description

SCD in OBIEE

Transcript of Modelling Slowly Changing Dimensions

Page 1: Modelling Slowly Changing Dimensions

Dimensional Modeling - Slowly Changing Dimensions

About

Tracking changes in dimension is referred in datawarehousing as slowly changing dimensions.

In the source system a lot of changes are daily made :

new customers are added, addresses are modified, new regional hierarchies are implemented, or simply the product descriptions and packaging change.

These sorts of changes need to be reflected in the dimension tables and in several cases, the history of the changes also needs to be tracked.

By remembering history, we are then able to look at historical data and compare it to their current situation.

Articles Related

Hash function Dimensional Modeling - Dimension Data Integration - ETL Moving to Pervasive Integration Data Warehousing Data Warehousing - Subsytem OWB - Dimension OWB - Key lookup operator OWB - How to implement a type 2 slowly changing dimension with a hash function ?

Techniques

Type 1 - Overwrite Original Value

A change does not require tracking

Type 2 - Add a new record

With Type II SCD, a new version of the dimension record is created, and the existing version is marked as history.

Each row does not correspond to a different instance of an entity but a different “state”, a “snapshot” of the instance at a point in time.

Page 2: Modelling Slowly Changing Dimensions

To accommodate this, extra metadata is required for the dimension table, including an effective date column and an expiration date column. These columns are used to differentiate a current version from a historical version as follows:

Effective date column stores the effective date of the version, also known as start date. Expiration date column stores the expiration date of the version, also known as end date. Expiration date value of the current version is always set to NULL or a default date value.

The user must identify the columns whose history will be tracked (by creating a new version) whenever their values are changed. These columns are known as trigger columns.

Type 3 - Add a new column to store the previous value

With Type III SCD, a current value field is created to keep the current value of dimension record apart from its previous value.

To accomplish this, two columns are created for each data field:

one storing the current value and one storing the previous value, respectively.

One of our previous clients had a similar business problem, the business wanted to change the SCD type 1 to SCD type 2.

Well if you are trying to model the SCD in such a way that the business wants to see a comparative analysis of current items & history items in a line order then we may need SCD1 to do this kind of analysis.

It’s always a recommended approach to have a flag of validity for a SCD suppose if I have a Item dimension then I would also imbibe a Item_Valid field to do what if analysis for example if the Business wants to do a comparison of the current Year valid items against their previous Year items

Page 3: Modelling Slowly Changing Dimensions

which may also contain discontinued items then we may need a type 1 SCD & also a Type 2 SCD based on the effective & start dates.

Now when creating a Business Model in a tool if we want to see an Item according to the validity we can use the Oracle Last function to aggregate on a dimension suppose we have an Item Id & a Item Effective date then we would model our business logic in such a way that the Item Effective date is aggregated as last using a Last function.

If possible try to model a flag field for historical validity too this gives more resilience to leverage the reports.

Also It’s pretty easy to model a SCD type 2 and change it to SCD type 1 but the vice versa is not simple