Granularity in the Data Warehouse - Building the Data Warehouse
Data Warehouse File
-
Upload
ruchi-sharma -
Category
Documents
-
view
282 -
download
1
Transcript of Data Warehouse File
DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases &
Object Oriented Databases & DataWarehouse
Multi Dimensional Modeling
Richa SharmaRoll Number: 511/IS/2010
DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases & DataWarehouse Object Oriented Databases &
1 Problem Statement................................................................................................................................................................................................................ 3
2 Requirement Analysis............................................................................................................................................................................................................ 4
3 Star Schema For Restaurant Footfall.....................................................................................................................................................................................5
4 Object Oriented Model for Restaurant Star Schema.............................................................................................................................................................6
5 Details.................................................................................................................................................................................................................................... 7
5.1 Information Package...................................................................................................................................................................................................... 7
5.2 Business dimension:...................................................................................................................................................................................................... 7
5.3 Level of detail:................................................................................................................................................................................................................ 8
5.4 Fact table....................................................................................................................................................................................................................... 8
5.5 Relationships.................................................................................................................................................................................................................. 8
5.6 Star Schema................................................................................................................................................................................................................... 8
6 Restaurant Problem Domain.................................................................................................................................................................................................. 9
6.1 Dimension Tables........................................................................................................................................................................................................... 9
6.2 Fact Tables..................................................................................................................................................................................................................... 9
1 Problem Statement
To do dimensional modelling for data warehouse of a RESTAURANT CHAIN COMPANY which wants to store information regarding its all restaurants located at different cities and different parts of a city.
It wants to store historical information based on various aspects which help the company to make various kinds of strategic decisions.
The Basic measures on which the company wants to do analysis is the footfall and the profit each restaurant makes.
The areas by which the company wants to analyse the footfall and profit is FOOD,TIME, LOCATION,EMPLOYEES,CATEGORY,DEMOGRAPHY OF LOCATION.
2 Requirement AnalysisIn multi dimensional modelling requirements are gathered in the form of information packages
Information Subject: Restaurant Footfall
Food Location Time Employee Demographic Age Group
Type
Category
Name
Price per unit
Country
State
Distt.
City Name
Restaurant name
Year
Quarter
Month
Day
Season
Holiday Flag
Restaurant Name
Designation
Employee name
Address
Country
State
Distt.
City Name
Population Index
Income Group
Age Group
Sex
Facts: Footfall, Profit
FOOTFALLPROFIT
Shared dimensions
STAR SCHEMA FOR RESTAURANT FOOTFALL
Item KeyTypeCategoryFood item nameprice per Unit
Restaurant codeCountryStateDistt.City NameRestaurant Name
Location KeyPopulation IndexIncome GroupAge GroupSex
Time KeyYearQuarterMonthDaySeasonHoliday flag
Item KeyLocation KeyRestaurant CodeTime KeyRestaurant Id
Category KeyRestaurant CodeLocation KeyTime KeyRestaurant Id
Restaurant IdRestaurant nameDesignationEmployee IdName
Category KeyCategoryFood Item NamePrice Per Unit
BASE TABLE FOOT FALL FACTS
FOOTFALLPROFIT
3 Star Schema For Restaurant Footfall FOOD ONE WAY AGGREGATE LOCATION FOOT FALL FACTS
TIME DEMOGRAPHIC AGE GROUP
EMPLOYEES
CATEGORY(DIMENSION DERIVED FROM FOOD)
FOOTFALLPROFIT
` FOODItem KeyTypeCategoryFood item nameprice per Unit
LOCATIONRestaurant codeCountryStateDistt.City NameRestaurant Name
DEMOGRAPHIC AGE GROUPLocation KeyPopulation IndexIncome GroupAge GroupSex
TIMETime KeyYearQuarterMonthDaySeasonHoliday flag
Item KeyLocation KeyRestaurant CodeTime KeyRestaurant Id Category Key
Restaurant CodeLocation KeyTime KeyRestaurant Id
EMPLOYEERestaurant IdRestaurant nameDesignationEmployee IdName
BASE TABLE FOOT FALL FACTS
FOOTFALLPROFIT
LOCATION Restaurant codeCountryStateDistt.City NameRestaurant NameOCATION
TIMETime KeyYearQuarterMonthDaySeason
EMPLOYEERestaurant IdRestaurant nameDesignationEmployee Id
Name
DEMOGRAPHIC AGE GROUPLocation KeyPopulation IndexIncome GroupAge GroupSex
CATEGORYCategory KeyCategoryFood Item NamePrice Per Unit
*
4 Object Oriented Model for Restaurant Star Schema
AGGREGATED FACT TABLE
FOOTFALL FACTS
*
1 1 1 1 1
1 1 1 1 1
5 Details
5.1 Information PackageInformation package enable to combine common subject areas.
Define the common subject areas Design key business metrics Decide how data must be presented Determine how users will aggregate or roll up Decide the data quantity for user analysis or query Decide how data will be accessed Establish data granularity Estimate data warehouse size Estimate Ascertain how information must be package
5.2 Business dimension: Business Dimensions form the underlying basis for requirement definition. Data must be stored to provide for the business dimensions. The business dimensions and their hierarchical levels form the basis for all further phases. In the data models for datawarehouse, the business dimension along which the users analyze the business metrics must be featured prominently. The usefulness of the warehouse or the mart is directly related to the accuracy of the data model. Dimensions are the attributes or the areas along which the users want to make the strategic decisions doing the analysis for the metrics corresponding to the dimensions.
So the data structure for storing Business Dimensions are The DIMENSION TABLES.
A dimension table has the following characteristics:
Dimension table key Table is wide Textual attributes Attribute not directly related Flattened out, not normalized Ability to drill down/roll up Multiple hierarchies Less number of records
5.3 Level of detail:For analysis at various levels of granularity, data model should have the facility to provide Drill down and Roll up facilities for analysis. DATA GRAIN is the level of detail for the measurements or metrics. This Is called Data Granularity. It represents the level of detail in the fact table. If the Fact table is at the lowest grain, then the facts or the metrics are at the lowest possible level at which they could be captured from the operational systems.
5.4 Fact tableFact tables are the data structures where we keep the measurements. The details may be kept at the lowest level or they may be kept as summary data depending upon the need the user wants to analyse up to which level of detail. Fact tables have the following properties:
Concatenated Key Data Grain Fully Additive Measures Semi additive measures Table Deep, Not Wide Sparse Data Degenerate Dimensions
5.5 RelationshipsThere is one to many relationships mostly from dimension table to Fact table. At times there can be Many to many from Dimension table to Fact table.Between Dimension tables there is Many To Many Relationship.
5.6 Star SchemaThe star schemas allow the query processor software to use better execution plans. It enables specific performance schemes to be applied to queries. The Star schema arrangement is eminently suitable for special performance techniques such as the STAR join and the STAR index.
6 Restaurant Problem DomainHere in my problem domain, I have chosen the following dimensions while making the information package.
6.1 Dimension Tables Food
o It is the dimension with the attributes Item Key, Type, Category, Food item name. This dimension helps in analysing the metrics footfall and
profit in terms of the items sold or the food sold and its various properties like category wise, type wise etc. Time
o It is the dimension with the attributes Time Key, Year, Quarter, Month, Day, and Season. It helps the analyst to view metrics according to
different grains of time. For example, the footfall in the month of February or day wise footfall etc. Location
o It is the dimension helping the user to view metrics according to the location of the restaurant where it is situated. It has attributes
Restaurant code, Country, State, Distt. City Name, Restaurant Name. Employee
o It is the table having attributes Restaurant Id, Restaurant name, Designation, Employee Id, Name. It helps in analysing the footfall according
to the employees in any particular restaurant. For example the profit and the footfall according to designation. Demographic Age Group
o This dimension has the following attributes. Location Key, Population Index, Income Group, Age Group, Sex.It helps the analyst in making
analysis such as footfall according to a particular age etc.
6.2 Fact TablesThis problem domain is having two Fact tables.
Basic Fact Table o The basic fact table includes the keys of all dimension tables. And the measures Footfall And Profit.
Aggregated Fact Table o The aggregated fact table shows one way aggregation along the Dimension Food. It stores the summarized metrics according to category of
food and its hierarchies below it.