Building the cube – Chapter 9 & 10 Let’s be over with it.
-
Upload
hester-porter -
Category
Documents
-
view
213 -
download
0
Transcript of Building the cube – Chapter 9 & 10 Let’s be over with it.
Building the cube – Chapter 9 & 10
Let’s be over with it
Two SSMS (SQL Server Management Studio)
• Important for any tasks deals with databases• Use this to make sure the
MaxMinMangufactureingDM database is in working order, follow the instructions I posted
• It is a Data Mart SSDT (SQL Server Data Tool)
• Essentially Visual Studio 2010• Creating a Multidimensional BI Semantic Model
(OLAP) and DM projects use the same tool
Key Steps 1 – need to know where to find this tool, you
may have to start a new project
Key steps 1 – may want to find the right directory to
store your projects 8 – you may have to create a new
connection
Key Steps 8 – Impersonation,
pick the first one• You can have
multiple data sources for the same project
Measures and related
Measure Group• The table where the measure comes from• The data in the table is the source for the
measure
Other factors to consider Granularity
• How detailed view do we need Day, month, quarter, year, etc. Professor, department, division, college, university
Calculated measures• New measures generated through calculations
with existing ones• For example, the total goods produced = goods
passed QA + goods failed QA• Step 24 of page 345 is another one.
Measure Aggregates beside SUM Look into AggregateFunction property, you
will see a list of selections because not all aggregates are just sum
For example, inventory level is not additive along the time dimension, but additive along the product dimension
Aggregate FunctionsSelection Description
Sum Specifies the sum of members. This is the default aggregation function.
Count Specifies the count of measure members.
Min Specifies the minimum value of members.
Max Specifies the maximum value of members.
DistinctCount Specifies the count of distinct measure members.
NoneNo aggregations are performed on any dimension – data is only available on the leaf cells. If no value from the fact table has been read in for a member, then the cell value for the member is considered to be Null.
ByAccountSpecifies that the aggregation used will be determined for each CurrentMember of the Account dimension according to its account type. Unmapped account types aggregate as SUM.
AverageOfChildren Specifies average of leaf descendants in time. Average does not count an empty value as 0.
FirstChild Specifies the first child member along Time dimension.
LastChild Specifies the last child member along Time dimension.
FirstNonEmpty Specifies the first non empty child member along Time dimension.
LastNonEmpty Specifies the last non empty child member along Time dimension.
Adding new measure group
True we can add new measure groups, but generally believe is to plan ahead and add all measure groups at the very beginning.
What is a measure group? • It is basically a fact table
Types of dimensions
Fact dimensions • Dimensions come from the fact table
Parent Child dimensions• Two columns in the same table• Self reference • For example, employees and managers both
come from the employee table
Types of dimensions
Role playing dimensions• The same dimension can related to multiple
columns multiple times• For example, a time dimension can related to a
sales measure group several times, order date, shipment date, received date, payment received date
Types of dimensions Reference dimensions
• It related to the measure group through another dimension
• In the case below, Geography dimension is related to InternetSales through Customer, therefor is a reference dimension
DM dimensions, M:N dimension, and Slowly changing dimension
The values of the dimension come from data mining algorithms
Many-to-Many dimension• Not to use
Slowly changing dimension • Type 1• Type 2• Type 3
Slowly Changing Dimension As the name suggest
• An employee got promoted in Dec of 2012, she is not the GM, but was a vendor manager before, how to reflect that?
• There are many ways to deal with this. We introduce three common approaches names Type I, II, and III.
The discussions here are based on Wikipedia
Slowly Changing Dimension – type I
Before
After
Then, the “After” info is all you going to see
Supplier_Key Supplier_Code Supplier_Name Supplier_State
123 ABC Acme Supply Co CA
Supplier_Key Supplier_Code Supplier_Name Supplier_State
123 ABC Acme Supply Co IL
Slowly Changing Dimension – type II Before
After
Then, add additional info
Supplier_Key Supplier_Code Supplier_Name Supplier_State
123 ABC Acme Supply Co CA
Supplier_Key Supplier_Code Supplier_Name Supplier_State
123 ABC Acme Supply Co IL
Supplier_Key Supplier_Code Supplier_Name Supplier_State Version.
123 ABC Acme Supply Co CA 0
124 ABC Acme Supply Co IL 1
Supplier_Key Supplier_Code Supplier_Name Supplier_State Start_Date End_Date
123 ABC Acme Supply Co CA 01-Jan-2000 21-Dec-2004
124 ABC Acme Supply Co IL22-Dec-2004
Slowly Changing Dimension – type III Before
After
Then, add additional info
Supplier_Key Supplier_Code Supplier_Name Supplier_State
123 ABC Acme Supply Co CA
Supplier_Key Supplier_Code Supplier_Name Supplier_State
123 ABC Acme Supply Co IL
upplier_Key
Supplier_Code
Supplier_NameOriginal_Supplier_State
Effective_Date
Current_Supplier_State
123 ABCAcme Supply Co
CA22-Dec-2004
IL
Slowly Changing Dimension – type IV Type IV
Type II
SCD– another example Per http://www.learndatamodeling.com First price
Second price
SCD– another example Type I
Use the second price to replace all the first one, actually the first will not be in the DM
SCD– another example Type II Approach I – use product ID and Year as key
Approach II, convert year to Effective DT
SCD– another example Type III
Add, previous price and year
The difference Between Type III and Type II
When we add more product price change,• Type II can be unlimited in handling the
changes by just adding records• Type III can only handle a limit changes, let it be
the first and last, the last two, or some others
Deploying and Processing Deploying
• Send your definitions to the Analysis Services
Processing • Perform the all prescribed calculations
Tools are• MSDT – to deploy and process (trigger these activities)• SSMS – to check the results• Analysis Services Deployment Wizard – a more
advanced tool that generate script for automated deployment to production -- skip
Other Bells and Whistles Linked Object – especially linked measures
• Allowed to combine other cubes to an existing one
BI wizard• A tool allows you to do a number of capability
easily Define time intelligence
• Period to date calculation, rolling average, period-over-period growth
Define currency conversion
Other Bells and Whistles KPI
• The dashboard to indicate how things are going Actions
• Making cube even fancier by allowing other activities such as following a URL, allowing drill-through, or launching a Reporting Service report
Partitions• Break one cube to several cubes for concurrent
processing to improve performance Partitions and storage options MOLAP, ROLAP
Other Bells and Whistles Aggregation Design
• How much aggregation is performed • Two approaches: usage based or manual• Usage based is determined by checking the
usage log• Manual is where the developers specify
Perspectives – very much like views Translations – at metadata level