8/17/2019 DDB_presentation3Skinny Fact Tables
1/26
• As the fact table contains the vast volume of
records it is important that it is memory spaceefficient
• Foreign keys are usually represented in integer
form and do not require much memory space
• Facts too are often numeric properties and canusually be represented as integers (contrast todimensional attributes which are usually long textstrings)
• This space efficiency is critical to the memoryspace consumption of the data warehouse
Skinny fact tables
8/17/2019 DDB_presentation3Skinny Fact Tables
2/26
Keys
• hoose the data warehouse keys to be
meaningless surrogate keys! "et a surrogate key be a simple integer
! #$byte ($$$$$$$$%$$$$$$$$%$$$$$$$$%$$$$$$$$)
can contain &'& values ( & billion positiveintegers% starting with )
Keys
8/17/2019 DDB_presentation3Skinny Fact Tables
3/26
Key
• *se surrogate keys also for the Time dimension
! +,"$based date key% is typically - bytes% so # bytesare wasted! bypassing .oins leads to embedding knowledge of thecalendar in the application% rather than reading itfrom the time dimension
! it is not possible to encode a data stamp as /0 do notknow1% /0t has not happen yet1% etc
• Avoid smart keys• Avoid production keys! production may decide to reuse keys! the company may acquire a competitor and therebychange the key building rules
! changed record% but deliberately not changed key
8/17/2019 DDB_presentation3Skinny Fact Tables
4/26
2ntity$relationship modelling$ a logical design technique to eliminate data
redundancy to keep consistency and storageefficiency
$ makes transaction simple and deterministic$ 23 models for enterprise are usually complex%
e4g4 they often have hundreds% or eventhousands% of entities5tables
6imensional modelling$ a logical design technique that present data in a
intuitive way and that allow high$performanceaccess$ aims at model decision support data$ easier to navigate for the user and high
performance
Dimensional modelling vs. ER-modelling
8/17/2019 DDB_presentation3Skinny Fact Tables
5/26
Why dimensional modelling?
• the logical model is easy understand• a predictable standard framework for end user
applications• the logical design can be done nearly independent of
expected query pattern• handle changes easy $ at least adding new dimensional
attributes• high performance /browsing1 across the attributes%
eliminating .oins and make use bit vector indexes• strategy to handling aggregates% e4g4 summery records that
are logical redundant with base table to enhance query
performance• the database engine can make strong assumption how tooptimise
• strategies for handling slowly changing dimensions%heterogeneous products% event$handling (/factless fact
tables1)
8/17/2019 DDB_presentation3Skinny Fact Tables
6/26
Steps in the Design Process
4 hoose a business process to model
A business process is a ma.or operational process in an organisation% that is supported by some kind
of a legacy system(s) from which data can be
collected% e4g4% orders% invoices% shipments%
inventory4
&4 hoose the grain of the business process
The grains is the level of detail at which the data isrepresented in the 674 Typical grains are
individual transactions% individual daily (monthly)
snapshots4
8/17/2019 DDB_presentation3Skinny Fact Tables
7/26
eps n e es gn rocess
'4 hoose the dimensions that will apply toeach fact table record4 Typicaldimensions are time% product% customer%
store% etc4
#4hoose the measured facts that will
populate fact table 24g4% quantity sold%dollars sold
8/17/2019 DDB_presentation3Skinny Fact Tables
8/26
Eample of a Dimensional !odel
8/17/2019 DDB_presentation3Skinny Fact Tables
9/26
"ggregation• Aggregations can be created on$the$fly or by theprocess of pre$aggregation
• An aggregate is a fact table record representing asummarisation of base$level fact table records
! ategory$level product aggregates by store by day! 6istrict$level store aggregates by product by day! 8onthly sales aggregates by product by store! ategory$level product aggregates by store district
by day! ategory$level product aggregates by store districtby month
8/17/2019 DDB_presentation3Skinny Fact Tables
10/26
#$%P vs. #$"P
8/17/2019 DDB_presentation3Skinny Fact Tables
11/26
How to store Aggregates?
• as new "evel fields in an already existing
Fact table
• as new fact tables
8/17/2019 DDB_presentation3Skinny Fact Tables
12/26
8/17/2019 DDB_presentation3Skinny Fact Tables
13/26
"n Eample
9 :sold napkin5day9 :sold tissue5day9 :sold paper5day
8/17/2019 DDB_presentation3Skinny Fact Tables
14/26
8/17/2019 DDB_presentation3Skinny Fact Tables
15/26
8/17/2019 DDB_presentation3Skinny Fact Tables
16/26
;ew Fact table for Aggregates
The creation of aggregate fact tablerequires the creation of<a derivative dimensionan artificial key for each new derivative dimension
8/17/2019 DDB_presentation3Skinny Fact Tables
17/26
&o' to store "ggregates
• as new "evel fields in an already existing Fact table
! problems with double count! visible for the users
• as new fact tables
= no problems with double count= invisible for the users= are easily introduced and5or reduced at differentpoints in time
= simpler metadata= simpler choice of key= the si>e of the field for the summarised data doesnot increase the si>e of the field for the basic data
8/17/2019 DDB_presentation3Skinny Fact Tables
18/26
"ggregation (avigator
8/17/2019 DDB_presentation3Skinny Fact Tables
19/26
"n Eample of S)$ )*ery
A E l f SQL Q (2)
8/17/2019 DDB_presentation3Skinny Fact Tables
20/26
An Example of SQL Query (2)
8/17/2019 DDB_presentation3Skinny Fact Tables
21/26
Aggregation Navigator
• 0nsulates end user applications from thechanging portfolio of aggregates
• Allows the 6?A to dynamically andseamlessly for the end user ad.ust theaggregates without having to roll overthe applications base
8/17/2019 DDB_presentation3Skinny Fact Tables
22/26
Aggregation Summary
• @re$aggregation demands more storagespace but provides better queryperformance
• "owest level of aggregation is determinedby the granularity of the fact table
• Aggregation is easier when facts are alladditive
8/17/2019 DDB_presentation3Skinny Fact Tables
23/26
Fats an! A!!itive "roperty
8/17/2019 DDB_presentation3Skinny Fact Tables
24/26
Semia!!itive fat Example
8/17/2019 DDB_presentation3Skinny Fact Tables
25/26
Numeri #easures of $ntensity
8/17/2019 DDB_presentation3Skinny Fact Tables
26/26
En!
Top Related