Data Vault Affecto Nordics Webinar Q4 2013

22
Data Vault Modeling & Approach DW2.0 & Unstructured Data Master Data Management Agile DW Data Vault Modeling the Agile Data Warehouse © 2013 Genesee Academy, LLC Hans P. Hultgren Webinar Event Q4 2013 gohansgo

description

Awareness Sessions for Data Vault Data Modeling for the Agile Data Warehouse. Includes DWBI Agility, Ensemble Modeling and DV core concepts.

Transcript of Data Vault Affecto Nordics Webinar Q4 2013

Page 1: Data Vault Affecto Nordics Webinar Q4 2013

Data Vault Modeling & Approach DW2.0 & Unstructured Data Master Data Management Agile DW

Data Vault

Modeling the Agile Data Warehouse

© 2013 Genesee Academy, LLC

Hans P. Hultgren

Webinar Event Q4 2013

gohansgo

Page 2: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Data Vault: Modeling the Agile DW

A G E N D A • Welcome • Background • Unified Decomposition &

Modeling Ensemble • Data Vault Hubs, Links and

Satellites • Working with Data Vault • Extreme Data Warehouse Agility • Architecture • Information Modeling • Succeeding with the Agile Data

Warehouse

2

Author, Advisor, Speaker & Industry Analyst; President Genesee Academy LLC, Principal at TopofMinds

About Hans Hultgren:

Affecto Webinar Event Q4 2013

Book available on Amazon.com

Page 3: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

The Data Vault modeling approach

• The Data Vault is a data modeling approach …so it fits into the family of modeling approaches:

3rd Normal Form Data Vault Dimensional

• While 3rd Normal Form is optimal for Operational Systems

…and Dimensional is optimal for Data Marts

…the Data Vault is optimal for the Data Warehouse (EDW)

3 Affecto Webinar Event Q4 2013

Page 4: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

• Business • Ability to adapt quickly to new business needs • Data is traceable allowing for a fully auditable, integrated data store. • Allows the EDW to absorb all data all of the time. • Easily adapts to new data sources and changing business rules –

without expensive re-engineering • Results in an Data Warehouse with lower total cost of ownership (TCO)

• Projects • Ideal for agile development techniques resulting in lower project risk and

more frequent deliverables • Can be built incrementally without compromising the core architecture

• Architecture • Parallel loading and restartability • Architecture that supports future expanded scope • Can scale to virtually any size without breaking down

Data Vault Benefits

4 Affecto Webinar Event Q4 2013

Page 5: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

A Saga of Data Warehousing

Once upon a time data warehousing was becoming more popular and everyone was eager to build their own. But whenever they tried they failed. They called upon their best to fix this but they just couldn’t solve the problem.

They discovered that meeting the needs of the data warehouse meant that the tables got too big and too hard to work with. They just could not handle changes over time. If the smallest thing changed it always meant they had to change the entire table. When just a single attribute was updated they had to insert a record for all of the attributes. All seemed lost.

But around the world there were rebels who questioned the conventional wisdom. And their voices were finally heard: Why not separate the things that change from the things that don’t change?

5 Affecto Webinar Event Q4 2013

Page 6: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Unified Decomposition™

6

• Separating the things that change from the things that don’t change. • break things out into component parts flexibility and capture things that

– are interpreted in different ways or – changing independently of each other

Affecto Webinar Event Q4 2013

Page 7: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Ensemble Modeling™

7

• The constellation of component parts acts as a whole – an Ensemble.

• With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts.

All the parts of a thing taken together, so that each part is considered only in relation to the whole.

Affecto Webinar Event Q4 2013

Page 8: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

The Data Vault Ensemble

8

• The Data Vault Ensemble conforms to a single key – embodied in the Hub construct.

• The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History

Affecto Webinar Event Q4 2013

Page 9: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Hubs

– A Hub Construct in Data Vault • contains Business Key • only the Business Key • contains No Context • is always 1:1 with EWBK

– A Hub Table contains only • Business Key • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source

9

Record source

Date/Time Stamp

Business Key

H_Customer_SID

H_Customer

Affecto Webinar Event Q4 2013

Page 10: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Links

– A Link Construct in Data Vault • contains Relationship • only a Relationship • contains No Context • is always 1:1 with Relationship

– A Link Table contains only • 2-n FKs for the Relationship • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source

10

L_Cust_Class_SID

H_Customer_SID

H_Sequence2_SID

Date/Time Stamp

Record source

L_Cust_Class Record source

Date/Time Stamp

Business Key

H_Customer_SID

H_Customer

– Unique – Specific – Natural

Business Relationship

Affecto Webinar Event Q4 2013

Page 11: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Satellites

– A Satellite Construct in Data Vault • contains Context only • has no FKs (no relationships) • Designed by * Rate of Change

* Type of Data * System…

– A Satellite Table contains only • Business Key FK + • Load Date / Time Stamp • Context Data… • Record Source

11

Context A Context B Context C

H_Customer_SID

Record source Context D

Date/Time Stamp

S_Customer

Record source

Date/Time Stamp

Business Key

H_Customer_SID

H_Customer

Affecto Webinar Event Q4 2013

Page 12: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Sample: Sales Data Vault Model

12 Affecto Webinar Event Q4 2013

Page 13: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Sales DV Model - Backbone

13

Sam

ple

Mod

el

Affecto Webinar Event Q4 2013

Page 14: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Data Vault means thinking differently

14

Customer

Customer • The minimal construct then for an “entity”

such as “Customer” is now a

Hub with a set of Satellites

Affecto Webinar Event Q4 2013

Page 15: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

15

Comparing the Models

Operational Data Warehouse Data Mart

Affecto Webinar Event Q4 2013

Page 16: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

16

A Customer Rating Changes 3 times…

Operational Data Warehouse Data Mart

Affecto Webinar Event Q4 2013

Page 17: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

17

A New Attribute is Added to Address…

Operational Data Warehouse Data Mart

Affecto Webinar Event Q4 2013

Page 18: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

18

Relationship to Cust_Class Changes…

Operational Data Warehouse Data Mart

Affecto Webinar Event Q4 2013

Page 19: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Fundamental Architecture

Load

Tran

sfor

m

Calc

ulat

e Co

nver

t

Clea

nse

Prof

ile

Val

idat

e

Extra

ct

Load

D/T

Stam

p

Inte

grat

e

Extra

ct

Staging

EDW

Tran

sfor

m

Calc

ulat

e Co

nver

t

Clea

nse

Prof

ile

Val

idat

e

Inte

grat

e

Raw BDW

19

Information Model

Data Mart

Data Mart

Data Mart

Affecto Webinar Event Q4 2013

Page 20: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

Succeeding with the Agile DW

20

Data Marts

Data Warehouse

Enterprise Data Warehouse

Applying an agile modeling methodology. This can only be accomplished if the program considers the people, processes, tools and techniques together.

Affecto Webinar Event Q4 2013

Page 21: Data Vault Affecto Nordics Webinar Q4 2013

© 2013 Genesee Academy, LLC

About Data Vault Ensemble

21

Estimated 800 Data Vault based Data Warehouses around the world

Affecto Webinar Event Q4 2013