Data Vault + Data Virtualization = Double Flexibility€¦ · and data virtualization. He is...
Transcript of Data Vault + Data Virtualization = Double Flexibility€¦ · and data virtualization. He is...
1
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands. All rights
reserved. No part of this material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or
otherwise, without the explicit written permission of the copyright owners.
by
Rick F. van der LansR20/Consultancy BVTwitter @rick_vanderlanswww.r20.nl
Data Vault + Data Virtualization = Double Flexibility
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 2
Rick F. van der LansRick F. van der Lans is an independent consultant, lecturer, and author. He specializes in data warehousing, business intelligence, database technology, and data virtualization. He is managing director of R20/Consultancy B.V.. Rick has been involved in various projects in which data warehousing, and integration technology was applied.
Rick van der Lans is an internationally acclaimed lecturer. He has lectured professionally for the last twenty five years in many of the European and Middle East countries, the USA, South America, and in Australia. He has been invited by several major software vendors to present keynote speeches.
He is the author of several books on computing, including his new Data Virtualization for Business Intelligence Systems. Some of these books are available in different languages. Books such as the popular Introduction to SQL is available in English, Dutch, Italian, Chinese, and German and is sold world wide. He also authored The SQL Guide to Ingres and SQL for MySQL Developers.
As author for TechTarget.com and BeyeNetwork.com, writer of whitepapers, chairman for the annual European Enterprise Data and Business Intelligence Conference, and as columnist for a few IT magazines, he has close contacts with many vendors.
R20/Consultancy B.V. is located in The Hague, The Netherlands, www.r20.nl. You can get in touch with Rick via: Email: [email protected]: @Rick_vanderlansLinkedIn: http://www.linkedin.com/pub/rick-van-der-lans/9/207/223
2
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 3
Reporting on a Data Vault DW ??
Reporting andAnalytics
productiondatabases
stagingarea
DVEDW
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 4
Flexibility is Gone!
Datastore
DatastoreData
store
Datastore
DatastoreData
store
Datastore
Datastore
stagingarea
DVEDW
productiondatabases
3
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 5
Physical Data Marts
Define data structuresDefine ETL logicInstall a database instanceCreate a databaseImplement the tablesDesign physical database structureInitial load of the tablesPeriodic load of the tablesTune and optimize the database (regularly)Tune and optimize ETL logic
Monitor database usageDevelop and run backup andrecovery processesUnload dataChange data structureChange ETL logicTune and optimize physicaldatabase designTune and optimize ETL logicReload data…
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 6
Remarks on Data Marts and Cubes
Gartner in Data Management Cost-Cutting Tips, March 10, 2008:Consolidate data marts into an application-neutral data warehouse or smaller data marts to reduce the cost and complexity of the data integration processes feeding the data marts. Gartner predicts this could save you 50 percent of what you're spending to support the siloed data marts.
4
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 7
Flexibility Through Data Virtualization
DataVirtualization
Server
productiondatabases
stagingarea
DVEDW
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 8
Data Virtualization Overview (1)
productiondatabases
streamingdatabases
socialmedia data
productionapplication
big datastores
website
ESB
analytics& reporting
unstructureddata
mobileApp
datawarehouse
& data marts
internalportal dashboard
externaldata
privatedata
Data Virtualization Server
applications
5
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 9
Data Virtualization Overview (2)
streamingdatabases
socialmedia data
productionapplication
big datastores
website
ESB
analytics& reporting
mobileApp
datawarehouse
& data marts
internalportal dashboard
externaldata
privatedata
ODBC/SQL JDBC/SQL XML/SOAP REST/JSON XQuery MDX/DAX
JMS SQL SQL+ XSLT Hive Prop. Excel JSONCICS SOAP
productiondatabases
applications
SQL statement
JMS message SQL statement SOAP messageData Virtualization Server
unstructureddata
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 10
Indonesian “Rijsttafel”
6
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 11
The Service Hatch
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 12
Data Virtualization as Service Hatch
Kitchen Servicehatch
Food Restaurant
Datasources
Datavirtualization
serverData End Users
7
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 13
The Market of Data Virtualization Servers
Cirro Data HubCisco/Composite Information ServerDenodo PlatformIBM InfoSphere Federation ServerInformatica Data ServicesInformation Builders EIIOracle Data Services IntegratorProgress EasylRed Hat Teiid and Jboss Data VirtualizationStone Bond Enterprise Enabler VirtuosoAnd many more …
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 14
Gartner on Integration Tools
Source: Gartner 2014: Modernize Your Data Integration Capabilities for Diverse Use-Cases, Ted Friedman
8
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 15
Source table
Virtual table:May contain row selections, column selections, column concatenations, transformations, column and table name changes, groupings, aggregations, data cleansing, …
Developing Virtual Tables
Data consumer
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 16
Nested virtual table
Source table
Virtual table
Nesting Virtual Tables
9
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 17
Layers of Virtual Tables
DataVirtualization
Server
Database 2Database 1 Database 3 Database 4
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 18
Virtual tablewith cache
Virtual tablewithout cache
Caches Mimimize Access to Data Stores
10
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 19
Enable caching
Table where cache should be stored
Refresh specification
Table where cache should be stored
Refresh specification
Enable caching
Table where cache should be stored
Refresh specification
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 20
Data Virtualization and Data Vault
Data Vault - EDW
productionapplication website
analytics& reporting
mobileApp
internalportal dashboard
ODBC/SQL JDBC/SQL XML/SOAP REST/JSON XQuery MDX/DAX
SQL
Data Virtualization Server
11
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 21
Solution With Data Virtualization
OperationalSystems
Data VaultEDW
SupernovaLayer
Extended Supernova Layer
Data DeliveryLayer
PDB PDB PDB PDB
Data virtu
alization
Data sto
rage
Data Vault
Users andReports
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 22
The Challenge: The Versions
12
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 23
Example: A Data Vault Model
OperationalSystems
Data VaultEDW
SupernovaLayer
Extended Supernova Layer
Data DeliveryLayer
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 24
Example: The SuperNova Model
All the satellite data is added to hubs and linksA record in a hub table represents a version of a hub objectA record in a link table represents a version of a link objectThe hub/link id + startdate are the primary keys
13
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 25
Why the Name SuperNova?
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 26
Determining Versions of Hubs
HUB_ID META_LOAD_DTS META_LOAD_END_DTS 1 2012-06-01 00:00:00 2013-11-14 23:59:59 1 2013-11-15 00:00:00 2014-03-06 23:59:59 1 2014-03-07 00:00:00 9999-12-31 00:00:00
HUB_ID META_LOAD_DTS META_LOAD_END_DTS 1 2013-06-21 00:00:00 2013-07-20 23:59:59 1 2013-07-21 00:00:00 2013-11-12 23:59:59 1 2013-11-13 00:00:00 9999-12-31 00:00:00
Satellite 1 records for hub object 1:
Satellite 1 records for hub object 1:
Merged result showing all versions of hub 1: HUB_ID STARTDATE ENDDATE 1 2012-06-01 00:00:00 2013-06-21 23:59:59 1 2013-06-22 00:00:00 2013-07-20 23:59:59 1 2013-07-21 00:00:00 2013-11-12 23:59:59 1 2013-11-13 00:00:00 2013-11-14 23:59:59 1 2013-11-15 00:00:00 2014-03-06 23:59:59 1 2014-03-07 00:00:00 9999-12-31 00:00:00
14
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 27
201206-01
190001-01
201306-21
201307-21
201311-13
201311-15
999912-31
Versions of hub 1from satellite1 table
Versions of hub 1from satellite2 table
201403-07
+
Visualization of Merge Process
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 28
Step 1 of Determining Hub Versions
(SELECT HUB_ID, META_LOAD_DTS AS STARTDATE, META_LOAD_END_DTS AS ENDDATE
FROM HUB1_SATELLITE1UNION SELECT HUB_ID, META_LOAD_DTS, META_LOAD_END_DTS FROM HUB1_SATELLITE2)
Merge all the satellites (with a union operator) :
Intermediate result: SATELLITES HUB_ID STARTDATE ENDDATE 1 2012-06-01 00:00:00 2013-11-14 23:59:59 1 2013-06-21 00:00:00 2013-07-20 23:59:59 1 2013-07-21 00:00:00 2013-11-12 23:59:59 1 2013-11-13 00:00:00 9999-12-31 00:00:00 1 2013-11-15 00:00:00 2014-03-06 23:59:59 1 2014-03-07 00:00:00 9999-12-31 00:00:00 2 2011-03-20 00:00:00 2012-02-25 23:59:59 2 2012-02-26 00:00:00 2014-02-25 23:59:59 2 2012-02-26 00:00:00 9999-12-31 00:00:00 2 2014-02-26 00:00:00 9999-12-31 00:00:00 3 2013-09-09 00:00:00 2013-11-11 00:00:00 3 2013-11-12 00:00:00 2013-11-12 00:00:00 Note that this result does not include hub object 4, because it has no satellite data.
15
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 29
Step 2 of Determining Hub Versions
SELECT HUB1.HUB_ID, SATELLITES.STARTDATE, SATELLITES.ENDDATE, HUB1.BUSINESS_KEYFROM HUB1 LEFT OUTER JOIN
(SELECT HUB_ID, META_LOAD_DTS AS STARTDATE, META_LOAD_END_DTS AS ENDDATEFROM HUB1_SATELLITE1UNION SELECT HUB_ID, META_LOAD_DTS, META_LOAD_END_DTS FROM HUB1_SATELLITE2) AS SATELLITES ON HUB1.HUB_ID = SATELLITES.HUB_ID)
Join with the original Hub table and get the business key(s):
Intermediate result:
STARTDATES HUB_ID STARTDATE ENDDATE BUSINESS_KEY 1 2012-06-01 00:00:00 2013-11-14 23:59:59 b1 1 2013-06-21 00:00:00 2013-07-20 23:59:59 b1 1 2013-07-21 00:00:00 2013-11-12 23:59:59 b1 1 2013-11-13 00:00:00 9999-12-31 00:00:00 b1 1 2013-11-15 00:00:00 2014-03-06 23:59:59 b1 1 2014-03-07 00:00:00 9999-12-31 00:00:00 b1 2 2011-03-20 00:00:00 2012-02-25 23:59:59 b2 2 2012-02-26 00:00:00 2014-02-25 23:59:59 b2 2 2012-02-26 00:00:00 9999-12-31 00:00:00 b2 2 2014-02-26 00:00:00 9999-12-31 00:00:00 b2 3 2013-09-09 00:00:00 2013-11-11 00:00:00 b3 Table continues on the next p
3 2013-11-12 00:00:00 2013-11-12 00:00:00 b3 4 NULL NULL b4 -1 NULL NULL Unknown -2 NULL NULL N.a.
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 30
Step 3 of Determining Hub Versions
Find for each hub the correct versions: HUB1_VERSIONS HUB_ID STARTDATE ENDDATE HUB_BUSINESS_KEY 1 2012-06-01 00:00:00 2013-07-20 23:59:59 b1 1 2013-07-21 00:00:00 2013-11-12 23:59:59 b1 1 2013-11-13 00:00:00 2013-11-14 23:59:59 b1 1 2013-11-15 00:00:00 2014-03-06 23:59:59 b1 1 2014-03-07 00:00:00 9999-12-31 00:00:00 b1 2 2011-03-20 00:00:00 2012-02-25 23:59:59 b2 2 2102-02-26 00:00:00 2014-02-25 23:59:59 b2 2 2014-02-26 00:00:00 9999-12-31 00:00:00 b2 3 2013-09-09 00:00:00 2013-11-11 23:59:59 b3 3 2013-11-12 00:00:00 2013-11-30 00:00:00 b3 4 1900-01-01 00:00:00 9999-12-31 00:00:00 b4 -1 1900-01-01 00:00:00 9999-12-31 00:00:00 Unknown -2 1900-01-01 00:00:00 9999-12-31 00:00:00 N.a.
16
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 31
The Three Steps Combined
CREATE VIEW HUB1_VERSIONS ASWITH STARTDATES (HUB_ID, STARTDATE, ENDDATE, BUSINESS_KEY) AS (
SELECT HUB1.HUB_ID, SATELLITES.STARTDATE, SATELLITES.ENDDATE, HUB1.BUSINESS_KEYFROM HUB1 LEFT OUTER JOIN
(SELECT HUB_ID, META_LOAD_DTS AS STARTDATE, META_LOAD_END_DTS AS ENDDATEFROM HUB1_SATELLITE1UNION SELECT HUB_ID, META_LOAD_DTS, META_LOAD_END_DTS FROM HUB1_SATELLITE2) AS SATELLITES ON HUB1.HUB_ID = SATELLITES.HUB_ID)
SELECT DISTINCT HUB_ID, STARTDATE, CASE WHEN ENDDATE_NEW <= ENDDATE_OLD THEN ENDDATE_NEW ELSE ENDDATE_OLD END AS ENDDATE,BUSINESS_KEY
FROM (SELECT S1.HUB_ID, ISNULL(S1.STARTDATE,'1900-01-01 00:00:00') AS STARTDATE, (SELECT ISNULL(MIN(STARTDATE - '1' SECOND),'9999-12-31 00:00:00') FROM STARTDATES AS S2WHERE S1.HUB_ID = S2.HUB_IDAND S1.STARTDATE < S2.STARTDATE) AS ENDDATE_NEW, ISNULL(S1.ENDDATE,'9999-12-31 00:00:00') AS ENDDATE_OLD, S1.BUSINESS_KEY
FROM STARTDATES AS S1) AS S3
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 32
Hubs with Less Than Two Satellites
Hubs with no satellites:
Hubs with one satellite:
CREATE VIEW HUB3_VERSIONS (HUB_ID, STARTDATE, ENDDATE, BUSINESS_KEY) ASSELECT HUB_ID, ISNULL(META_LOAD_DTS, '1900-01-01 00:00:00'),
'9999-12-31 00:00:00', BUSINESS_KEYFROM HUB3
CREATE VIEW HUB2_VERSIONS (HUB_ID, STARTDATE, ENDDATE, BUSINESS_KEY) ASSELECT HUB2.HUB_ID, ISNULL(HUB2_SATELLITE1.META_LOAD_DTS, '1900-01-01 00:00:00'),
ISNULL(HUB2_SATELLITE1.META_LOAD_END_DTS, '9999-12-31 00:00:00'), HUB2.BUSINESS_KEY
FROM HUB2 LEFT OUTER JOIN HUB2_SATELLITE1ON HUB2.HUB_ID = HUB2_SATELLITE1.HUB_ID
17
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 33
Creating the SuperNova Hub Views
A hub is joined with all its satellites using the data in the hub_versions views:CREATE VIEW SUPERNOVA_HUB1
(HUB_ID, STARTDATE, ENDDATE, BUSINESS_KEY, ATTRIBUTE1, ATTRIBUTE2) ASSELECT HUB1_VERSIONS.HUB_ID, HUB1_VERSIONS.STARTDATE, HUB1_VERSIONS.ENDDATE,
HUB1_VERSIONS.BUSINESS_KEY, HUB1_SATELLITE1.ATTRIBUTE, HUB1_SATELLITE2.ATTRIBUTE
FROM HUB1_VERSIONS LEFT OUTER JOIN HUB1_SATELLITE1
ON HUB1_VERSIONS.HUB_ID = HUB1_SATELLITE1.HUB_ID AND (HUB1_VERSIONS.STARTDATE <= HUB1_SATELLITE1.META_LOAD_END_DTS AND HUB1_VERSIONS.ENDDATE >= HUB1_SATELLITE1.META_LOAD_DTS)
LEFT OUTER JOIN HUB1_SATELLITE2 ON HUB1_VERSIONS.HUB_ID = HUB1_SATELLITE2.HUB_ID AND (HUB1_VERSIONS.STARTDATE <= HUB1_SATELLITE2.META_LOAD_END_DTS AND HUB1_VERSIONS.ENDDATE >= HUB1_SATELLITE2.META_LOAD_DTS)
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 34
Virtual Contents of the SuperNova Hub
SUPERNOVA_HUB1 HUB_ID STARTDATE ENDDATE HUB_BUSINESS
_KEY ATTRIBUTE1 ATTRIBUTE2
1 2012-06-01 00:00:00 2013-07-20 23:59:59 b1 a1 a7 1 2013-07-21 00:00:00 2013-11-12 23:59:59 b1 a1 a8 1 2013-11-13 00:00:00 2013-11-14 23:59:59 b1 a1 a9 1 2013-11-15 00:00:00 2014-03-06 23:59:59 b1 a2 a9 1 2014-03-07 00:00:00 9999-12-31 00:00:00 b1 a3 a9 2 2011-03-20 00:00:00 2012-02-25 23:59:59 b2 a4 a10 2 2102-02-26 00:00:00 2014-02-25 23:59:59 b2 a5 a11 2 2014-02-26 00:00:00 9999-12-31 00:00:00 b2 a6 a11 3 2013-09-09 00:00:00 2013-11-11 23:59:59 b3 NULL a12 3 2013-11-12 00:00:00 2013-11-30 00:00:00 b3 NULL a13 4 1900-01-01 00:00:00 9999-12-31 00:00:00 b4 NULL NULL -1 1900-01-01 00:00:00 9999-12-31 00:00:00 Unknown NULL NULL -2 1900-01-01 00:00:00 9999-12-31 00:00:00 N.a. NULL NULL
18
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 35
Creating Version Views for Links
CREATE VIEW LINK_VERSIONS ASWITH STARTDATES (LINK_ID, STARTDATE, ENDDATE, HUB1_ID, HUB2_ID, EVENTDATE) AS (
SELECT LINK.LINK_ID, SATELLITES.STARTDATE, SATELLITES.ENDDATE, LINK.HUB1_ID, LINK.HUB2_ID, LINK.EVENTDATE
FROM LINK LEFT OUTER JOIN(SELECT LINK_ID, META_LOAD_DTS AS STARTDATE, META_LOAD_END_DTS AS ENDDATEFROM LINK_SATELLITE1UNION SELECT LINK_ID, META_LOAD_DTS, META_LOAD_END_DTSFROM LINK_SATELLITE2) AS SATELLITES ON LINK.LINK_ID = SATELLITES.LINK_ID)
SELECT DISTINCT LINK_ID, STARTDATE, CASE WHEN ENDDATE_NEW <= ENDDATE_OLD THEN ENDDATE_NEW ELSE ENDDATE_OLD END AS ENDDATE,HUB1_ID, HUB2_ID, EVENTDATE
FROM (SELECT S1.LINK_ID, ISNULL(S1.STARTDATE, '1900-01-01') AS STARTDATE, (SELECT ISNULL(MIN(STARTDATE - INTERVAL '1' SECOND),'9999-12-31 00:00:00') FROM STARTDATES AS S2WHERE S1.LINK_ID = S2.LINK_IDAND S1.STARTDATE < S2.STARTDATE) AS ENDDATE_NEW,ISNULL(S1.ENDDATE,'9999-12-31') AS ENDDATE_OLD,S1.HUB1_ID, S1.HUB2_ID, S1.EVENTDATE
FROM STARTDATES AS S1) AS S3
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 36
Creating the SuperNova Link Views
CREATE VIEW SUPERNOVA_LINK (LINK_ID, HUB1_ID, HUB2_ID, STARTDATE, ENDDATE, EVENTDATE,ATTRIBUTE1, ATTRIBUTE2) AS
SELECT LINK_VERSIONS.LINK_ID, LINK_VERSIONS.HUB1_ID, LINK_VERSIONS.HUB2_ID,LINK_VERSIONS.STARTDATE, LINK_VERSIONS.ENDDATE, LINK_VERSIONS.EVENTDATE,LINK_SATELLITE1.ATTRIBUTE, LINK_SATELLITE2.ATTRIBUTE
FROM LINK_VERSIONS LEFT OUTER JOIN LINK_SATELLITE1
ON LINK_VERSIONS.LINK_ID = LINK_SATELLITE1.LINK_ID AND (LINK_VERSIONS.STARTDATE <= LINK_SATELLITE1.META_LOAD_END_DTS AND LINK_VERSIONS.ENDDATE >= LINK_SATELLITE1.META_LOAD_DTS)
LEFT OUTER JOIN LINK_SATELLITE2 ON LINK_VERSIONS.LINK_ID = LINK_SATELLITE2.LINK_ID AND (LINK_VERSIONS.STARTDATE <= LINK_SATELLITE2.META_LOAD_END_DTS AND LINK_VERSIONS.ENDDATE >= LINK_SATELLITE2.META_LOAD_DTS)
A link is joined with all its satellites using the data in the link_versions views:
19
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 37
Virtual Contents of the SuperNova Link
LINK_VERSIONS LINK_ID STARTDATE ENDDATE HUB1_ID HUB2_ID EVENTDATE 1 2013-12-01 00:00:00 2013-12-24 23:59:59 1 5 2013-12-01 1 2013-12-25 00:00:00 2014-01-23 23:59:59 1 5 2013-12-01 1 2014-01-24 00:00:00 9999-12-31 00:00:00 1 5 2013-12-01 2 2014-03-12 00:00:00 9999-12-31 00:00:00 1 6 2014-01-01 3 2013-12-27 00:00:00 2014-02-01 23:59:59 2 6 2013-12-25 3 2014-02-02 00:00:00 9999-12-31 00:00:00 2 6 2013-12-25 4 2013-12-08 00:00:00 9999-12-31 00:00:00 3 -1 2013-06-24
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 38
Lineage Analysis of All Views
20
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 39
Defining Primary and Foreign Keys
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 40
Caching of SuperNova Views
21
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 41
The Extended SuperNova Model
Add derived dataTransform dataReuse of definitionsAlways use the XSN layer
OperationalSystems
Data VaultEDW
SupernovaLayer
Extended Supernova Layer
Data DeliveryLayer
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 42
The Data Delivery Model
Data is shown in a filtered mannerData is shown in aggregated formData is shown in one large, highly denormalized tableData is shown in a star schema formData is shown with a service interface…
OperationalSystems
Data VaultEDW
SupernovaLayer
Extended Supernova Layer
Data DeliveryLayer
22
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 43
Virtual Data Marts
Define data structuresDefine ETL/DV logicInstall a database instanceCreate a databaseImplement the tablesDesign physical database structureInitial load of the tablesPeriodic load of the tablesTune and optimize the database (regularly)Tune and optimize ETL logic
Monitor database usageDevelop and run backup andrecovery processesUnload dataChange data structureChange ETL/DV logicTune and optimize physicaldatabase designTune and optimize ETL logicReload data…
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 44
Why Not Database Views?
Not database server independentMore advanced distributed join featuresMore advanced heterogeneous join featuresMore advanced caching/refreshing featuresDatabase views offer no lineage/impact analysisDatabase views offer only one API: SQLNo versioning of joinsNo data cleansing featuresNo business glossary…
23
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 45
The Whitepaper
Download: www.r20.nl or http://www.cisco.com/web/services/enterprise-it-services/data-virtualization/documents/whitepaper-cisco-datavaul.pdf
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 46
Closing Remarks
Data Vault offers data model extensibility and report reproducibilityData vault is half the solutionSuperNova (with data virtualization) is the other halfWith data virtualization a more flexible reporting and analytical environment can be developed (quickly)Avoid the (physical) data mart explosion! Go virtual!
24
Copyright © 1991 - 2015 R20/Consultancy B.V., The Hague, The Netherlands 47