The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP...

9
The Data Warehouse Automation Platform Architecture WWW.DMERLIN.COM

Transcript of The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP...

Page 1: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables

The Data Warehouse Automation PlatformArchitecture

WWW.DMERLIN.COM

Page 2: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables
Page 3: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables

Logical DatamartsEDW

Data Lake or Hadoop

ETL or ETL

OLTP databases

Enterprise applications

Web apps

Third-party

Other

Data Sources Data Use CasesData Transformation

Page 4: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables

Staging Area Transient Tables,

organized but no data model

Entprise Data Sources3 NF modelled, >1000+ Tables

Core DWHData Vault Model, Super-Sub Type,

Anchor (or 3 NF, Inmon)historical data, finest granularity

Information Marts(Virtual) Dimensional Model (Kimball)

historical data, finest granularity

Dat

a A

rchi

tect

ure

Syst

emA

rchi

tect

ure

Finance

HR

Sales

UC2 newSales Forecast

Sandbox / Data StoreLarge result tables organized by use

case, data project

UC1 newChurn Indicator

Derive/Aggregate

UC3 newprediction for KPI

Transform

Transformation phase: DataMerlin primary focusExtract phase

Data LakePermanent Tables,

additional Metadataorganized but no data model

* Could be implemented in

DataMerlin (optionally)

Extract / Load

Page 5: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables

Development / Test Environment Production Environment

Data Analyst createsDBA

Automatic or manual deployment is handled by

DataMerlin automatically generates deployment scripts and ETL recipes based on templates

Meta data repository on

Snowflake

Mappings* that are stored in

* DataMerlin automatically send different recipes based on proven data warehouse methodology for Snowflake database

creates all required ETL objects based on DataMerlin recipes and takes care for data load schedule

Meta data repository on

Snowflake

Data Warehouse

Execution engine Execution engineData Warehouse

Examples: Python, database procedures, Snowflake tasks, IBM Data Stage, Matillion, Airflow

Page 6: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables

DataMerlin offers support through the whole DWH implementation lifecycle, from data mapping to ETL development and deployment.

Page 7: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables
Page 8: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables

• Assessment workshop (remote)

• Installation (can be on-site or remote)

• On-site visit for larger clients (usually 5 days) • Installation (if not done remotely)

• Training

• Mentoring: implementation of first ETL mappings

• Mentoring: deployment of first ETL jobs to production

• Customisation possibilities

Page 9: The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP databases Enterprise applications Web ... Entprise Data Sources 3 NF modelled, >1000+ Tables

The Data Warehouse Automation Platform

WWW.DMERLIN.COM