The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP...
Transcript of The Data Warehouse Automation Platform Architecture · Data Lake or Hadoop ETL or ETL OLTP...
The Data Warehouse Automation PlatformArchitecture
WWW.DMERLIN.COM
Logical DatamartsEDW
Data Lake or Hadoop
ETL or ETL
OLTP databases
Enterprise applications
Web apps
Third-party
Other
Data Sources Data Use CasesData Transformation
Staging Area Transient Tables,
organized but no data model
Entprise Data Sources3 NF modelled, >1000+ Tables
Core DWHData Vault Model, Super-Sub Type,
Anchor (or 3 NF, Inmon)historical data, finest granularity
Information Marts(Virtual) Dimensional Model (Kimball)
historical data, finest granularity
Dat
a A
rchi
tect
ure
Syst
emA
rchi
tect
ure
Finance
HR
Sales
UC2 newSales Forecast
Sandbox / Data StoreLarge result tables organized by use
case, data project
UC1 newChurn Indicator
Derive/Aggregate
UC3 newprediction for KPI
Transform
Transformation phase: DataMerlin primary focusExtract phase
Data LakePermanent Tables,
additional Metadataorganized but no data model
* Could be implemented in
DataMerlin (optionally)
Extract / Load
Development / Test Environment Production Environment
Data Analyst createsDBA
Automatic or manual deployment is handled by
DataMerlin automatically generates deployment scripts and ETL recipes based on templates
Meta data repository on
Snowflake
Mappings* that are stored in
* DataMerlin automatically send different recipes based on proven data warehouse methodology for Snowflake database
creates all required ETL objects based on DataMerlin recipes and takes care for data load schedule
Meta data repository on
Snowflake
Data Warehouse
Execution engine Execution engineData Warehouse
Examples: Python, database procedures, Snowflake tasks, IBM Data Stage, Matillion, Airflow
DataMerlin offers support through the whole DWH implementation lifecycle, from data mapping to ETL development and deployment.
• Assessment workshop (remote)
• Installation (can be on-site or remote)
• On-site visit for larger clients (usually 5 days) • Installation (if not done remotely)
• Training
• Mentoring: implementation of first ETL mappings
• Mentoring: deployment of first ETL jobs to production
• Customisation possibilities
The Data Warehouse Automation Platform
WWW.DMERLIN.COM