Building a Big Data Warehouse
-
Upload
mids106 -
Category
Technology
-
view
102 -
download
0
description
Transcript of Building a Big Data Warehouse
GoDataDrivenPROUDLY PART OF THE XEBIA GROUP
Building a Big Data Warehouse
Joris BontjeBig Data Hacker
GoDataDriven
About MeBig Data HackerData Driven Solution ArchitectHadoop Trainer
GoDataDriven
About GoDataDriven
Data Warehouse Evolution
http://en.wikipedia.org/wiki/Data_warehouse
In computing, a data warehouse is a database used for reporting and data analysis.
GoDataDriven
Database Architecture (1.0)
Products)Customers)Orders)
Inventory)Sales)DB)
GoDataDriven
Analytical Database (2.0)
Sales&
Inventory&Customers&
Products&Orders&
GoDataDriven
Basic DWH Architecture
TX#DB#
Analy+cal#DB#
BI#ETL
GoDataDriven
Data Marts
TX#DB# DW#
Sales#
Mktg#
Prch#
BI#
GoDataDriven
Multiple Data-Sources
other&
Files&
TX&DB&
DW&
Sales&
Mktg&
Prch&
BI&
GoDataDriven
Operational Data Store
DW#ODS#
other#
Files#
TX#DB# Sales#
Mktg#
Prch#
BI#
Hadoop
GoDataDriven
No Hadoop
DW#ODS#
other#
Files#
TX#DB# Sales#
Mktg#
Prch#
BI#
GoDataDriven
ETL Engine
other&
Files&
TX&DB& Sales&
Mktg&
Prch&
DW BI&
GoDataDriven
Tiered Data Warehouse
other&
Files&
TX&DB& Sales&
Mktg&
Prch&
BI&
GoDataDriven
Analytical Query Engine
other&
Files&
TX&DB&
BI&
Tools
GoDataDriven
Tools
Tools Applied
GoDataDriven
Tools Applied
Considerations
GoDataDriven
ConsiderationsBig Data is dirtyAutomate everythingMonitoring and QA become the same thing
My Past TrendsBig Data Forum 2012
GoDataDriven
My Past Trends
Cloud / On-demand
GoDataDriven
My Past Trends
Hadoop Hardware
GoDataDriven
My Past Trends
Batch → Real-Time
New TrendsXebiCon 2013
GoDataDriven
TrendsImpala
Open Source, Real-time Query enginefor Hadoop
GoDataDriven
Trends
Defacto standard for Hadoop metadata
GoDataDriven
Simple Database Architecture
Products)Customers)Orders)
Inventory)Sales)DB)
GoDataDriven
The future?
Products)Customers)Orders)
Inventory)Sales)
GoDataDriven
We’re hiring / Questions? / Thank you!
Joris BontjeBig Data Hacker