Case Study: Cloud based DWH Solution using Amazon Redshift

12
Cloud-based DWH Solution Using Amazon Redshift CeBIT | 12 March 2014 Ionut Hedesiu Senior Software Engineer

Transcript of Case Study: Cloud based DWH Solution using Amazon Redshift

Cloud-based DWH Solution

Using Amazon Redshift

CeBIT | 12 March 2014

Ionut Hedesiu

Senior Software Engineer

What is Big Data?

BigWhat does it stand for?

Does it really matter?

What if?

affordable and

intuitive framework

complete ETL flow ready in minutes

no 3rd

party licensing royalties

any amount of data

no single point of failure

Approach

inexpensive, highly performant data

warehousing

strictly proven open source technologies

horizontally and vertically

scalable

Solution

independent, metadata-driven

modules

collection of python modules

deployed and tested on enterprise/commodity hardware and Amazon

cloud solutions

Implementation

• simple virtual Linux boxes

• instance auto-spawn

• SQL code on the fly

• AMQP standard messaging

• detailed logging, Splunk

• fully configurable

Features

enterprise messaging

metadata-driven ETL flows

multiple work queues

detailed logging in multiple destinations

secure user access

alerts based on user-defined formulas

Benefits

SCALABLE• vertical and horizontal• auto scalability and load balancing

CUSTOMISABLE• platform and database agnostic• quick module addition or removal

COST-EFFICIENT• minimal cost and development time• very low maintenance cost

Benefits

POWERFUL• real-time data analytics• massive parallel processing• intensive data mining and cleansing

ROBUST• 99.5% availability• minimal or no maintenance• lightweight framework

FLEXIBLE• one central point of control• metadata driven

Case Study – Global Media Organisation

• 500+ source systems• 3 database vendors• local batch processing

• no global data overview• no data integration

Implementation Overview

• centralised data repository• real time processing• metadata driven• customised to client needs

• Python • Rabbit MQ• Amazon Redshift• Tableau

Benefits & Results

• tenfold cost reduction• intuitive and easy to use • secure and simple to

administer

• real time analytics• improved decision-making • minimal to no maintenance• high scalability