Post on 02-Jul-2015
description
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 12460 North First Street, Suite 170, San Jose, CA 95131 408-433-9383 www.joviandata.com
Analytics at the Speed of Thought
Satya RamachandranVice President of Engineering
Anupam SinghChief Technology Officer
April 14, 2010
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 2
Technology platform to
optimize your conversion
funnel at the lowest cost
JovianDATA Mission
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 3
Why move to the cloud?Customer Problem Solution Impact
Media Conglomerate • Generating 5TB data per
quarter
A conventional data
warehousing stack
Capital Expenditure of
more than a million
dollars
Maintain a terabyte
scale enterprise stack
has recurring
expenditure
Agency • Getting 2TB of DoubleClick
data per quarter per
advertiser
Sample 5000 users
and use SAS for data
mining
Loss of Analytic
Richness
Build tech expertise to
maintain a warehouse
Portal • 200 Terabytes of data
• Large number of physical
nodes in a datacenter
Use ‘NoSQL’ (hadoop
etc) to develop an
analytics practice in
house
Deployment and SLA
maintenance is
impossible in a single,
monolithic cluster
NoSQL does not solve
issues of application
provisioning
Considering AWS actively but not sure about
• Cap Ex benefits• Current stack’s cloud readiness• Application provisioning challenges
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 4
Introducing JovianDATA
Extremely low TCO
Billions of Impressions, Clicks
& Conversions (100’s of TB)
No sampling
Multi-dimensional analytics
In-Flight
Fast Time-to-Value SaaS
Other Data Sources
+
+
+
+
Ad Server Data, Search Engine Data
Sales/Conversion Data
Site/Web Analytics Data
Customer/3rd Party Data
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 5
Transforming Data to Actionable Insights
High
Medium
Low
Engagement
Campaign Heat Map
Fully Materialized Data Cube
Publishers
Time
Incremental updates
Multi-dimensional indexes
Multi-dimensional partitions
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 6
Agenda
JovianDATA Company Overview
JovianInsights – The Power of Analytics
Analytics Lifecycle ManagementInnovations in Cloud Infrastructure Management
JovianDATA Cube Storage
Innovations in Advanced Analytics using commodity clusters
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 7
Avoiding Expensive Data Processing
Reduce Disk I/OBy Materializing Expensive Groups
Usage based Automatic View Materialization
Avoid Network I/O Multi-Dimensional Partitioning
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 8
Why move to the cloud?
Problem Current Solution Solution Impact
Capital Expenditure • Cap Ex takes long
approval cycles
JovianDATA enables
IT department to
complement and
extend cloud
infrastructure on
commodity machines
Reduce IT cost by
having a migration
path to a low cost
commodity cluster
environment
Over Provisioning • Resources are provisioned
for the peak leading to
massive underutilization
JovianDATA enables
extra load to be
handled by
dynamically
provisioning virtual
instances
TCO is tightly fitted to
usage rather than to
peak
Application Isolation • Configuration for
applications are guessed
resulting in expensive re-
config cycles while deploying
the application in production
JovianDATA provides
a configuration and
deployment
framework to isolate
applications in their
own set of instances
Prototyped application
can be deployed in
production without
interrupting other
applications
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 9
Agenda
Reducing Capex
Application Isolation
Dynamic Provisioning
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 10
Managing CapEx with Role Based Clusters
SINGLECLUSTER FOR
DATA CLEANSING, LOAD AND QUERY
15TB100 NODES
Monthly Cost = $28,800
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 11
Managing Cap-Ex with Role Based Clusters
LOAD MODEL
HIBERNATE MODEL
QUERY
UIAd Server Data, Search Engine Data
DATA CLEANSING2 hours daily for load on 10 nodes8 hours daily for query on 5 nodes
Monthly Cost = $2,052
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 12
Agenda
Reducing Capex
Application Isolation
Dynamic Provisioning
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 13
Temp1 Temp2
Selective Replication for on demand perf• Power analyst needs to perform complex, heavy number-crunching query that
typically take 8 - 10 hours
• Solution
• FlexRestoreTM
• Adds two new temporary nodes (Temp1, Temp2)
• Creates new replicas for hot partitions and redistributes across nodes
P34
P22
P12
P3
P1
P1
Node1 Node2 Node3 Node4
Nodeset1
P1
P34
P22 P12
P3
P22
P34
P12
P3
P34
P1
P12
P22
P3
With Replication Factor = 1Site Section Analytics = 10 minutes
With Replication Factor = 10Site Section Analytics = 30 seconds
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 14
Reduce replication to maintain cost • When the analysis is done and the extra performance is not needed, the SLA
Controller brings down the two temporary nodes (and the extra replicas)
• Benefits
• High performance computing power when you need it
• But only when you need it to hold down operating costs
P34
P22
P12
P3
P1
P3
P22
P34
P12
P3
P34
P1
P1
P12
P22
Node1 Node2 Node3 Node4
No
de
set1
P34
P22
P1
P12
Temp1 Temp2
P3
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 15
Agenda
Reducing Capex
Application Isolation
Dynamic Provisioning
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 16
FUNNEL ANALYSIS FOR CLIENT
Provision Tera Scale Applications in Minutes
Campaign Manager needs to runheavy duty reports for a
Big Advertiser
Without Application IsolationData for all advertisers is kept
‘live’ on 50 nodes
50 live nodes per month=
$14, 400
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 17
Provision Tera Scale Applications in Minutes
FUNNEL ANALYSIS FOR CLIENT
HIBERNATED MODEL
Campaign Manager requestsApplication Provisioning for a
Specific Advertiser
Application is provisioned in parallel from S3/EBS into EC2
50 nodes for fortnightly analysis=
$320
JovianDATA © 2009 Confidential & Proprietary InformationSlide - 18
Summary
Dynamic Provisioning with Selective Replication on EC2
10x Performance on EC2 replication
Reducing CapEx with Role based Temporary Clusters on EC2
10x Cost Savings with EC2 usage
Application Isolation with Application Hibernation on S3/EBS
100x Cost Savings with EC2-S3
Thank You