"Interactive Deep Analytics" Dashboard

Post on 14-Jul-2015

165 views 1 download

Transcript of "Interactive Deep Analytics" Dashboard

Named leader in report.

Founded in 2009  Acquired by AOL in 2014  

Using Big Data stack since 2009

75 people - 30 R&D

1.5T of daily data >100 data sources

Helping marketers to optimize their spend

Across: Channels | Devices Online + Offline

Convertro

ü Clear

ü Actionable

ü Great UX/I

Successful Dashboard

Rendering time  

Storage  

Cost  

UX/I  

Considerations

Processing time  

Insights  

Comparison  

Over  2me  

Explore  

“S2cky”  Configura2on  

RT  metrics  

Integrate  

USE CASE #1

Speed   Batch  

Materialization to one table is too costly (belated massive updates)

Leverage Vertica’s sorted data structure

Join data in run time ( O(n) )

Query  

Spend    Batch  

Revenue  Speed  

Query* Merged  Structure  

Spend    Batch  

Revenue  Speed  

* λ architecture

USE CASE #2

Different    metrics  with    

1:N  rela2onships  

Avoid joins in query time ( if possible )

Pre joining and aggregate by dimensions

Pre joining does not necessarily explode

your data store

Visits,    Conversions,  Impressions  

Conversions   Impressions  ⨝   ⨝  

Σ  

Visits  

USE CASE #3

Many  Dimensions    

Limit number of returned records to screen – vizualize the most significant data  

 Allow to dump data with different QOS

Allow to choose up to X dimensions – not all  

For each page allow to choose different relevant dimensions  

Build different data structures for different pages

USE CASE #4

Same  data  different  rendering  

Same  data  different  rendering  

Query locality caching  

Backend does data rendering    

Shared configuration across widgets    

MPP has a limited query schedulers  

Table   Query   Cache  

Σ   Widget  1  

Widget  2  

USE CASE #5

Real  Time    data  points    2cker  

Sometimes you don’t have to be 100% accurate or consistent

try using:

Extrapolation

Sampling

Different data stores

Heuristics

logs   Speed  layer   Ticker  Every  X  minutes  

Real  2me    extrapola2on  

Hydro – Data Rendering Service

Hydro

EXTRACT  

TRNSFORM

RENDER  

ETL  Web/App  Server  

API  

DB1  

DB2  

Connect to any data source Multi level caching and invalidation Applying data transformation and rendering Logic sharing

Understand the requirements

One technology doesn't fit all

One data structure doesn't fit all

Good UX takes into account Data and Technology considerations

yaniv@convertro.com

Data Processing and Mining

Analytics DB - Vertica

Built  for  analy>cs  Storage  /  Query  engine  /  Op2mizer    

Column  oriented  store  Sorted  

True  MPP  

 Deals  well  with  high  cardinality  and  sparse  data  

*not an open source

Real Time metrics

Web Stack

Server      Pandas  Hydro  

Client    Backbone  marioneVe  RequireJs  handlebars      highcharts  d3    underscore  TwiVer  Bootstrap  SlickGrid  ...  

Architecture  

Visualizaion