NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi
-
Upload
nasscom -
Category
Technology
-
view
386 -
download
2
description
Transcript of NATC 2013 - Big Data Ecosystem at InMobi by Sharad Agarwal, InMobi
BIG DATA ECOSYSTEM AT INMOBI
Sharad Agarwal Sharad Agarwal Nasscom ATC 2013
Technology and Product have led to InMobi being recognized by MIT as one of the Top 50 Disruptive Companies for 2013 2
InMobi Global Reach And Scale 3
Leveraging Data 4
Decision Making by Machines
Reports
Data Driven Systems Data Driven Business Decisions
Increasing Value
Decision Making By Humans
Agile Reports & Analytics
Infrastructure Scaling
Data Sciences
Data Driven Decision Making
§ Campaign Delivery § Marketplace Health Optimization
§ Adoption Metrics § Product Performance Metrics and Debugging § Planning and Strategy – Demand, Supply and others
Business Metrics
§ New Product / Feature Ideas Exploration of new opportunities
Data Sciences Driven Systems
§ Conversion Based Pricing § Engagement based Pricing § Determining the value of Supply
Pricing
§ Prediction of Click through Rates and Conversion Rates § Forecasting and Planning – Inventory / Burn § Risk Mitigation and Management – Overburn / Fraud
Prediction Prediction
§ App Recommendation Engine § Dynamic Personalization of Creatives § Bid Budget Recommendation
Recommendation Recommendation
§ Audience Segment based Targeting § Geo and Hyper local Targeting § Contextual Targeting § Look Alike Modelling
Targeting
6
7
Access to Data
Ability to Process
Ability to U@lize
1
2
3
Data Flow 8
Data Systems
Reporting & Analytics
Feedback -> To power products
Ingest
Curate
Normalize
Store Analyze
Data Ingestion
Data Consumption
Design: Data Platform Goal 9
Commoditize Data Access And Processing
By Providing Rich Abstractions
Signals Ac3onable Insights InMobi Big Data Pla=orms
DATA INGESTION
CONDUIT + PINTAIL
DATA MGMT
FALCON
ANALYTICS
GRILL
SDK
APLICATIONS
DATA INFRASTRUCTURE
DASHBOARD
Hosted/On-‐Premise Cloud(Public/Private) Server Infrastructure
STORM
Conduit + PinTail 11
Collect signals – streaming, batch, multi-site At Scale In Real Time
A_part1 B_part3 B_part1
A
DC1 Consumers DC2 Consumers DC3 Consumers B A B
DC1 Producers DC2 Producers DC3 Producers A_part2
Control Flow
Data Flow
Apache Falcon 13
InMobi Incubated Its Hadoop Data Management Project in Apache
Apache Falcon
GRILL 15
Adhoc Reporting on Logical Cube Abstraction Across Heterogeneous Storages
GRILL: Query on Cube using HQL 16
InMobi and Big Data – Metrics 17
1+ PB Storage
Hadoop cluster
175 K
Hadoop Jobs per day
240 TB
Amount of data read / written by systems in a day
8 Bn
Hbase Read-Write throughputs per day
Raw events per day
10 Bn