Hadoop Summit Socialize & Splunk

27
!"#$%&'() + ,-./ 0#1234 5367 Big Data at the Speed of Business Isaac Mosquera Director of Mobile, Sh areThis Clint Sharp Principal Big Data Product Manager , Splunk !"#$%&'() + ,-./ 0#1234 5367

description

Splunk and Socialize discuss Big Data and how to process it efficiently.

Transcript of Hadoop Summit Socialize & Splunk

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 1/32

Big Data at theSpeed of Busin

Isaac MosqueraDirector of Mobile, ShareThis

Clint SharpPrincipal Big Data Product Manager,

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 2/32

What We’ll Talk About

•  Our quest for visibility

•   Analyzing at scale

•  Splunk and Big Data

•  Where do you start?

•  Q&A

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 3/32

 About Splunk 

Company(NASDAQ:SPLK)

" Founded2004,firstso?warerele

" HQ:SanFrancisco

BusinessModel/Products

" Industry-leadingmachinedatapla

" On-premise,inthecloudandSaa

5,600+Customers

" 63oftheFortune100

" Largestlicense:100Terabytespe

#1BigDataInnovator*

*FastCompany'sMostInnova1veComp

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 4/32

About ShareThis and Social

" ShareThismakestheworldmoreconnected,

trustedandvaluablethroughsharing

" Powersthesocialweb,touchingthelives

of95percentofU.S.

" AcquiresSocialize,whichmakesmobile

andsocialmoreengaging

" SocializedintegratedintothousandsofiOSandAndroidApps

" Installedon80M+devices

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 5/32

Evaluating 20 Billio Ad Impressions Monthl

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 6/32

AdRequest R TB

AdRequest

So

B

BidResponseWinningBidder'sAd

AdImpression

AdClick

Little Bit About Real-Time Bidd

Allthisneedstohappeninlessthan100milliseconds!

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 7/32

So What Are Some of the Proble

DecisionMaki

(BidAlgorithm" IngesYngmorethan10,000

queriespersecond

" Whichbidsare>100ms

" Quicklyfindinganyerrorswithinthesystem

" Campaignspend

" Campaignefficie

" Dissectdataby:

 –  apps –  users –  devices

OperaTonal

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 8/32

Analyzing Big Data Efficien

1. 2. 3.

CollecYon Storage AnalyzaYon/

AggregaYon

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 9/32

Some Options

SQLfuncYonslikecount()presentsproblemsatscale

WriteoperaYonstoohighforasingleDB,aswellasasinglepointoffailure

Wouldworkwellforhighinsertsandqueries,howeverwewouldneedtobuildalerYng,charYng

andreporYngdashboardsEasytosetupandqueryusingHivehoweverwewouldhavetosetupanewenvironmentsandlearnnewtechnology

RDBMS

RDBMS

NoSQL

Hadoop

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 10/32

EasilyidenYfyproblemsandpreventerroneouspending.Whenanalertgoesoffwehitascrwhichshutsoffthebidder.

Allowsustofindpaernsinthedatatoimproourbidalgorithms

Instantlyknowcampaignmetricsforusand

ourclientsAddingnewRTBServiceprovidersmeansbillinewadrequests.Scalinghorizontallyiskey

OperaTonal

ReporTng

AdHocQueries

ApplicaTon

ReporTngScalability

Splunk Fits the Bill

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 11/32

 Analysis/Aggregationindex=ad_events displayed_ad

| bin _time span=1m

| stats count(meta.displayed_ad) as displayssum(price/1000) as dollars_spent

avg(price) as avg_cpm_price

by campaign_id _time

| mysqloutput spec=ads-prod table=ads_analytics

insert="campaign_id, stat_date, displays, dollars_spent, avg_cpm_price"

RDBM

(Generated

Search

Head

Indexer

Indexer

Indexer

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 12/32

InteracYveanalysiswithSearchProcessingLanguage:

Using Splunk to Analyze Operation

EasilydigestinformaYonthroughcharts

source="nginx-prod.log" | stats avg(ResponseTime) avg_rtime, p95(ResponseTime) as p95_rtime ,

stdev(ResponseTime) as stdev_rtime

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 13/32

Final Architecture

RDBMS

(Generated

Reports)S3

Snapshots

Search

Head

SocializeBidder

Splunk

Indexer

Indexer

Indexer

CacheCluster

Memcache Memcache Memc

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 14/32

So, What isSplunk?

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 15/32

Expanding Universe of Data Sou

Machine-generatedDataBusinessApplicaTonData Human

HighlyStructured Arbitraril

2012-12-05 07:04:44Id=00Q000000Rd910EAJ City=New York

Country=US CreatedDate=“2012-12-05

07:06:44” [email protected]

Email_Opt_In_c Customer_Street

_Address_c=“123 Main St.”purchased_product_id=

product_i BD-01 twitter_username

john_t_doe

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 16/32

Industry Leading Platform for Mach

 Any Machine Data Operational Intelligen

HAIndexes

andStorage

Custodashboa

Monitorandalert

Adhocsearch

Reportandanalyze

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 17/32

 Analyzing Heterogeneous Da

UniversalIndex Schema-on-the-fly FlexFast

• NodatanormalizaYon• AutomaYcallyhandlesYmestamps

• Parsersnotrequired•  Indexeveryterm&paern“blindly”

• Noaemptto“understand”upfront

• Structureappliedatsearch-Yme

• Nobrileschematoworkaround

• AutomaYcallyfindtransacYons,paernsandtrends

• Normaneeded

• Faster• Easyse• MulYpsamed

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 18/32

Gain Critical Insights … in Real-OrderID

TimeWaiYngOnHold

Company’sName

Sources

wier

CareIVR

MiddlewareError

OrderProcessing

OrderID

CustomerID

TwierID

CustomerID

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 19/32

Deep Visibility and Insight for IT and

ITOperaYonsManagement WebIntellig

BusinessAnalyYcApplicaYonManagement

SecurityandCompliance IndustrialData/InternetofTh

Over 5,600 organizations using Splunk across IT and busin

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 20/32

Driving Insightsfrom Big Data

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 21/32

Hadoop

The ShareThis Insights Platfo

OnFather’sday:

“Whowerethemostsharedabouttopics?”

“Whattypeoftypeofbeersdopeopledrink?”

API EL Pre-aggregaTon

AnalyTcs

?

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 22/32

Finding the Optimal Approac

" HadoopandMapReducearegreatforcomplexdatasciendataatrest–thepreviousarchitecturetook9monthswi

ofengineers,dataarchitects,etc.

" TheSplunkplaormdeliversreal-Yme,interacYveanalyswecanbuildmanyofthesameinsightswithin1hour

Whatshouldbethecorefocusorcompetencyofyourte

Conclusion:findthemostopYmalapproachforthebusin

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 23/32

What About Ad Hoc Analysis?

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 24/32

PR Insights Example

" WhatwasthesituaTon?(e.g.fastmovingbusiness,needed

real-Ymeinsights)

" WhatwasthePRteamstrugglingwith?Difficulttofindusef

datatobuildinteresYnguse-cases

" Whatdidtheywant?Theywantedaflexiblereal-Ymerepo

environmenttoextractinsightsusefulforthemarket

" Howmyteamhelped?Deliveredasingledashboardthatcoreal-Ymedataintothesharingbehaviorsacrossournetwork

i h hb d

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 25/32

PR Insights Dashboard

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 26/32

Let’s not forgeThe low-hanging fru

O ti l A l ti f O li

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 27/32

Operational Analytics for an Online

website

API NoYficaYonGoogle(GCM)

Feedback

Processor

Apple(APNS)

? !

NoTficaTonsSystems

DrivingSuperiorCustomerExperience

Howmany500errors

haveIhadoverYme

Lookforanomalies

andspikes!

Zone

tothe!OnlineDeviceNoYficaYons

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 28/32

One More T

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 29/32

NewproductfromS

deliversinteracTve

exploraTon,analysvisualizaTonsforH

AnnouncingHunSplunkAnalyYcsfor

D i A ti bl I i ht f R

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 30/32

Derive Actionable Insights from Ra

Hadoop

Storage

s

a

v

d

1 2Point

Splunkat

Hadoop

Cluster

Explore Analyze Visualize Dashboards Share

L M

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 31/32

Learn More

splunk.com/bigd

7/15/2019 Hadoop Summit Socialize & Splunk

http://slidepdf.com/reader/full/hadoop-summit-socialize-splunk 32/32

Questions