Splunk/Socialize at Hadoop Summit

32
Copyright © 2013 Splunk Inc. Big Data at the Speed of Business Isaac Mosquera Director of Mobile, ShareThis Clint Sharp Principal Big Data Product Manager, Splunk Copyright © 2013 Splunk Inc.

description

How to efficiently analyze large amounts of data in rea-time using splunk

Transcript of Splunk/Socialize at Hadoop Summit

Page 1: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Big Data at the Speed of Business

Isaac MosqueraDirector of Mobile, ShareThis

Clint SharpPrincipal Big Data Product Manager, Splunk

Copyright © 2013 Splunk Inc.

Page 2: Splunk/Socialize at Hadoop Summit

What We’ll Talk About

• Our quest for visibility

• Analyzing at scale

• Splunk and Big Data

• Where do you start?

• Q&A

Page 3: Splunk/Socialize at Hadoop Summit

About Splunk

Company (NASDAQ: SPLK)

Founded 2004, first software release in 2006

HQ: San Francisco

Business Model / Products

Industry-leading machine data platform

On-premise, in the cloud and SaaS

5,600+ Customers

63 of the Fortune 100

Largest license: 100 Terabytes per day

#1 Big Data Innovator*

* Fast Company's Most Innovative Companies Issue (March 2013)

Page 4: Splunk/Socialize at Hadoop Summit

About ShareThis and Socialize

ShareThis makes the world more

connected, trusted and valuable through sharing

Powers the social web, touching the lives

of 95 percent of U.S.

Acquires Socialize, which makes mobile

and social more engaging

Socialized integrated into thousands of

iOS and Android Apps

Installed on 80M+ devices

Page 5: Splunk/Socialize at Hadoop Summit

Evaluating 20 Billion Ad Impressions Monthly

Page 6: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Page 7: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Page 8: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Page 9: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Page 10: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Page 11: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Page 12: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Page 13: Splunk/Socialize at Hadoop Summit

Final Architecture

RDBMS (Generated

Reports)S3

Snapshots

SearchHead

Socialize Bidder

Splunk

Indexer

Indexer

Indexer

Cache Cluster

Memcache Memcache Memcache

Page 14: Splunk/Socialize at Hadoop Summit

So, What is Splunk?

14

Page 15: Splunk/Socialize at Hadoop Summit

Expanding Universe of Data Sources

Machine-generated DataBusiness Application Data Human-generated Data

Highly Structured Arbitrarily Structured

2012-12-05 07:04:44

Id=00Q000000Rd910EAJ City=New York

Country=US CreatedDate=“2012-12-05

07:06:44” [email protected]

Email_Opt_In_c Customer_Street

_Address_c=“123 Main St.”purchased_product_id=

product_i BD-01 twitter_username

john_t_doe

Page 16: Splunk/Socialize at Hadoop Summit

Industry Leading Platform for Machine Data

Any Machine Data Operational Intelligence

HA Indexes and Storage

CommodityServers

DeveloperPlatform

Custom dashboards

Monitor and alert

Ad hoc search

Report and analyze

Page 17: Splunk/Socialize at Hadoop Summit

Analyzing Heterogeneous Data

Universal Index Schema-on-the-fly Flexibility and Fast Time to Value

• No data normalization

• Automatically handles timestamps

• Parsers not required

• Index every term & pattern “blindly”

• No attempt to “understand” up front

• Structure applied at search-time

• No brittle schema to work around

• Automatically find transactions, patterns and trends

• Normalization as it’s needed

• Faster implementation

• Easy search language

• Multiple views into the same data

Page 18: Splunk/Socialize at Hadoop Summit

Gain Critical Insights … in Real-timeOrder ID

Customer’s Tweet

Time Waiting On Hold

Product ID

Company’s Name

Sources

Twitter

Care IVR

Middleware Error

Order Processing

Order ID

Customer ID

Twitter ID

Customer ID

Customer ID

Page 19: Splunk/Socialize at Hadoop Summit

Deep Visibility and Insight for IT and Business

IT Operations Management Web Intelligence

Business AnalyticsApplication Management

Security and Compliance Industrial Data / Internet of Things

Over 5,600 organizations using Splunk across IT and business users

Page 20: Splunk/Socialize at Hadoop Summit

Driving Insights

from Big Data

Page 21: Splunk/Socialize at Hadoop Summit

Hadoop

The ShareThis Insights Platform

On Father’s day:“Who were the most shared about topics?”“What type of type of beers do people drink?”

API ETL Pre-aggregation

Analytics

?

Page 22: Splunk/Socialize at Hadoop Summit

Finding the Optimal Approach

Hadoop and MapReduce are great for complex data science on data at rest – the previous architecture took 9 months with a team of engineers, data architects, etc.

The Splunk platform delivers real-time, interactive analysis –we can build many of the same insights within 1 hour

What should be the core focus or competency of your team?

Conclusion: find the most optimal approach for the business

Page 23: Splunk/Socialize at Hadoop Summit

What About Ad Hoc Analysis?

Page 24: Splunk/Socialize at Hadoop Summit

PR Insights Example

What was the situation? (e.g. fast moving business, needed real-time insights)

What was the PR team struggling with? Difficult to find useful data to build interesting use-cases

What did they want? They wanted a flexible real-time reporting environment to extract insights useful for the market

How my team helped? Delivered a single dashboard that contained real-time data into the sharing behaviors across our network

Page 25: Splunk/Socialize at Hadoop Summit

PR Insights Dashboard

Page 26: Splunk/Socialize at Hadoop Summit

Let’s not forgetThe low-hanging fruit

Page 27: Splunk/Socialize at Hadoop Summit

Operational Analytics for an Online World

website

API Notification

Google (GCM)

FeedbackProcessor

Apple (APNS)

? !

Notifications Systems

Driving Superior Customer Experience

How many 500 errors have I had over time?

Look for anomalies and spikes!

Zone in directly to the customer!!

Online Device Notifications

Page 28: Splunk/Socialize at Hadoop Summit

One More Thing …

28

Page 29: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

New product from Splunkdelivers interactive data exploration, analysis and visualizations for Hadoop

Announcing Hunk BetaSplunk Analytics for Hadoop

Page 30: Splunk/Socialize at Hadoop Summit

Derive Actionable Insights from Raw Data

HadoopStorage

Immediately start exploring, analyzing and visualizing raw data in Hadoop

1 2Point Splunk at Hadoop Cluster

Explore Analyze Visualize Dashboards Share

Page 31: Splunk/Socialize at Hadoop Summit

Learn More

31

splunk.com/bigdata

Page 32: Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Questions?