Splunk/Socialize at Hadoop Summit

Post on 07-Jul-2015

251 views 1 download

Tags:

description

How to efficiently analyze large amounts of data in rea-time using splunk

Transcript of Splunk/Socialize at Hadoop Summit

Copyright © 2013 Splunk Inc.

Big Data at the Speed of Business

Isaac MosqueraDirector of Mobile, ShareThis

Clint SharpPrincipal Big Data Product Manager, Splunk

Copyright © 2013 Splunk Inc.

What We’ll Talk About

• Our quest for visibility

• Analyzing at scale

• Splunk and Big Data

• Where do you start?

• Q&A

About Splunk

Company (NASDAQ: SPLK)

Founded 2004, first software release in 2006

HQ: San Francisco

Business Model / Products

Industry-leading machine data platform

On-premise, in the cloud and SaaS

5,600+ Customers

63 of the Fortune 100

Largest license: 100 Terabytes per day

#1 Big Data Innovator*

* Fast Company's Most Innovative Companies Issue (March 2013)

About ShareThis and Socialize

ShareThis makes the world more

connected, trusted and valuable through sharing

Powers the social web, touching the lives

of 95 percent of U.S.

Acquires Socialize, which makes mobile

and social more engaging

Socialized integrated into thousands of

iOS and Android Apps

Installed on 80M+ devices

Evaluating 20 Billion Ad Impressions Monthly

Copyright © 2013 Splunk Inc.

Copyright © 2013 Splunk Inc.

Copyright © 2013 Splunk Inc.

Copyright © 2013 Splunk Inc.

Copyright © 2013 Splunk Inc.

Copyright © 2013 Splunk Inc.

Copyright © 2013 Splunk Inc.

Final Architecture

RDBMS (Generated

Reports)S3

Snapshots

SearchHead

Socialize Bidder

Splunk

Indexer

Indexer

Indexer

Cache Cluster

Memcache Memcache Memcache

So, What is Splunk?

14

Expanding Universe of Data Sources

Machine-generated DataBusiness Application Data Human-generated Data

Highly Structured Arbitrarily Structured

2012-12-05 07:04:44

Id=00Q000000Rd910EAJ City=New York

Country=US CreatedDate=“2012-12-05

07:06:44” Email.jdoe@gmail.com

Email_Opt_In_c Customer_Street

_Address_c=“123 Main St.”purchased_product_id=

product_i BD-01 twitter_username

john_t_doe

Industry Leading Platform for Machine Data

Any Machine Data Operational Intelligence

HA Indexes and Storage

CommodityServers

DeveloperPlatform

Custom dashboards

Monitor and alert

Ad hoc search

Report and analyze

Analyzing Heterogeneous Data

Universal Index Schema-on-the-fly Flexibility and Fast Time to Value

• No data normalization

• Automatically handles timestamps

• Parsers not required

• Index every term & pattern “blindly”

• No attempt to “understand” up front

• Structure applied at search-time

• No brittle schema to work around

• Automatically find transactions, patterns and trends

• Normalization as it’s needed

• Faster implementation

• Easy search language

• Multiple views into the same data

Gain Critical Insights … in Real-timeOrder ID

Customer’s Tweet

Time Waiting On Hold

Product ID

Company’s Name

Sources

Twitter

Care IVR

Middleware Error

Order Processing

Order ID

Customer ID

Twitter ID

Customer ID

Customer ID

Deep Visibility and Insight for IT and Business

IT Operations Management Web Intelligence

Business AnalyticsApplication Management

Security and Compliance Industrial Data / Internet of Things

Over 5,600 organizations using Splunk across IT and business users

Driving Insights

from Big Data

Hadoop

The ShareThis Insights Platform

On Father’s day:“Who were the most shared about topics?”“What type of type of beers do people drink?”

API ETL Pre-aggregation

Analytics

?

Finding the Optimal Approach

Hadoop and MapReduce are great for complex data science on data at rest – the previous architecture took 9 months with a team of engineers, data architects, etc.

The Splunk platform delivers real-time, interactive analysis –we can build many of the same insights within 1 hour

What should be the core focus or competency of your team?

Conclusion: find the most optimal approach for the business

What About Ad Hoc Analysis?

PR Insights Example

What was the situation? (e.g. fast moving business, needed real-time insights)

What was the PR team struggling with? Difficult to find useful data to build interesting use-cases

What did they want? They wanted a flexible real-time reporting environment to extract insights useful for the market

How my team helped? Delivered a single dashboard that contained real-time data into the sharing behaviors across our network

PR Insights Dashboard

Let’s not forgetThe low-hanging fruit

Operational Analytics for an Online World

website

API Notification

Google (GCM)

FeedbackProcessor

Apple (APNS)

? !

Notifications Systems

Driving Superior Customer Experience

How many 500 errors have I had over time?

Look for anomalies and spikes!

Zone in directly to the customer!!

Online Device Notifications

One More Thing …

28

Copyright © 2013 Splunk Inc.

New product from Splunkdelivers interactive data exploration, analysis and visualizations for Hadoop

Announcing Hunk BetaSplunk Analytics for Hadoop

Derive Actionable Insights from Raw Data

HadoopStorage

Immediately start exploring, analyzing and visualizing raw data in Hadoop

1 2Point Splunk at Hadoop Cluster

Explore Analyze Visualize Dashboards Share

Learn More

31

splunk.com/bigdata

Copyright © 2013 Splunk Inc.

Questions?