Operation Real-Time: Analyzing Big Data Now

35
The Briefing Room

Transcript of Operation Real-Time: Analyzing Big Data Now

The Briefing Room

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected]

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!   Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Twitter Tag: #briefr

The Briefing Room

MARCH: Operational Intelligence

April: INTELLIGENCE

May: INTEGRATION

June: DATABASE

Twitter Tag: #briefr

The Briefing Room

Operational Intelligence

!   Real-time, dynamic business analytics

!   Visibility and insight into data, streaming events and business operations

!   The ability to make decisions and act quickly

!   Automated alerts and/or response

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

 Robin Bloor is Chief Analyst at The Bloor Group

[email protected]

Twitter Tag: #briefr

The Briefing Room

! Acunu offers a Cassandra-based real-time analytics platform

!   Its platform allows Cassandra users to build and extend business applications without being a database expert

! Acunu Analytics provides the ability to leverage customizable and re-usable analytic apps on top of its analytics layer

Acunu

Twitter Tag: #briefr

The Briefing Room

Tim Moreton

Tim is an expert in distributed file systems. Tim was previously a senior member of the technical team at Tideway (now BMC), where he led the creation of solutions for managing data centres at Fortune 500 clients. Previously he was CEO of a consultancy delivering data solutions for the aviation sector. He holds a PhD in Computer Science from Cambridge University.

Real Time Analytics forApache Cassandra

Monday, 4 March 13

2

New Big Data Sources, New Big Data Applications

Machine Generated

Mobile Phones

RFID Tags

02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html

Web Logs

Crowd Generated

Monday, 4 March 13

2

Operational Intelligence

Dashboards Real-time Decisions

Alerting

!

New Big Data Sources, New Big Data Applications

Machine Generated

Mobile Phones

RFID Tags

02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html

Web Logs

Crowd Generated

Monday, 4 March 13

2

Operational Intelligence

Dashboards Real-time Decisions

Alerting

!

New Big Data Sources, New Big Data Applications

Machine Generated

Mobile Phones

RFID Tags

02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html

Web Logs

Crowd Generated

Offline Exploratory Analytics

UnstructuredWarehouses

Data Mining

?Machine Learning

Monday, 4 March 13

2

Operational Intelligence

Dashboards Real-time Decisions

Alerting

!

New Big Data Sources, New Big Data Applications

Machine Generated

Mobile Phones

RFID Tags

02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html02:44:02 241.24.41 0.0.1 GET /index.html

Web Logs

Crowd Generated

Offline Exploratory Analytics

UnstructuredWarehouses

Data Mining

?Machine Learning

Data freshness, response time Complex analysis, comprehensive datasetsQuery speed Query richness

Monday, 4 March 13

Big Data Technology Timeline

3

Relational databases

✓ Rich, fast queries

✗ Not economically scalable for volume, velocity, variety

Monday, 4 March 13

Big Data Technology Timeline

3

Relational databases

✓ Rich, fast queries

✗ Not economically scalable for volume, velocity, variety

BigTable, NoSQL DBs

✓ Horizontally scalable✓ Millisec query latencies ✗ Spartan K-V queries

MapReduce, Hadoop

✓ Horizontally scalable✓ Very rich queries✗ Jobs take mins or hours

Monday, 4 March 13

Big Data Technology Timeline

3

Relational databases

✓ Rich, fast queries

✗ Not economically scalable for volume, velocity, variety

BigTable, NoSQL DBs

✓ Horizontally scalable✓ Millisec query latencies ✗ Spartan K-V queries

MapReduce, Hadoop

✓ Horizontally scalable✓ Very rich queries✗ Jobs take mins or hours

✓ Rich queries✓ “API real time"

Hadoop-EDW HybridsImpala+Trevni, Drill, Pivotal-HD

✓ Very rich queries✓ “Exploratory real time”

Monday, 4 March 13

Apache Cassandra

4

• Multi-master : highly available by design

• Multi-data center optimised

• Very high write performance

• Atomic counters

• Numerous large production deployments

Ideal building block for real-time analytics

• Awkward data ingest interfaces

• Spartan query semantics

• Brittle data modeling leads to lack of flexibility

But building analytics on a key-value interface is hard

Monday, 4 March 13

Apache Cassandra

4

• Multi-master : highly available by design

• Multi-data center optimised

• Very high write performance

• Atomic counters

• Numerous large production deployments

Ideal building block for real-time analytics

Virtual nodes CQL v2

• Awkward data ingest interfaces

• Spartan query semantics

• Brittle data modeling leads to lack of flexibility

But building analytics on a key-value interface is hard

Monday, 4 March 13

5

Continuous OLAP Cubing: Fresh, Instant Answers

API

event stream

event store

roll-upcubes

Ingest Processing

dashboard queries programatic interface

Monday, 4 March 13

Acunu Analytics

Aggregate queries templates, continuously evaluated

‣ Rich aggregate analytic operators‣ Probabilistic functions‣ Joins, limits, group bys‣ Link aggregates to raw events

6

01101001010101010

010110

101010101001011010101011001011010101010010110101010101101

0010

01101001010

101010

0101101010101010010110101010110010110101010100110

100101001011010

101010100101101010101100101101010101

00

Pre process event data

‣ Transform, filter, split streams‣ Integrate other data sources

!! !!

!!

Simple, high-velocity event stream ingest

‣ RESTful HTTP‣ JSON-based ‣ Apache Flume‣ MQ sources

Threshold Alerts

‣ Comparisons vs historic baselines

Drive Applications

‣ AQL queries, JSON results‣ Instant results enable

reactive feedback loop

Beautiful, modular, live BI dashboards

‣ Rapidly visualize results‣ Rich, flexible widgets‣ Drill-down to raw events ‣ Create custom

monitoring apps

Historic context

‣ Raw events and cubes stored in Cassandra

Monday, 4 March 13

7

How Acunu Analytics is Used

• Analytics of Telco telemetry data

• Monitoring of large-scale compute and cloud infrastructures

• Continuous analytics for high-tech manufacturing production lines

• Real-time financial tick data analytics

• Powering social media analytics

• Real-time user engagement and advertising analytics in large web

• Funnel analysis, instant user journey optimization in social gaming

Monday, 4 March 13

7

How Acunu Analytics is Used

DevOps Analytics:

• Unified real-time infrastructure, application and business metrics• Live customer supply and driver

availability powering in-app features• Push new builds with confidence• Faster issue resolution times

We feed in our data and just ask questions. We get immediate results. It's resilient and very flexible and fits into our service-based architecture.

• Analytics of Telco telemetry data

• Monitoring of large-scale compute and cloud infrastructures

• Continuous analytics for high-tech manufacturing production lines

• Real-time financial tick data analytics

• Powering social media analytics

• Real-time user engagement and advertising analytics in large web

• Funnel analysis, instant user journey optimization in social gaming

Monday, 4 March 13

Apache, Apache Cassandra, Cassandra, Hadoop, and the eye and elephant logos are trademarks of the Apache Software Foundation.

Thank You.

Monday, 4 March 13

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

Perceptions & Questions

The Bloor Group

The Bloor Group

The Dawn of Operational Intelligence

We have noticed an emerging trend towards building business intelligence capabilities that are deployed and used in a real-time manner.

The term that has emerged to describe such capabilities is:

OPERATIONAL INTELLIGENCE

The Bloor Group

Criteria

The OI capability is fed by streamed data or current

data at low latency

It involves immediate analysis of the (event) data to derive useable

intelligence

It is actioned immediately: either for automated use or to inform operational

staff “just in time”

It is integrated, to some degree, with other BI capabilities and other

applications

OPERATIONAL INTELLIGENCE

The Bloor Group

OI Implementation

The Bloor Group

The Discovery Side Of The Coin

The Bloor Group

The Bottom Line

OI capabilities are only just

emerging

They should not be

“architecturally” divorced from

other BI

The business value of OI is

usually dramatic

The Bloor Group

!   How close to actual real-time does Acunu achieve?

!   Is Acunu tied entirely to Apache Cassandra or is it likely that it will be added to other data platforms?

!   There are many analytics approaches and algorithms. What is the breadth of Acunu’s capability?

!   How rich is the historic context?

The Bloor Group

!   In your view, is the “age of the data warehouse” over?

!   Do you have a cloud offering?

!   Which sectors/businesses are currently in Acunu’s “sweet spot”?

!   Which companies/products do you regard as competitors/ partners?

Twitter Tag: #briefr

The Briefing Room

Twitter Tag: #briefr

The Briefing Room

April: INTELLIGENCE

May: INTEGRATION

June: DATABASE

Upcoming Topics

www.insideanalysis.com

Twitter Tag: #briefr

The Briefing Room

Thank You for Your

Attention

Certain images and/or photos in this presentation are the copyrighted property of 123RF Limited, their Contributors or Licensed Partners and are being used with permission under license. These images and/or photos may not be copied or downloaded without permission from 123RF Limited.