Windows of Opportunity: Big Data on Tap
-
Upload
inside-analysis -
Category
Technology
-
view
876 -
download
0
description
Transcript of Windows of Opportunity: Big Data on Tap
The Briefing Room
Twitter Tag: #briefr
The Briefing Room
! Reveal the essential characteristics of enterprise software, good and bad
! Provide a forum for detailed analysis of today’s innovative technologies
! Give vendors a chance to explain their product to savvy analysts
! Allow audience members to pose serious questions... and get answers!
Mission
Twitter Tag: #briefr
The Briefing Room
JANUARY: Big Data
February: Analytics
March: Open Source
April: Intelligence
Twitter Tag: #briefr
The Briefing Room
Big Data
THERE IS NO MORE SMALL DATA
Copy
righ
ted
prop
erty
. M
ay n
ot b
e co
pied
or
dow
nloa
ded
wit
hout
per
mis
sion
fro
m 1
23RF
Lim
ited
.
Twitter Tag: #briefr
The Briefing Room
Analyst: Robin Bloor
Robin Bloor is Chief Analyst at The Bloor Group
Twitter Tag: #briefr
The Briefing Room
! SQL stream is an enterprise software company focused on making businesses responsive to real-time big data assets.
! Its core s-Server streaming data management platform collects, analyzes and shares high volume, high velocity structured and unstructured data from any source, in any format.
! SQLstream recently introduced s-Server 3.o, which includes distributed streaming data processing, machine data collection, and integration with Google Big Query and Hadoop Hbase.
SQLstream
Twitter Tag: #briefr
The Briefing Room
Damian Black
Damian Black is the founder and CEO of SQLstream, a pioneer in Streaming Big Data. Damian has worked for almost two decades in Silicon Valley, with senior roles in a variety of companies including Hewlett-Packard, Neustar, Xacct Technologies and Followap. He has always focused on real-time data platforms for the largest Internet scale applications. He has spoken at many conferences, and was on GigaOM’s first Big Data panel in 2008. Damian graduated from Manchester University and was one of the first research scientists to join HPLabs Europe. He was selected for the International Management Challenge in conjunction with the Financial Times and Ashridge business school while at Hewlett-Packard. Damian is the author of eleven granted patents with five more pending.
Copyright © 2013 SQLstream Inc.
Windows of Opportunity: ���Big Data on Tap™
January 2013 Damian Black
CEO, SQLstream
| 10 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
SQLs t ream V i s ion
PROVEN ➔ Founded in 2003. ➔150+ engineering years. ➔ Over 25 customers and focused on Fortune 1000 companies.
OPEN ➔ 100% standard SQL. ➔ Dynamically extendable using C++ Java and more. ➔ Comprehensive set of adapters.
INNOVATIVE ➔ Leaders in Streaming Big Data Management. ➔ Best real-time technology and most complete platform. ➔ Holds 5 key streaming patents (with 3 pending).
IN 2013 STREAMING DATA MANAGEMENT WILL EMERGE AS THE CORE INTEGRATION AND OPERATIONAL INTELLIGENCE PLATFORM FOR REAL-TIME BIG DATA SOLUTIONS WITHIN THE ENTERPRISE.
| 11 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
What i s S t ream ing B i g Da ta Management ?
REAL- TIME DATA
Log and Machine Data ✔ Cloud and Device health ✔
Sensor Networks ✔ Social Interaction & Feeds ✔
CDR and Service Data ✔ Automotive & Telematics ✔
Wireless Networks ✔ Streaming media QoS ✔
GPS and Location Data ✔ Application transactions ✔
DEFINITION Streaming Big Data = Big Data + Real-time Data Capture & Collection + Continuous Integration & ETL + Low Latency Transformation & Analysis
EFFECT Businesses become “real-time responsive” to Big Data. Unlocks the power and value of real-time Big Data.
WHERE ARE THE REAL-TIME DATA SOURCES?
| 12 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
S t ream ing B i g Da ta i n Ac t ion
Telematics Device health monitoring with intelligent integration
Cloud Real-time prediction of resource over-utilization
Intelligent Transportation Real-time traffic flow analytics from vehicle GPS data feeds
Social Media Real-time semantic streaming for QoE monitoring
Internet Real-time content, activity and security event monitoring
Telecomm Real-time QoS and capacity monitoring from CDR data
HPC Big Data log monitoring on a massive scale
Banking Real-time fraud and security event prediction
High volume, high velocity, structured and unstructured data from software platforms, applications and systems.
Log Files Sensors Services Markets���Internet Location Networks Devices
| 13 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
P l a t fo rm Requ i rement s fo r Rea l -T ime B i g Da ta
Both On-Cloud and On-Premise
Deployment
Continuous data analysis and integration using distributed streaming platform
Parallel Dataflow & Distributed Stream
Processing Architecture
100% Standards-compliant with true SQL:2008
| 14 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
DATA EXPLOSION
COMPLEXITY
BUSINESS AGILITY
S t ream ing B i g Da ta – Pa i n Po in t s
Too difficult to build & maintain real-time apps
Too costly to analyse voluminous real-time data
Too slow to respond to new requirements
| 15 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
DATA EXPLOSION
COMPLEXITY
BUSINESS AGILITY
S t ream ing B i g Da ta – Pa i n Po in t s
Too difficult to build & maintain real-time apps SQLstream eliminates your development risk.
Too costly to analyse voluminous real-time data SQLstream slashes TCO for real-time analysis.
Too slow to respond to new requirements SQLstream allows you to add new apps easily.
| 16 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
The Rea l - t ime Da ta Management Headache…
TIME, MONEY, COMPLEXITY
Business Intelligence: Hadoop HBase & Data Warehouses
Supply Chain &
ERP
Operations &
Management
Finance &
Accounting
CRM &
Billing
| 17 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
The Rea l - t ime Da ta Management Headache…
STREAMING ANALYTICS AND AGGREGATION
STEAMING EVENT CORRELATION
STREAMING ALERTS & ALARMS
CONTINUOUS ETL
Business Intelligence: Hadoop HBase & Data Warehouses
Supply Chain &
ERP
Operations &
Management
Finance &
Accounting
CRM &
Billing
| 18 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
Mov ing f rom H i gh La tency to Rea l - t ime Respons i venes s
COLLECT
CLEANSE
ENRICH
ANALYZE
SHARE
➔ Traditional ETL approach leads to high latency
| 19 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
Mov ing f rom H i gh La tency to Rea l - t ime Respons i venes s
COLLECT
CLEANSE
ENRICH
ANALYZE
SHARE
LOW LATENCY
➔ Traditional ETL approach leads to high latency
➔ SQLstream Streaming Approach:
» Continuous Parallel Dataflow Execution
» Generate real-time answers immediately
» Deliver and share the results immediately
| 20 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
SQLs t ream Dataflow Techno log y ���P i p e l i n i n g a n d S u p e r s c a l a r P a r a l l e l P r o c e s s i n g
Fine-grained parallelism: simple, massively scalable, super fast.
Query Processor =
| 21 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
SELECT STREAM ROWTIME, url, numErrorsLastMinute FROM ( SELECT STREAM ROWTIME, url, numErrorsLastMinute, AVG(numErrorsLastMinute) OVER lastMinute AS avgErrorsPerMinute, STDDEV(numErrorsLastMinute) OVER lastMinute AS stdDevErrorsPerMinute FROM ServiceRequestsPerMinute WINDOW lastMinute AS (PARTITION BY url RANGE INTERVAL ‘1’ MINUTE PRECEDING) ) AS S WHERE S.numErrorsLastMinute > S.avgErrorsPerMinute + 2 * S.stdDevErrorsPerMinute;
A S t ream ing SQL Quer y ���C l o u d I n f r a s t r u c t u r e M o n i t o r i n g w i t h B o l l i n g e r B a n d s
Business need: Predict run-away applications before resource consumption becomes an issue.
| 22 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
SQLstream
Cus tomer Benchmarked Examp le App l i c a t ion
Network Data
Network Data
Network Data
Network Data
Network Data
ENRICH SHARE ANALYZE
Remote Agent
Remote Agent
Remote Agent
Remote Agent
Remote Agent
Data Warehouse
External Systems
External Data
PERFORMANCE STATISTICS System Throughput: 1.35M events / sec
Server Configuration: 1 x 4-core CPU
Event Size: ~1KB
Data Sources: Many
SYSTEM CHARACTERISTICS Collection: Intelligent Remote Agents (Distributed)
Enrichment: Streaming data augmentation
Analytics: Temporal & spatial pattern detection
Output: Data warehouse + applications (JDBC)
| 23 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
SQLs t ream Produc t Por t fo l i o
➔ s-Server Core Streaming Data Management and Integration
Platform
➔ s-Analyzer Real-time data stream visualization and dashboards
➔ s-Studio Developer and administration console
➔ s-Cloud Cloud-based EC2 offering
➔ s-Transport GPS, location-based and geospatial analytics module
| 24 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
Rea l - t ime Web Ser ver Log Mon i tor i ng ���M o z i l l a ( G o o g l e : “ Yo u t u b e M o z i l l a G l o w ” )
Real-time monitoring across all download web
servers across the world simultaneously.
➔ Collect
Remote agents transform log files into real-time streams
➔ Analyze
Real-time analysis & aggregation by location
➔ Share
Continuous ETL into Hadoop Hbase
Internet ‘Glow’ app for real-time visualization
Web Server Log Files (Remote)
Hadoop HBase
Streaming collecDon, real-‐Dme analysis and conDnuous integraDon by locaDon
| 25 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
Rea l - t ime Tra ffic Ana ly t i c s ���Tr a n s f o r m G P S d a t a i n t o r e a l - t i m e t r a f fi c fl o w i n f o r m a t i o n
GPS Vehicle Data Feeds
Geo-DB
Streaming transformaDon of GPS data into Traffic Flow and
CongesDon PredicDon Events
Real-time traffic flow and congestion prediction
from vehicle GPS data.
➔ Collect
Collect, cleanse and filter vehicle GPS data feeds
➔ Analyze
‘Snap-to-map’
Transform GPS records into traffic flow information
Prediction events for congestion alerts
➔ Share
Real-time Google Maps and Google Earth Displays
Web and Smartphone access
| 26 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
Rea l - t ime Opera t iona l I n te l l i g ence ���M a r ke t C o m p a r i s o n
ENTERPRISE CAPABILITY
TRADITIONAL BIG DATA OPERATIONAL INTELLIGENCE TOOLS
SQLSTREAM STREAMING BIG DATA PLATFORM SQLSTREAM BENEFITS
True Real-time Moderate to high latency. Incomplete answers.
Real-time low latency. Complete answers.
Instant results
Sophisticated Analytics
Simple patterns. No real power.
Full SQL power. Very high-level, concise.
Elegantly handle every business need.
Joins & Correlation Operates on a single feed only.
Join & correlate across multiple different feeds
Compare and combine info in real-time.
Data Enrichment & Integration
Weak, simplistic. Incomplete.
Continuous, powerful. Comprehensive.
Create complete answers continuously.
Big Data Scalability Limited scalability. Cost prohibitive. No parallel processing.
Massively scalable. Inexpensive. Massively parallel.
Delivers low cost, high performance needed for real-time big data.
Development Ease Proprietary and low level. Expensive. Time consuming.
Standard SQL. Optimized. Parallel.
Instant productivity. No hidden obstacles.
| 27 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
A New Data Management Quadran t
STREAMING BIG DATA
MESSAGING MIDDLEWARE
DATA WAREHOUSES
High Level Declarative Language & Operation (SQL)
Low Level Procedural Language & Operation (C++, C#, Java, Pig, JCL, etc.)
Historical Analysis Periodic Batches
Continuous Analysis Real-time Processing
Stale snapshots Not real-time Costly recalculations
Batch processing Low-level but scalable Extensive coding
Low-level software Scattered business logic Brittle with high TCO
Always current Streaming integration Rapid development
BATCHED BIG DATA
| 28 Copyright © 2013 1 Big Data on Tap™ | Damian Black | +1 877 571 5775 | [email protected]
DATA EXPLOSION
COMPLEXITY
BUSINESS AGILITY
B IG DATA ON TAP™ – De l i ve red .
Eliminates the development risk and pain. • Real-time parallel processing made simple, scalable and fast.
Slashes the TCO for real-time analysis. • Scales easily and continuously processes data in real-time.
Makes adding new apps easy. • Create powerful real-time apps, and share results easily.
Copyright © 2013 SQLstream Inc.
Windows of Opportunity: ���Big Data on Tap™
Thanks! Damian Black
CEO, SQLstream
Twitter Tag: #briefr
The Briefing Room
Analyst: Robin Bloor
Perceptions & Questions
The Bloor Group
Harnessing Data Flow
The Bloor Group
Hadoop Is The Reservoir
• Because of its flexibility and scalability as a data store, Hadoop has become the natural reservoir for data, but… – Hadoop is a multi-purpose engine, but not a
performance engine – do not be fooled by its parallelism
– Sometimes you don’t have time to drop the data into Hadoop first; it is not necessarily the first port of call for data
– Sometimes it may be better to leave data where it is, and just replicate
The Bloor Group
Event Processing
The Bloor Group
Event Stream Processing
The Bloor Group
Operational Intelligence
• Real-time BI could also be called operational Intelligence (OI)
• It poses three problems: – How to establish the stream data flow
(at an acceptable speed) – How to process the data – How to manage the data
The Bloor Group
A Side Comment…
• We are familiar with the issue of “Data Life Cycle”
• This issue didn’t just evaporate with the advent of Big Data and Streams Processing – it became more important
The Bloor Group
! Does the platform include its own database?
! You use an enhanced SQL for streams processing. Can it handle unstructured data (such as a tweet stream)?
! You characterize your analytics as being “advanced.” Can you expand on what you mean by that? What analytic capabilities does it include?
! It wasn’t entirely clear to me as to how you integrate with legacy data warehouse data flows. Consider data cleansing for example. How does the SQLstream Platform accommodate that?
The Bloor Group
! Which sectors/businesses do you expect to be able to make best use of this technology?
! Which companies/products do you regard as competitors (either directly or close competitors)?
! Which companies/products do you partner with?
! How is the product/platform priced? How is the cloud version priced?
Twitter Tag: #briefr
The Briefing Room
Twitter Tag: #briefr
The Briefing Room
Upcoming Topics
This month: Big Data
February: Analytics
March: Open Source
April: Intelligence
www.insideanalysis.com
Twitter Tag: #briefr
The Briefing Room
Thank You for Your
Attention