WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
-
Upload
wso2 -
Category
Technology
-
view
305 -
download
3
Transcript of WSO2Con EU 2015: An Introduction to the WSO2 Data Analytics Platform
Introduc)on to WSO2 Analy)cs Pla5orm
Srinath Perera VP Research WSO2 Inc.
Analytics is Growing Up▪ It is no longer about doing
your first analytics usecase. ▪ It is about ▪ How to do it everyday,
efficiently? ▪ How to recover? ▪ How to make
decisions? ▪ How to do other forms
like real-time , Interactive, and predicative analytics
Analytics 2.0 Platform▪ One platform for all
four forms of analytics ▪ Single consistent
programming model ▪ One analytics archive
format) ▪ Support for the lifecycle
of analytics Apps
Integrate well with rest of the enterprise!!
Collect Data▪ One Sensor API to
publish events - REST, Thrift, JMS, Kafka - Java clients, java script
clients*
▪ First you define streams (think it as a infinite table in SQL DB) ▪ Then send events via
Sensor API
Can send to batch pipeline, Real8me pipeline or both via configura8on!
Collecting Data: Example
§ Java example: create and send events § Events send asynchronously § See client given in http://goo.gl/vIJzqc for more info
Agent agent = new Agent(agentConfiguration); publisher = new AsyncDataPublisher("tcp://hostname:7612", .. ); StreamDefinition definition = new StreamDefinition(STREAM_NAME,VERSION); definition.addPayloadData("sid", STRING); ... publisher.addStreamDefinition(definition); ... Event event = new Event(); event.setPayloadData(eventData); publisher.publish(STREAM_NAME, VERSION, event);
Send events
Define Stream
Initialize Agent
Analysis: Batch Analytics
Complex Event Processing
Analytics logic with SQL like Queries
▪ Both BAM and CEP provides a SQL like data processing language ▪ Since many understands SQL,
above languages made large scale data processing Big Data accessible to many ▪ Expressive, short, and sweet. ▪ Define core operations that covers
90% of problems ▪ Lets experts dig in when they like!
(via User Defined functions)
Scaling CEP Queries on top of Storm
▪ Accepts CEP queries with hints about how to partition streams ▪ Partition streams, build a Apache Storm topology running CEP nodes as Storm Sprouts, and run it. (see http://goo.gl/pP3kdX )
Predictive Analytics▪ Predictive Analytics learns a
decision function (a model) using examples ▪ Is this fraud? ▪ How to drive? ▪ Handwritten text
▪ Build models and use them with WSO2 CEP, BAM and ESB using WSO2 Machine Learner Product ( 2015 Q3) ▪ Build model using R, export
them as PMML, and use within WSO2 CEP
WSO2 Machine Learner▪ A wizard to sample,
explore, and understand data through visualizations ▪ A wizard to configure,
train machine learning models, and select the best model ▪ Find and use those
models with WSO2 CEP, BAM and ESB ▪ Powered by Apache
Spark MLLib
Communicate: Dashboards
▪ Idea is to give a “Overall idea” in a glance (e.g. car dashboard) ▪ Support for personalization, you can build your own dashboard. ▪ Also the entry point for Drill down ▪ How to build?
- Dashboard via Google Gadget and content via HTML5 + java scripts
- Use charting libraries like Vega or D3
Communicate: Alerts▪ Detecting conditions can
be done via CEP Queries ▪ Key is the “Last Mile” - Email - SMS - Push notifications to a UI - Pager - Trigger physical Alarm
▪ How? - Select Email sender “Output Adaptor” from CEP, or send from
CEP to ESB, and ESB has lot of connectors
Communicate: APIs▪ With mobile Apps, most data
are exposed and shared as APIs (REST/Json ) to end users. ▪ Need to expose analytics
results as API ▪ Following are some challenges
- Security and Permissions - API Discovery - Billing, throttling, quotas &
SLA
▪ How?
- Write data to a database from CEP event tables - Build Services via WSO2 Data Service - Expose them as APIs via API Manager
Event Stream Store▪ One stop place for all
event stream definitions ▪ Let users ▪ Publish and consume
though Multiple protocols like REST, JMS, Thrift, Web Sockets etc.
▪ Discover event streams ▪ Enforce security and
authorization ▪ Per-pay subscriptions ▪ Effectively a Event Stream
Market Place!!
▪ This will automate APIs creation as discussed in the slide before.
What is it good for?
▪ Batch Analytics ▪ Realtime Streaming analytics ▪ Realtime Interactive analytics ▪ Lambda Architecture ▪ Train and use a ML model ▪ Selective Detailed Analysis
Selective Detailed Analysis
• Too expensive to do detailed analysis on all the data
• Instead detect the condition, and dig into related data
• Fraud toolbox • Other usecases
– Dynamic offers at Retail Site
– Weather
Lambda Architecture
• Same code in both batch and realtime layers • Idea is to fill the time between two batch runs • Batch layer writes the data to a DB • Realtime layer merge with batch data via Event Tables
Real Life Use Cases▪ Health, Smart Parking solutions ▪ Financial Monitoring ▪ Smart City project, Vehicle
tracking, Building monitoring ▪ Railway monitoring ▪ Throttling and Anomaly
Detection ▪ API Analytics (13+ customers) ▪ Connected Car
Case Study: DEBS Grand Challenges▪ DEBS ((Distributed Event Based Systems) Grand Challenge is a yearly event processing challenge. ▪ 2014 Challenge: ▪ Smart Home electricity data: 2000 sensors, 40
houses, 4 Billion events. We posted (400K events/sec) and close to one million distributed throughput with 4 nodes. ▪ one of the four finalists
▪ 2015 Challenge:
▪ Based on taxi activities collected from New York City over the year 2013. 14,144 taxis 173 million taxi trip records. We posted 300K/sec on a single node and one of the finalists.
h=ps://www.flickr.com/photos/shedboy/3681317392/
Case Study: Realtime Soccer ���Analysis
Watch at: https://www.youtube.com/watch?v=nRI6buQ0NOM
Case Study: TFL Traffic AnalysisBuilt using TFL ( Transport for London) open data feeds.
http://goo.gl/04tX6k
http://goo.gl/9xNiCm
Select the Product
Product Features
WSO2 Data Analytics Server (DAS)
Everything : Batch, Realtime, Interactive, and Predictive Analytics
WSO2 Complex Event Processor (CEP)
Realtime Analytics only
WSO2 Machine Learner
Predictive Analytics only
Questions? ������
Thank You