DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
Streaming Analytics and Internet of Things - Geesara Prathap
-
Upload
withthebest -
Category
Technology
-
view
85 -
download
1
Transcript of Streaming Analytics and Internet of Things - Geesara Prathap
Challenges
2
• How fast do we need results?
• How much data to keep?
• Common language ?
• Do we have centralized data storage and
processing units?
• Knowledge of the past data only?
Analytics Platform
6
WSO2 Analytics platform uniquely combine simultaneous real time and batch analytics with predictive analytics to run data from IoT, mobile, and web apps into actionable insights.
Analytics Strategy
8
Single platform to address all analytics styles.
Batch Analytics: analytics on data at-rest, running typically every hour or every day, and focused on historical analytics
dashboards and reports Real time Analytics: analyze event streams in real-time and detects patterns and conditions
Predictive Analytics: leverages machine learning to create
a mathematical model allowing to predict future behavior. Interactive Analytics: execute queries on the fly on top of data at rest.
Streaming Analytics in Other Words
10
● Gather data from multiple sources● Correlate data streams over time● Find interesting occurrences ● Notify
Basic Building Blocks
11
● Receivers: Data collection point, associated to a specific data connector
● Publishers: Data publishing point, associated to a specific data connector● Event Streams: Event data flowing through the system● Execution Plans: Execution pipeline applied to event
streams● Siddhi: Codename for the streaming engine● Siddiqi: SQL-like query language
Event Streams
14
● Event stream is a sequence of events● Event streams are defined by stream definition● Event streams have inflows and outflows ● Inflows can be from
○ Event receivers○ Execution plans
● Outflows are to ○ Event publishers ○ Execution plans
Data Connectors
15
● The following connectors are available out of the box Source: Email, File, JMS, Kafka, MQTT, SOAP, Websocket, Thrift, Binary, Log and JMX receiver
Sink: RDBMS, Cassandra, SMS, Email, File, HTTP, JMS, Kafka, MQTT, SOAP, Websocket, Thrift, Binary
● Incoming/ outgoing data can be mapped using XPath, regular expressions, or JSON paths
● Data connectors are common across the analytics platform
Real-time Analytics Patterns
● Simple counting (e.g. failure count)● Counting with Windows (e.g. failure count every hour)● Preprocessing: filtering, transformations, (e.g data cleanup)● Alerts, thresholds (e.g Alarm on high temperature)● Data correlation, Detect missing events detecting erroneous
data( e.g detecting failed sensors)● Joining event streams (e.g. detect a hit on soccer ball)● Merge with data in a database, collect update data
conditionally
Real-time Analytics Patterns
● Detecting event sequence patterns( e.g. small transaction followed by large transaction)
● Tracking - follow some related entity’s state in space, time etc. (e.g location of airline baggage, vehicle, tracking wild life)
● Detect trends- Rise, turn, fall, outliers, Complex trends like triple bottom etc., (e.g algorithmic trading, SLA, load balancing)
● Learning a model (e.g. predictive maintenance)● Predicting next value and corrective actions (e.g automated
car)
CEP = SQL for Real-time Analytics
● Easy to follow from SQL● Expressive, short, and sweet● Define core operations that covers 90% of
problems ● Let’s experts dig in when they like!
Let’s look at the core operation
Operators: Filters
Assume a temperature streamHere weather: convertFtoC() is a user defined function. They are used to extend the languageUsecases:
- Alerts, thresholds, (e.g Alarm on high temperature)- Preprocessing: filtering, transformation (e.g data cleanup)
Operators: Windows and Aggregation
Support many window types - Batch windows, Sliding windows, Custom windows
Usecases- Simple counting ( e.g failure count)- Counting with Windows ( e.g failure count every hour)
Operators: Patterns
Models a followed by relation: e.g. event AS followed by event BVery powerful tool for tracking and detecting patterns Usecases
- Detecting event sequence patterns- Tracking - Detect trends
Operators: Joins
Models a followed by relation: e.g. event AS followed by event BVery powerful tool for tracking and detecting patterns Usecases
- Detecting event sequence patterns- Tracking - Detect trends