WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor

41
WSO2 Product Release Webinar WSO2 Complex Event Processor 2.0.1 Simplifying High Performant Data Processing S. Suhothayan (Suho) Software Engineer, Data Technologies Team.

Transcript of WSO2 Product Release Webinar - Introducing the WSO2 Complex Event Processor

WSO2 Product Release Webinar

WSO2 Complex Event Processor 2.0.1

Simplifying High Performant Data Processing

S. Suhothayan (Suho) Software Engineer,

Data Technologies Team.

Outline � What is Complex Event Processing? � WSO2 CEP Server & SOA integrates � The Siddhi Runtime CEP Engine. � High availability, Persistence and Scalability of

WSO2 CEP � How CEP can be combined with Business

Activity Monitoring (BAM). � Demo

Complex Event Processing ?

Complex Event processing is about listening to events and detecting patterns in

near real-time without storing all events.

WSO2 Inc. 4

CEP Is & Is NOT! � Is NOT!

o Simple filters - Simple Event Processing - E.g. Is this a gold or platinum customer?

o Joining multiple event streams - Event Stream Processing

� Is ! o Processing multiple event streams o Identify meaningful patterns among streams o Useing temporal windows

- E.g. Notify if there is a 10% increase in overall trading

activity AND the average price of commodities has

fallen 2% in the last 4 hours

WSO2 CEP Server � Enterprise grade server for CEP runtimes � Supports several transports (network access) � Supports several data formats � Support for multiple CEP runtimes � Governance � Monitoring � Tools (WSO2 Dev Studio)

WSO2 CEP Architecture

CEP Brokers

� Is an adaptor for receiving and publishing events

� Has the configurations to connect to external endpoints

� Its many-to-many with CEP engine

CEP Brokers � Support for several transports (network access)

and data formats o SOAP/WS-Eventing

- XML messages o REST

- JSON messages o JMS

- Map messages - XML messages - Text messages

o SMTP (Email) - Text messages

o Thrift - WSO2 data format High Performant Event Capturing & Delivery Framework supports Java/C/C++/C# via Thrift language bindings - WSO2 Event

� & Brokers are pluggable !

CEP Buckets

� Is an isolated logical execution unit

� Each CEP bucket has a set of o Queries o Input & Output

event mappings. � Its one-to-one with

a CEP Backend Runtime Engine

Opensource CEP Runtimes for Buckets � Siddhi

o Apache License, a java library, Tuple based event model

o Supports distributed processing o Supports multiple query models

- Based on a SQL-like language - Filters, Windows, Joins, Ordering and others

� Esper, http://esper.codehaus.org (Deprecated) o GPLv2 License, a Java library, Events can be XML, Map,

Object o Supports multiple query models

- Based on a SQL-like language - Filters, Windows, Joins, Ordering and others

� Drools Fusion (Deprecated) o Apache License, a java library o Support for temporal reasoning + windows

Management UI � To define,

manage & monitor o buckets o brokers (Data

adopters)

Developer Studio UI

� Eclipse based tool to define buckets

� Can manage the configurations throughout the production lifecycle

� Note: 2.1.0 Still not support Text Output Mapping

Monitoring � Provides real-time statistical visual illustrations of

request & response counts per time based on CEP server, bucket, broker and topics.

Understanding Siddhi CEP Runtime

Engine

Siddhi Queries � Filters and Projection � Windows

o Events are processed within temporal windows. (e.g. for aggregation and joins)

Time window vs. length window. � Joins

o Join two streams � Event ordering

o Identify event sequences and patterns

Filters

� Filters the events by conditions � Conditions

o >, <, = , <=, <=, != o contains, instanceof o and, or, not

� Example

from <stream-name> [<conditions>]* insert into <stream-name>

from cseEventStream[price >= 20 and symbol==’IBM’] insert into StockQuote symbol, volume

Window

� Types of Windows o (Time | Length) (Sliding| Batch) windows

� Type of aggregate functions o sum, avg, max, min

� Example

from <stream-name> [<conditions>]#window.<window-name>(<parameters>) Insert [<output-type>] into <stream-name

from cseEventStream[price >= 20]#window.lengthBatch(50) insert into StockQuote symbol, avg(price) as avgPrice group by symbol having avgPrice>50

Join

� Join two streams based on a condition and window � Unidirectional – event arriving only to the

unidirectional stream triggers join � Example

from <stream>#<window> [unidirectional] join <stream>#<window> on <condition> within <time> insert into <stream>

from TickEvent[symbol==’IBM’]#window.length(2000) join NewsEvent#window.time(5 min) insert into JoinStream *

Pattern

� Check condition A happen before/after condition B � Can do iterative checks via “every” keyword. � Here with “within <time>”, SIddhi emits only events

that are within that time of each other � Example

from [every] <condition> Æ [every] <condition> … <condition> within <time> insert into StockQuote (<attribute-name>* | * )

from every (a1 = purchase[price < 10] ) Æa2 = purchase [price >10000 and a1.cardNo==a2.cardNo]

within 1 day insert into potentialFraud a1.cardNo as cardNo, a2.price as price, a2.place as place

a1 x1 k5 a2 n7 y1

Sequence

� Regular Expressions supported o * - Zero or more matches (reluctant). o + - One or more matches (reluctant). o ? - Zero or one match (reluctant). o or – either event

� Here we have to refer events returned by * , + using square brackets to access a specific occurrence of that event

from <event-regular-expression> within <time> insert into <stream>

from a1 = requestOrder[action == "buy"], b1 = cseEventStream[price > a1.price and symbol==a1.symbol]+, b2 = cseEventStream[price <b1.price] insert into purchaseOrder a1. symbol as symbol, b1[0].price as firstPrice, b2.price as orderPrice

a1 b1 b1 b2 n7 y1

� We compared Siddhi with Esper, the widely used opensource CEP engine

� For evaluation, we did setup different queries using both

systems, push events in to the system, and measure the time till all of them are processed.

� We used Intel(R) Xeon(R) X3440 @2.53GHz , 4 cores 8M

cache 8GB RAM running Debian 2.6.32-5-amd64 Kernel

Performance Results

Simple filter without window

Performance Comparison With ESPER

from StockTick[prize >6] return symbol, price

State machine query for pattern matching

Performance Comparison With ESPER

From f=FraudWarningEvent -> p=PINChangeEvent(accountNumber=f.accountNumber) return accountNumber;

Performance of WSO2 CEP � Here we publihsed data from two client publisher

nodes to the CEP Sever node and sent the triggered notifications of CEP to a client subscriber node.

� To test the worsecase sinario, 100% of the data

published to CEP is recived at the subscriber node after processing (No data is filtered)

� We used Intel® Core™ i7-2630QM CPU @ 2.00GHz, 8

cores, 8GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for running CEP and used Intel® Core™

i3-2350M CPU @ 2.30GHz, 4 cores, 4GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for the three client nodes.

Simple filter without window

Performance of WSO2 CEP

from StockTick[prize >6] return symbol, price

1 2 3 4 5 6 7 8 9 10 50 100 Avg 67 135 181 210 212 232 245 250 234 186 187 112

0

50

100

150

200

250

300

kilo

Eve

nts/

Sec

# Clients

WSO2 CEP Throughput

HA/ Persistence � Ability to recover

runtime state in the case of a failure.

� Enables queries to span lifetimes much greater than server uptime.

� Takes periodic snapshots and stores all state information to a scalable persistence store (Apache Cassandra).

� Supports pluggable persistent stores.

Scaling � Vertically scaling

o Can be distributed as a pipeline � Horizontally scaling

o Queries like windows, patterns, and Join have shared states, hence hard to distribute!

o Use distributed cache (Hazelcast) to achieve this - shared memory and batch processing

Event Recording � Ability to record all/some of the events for

future processing � Few options

o Publish them to Cassandra cluster using WSO2 data bridge API or BAM (can process data in Cassandra with Hadoop using WSO2 BAM).

o Write them to distributed cache o Custom thrift based event recorder

WSO2 BAM

Data Receiving Data Analyzing Data Presentation

Data Publishing

CEP Role within WSO2 Platform

DEMO

Scenario � Monitoring stock exchange for game changing

moments � Two input event streams.

o Event stream of Stock Quotes from a stock exchange

o Event stream of word count on various company names from twitter pages

� Check whether the last traded price of the stock has changed significantly(by 2%) within last minute, and people are twitting about that company (> 10) within last minute

Example Scenario

Input events � Input events are JMS Maps

o Stock Exchange Stream

Map<String, Object> map1 = new HashMap<String, Object>(); map1.put("symbol", "MSFT"); map1.put("price", 26.36); publisher.publish("AllStockQuotes", map1);

o Twitter Stream

Map<String, Object> map1 = new HashMap<String, Object>();

map1.put("company", "MSFT");

map1.put("wordCount", 8);

publisher.publish("TwitterFeed", map1);

Queries

Queries from allStockQuotes[win.time(60000)] insert into fastMovingStockQuotes symbol,price, avg(price) as averagePrice group by symbol having ((price > averagePrice*1.02) or (averagePrice*0.98 > price )) from twitterFeed[win.time(60000)] insert into highFrequentTweets company as company, sum(wordCount) as words group by company having (words > 10) from fastMovingStockQuotes[win.time(60000)] as fastMovingStockQuotes join highFrequentTweets[win.time(60000)] as highFrequentTweets on fastMovingStockQuotes.symbol==highFrequentTweets.company insert into predictedStockQuotes fastMovingStockQuotes.symbol as company, fastMovingStockQuotes.averagePrice as amount, highFrequentTweets.words as words

Alert � As a Email

Hi Within last minute, people being twitting about {company}

{words} times, and the last traded price of {company} has changed by 2% and now being trading at ${amount}.

From CEP

Useful links � WSO2 CEP 2.0.1

http://wso2.com/products/complex-event-processor/

� Distributed Processing Sample With Siddhi CEP and ActiveMQ JMS Broker.

http://suhothayan.blogspot.com/2012/08/distributed-processing-sample-for-wso2.html

� Creating Custom Data Publishers to BAM/CEP http://wso2.org/library/articles/2012/07/creating-custom-agents-publish-

events-bamcep

� WSO2 BAM 2.0.1 http://wso2.com/products/business-activity-monitor/

Questions?

Thank you.