Processing Twitter Stream with Oracle Event Processing (OEP)
-
Upload
guido-schmutz -
Category
Technology
-
view
1.894 -
download
0
description
Transcript of Processing Twitter Stream with Oracle Event Processing (OEP)
2013 © Trivadis
BASEL BERN BRUGG LAUSANNE ZUERICH DUESSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA
2013 © Trivadis
Processing Twitter Stream with Oracle Event Processing (OEP) Guido Schmutz
OFM Partner Forum Malta
19.2.2014
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
1
INFOBOX – Read and delete • A heading and an optional sub-heading
can be placed on the first slide. • The title is written directly under the
name (Shift+Return) • If multiple speakers are named, please
just write the names one underneath the other (there is normally no space for titles, etc.)
2013 © Trivadis
Guido Schmutz
• Working for Trivadis for more than 17 years
• Oracle ACE Director for Fusion Middleware and SOA • Co-Author of different books • Consultant, Trainer Software Architect for Java, Oracle, SOA and
Big Data / Fast Data • Member of Trivadis Architecture Board • Technology Manager @ Trivadis
• More than 25 years of software development experience
• Contact: [email protected] • Blog: http://guidoschmutz.wordpress.com • Twitter: gschmutz
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
2
2013 © Trivadis
AGENDA
1. Introduction
2. Twitter Use Case
3. Processing with Oracle Event Processing (OEP)
4. Visualization with Oracle Business Activity Monitoring (BAM)
5. Store Information in Apache Cassandra
6. Summary
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
3
INFOBOX – Read and delete • If the agenda is used as an interim
page, please highlight the relevant chapter in red font.
• To allow optimum alignment of objects,
display the drawing guides (right-click on the page)
2013 © Trivadis
Big Data Definition (4 Vs)
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
4
+ Time to action ? – Big Data + Event Processing = Fast Data
Characteristics of Big Data: Its Volume, Velocity and Variety in combination
2013 © Trivadis
The world is changing …
The model of Generating/Consuming Data has changed ….
Old Model: few companies are generating data, all others are consuming data
New Model: all of use are generating data, and all of us are consuming data
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
5
2013 © Trivadis
Who is generating Big Data?
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
6
The progress and innovation is no longer hindered by the ability to collect data
But by the ability to manage, analyze, summarize, visualize and discover knowledge from the collected data in a timely manner and in a scalable fashion
Social media and networks (all of us are generating data)
Scientific instruments (collecting all sorts of data)
Mobile devices (tracking all objects all the time)
Sensor technology and networks
(measuring all kinds of data)
2013 © Trivadis
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
7
2013 © Trivadis
Internet Of Things – Sensors are/will be everywhere
There are more devices tapping into the internet than people on earth
How do we prepare our systems/architecture for the future?
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
8 Source: Cisco Source: The Economist
2013 © Trivadis
Data as an Asset - Store Anything?
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
9
But then data is just too valuable to delete! We must store anything!
Nonsense! Just store the data
you know you need today!
It depends … but Big Data technologies allow to store the raw information from both new data sources as well as existing ones so that you can later use it to create new data-driven products, you would not have thought about today!
2013 © Trivadis
AGENDA
1. Introduction
2. Twitter Use Case
3. Processing with Oracle Event Processing (OEP)
4. Visualization with Oracle Business Activity Monitoring (BAM)
5. Store Information in Apache Cassandra
6. Summary
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
10
2013 © Trivadis
Retrieve Tweets and Visualize
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
11
2013 © Trivadis
Access to Tweets
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
12
Quelle
Source Limitations Cost Twitter’s Search API 3200 / user
5000 / keyword 180 requests / 15 minutes
free
Twitter’s Streaming API 1%-40% of total volume free
DataSift none 0.15 -0.20$ /
unit Gnip none On request
2013 © Trivadis
How to design a stream (event) processing system?
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
13
Twitter Stream
tweet Sensor
tweet
Persist (Queue)
Twitter Stream
tweet Sensor
tweet
Processing
tweet Processing
result
result
Twitter Stream
tweet Receiving/ Processing
result
2013 © Trivadis
AGENDA
1. Introduction
2. Twitter Use Case
3. Processing with Oracle Event Processing (OEP)
4. Visualization with Oracle Business Activity Monitoring (BAM)
5. Store Information in Apache Cassandra
6. Summary
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
14
2013 © Trivadis
Oracle Event Processing (OEP) - Engine
Lightweight Java Application Server
• Full environment for running Java applications
• Module Framework - OSGi
High Throughput
• Hundreds of thousands of events/second
Event Processing Infrastructure
Easy-to-use development environment
• Service Framework – Spring DM, POJO
Enterprise Web 2.0 & Eclipse-based tooling
Multiple-choice VM
• JRockit or WebLogic RealTime
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
15
2013 © Trivadis
Oracle Event Processing – Event Processing Network Concept
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
16
2013 © Trivadis
Oracle Event Processing – In Memory, Continuous Queries
Event Processing Output § Filtering
- New stream filtered for specific criteria, e.g. stock price > $22
§ Correlation & Aggregation
- Scrolling, time-based window metrics, e.g. average # of stock trades in the last hour
§ Pattern Matching
- Notification of detected event patterns, e.g. price changes A, B and C occurred within 15 minute window
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
17
2013 © Trivadis
Oracle Event Processing - CQL
Initiative for a complete “continuous” query language Start with SQL ’99 plus “continuous” query extensions
§ Based on Stanford University research
Industry standards discussions § Event Processing Technical Society (EPTS)
§ ANSI SQL
§ OMG
Adoption Today § ANSI SQL Standards Proposal for CQL Pattern Matching
- Oracle, IBM, Stanford University
§ OpenSource Adoption of CQL
§ Oracle Complex Event Processor (CEP) Releaseà Available in 11g
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
18
2013 © Trivadis
Oracle Event Processing – Visual Development Tools
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
19
2013 © Trivadis
Oracle Event Processing – Operation & Management
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
20
2013 © Trivadis
Implementation – complete picture
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
21
Mention Extractor
Twitter Adapter
CounterProcessor
Hashtag Extractor
Author Extractor
Cassandra Counter
BAM Tweet
Cassandra Tweet
BAM Counter
@SOASimone @SOACommunity heard you couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity heard you couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity heard you
couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity
robertvanmolken
#ofmforum #malta
#ofmforum,5 #malta,2
Robertvanmolken,1
@SOASimone,1 @SOACommunity,5
JMS
JMS
range 30 seconds slide 30 seconds
2013 © Trivadis
1) Creating a Twitter Adapter
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
22
Twitter Adapter
@SOASimone @SOACommunity heard you
couldn’t make it. We miss you! #ofmforum #malta
2013 © Trivadis
2) Send Tweets to BAM
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
23
Twitter Adapter
BAM Tweet
@SOASimone @SOACommunity heard you couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity heard you
couldn’t make it. We miss you! #ofmforum #malta
JMS
2013 © Trivadis
3) Extract interesting information from Tweet
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
24
Mention Extractor
Twitter Adapter
Hashtag Extractor
Author Extractor
BAM Tweet
@SOASimone @SOACommunity heard you couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity heard you
couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity
robertvanmolken
#ofmforum #malta
JMS
2013 © Trivadis
4) Count occurrences within period
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
25
Mention Extractor
Twitter Adapter
CounterProcessor
Hashtag Extractor
Author Extractor
BAM Tweet
BAM Counter
@SOASimone @SOACommunity heard you couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity heard you
couldn’t make it. We miss you! #ofmforum #malta
#ofmforum,5 #malta,2
Robertvanmolken,1
@SOASimone,1 @SOACommunity,5
JMS
JMS
range 30 seconds slide 30 seconds
@SOASimone @SOACommunity
robertvanmolken
#ofmforum #malta
2013 © Trivadis
5) Adding Cassandra NoSQL for storing results
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
26
Mention Extractor
Twitter Adapter
CounterProcessor
Hashtag Extractor
Author Extractor
Cassandra Counter
BAM Tweet
Cassandra Tweet
BAM Counter
@SOASimone @SOACommunity heard you couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity heard you couldn’t make it. We miss you! #ofmforum #malta
@SOASimone @SOACommunity heard you
couldn’t make it. We miss you! #ofmforum #malta
#ofmforum,5 #malta,2
Robertvanmolken,1
@SOASimone,1 @SOACommunity,5
JMS
JMS
range 30 seconds slide 30 seconds
@SOASimone @SOACommunity
robertvanmolken
#ofmforum #malta
2013 © Trivadis
Implementing in Oracle Event Processing
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
27
Mention Extractor
Twitter Adapter
CounterProcessor
Hashtag Extractor
Author Extractor
BAM Tweet
BAM Counter
JMS
JMS
range 30 seconds slide 30 seconds
2013 © Trivadis
1) Creating Twitter Adapter – Connecting to Twitter Stream
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
28
2013 © Trivadis
1) Creating Twitter Adapter – Tweet Event
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
29
2013 © Trivadis
1) Creating Twitter Adapter – Adapter Factory
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
30
2013 © Trivadis
1) Creating Twitter Adapter – Assembly
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
31
2013 © Trivadis
1) Creating Twitter Adapter – Export Adapter to server
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
32
2013 © Trivadis
1) Creating Twitter Adapter – Using Twitter Adapter
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
33
2013 © Trivadis
2) Sending Tweets to BAM
Using Oracle BAM Enterprise Message Sources (JMS) interface
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
34
2013 © Trivadis
2) Sending Tweets to BAM – Convert event to JMS MapMessage
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
35
2013 © Trivadis
3) Extract information from Tweet – Extract Hashtags from TweetEvent
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
36
2013 © Trivadis
3) Extract information from Tweet – Extract Hashtags from TweetEvent
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
37
2013 © Trivadis
4) Count occurrences within period - Using CQL
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
38
2013 © Trivadis
Implementation – Complete Picture
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
39
2013 © Trivadis
AGENDA
1. Introduction
2. Twitter Use Case
3. Processing with Oracle Event Processing (OEP)
4. Visualization with Oracle Business Activity Monitoring (BAM)
5. Store Information in Apache Cassandra
6. Summary
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
40
2013 © Trivadis
Oracle BAM: Architected for Integration and Visualization
Processing Twitter Stream with Oracle Event Processing (OEP)
Internet
BAM Dashboards
WebApplications
StartPage
ActiveViewer
ActiveStudio
Architect
Administrator
ReportServer
iCommand
Oracle Database (Grid)
BAM Data & Metadata
External Data Objects
WebServices
Internet
Enterprise Integration Framework
Application Server
BI
Web Services
JMS Connector
BAM Adapter
ADF
BAM DataControl
ADF Pages with DVT
BAM Server EventEngine
Actions & Escalations
Notification Services
ReportCache
Snapshots & Change Lists
Memory / Disk
ActiveDataCache
ViewSets
API
Kernel
DataSets
DataStorageEngine ODI
Databases
OLTP & Data Warehouses
Mobile Devices
Data & Metadata Import & Export
BPEL
BPM
Message Queues
CEP
OESB
19.02.2014
41
2013 © Trivadis
Oracle BAM – Create a Data Object
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
42
2013 © Trivadis
Oracle BAM Enterprise Message Source Configuration
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
43
2013 © Trivadis
AGENDA
1. Introduction
2. Twitter Use Case
3. Processing with Oracle Event Processing (OEP)
4. Visualization with Oracle Business Activity Monitoring (BAM)
5. Store Information in Apache Cassandra
6. Summary
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
44
2013 © Trivadis
Implementation – Storing information in NoSQL database
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
45
Mention Extractor
Twitter Adapter
CounterProcessor
Hashtag Extractor
Author Extractor
Cassandra Counter
BAM Tweet
Cassandra Tweet
BAM Counter
JMS
JMS
range 30 seconds slide 30 seconds
2013 © Trivadis
Event Processing Network in OEP
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
46
2013 © Trivadis
The world is changing … new data stores
Problem of traditional (R)DBMS approach: § Complex object graph § Schema evolution § Semi-structured data § Scaling
Polyglot persistence
§ Using multiple data storage technologies (RDMBS + NoSQL + NewSQL + In- Memory)
§ Selected based on the way data is being used by individual applications • Why using an RDBMS if there are better storage alternatives? • Key/Value, Column Family, Document, Graph-oriented, Relational, …
§ Can occur both over the enterprise as well as within a single application
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
47
ORDER
ADDRESS
CUSTOMER
ORDER_LINES
OrderID: 1001Order Date: 15.9.2012
Line Items
Customer
First Name: PeterLast Name: Sample
Billing AddressStreet: Somestreet 10City: SomewherePostal Code: 55901
Name
Ipod Touch
Monster Beat
Apple Mouse
Quantity
1
2
1
Price
220.95
190.00
69.90
2013 © Trivadis
Apache Cassandra – NoSQL database
• Developed at Facebook
• Open source distributed database management system
• Professional grade support from company called DataStax
• Main Features § Real-Time § Highly Distributed § Support for Multiple Data Center § Highly Scalable § No Single Point of Failure § Fault Tolerant § Tunable Consistency § Cassandra Query Language (CQL)
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
48
2013 © Trivadis
Apache Cassandra - NoSQL Database
• Don’t think of relational table => more of a sorted map
• Know your application => model around the queries
• De-normalize and duplicate for read performance
• Index is not an afterthought, anymore=> index upfront
• Think of physical storage structure
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
49
2013-08 Day-1, keyword-1=>100
Day-2, keyword-1=>150
Day-3, keyword-1=>170 …. Day-31,
keyword-1 =>170
2013-08-31 Hour-1, keyword-1 =>10
Hour-2, keyword-1 =>15
Hour-3, keyword-1 =>17 …. Hour-24,
keyword-1 =>17
2013-08-31-10 Minute-1, keyword-1=>2
Minute-2, keyword-1=>3
Minute-3, keyword-1 =>5 …. Minute-60,
keyword-1=>2
Row-key Columns à
2013 © Trivadis
Apache Cassandra – NoSQL database
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
50
2013 © Trivadis
Apache Cassandra – NoSQL database
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
51
2013 © Trivadis
AGENDA
1. Introduction
2. Twitter Use Case
3. Processing with Oracle Event Processing (OEP)
4. Visualization with Oracle Business Activity Monitoring (BAM)
5. Store Information in Apache Cassandra
6. Summary
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
52
2013 © Trivadis
Big Data Reference Architecture – Combine Streaming and Batch
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
53
2013 © Trivadis
Summary
Cassandra • No single point of failure
• Forget your data modeling skills
• Model around the queries
• Query Language
Oracle Event Processing • Very light weight server
• Very easy to write adapters
• Very strong CQL language
Oracle Business Activity Monitoring § 11g version a bit “old fashioned”
§ Easy to integrate through JMS
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
54
2013 © Trivadis
Questions and answers ...
2013 © Trivadis
BASEL BERN BRUGG LAUSANNE ZUERICH DUESSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA
Guido Schmutz
Technology Manager
19.02.2014 Processing Twitter Stream with Oracle Event Processing (OEP)
INFOBOX – Read and delete • There are two versions of the last slide
available, one for the contact details of a speaker, and one for two or more speakers.
• Name, title and location always underneath one another in one row (Shift+Return)
• This idea is that this is the last slide (also for questions and answers) and is on the screen for a long time at the end of the presentation, so the viewers have the chance to write down the contact data J
55