Toolkits Overview for IBM Streams V4.2
-
Upload
lisanl -
Category
Data & Analytics
-
view
79 -
download
2
Transcript of Toolkits Overview for IBM Streams V4.2
© 2016 IBM Corporation
Toolkits Overview
IBM Streams 4.2
Samantha Chan
IBM Streams Community Architect
For questions about this presentation contact: [email protected]
2 © 2016 IBM Corporation
Important Disclaimer
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONALPURPOSES ONLY.
WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THEINFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTYOF ANY KIND, EXPRESS OR IMPLIED.
IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OROTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS ORTHEIR SUPPLIERS AND/OR LICENSORS); OR
• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENTGOVERNING THE USE OF IBM SOFTWARE.
IBM’s statements regarding its plans, directions, and intent are subject to change orwithdrawal without notice at IBM’s sole discretion. Information regarding potentialfuture products is intended to outline our general product direction and it should notbe relied on in making a purchasing decision. The information mentioned regardingpotential future products is not a commitment, promise, or legal obligation to deliverany material, code or functionality. Information about potential future products maynot be incorporated into any contract. The development, release, and timing of anyfuture features or functionality described for our products remains at our solediscretion.
THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.
IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
3 © 2016 IBM Corporation
Agenda
What’s new in Streams Github Projects?
Toolkit Enhancements in Streams v4.2
4 © 2016 IBM Corporation
What’s New? – Language Support
TopologyToolkit (streamsx.topology)
– You can now write your Streams application purely in Python
5 © 2016 IBM Corporation
What’s New? - Adapters
Solr Toolkit (streamsx.solr)
• Solr is the popular, blazing-fast, open source enterprise search platform built
on Apache Lucene™
• SolrDocumentSink - This operator is used for writing tuples as Solr
documents to a Solr collection.
• SolrQuery - This operator is used for querying a Solr server. One of the
incoming attributes must be a solr query
– SolrStemmer - This operator is used for stemming words. • For example, apples -> apple, walked -> walk, talked -> talk
http://ibmstreams.github.io/streamsx.solr/
6 © 2016 IBM Corporation
What’s New? - Adapters
Cassandra Toolkit (streamsx.cassandra)
• Newest toolkit in active development and production at The Weather
Company!
• Ability to write data to Cassandra from a Streams Application:
stream<rstring greeting....> Greeting = Beacon() {
param
iterations: 1000000u; //generate 1000000 tuples
period : 0.5; //generate a tuple every 0.5 seconds
output
Greeting:
greeting = "Hello Streams!",
count = IterationCount() + 1ul,
testList = [1,2,3],
testSet = {4, 5, 6},
testMap = {7: true, 8 : false, 9: true},
nInt = -2147483647;
}
() as CoolStuff = com.weather.streamsx.cassandra::CassandraSink(Greeting) {
param
connectionConfigZNode: "/cassandra_config";
nullMapZnode: "/null_values";
}
7 © 2016 IBM Corporation
What’s New? - Adapters
HBase Toolkit (streamsx.hbase)
• Support for BigInsights 4.2
• HBasePut operator now uses Hbase caching mechanism to cache writes,
thus improving performance when writing a lot of data to the HBase server.
() as putSink = HBASEPut(In1, In2)
{
param
tableName : “users" ;
…
enableBuffer: true;
}
8 © 2016 IBM Corporation
What’s New? – Other interesting toolkits
Mail Toolkit (streamsx.mail)– Sending and reading emails in a Streams application
Shell Toolkit (streamsx.shell)– Utility toolkit to execute shell commands in a Streams application
OpenCV Toolkit (streamsx.opencv)– Enables Streams applications to ingest and process images with the
OpenCV library.
9 © 2016 IBM Corporation
What’s New? - Adapters
• HDFSToolkit (streamsx.hdfs)
• Support for BigInsight v4.2
• TempFile Support • Specify a temporary file name for files that are being written. Thus you can tell
which files in HDFS are currently being written. When the file is closed the file is
renamed to the final filename. (Cannot be used in consistent region)
() as ToHDFS = HDFS2FileSink(input)
{
param
file : "Locations.csv" ;
tempFile: "Locations.%TIME.tmp";
}
10 © 2016 IBM Corporation
What’s New? - Adapters
• Messaging Toolkit(streamsx.messaging)
Support for AppConfig – You can provide credentials to MQTT, JMS, Kafka, and RabbitMQ that
allow you to update credentials (and in some cases properties) from the console or streamtool while
a job continues to run.
Improved metrics – RabbitMQ and MQTT operators now have metrics for connection status and
number of connection attempts.
11 © 2016 IBM Corporation
What’s New? – Analytics
• Weather Toolkit (streamsx.weather)
• Enables Streams applications to retrieve weather forecast from the Weather
Company Data Bluemix service
• Provides the following operators:• CurrentWeather
• ForecastDaily
• ForecastHourly
• HistoricalWeather
12 © 2016 IBM Corporation
What’s New? – Analytics and Processing
Text Toolkit (streamsx.text)
– Apache Uima – Open source project to facilitate the analysis of unstructured
content such as text, audio and video.
– Integrates the Text Analytics component of Apache Uima, which provides a
system for extracting information from text data.
– The Text Toolkit includes operators to extract information from text data and
provides operations for text analysis, like lemmatization and text annotation
with Uima Ruta scripts or existing project specific Uima pear files.
Healthcare Toolkit (streamsx.health)– Added microservice to support MLLP and HL7 OBX Data
13 © 2016 IBM Corporation
What’s New? – Data Visualization
Visualization Toolkit (streamsx.visualization)– READ is a cloud-ready developer-centric API-friendly playground for visual
analytics.
– Create advanced, reactive, interactive, and real-time visualizations and
dashboards by combining data from Streams based APIs, Watson APIs, and
any other APIs of your choice.
14 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
ODM Toolkit (com.ibm.streams.rules)
Introduction of a rules compiler to compile ODM Rules to SPL application
Better performance than existing ODM operator
See previous recording if you want more information
15 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
Text Toolkit (com.ibm.streams.text)
Includes BigInsights web tool so you can create extractors without having to have a
BigInsights install available
Support for dynamic updates. This means that if you have a list of words or
keywords that your application is searching for, you can update that list without
having to restart your application.
16 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
Text Toolkit (com.ibm.streams.text)
Simplified sentiment extraction: NEW ExtractedSentiment operator!
Using raw Text Extract operater
17 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
Geospatial Toolkit (com.ibm.streams.geospatial)
– PointMapMatcher shared map mode
• Requires boost library to enable shared map mode
• Maps can be shared across multiple PointMapMatcher operators
using shared memory
• Make sure all PointMapMatcher operators and MapStore
operators colocate on the same host
• Use the new MapStore operator to read in the map
• Specify map store name in the PointMapMatcher operator
19 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
IBM Watson Speech to Text Toolkit (com.ibm.streams.speech2text)
– This toolkit comes as a separate download from the same
location you retrieve your Streams Install• This is not available for the Streams QSE
– Once the package is downloaded and untarred, simply follow
the README instructions to build and deploy a sample
– This toolkit is only available for Intel x86 RHEL6/RHEL7
– The WatsonS2T operator cannot be fused with other copies of
itself (parallel WatsonS2T operators must run in separate
PEs)
20 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
IBM Watson Speech to Text Toolkit (com.ibm.streams.speech2text)
21 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
IBM Watson Speech to Text Toolkit (com.ibm.streams.speech2text)
Sample output tuple:
{utteranceStartTime=0,
utteranceEndTime=14.96,
utteranceNumber=1,
utterance=" <s> ~SIL four score and seven years ago
~SIL our fathers ~SIL brought ~SIL forth on this
continent ~SIL a new nation ~SIL conceived in liberty
~SIL and dedicated to the proposition that ~SIL all
men are created equal </s>"}
22 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
Cybersecurity Toolkit (com.ibm.streams.cybersecurity)
– DNSTunneling - The DNSTunneling operator analyzes DNS
response traffic and reports suspicious behaviour that may
indicate the presence of DNS tunneling in the network.
– QRadarSink - This operator allows Streams applications to
send syslog messages to a QRadar host.
– BWListTagger (BlackWhiteListTagger) – improved algorithm
for searching domain – to see if a domain is part of a black /
white list.
23 © 2016 IBM Corporation
What’s New? – Streams 4.2 Specialized Toolkits
New Github Toolkits Included in Product
– JDBC Toolkit• Allow us to run SQL against any database that support JDBC
– JSON Toolkit• Convert data from JSON to SPL tuples and vice versa
– Datetime Toolkit• Provides facilities to work with date time data more easily
– Internet of Things Toolkit • Micro-service style applications for connecting with Internet of Things Platform on
Bluemix
– Network Toolkit• Provides functions for processing network data. Allow us to more easily work with the cyber
security toolkit.