Toolkits Overview for IBM Streams V4.2

24
© 2016 IBM Corporation Toolkits Overview IBM Streams 4.2 Samantha Chan IBM Streams Community Architect For questions about this presentation contact: [email protected]

Transcript of Toolkits Overview for IBM Streams V4.2

© 2016 IBM Corporation

Toolkits Overview

IBM Streams 4.2

Samantha Chan

IBM Streams Community Architect

For questions about this presentation contact: [email protected]

2 © 2016 IBM Corporation

Important Disclaimer

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONALPURPOSES ONLY.

WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THEINFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTYOF ANY KIND, EXPRESS OR IMPLIED.

IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY,WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.

IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OROTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:

• CREATING ANY WARRANTY OR REPRESENTATION FROM IBM (OR ITS AFFILIATES OR ITS ORTHEIR SUPPLIERS AND/OR LICENSORS); OR

• ALTERING THE TERMS AND CONDITIONS OF THE APPLICABLE LICENSE AGREEMENTGOVERNING THE USE OF IBM SOFTWARE.

IBM’s statements regarding its plans, directions, and intent are subject to change orwithdrawal without notice at IBM’s sole discretion. Information regarding potentialfuture products is intended to outline our general product direction and it should notbe relied on in making a purchasing decision. The information mentioned regardingpotential future products is not a commitment, promise, or legal obligation to deliverany material, code or functionality. Information about potential future products maynot be incorporated into any contract. The development, release, and timing of anyfuture features or functionality described for our products remains at our solediscretion.

THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE.

IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

3 © 2016 IBM Corporation

Agenda

What’s new in Streams Github Projects?

Toolkit Enhancements in Streams v4.2

4 © 2016 IBM Corporation

What’s New? – Language Support

TopologyToolkit (streamsx.topology)

– You can now write your Streams application purely in Python

5 © 2016 IBM Corporation

What’s New? - Adapters

Solr Toolkit (streamsx.solr)

• Solr is the popular, blazing-fast, open source enterprise search platform built

on Apache Lucene™

• SolrDocumentSink - This operator is used for writing tuples as Solr

documents to a Solr collection.

• SolrQuery - This operator is used for querying a Solr server. One of the

incoming attributes must be a solr query

– SolrStemmer - This operator is used for stemming words. • For example, apples -> apple, walked -> walk, talked -> talk

http://ibmstreams.github.io/streamsx.solr/

6 © 2016 IBM Corporation

What’s New? - Adapters

Cassandra Toolkit (streamsx.cassandra)

• Newest toolkit in active development and production at The Weather

Company!

• Ability to write data to Cassandra from a Streams Application:

stream<rstring greeting....> Greeting = Beacon() {

param

iterations: 1000000u; //generate 1000000 tuples

period : 0.5; //generate a tuple every 0.5 seconds

output

Greeting:

greeting = "Hello Streams!",

count = IterationCount() + 1ul,

testList = [1,2,3],

testSet = {4, 5, 6},

testMap = {7: true, 8 : false, 9: true},

nInt = -2147483647;

}

() as CoolStuff = com.weather.streamsx.cassandra::CassandraSink(Greeting) {

param

connectionConfigZNode: "/cassandra_config";

nullMapZnode: "/null_values";

}

7 © 2016 IBM Corporation

What’s New? - Adapters

HBase Toolkit (streamsx.hbase)

• Support for BigInsights 4.2

• HBasePut operator now uses Hbase caching mechanism to cache writes,

thus improving performance when writing a lot of data to the HBase server.

() as putSink = HBASEPut(In1, In2)

{

param

tableName : “users" ;

enableBuffer: true;

}

8 © 2016 IBM Corporation

What’s New? – Other interesting toolkits

Mail Toolkit (streamsx.mail)– Sending and reading emails in a Streams application

Shell Toolkit (streamsx.shell)– Utility toolkit to execute shell commands in a Streams application

OpenCV Toolkit (streamsx.opencv)– Enables Streams applications to ingest and process images with the

OpenCV library.

9 © 2016 IBM Corporation

What’s New? - Adapters

• HDFSToolkit (streamsx.hdfs)

• Support for BigInsight v4.2

• TempFile Support • Specify a temporary file name for files that are being written. Thus you can tell

which files in HDFS are currently being written. When the file is closed the file is

renamed to the final filename. (Cannot be used in consistent region)

() as ToHDFS = HDFS2FileSink(input)

{

param

file : "Locations.csv" ;

tempFile: "Locations.%TIME.tmp";

}

10 © 2016 IBM Corporation

What’s New? - Adapters

• Messaging Toolkit(streamsx.messaging)

Support for AppConfig – You can provide credentials to MQTT, JMS, Kafka, and RabbitMQ that

allow you to update credentials (and in some cases properties) from the console or streamtool while

a job continues to run.

Improved metrics – RabbitMQ and MQTT operators now have metrics for connection status and

number of connection attempts.

11 © 2016 IBM Corporation

What’s New? – Analytics

• Weather Toolkit (streamsx.weather)

• Enables Streams applications to retrieve weather forecast from the Weather

Company Data Bluemix service

• Provides the following operators:• CurrentWeather

• ForecastDaily

• ForecastHourly

• HistoricalWeather

12 © 2016 IBM Corporation

What’s New? – Analytics and Processing

Text Toolkit (streamsx.text)

– Apache Uima – Open source project to facilitate the analysis of unstructured

content such as text, audio and video.

– Integrates the Text Analytics component of Apache Uima, which provides a

system for extracting information from text data.

– The Text Toolkit includes operators to extract information from text data and

provides operations for text analysis, like lemmatization and text annotation

with Uima Ruta scripts or existing project specific Uima pear files.

Healthcare Toolkit (streamsx.health)– Added microservice to support MLLP and HL7 OBX Data

13 © 2016 IBM Corporation

What’s New? – Data Visualization

Visualization Toolkit (streamsx.visualization)– READ is a cloud-ready developer-centric API-friendly playground for visual

analytics.

– Create advanced, reactive, interactive, and real-time visualizations and

dashboards by combining data from Streams based APIs, Watson APIs, and

any other APIs of your choice.

14 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

ODM Toolkit (com.ibm.streams.rules)

Introduction of a rules compiler to compile ODM Rules to SPL application

Better performance than existing ODM operator

See previous recording if you want more information

15 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

Text Toolkit (com.ibm.streams.text)

Includes BigInsights web tool so you can create extractors without having to have a

BigInsights install available

Support for dynamic updates. This means that if you have a list of words or

keywords that your application is searching for, you can update that list without

having to restart your application.

16 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

Text Toolkit (com.ibm.streams.text)

Simplified sentiment extraction: NEW ExtractedSentiment operator!

Using raw Text Extract operater

17 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

Geospatial Toolkit (com.ibm.streams.geospatial)

– PointMapMatcher shared map mode

• Requires boost library to enable shared map mode

• Maps can be shared across multiple PointMapMatcher operators

using shared memory

• Make sure all PointMapMatcher operators and MapStore

operators colocate on the same host

• Use the new MapStore operator to read in the map

• Specify map store name in the PointMapMatcher operator

18 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

19 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

IBM Watson Speech to Text Toolkit (com.ibm.streams.speech2text)

– This toolkit comes as a separate download from the same

location you retrieve your Streams Install• This is not available for the Streams QSE

– Once the package is downloaded and untarred, simply follow

the README instructions to build and deploy a sample

– This toolkit is only available for Intel x86 RHEL6/RHEL7

– The WatsonS2T operator cannot be fused with other copies of

itself (parallel WatsonS2T operators must run in separate

PEs)

20 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

IBM Watson Speech to Text Toolkit (com.ibm.streams.speech2text)

21 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

IBM Watson Speech to Text Toolkit (com.ibm.streams.speech2text)

Sample output tuple:

{utteranceStartTime=0,

utteranceEndTime=14.96,

utteranceNumber=1,

utterance=" <s> ~SIL four score and seven years ago

~SIL our fathers ~SIL brought ~SIL forth on this

continent ~SIL a new nation ~SIL conceived in liberty

~SIL and dedicated to the proposition that ~SIL all

men are created equal </s>"}

22 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

Cybersecurity Toolkit (com.ibm.streams.cybersecurity)

– DNSTunneling - The DNSTunneling operator analyzes DNS

response traffic and reports suspicious behaviour that may

indicate the presence of DNS tunneling in the network.

– QRadarSink - This operator allows Streams applications to

send syslog messages to a QRadar host.

– BWListTagger (BlackWhiteListTagger) – improved algorithm

for searching domain – to see if a domain is part of a black /

white list.

23 © 2016 IBM Corporation

What’s New? – Streams 4.2 Specialized Toolkits

New Github Toolkits Included in Product

– JDBC Toolkit• Allow us to run SQL against any database that support JDBC

– JSON Toolkit• Convert data from JSON to SPL tuples and vice versa

– Datetime Toolkit• Provides facilities to work with date time data more easily

– Internet of Things Toolkit • Micro-service style applications for connecting with Internet of Things Platform on

Bluemix

– Network Toolkit• Provides functions for processing network data. Allow us to more easily work with the cyber

security toolkit.

24 © 2016 IBM Corporation

Questions?