© 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015...

31
© 2015 IBM Corporation

Transcript of © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015...

Page 1: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Page 2: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Agenda Introduction to Streams

Use Cases / References / Samples

Demo

Page 3: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation3

IBM InfoSphere Streams for Context-Aware Stream ComputingExperience the power of now: secure, continuous, dynamic

Real-Time Action

Context-Aware

AnalyticsData

AcquireBroadest range of data types

AnalyzeContinuous multimodal analytics

ActRight time, right method

Page 4: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

5

Why InfoSphere Streams?

Integration with existing architectures

Privacy built in

IBM services and support

Top Performance Real-Time Analytics

Enterprise Ready Context Awareness

TextGeospatialImage/VideoAcousticStatisticalNatural language processingTime seriesStatistics/Mathematics Predictive

Allows building context and profiles of entities and correlating streaming data with contextual information. Lookup historical data in databases and Big Data repositoriesNot just looking at each event or a small collection of events independently

Telco: 200 B messages / dayTrade Application: 5.7 million messages per second, 30 micros second latencyLow CPU and Memory footprint

Page 5: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Introduction to Stream Processing

Incremental tuple by tuple processing

FX rate

Internal Crossing

Weather

Exchange

Value Added Feed

Stream

Tuple

Operator

Page 6: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

InfoSphere Streams Application Pattern

© 2013 IBM Corporation7

Ingest Prepare Detect and Predict

Decide Act

Store

Transform

Filter

Correlate

Aggregate

Enrich

Classification

Patterns

Anomalies

Scoring

Business Rules

Conditional Logic

Notify

Publish

Execute

Visualize

Sensors

Social

Machine Data

Location

Audio

Video

Text

Warehouse, Hadoop, Operational Store, Files

Page 7: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation 8

InfoSphere Streams OverviewIntegrated Development Environment

Scale-Out Runtime Analytic Toolkits

Development and Management Functional and OptimizedFlexibility and Scalability

Cloud and on premise available for flexible deployment

Page 8: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation 9

Development Environment

Integrated Development Environment

Development and Management

Streams Processing Language

Visual Composition Tools

Page 9: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Integration with Analytic Tools

10

Integrated Development Environment

Development and Management

Streaming to Excel

Wrappers for legacy code written in Java, C++, Python, R, and Matlab

Page 10: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Integration with Languages

11

Integrated Development Environment

Development and Management

Wrappers for legacy code written in Java, C++, R, and Matlab

Page 11: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Monitoring and Debugging Support

12

Integrated Development Environment

Development and Management

Web based Monitoring Console

Page 12: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation 13

RuntimeScale-Out Runtime

Flexibility and Scalability

•High-performance clustered runtime•Large scale deployment •RHEL, CentOS, SUSE Linux Enterprise Server •X86 and Power multicore hardware•InfiniBand support•Ethernet support

Page 13: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Streams Runtime Illustrated

x86 host x86 host x86 host x86 host

Optimizing scheduler assigns jobs to hosts, and continually manages resource allocationOptimizing scheduler assigns jobs to hosts, and continually manages resource allocation

Commodity hardware – laptop, blades or high performance clustersCommodity hardware – laptop, blades or high performance clusters

MetersCompany Filter

Usage Model

Usage ContractText Extract Season Adjust Daily

Adjust

Temp Action

Page 14: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Streams Runtime Illustrated

x86 host x86 host x86 host x86 host x86 host

Optimizing scheduler assigns PEs to hosts, and continually manages resource allocationOptimizing scheduler assigns PEs to hosts, and continually manages resource allocation

Commodity hardware – laptop, blades or high performance clustersCommodity hardware – laptop, blades or high performance clusters

MetersCompany Filter

Usage Model

Usage Contract

Temp Action

Dynamically add hosts and jobsDynamically add hosts and jobs

New jobs work with existing jobsNew jobs work with existing jobs

Text Extract Degree History

Compare History

Store History

Meters

Season Adjust Daily Adjust

Text Extract

Page 15: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Runtime: Advanced Features

16

Scale-Out Runtime

Flexibility and Scalability User Defined ParallelismApplication Resiliency

System High Availability

Page 16: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Tooling

Domain Metadata Catalog

Instance Services

Host Controller

PEC

PEC

Host

Instance

Domain

Host Controller

PEC

PEC

Host

Instance Services

Host Controller

PEC

PEC

Host

Instance

Host Controller

PEC

PEC

Host

Instance Metadata Catalog Instance Metadata Catalog

Domain Services

Streams Domain

Page 17: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation© 2014 IBM Corporation

Automated System High Availability “Without specialized HA skills, an administrator can quickly and easily configure Streams to be resilient and use a single console to manage multiple instances with common users and hosts.”

New next generation architecture◦ Simpler Setup & Administration◦ More Secure◦ More Resilient◦ More Automatic◦ More Dynamic◦ New JMX API

18

Page 18: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Service A(leader)(standby)

Service A

Scenario 1: Management Host Failure

Services are running with a HA Count of 3

A Host failure is detected

If a Service on the Host was the leader, a standby takes over

A replacement service is started

Another Host becomes available and is tagged for management services

The Services are load balanced across the management hosts

Resource A Resource B Resource C

Service A Service A

“Management” “Management” “Management”

(standby) (standby)(leader)

Service A(standby)

Resource D“Management”

AUTOMATIC

Page 19: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Scenario 2: Application Host Failure

An Applications PEs are running across several Hosts

A Host failure is detected

PEs are started on alternative application Hosts

Streams are reconnected

Resource A Resource B Resource C“Application” “Application” “Application”

Source

Source

Sink 1

Sink 2

Op 2

Op 1 Op 1

AUTOMATIC

Page 20: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Analytic Toolkits

21

Analytic Toolkits

Functional and Optimized

Page 21: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Toolkits and Accelerators to Speed Up Development

Standard ToolkitRelational Operators

Filter Sort Functor JoinPunctor Aggregate

Adapter OperatorsFileSource UDPSourceFileSink UDPSinkDirectoryScan Export TCPSource ImportTCPSink MetricsSink

Utility OperatorsCustom SplitBeacon DeDuplicateThrottle Union Delay ThreadedSplitBarrier DynamicFilterPair GateJavaOp SwitchParse FormatDecompress CharacterTransform

XML OperatorXMLParse

IBM Supported ToolkitsDatabase DataStageBig Data Data ExplorerMessaging InternetText Analytics MiningSPSS CEPTime Series GeospatialFinancialRKafka MLlib

Open-Source ToolkitsJSON HTTP/RESTOpenCV AccumuloHbase Documents ….

User-Defined ToolkitsExtend the language by adding user-defined operators, types,and functions

Page 22: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

IBM Streams Community Github (http://ibmstreams.github.io/)

◦38 Active Open-Source Projects!◦More Agile◦Decouple from product release cycle◦In-sync with open-source ecosystem

◦ streamsx.hbase◦ streamsx.thrif◦ streamsx.mongoDB◦ streamsx.document

◦ Streamsx.inet◦ Streamsx.hdfs◦ Streamsx.messaging

◦ Streamsx.json◦ Streamsx.bytes◦ Streamsx.datetime◦ Streamsx.mathIBM Streams Github - http://ibmstreams.github.io/

StreamsDev - https://developer.ibm.com/streamsdev/

Page 23: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Use Cases / References

Page 24: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

University of Ontario Institute of Technology (UOIT) uses big data to improve quality of care for neonatal babies

Need

• Performing real-time analytics using physiological data from neonatal babies

• Continuously correlates data from medical monitors to detect subtle changes and alert hospital staff sooner

• Early warning gives caregivers the ability to proactively deal with complications

Benefits

• Detecting life threatening conditions 24 hours sooner than symptoms exhibited

• Lower morbidity and improved patient care

2525

Page 25: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Operations Analysis

Only vendor combining at-rest vehicle data with real time data-in-use from vehicles for single, integrated view and analysis within and outside of Hadoop environment

• Predict demand for replacement parts and service

• Monetize telematics data

• Provide drivers assistance

Advanced Condition Monitoring

26

Page 26: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

Bharti Airtel reduces billing costs and improves customer satisfaction.

Need

Could not achieve real time billing which required handling billions of Call Detail Records (CDR) per day and de-duplication against 15 days worth of CDR data

Benefits• Real-time mediation and analysis of 5B CDRs per day

• Data processing time reduced from 12 hrs to 1 min

• Hardware cost reduced to 1/8th

• Proactively address issues (e.g. dropped calls) impacting customer satisfaction.

2727Home

Page 27: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2016 IBM Corporation

“IBM Analytics gave us the power to break the big stories first, driving greater fan engagement.”

—Alexandra Willis, Head of Digital and Content, AELTC

Wimbledon 2015Real-time analytics helps share the moments that matter with tennis fans worldwide

Business challengeTo keep tennis fans engaged with its coverage of The Championships 2015, the AELTC wanted to use real-time match data to attract fans’ attention and encourage them to visit its digital platforms.

TransformationReal-time analysis of match data instantly notifies the AELTC’s content team about key events such as record serves or career milestones, helping them break news online before competitors can react.

Alexandra Willis,Head of Digital and Content, AELTC

Known to millions of fans simply as “Wimbledon”, The Championships is the oldest of tennis’ four Grand Slams, and one of the world’s highest-profile sporting events. Organized by the All England Lawn Tennis Club (AELTC) it has been a global sporting and cultural institution since 1877.

Business benefits:

Brokenews of records and milestones within seconds, faster than competing media

Boostedfans’ engagement with the tournament by sharing the moments that mattered

71 millionvisits to wimbledon.com proved the success of the digital strategy

Solution components• IBM Streams • IBM SPSS® Modeler• IBM Emerging Technology Services• IBM Global Business Services®

Share this

Media & Entertainment

Page 28: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation© 2013 IBM Corporation

CenterPoint Energy Advanced Metering System Deployment

29

Page 29: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Energy companies stand to save millions by using the power of Big Data to predict ice floes

Need

• 25% of the world’s remaining oil reserves are present in the Arctic where the harsh and hostile environment presents significant challenges for energy companies like ConocoPhillips

• High Arctic exploration operations will require a sophisticated solution that applies various algorithms, analytics, and simulated models of

satellite and metocean data to detect, track and forecast ice floe trajectories.

Benefits

• Anticipates saving roughly USD300 million per season by reducing drilling mobilization costs

• Estimates savings of USD1 billion per production platform by optimizing design requirements and ice management operations

30

Page 30: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

Only vendor providing a comprehensive Big Data platform supporting the world’s largest oil exploration and production company’s Global Analytics Platform initiative

This company is in the process of implementing a multi-step roadmap to gain visiblity to trusted information and enabling analytics capability across the enterprisee. Achieved $2M IT savings during the initial implementation of the combined Composite* Information Server and PureData for Analytics solution. Expanding into a wider set of enabling analytics to include both data-in motion and data-at-rest analytics across different business use cases.

* IBM Business Partner Composite Software on Infosphere Server 31

Page 31: © 2015 IBM Corporation - Meetupfiles.meetup.com/7770922/Streams_Overview_12-4-16.pdf · © 2015 IBM Corporation Agenda Introduction to Streams Use Cases / References / Samples Demo

© 2015 IBM Corporation

Demo