Applying the R Language to BI and Real Time Applications

Post on 16-Apr-2017

767 views 0 download

Transcript of Applying the R Language to BI and Real Time Applications

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Applying the R LanguageIn Streaming Applications and Business Intelligence

Lou Bajuk-Yorgan, Sr. Dir., Product Management, TIBCO Analytics

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Analytic Challenges for Enterprises• Big Data

• More and more data, and the expectation to do something with it

• Competitive Pressures• Deeper insights into data--Apply Advanced

Analytics• Smarter Decisions--Broaden analytic usage to

wider community beyond Data Scientists• Faster Decisions—both human and automated

• Agile response to evolving opportunities and threats

• Answers (and the questions to ask) change rapidly

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

R can help…• Agile

• Easy prototyping of new models and analysis

• Deeper insights• Huge array of analytic

methods available• The “best” method to solve a

given problem is likely available

…but has it’s own challenges• Performance

• Not designed for real time or Big Data applications

• Broader usage• Hard for non-Data Scientist to use directly• Challenging to integrate into enterprise

applications • Performance, commercial support and

Intellectual Property concerns

• Compromises which impact Agility• Recode in a new, less agile environment• Rewrite, use specialized R packages to solve

one problem better

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

What would the ideal solution look like?

• A single environment that would allow you to prototype in R, and deploy to production in R

• Without recoding, without delay, without compromises• Enable agile response to changing opportunities and threats

Requires• Analytic flexibility, power and breadth of R• High performance, scalable, robust platform• Easy to embed in Business Intelligence, Real time and custom applications• Fully supported for mission critical applications• Allows R users to continue to work in their preferred development

environments (e.g., RStudio)

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

TIBCO Enterprise Runtime for R (TERR)

• Unique, enterprise-grade statistics engine, architected from the ground up by TIBCO

• Based on TIBCO’s long history and expertise with S+ • Better performance and memory management than open

source R

• Designed for R language compatibility• Wide range of built-in analytic methods• Extensible through R community packages

• Designed for commercial embeddability • TIBCO licensed & supported product • Not GPL, not a repackaging of the Open source R

engine

• TERR extends the reach of R in the enterprise• Develop code in open source R• Deploy on a commercially-supported and robust platform• Without the delay and cost of rewriting your code• Embed in Data Discovery, BI and real time applications

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Better performance and memory management than open source R

– Handles much larger data sets in memory

– Designed and architected for 64-bit platforms

– Linear, predictable performance as data set sizes increase

Summary• Small to moderate size data sets

– Many common operations– TERR: 2-10x as fast as OS R

• Larger data sets– Common operations (e.g.,

model scoring) or complex, real-world scripts

– TERR: 10-100x as fast as OS R

Predictions using SVMs from the e1071 package

Fitting and Scoring Generalized Linear Models

  OS R TERR Speedup

Model Fitting on 5 M rows 107.1 sec 17.5 sec 6.1 x

Model Scoring on 20M rows 84.2 sec 1 sec 84.2 x

TERR Performance

© Copyright 2000-2015 TIBCO Software Inc.-7-

All UsersBusiness Analysts Data Scientists App Developers Sys Admins

All Data Historical & Real-Time Internal & External Structured, Unstructured & Semi-Structured

Visual analytics empowering you to make strong decisions using your data

Descriptive & Diagnostic Analytics Predictive & Prescriptive AnalyticsContent Analytics Location Analytics Event Analytics Fast Data Analytics

Self-Service Analytics without sacrificing strong Central Governance

© Copyright 2000-2015 TIBCO Software Inc.-8-

All UsersBusiness Analysts Data Scientists App Developers Sys Admins

All Data Historical & Real-Time Internal & External Structured, Unstructured & Semi-Structured

Predictive Analytics Ecosystem

Leverage existing analytic investments in aunified framework

Create guided analytic applications Rapid start with easy-to-use tools

Native scripting in R

TIBCO Enterprise Runtime for R (TERR)  

Open Source R 

MATLAB® SAS® 

SQL/In-database Analytics

Hadoop/Spark for Big Data  S+ 

KNIME®  Lavastorm Analytics®

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Example 1: Embedded TERR in Spotfire• Spotfire: Data Discovery and Visualization platform for Business Users and Analysts

• Separate analytics platform, independent of TERR/R

• Easily enhance Spotfire analyses and applications with R language scripts• Extend the impact of the Data Scientist/R by making their analytic insights available to a wider audience

Write R code directly in Spotfire;TERR executes locally or on server

Manage TERR analytics locally or in Server to reuse across

community

Deploy TERR-powered applications to the web

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Power of embedded Advanced Analytics

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Advanced Analytic Applications in SpotfireCustomer Churn: • Retain your most profitable customers• Increase upsell, decrease churn

Fraud Detection: • Reduce losses due to fraudulent

transactions

Supply Chain Optimization: • Anticipate peaks and lulls• Optimize distribution centers

HR Planning: • Predict employee attrition and optimize

retention

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Example 2: TERR in TIBCO’s Complex Event Processing• TERR powers real-time advanced analytics in TIBCO “Fast Data”

• When an event is identified, the CEP application applies a predictive model, and then can trigger an automated business process

• E.g., extend a mobile offer to a customer; stop a fraudulent transaction in process

ModelDevelop model

Deploy via TERR in TIBCO Streambase or Business Events

ActAutomatically monitor real-time transactionsAutomatically trigger

action

AnalyzeAnalyze data in Spotfire

Uncover patterns, trends & correlations

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Logistics Optimization

• Port Congestion Detection• Real time system triggers TERR• Analyzes port congestion• Recommends reduction of

speed if no berths available• Maritime Abnormality Detection

• Based on Automatic Identification System info, TERR calculates likelihood of deviation from normal sailing routes

• Alerts carrier & operator

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Predictive Maintenance for Oil & Gas

• Oil & Gas Extraction• Maintenance Downtime and

Equipment failures are costly• Engineers track sensor data to find

leading indicators• Temperature, vibration, etc.

• Engineers usually use ad hoc rules on leading indicators• R/TERR used to develop predictive

models for preventative maintenance• Deployed in real-time systems, alert

when maintenance recommended

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

TERR Ecosystem• TIBCO

• Spotfire: BI and Data Discovery• Jaspersoft: pixel perfect reporting • Streambase: real time, streaming applications

• Lavastorm Analytics• Visual workflow tool for data management and analysis• Embedding TERR for R scripting and predictive tools

• RStudio IDE• Free, open source IDE widely used by the R Community• Fully compatible with TERR Developer Edition

• KNIME• Free, open source workflow tool for data management

and analysis• TERR fully compatible with KNIME Interactive R

Statistics Integration nodes

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

TERR for individual R users• Empower R users

• Enterprise platform for the deployment and integration of your work—without having to rewrite it!

• TERR Developer Edition• Full version of TERR engine for testing code

prior to deployment• Compatible with RStudio & ESS Emacs

• Free for non-production use• Supported through Community site• Available at Tap.tibco.com

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

TERR is R for the Enterprise

• Develop code in open source R, deploy on commercially-supported, and robust platforms

• Without recoding, without compromises• Save time & money, quickly respond to new threats and opportunities

• Tightly & efficiently embed R language functionality• Extend the power of R to a wider audience, more applications

Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk

Learn more and Try it yourself• TERR Community at community.tibco.com

• Resources, Documentation, FAQs, Forums• More info at spotfire.tibco.com/terr

• TERR Developer Edition• Full version of TERR engine for testing code prior to deployment• Supported through TIBCOmmunity, download via tap.tibco.com

• Spotfire Free Trial: http://spotfire.tibco.com/trial

• Presentations: http://www.slideshare.net/loubajukyorgan/presentations• Slides @loubajuk

• We’re hiring Data Scientists! Contact me at lbajuk@tibco.com

• R Consortium Founding Member  www.r-consortium.org