Revolution R: 100% R and more
-
Upload
revolution-analytics -
Category
Technology
-
view
4.423 -
download
1
Transcript of Revolution R: 100% R and more
Revolution Confidential
Revolution R: 100% R and More
Presented by:David SmithVP Marketing, Revolution
Analytics
Revolution Confidential
2
August 24, 2011: Welcome!
Thanks for coming. Slides and replay available (soon) at:
http://bit.ly/railcj
David SmithVP Marketing, Revolution AnalyticsEditor, Revolutions blog
http://blog.revolutionanalytics.comTwitter: @revodavid
Revolution Confidential
3
In today’s webcast:
About Revolution Analytics and R
What Revolution R adds to R
Resources for getting more from R
Q&A
Introducing Revolution R
Revolution Confidential
4
What is R?
Data analysis software A programming language
Development platform designed by and for statisticians
An environment Huge library of algorithms for data access, data
manipulation, analysis and graphics
An open-source software project Free, open, and active
A community Thousands of contributors, 2 million users Resources and help in every domain
Download the White Paper
R is Hot
Revolution Confidential
Source: http://r4stats.com/popularity 5
R is exploding in popularity and functionality
Stata 10%
S-Plus 0%
SPSS -27%
SAS -11%
R 46%
Scholarly ActivityGoogle Scholar hits (’05-’09 CAGR)
0
500
1000
1500
2000
2500
20102008200620042002
Package GrowthNumber of R packages listed on CRAN
“A key benefit of R is that it provides near-instant availability of new and
experimental methods created by its user base — without waiting for the
development/release cycle of commercial software. SAS recognizes the value of R
to our customer base…”
Product Marketing Manager SAS Institute, Inc.
“I’ve been astonished by the rate at which R has been adopted. Four years ago,
everyone in my economics department [at the University of Chicago] was using
Stata; now, as far as I can tell, R is the standard tool, and students learn it first.”
Deputy Editor for New Products at Forbes
Revolution Confidential
6
15
20
25
30
MSFT [2009-01-02/2010-03-31]
Last 29.29
Volume (millions):63,760,000
50
100
150
200
250
Moving Average Convergence Divergence (12,26,9):MACD: 0.702Signal: 0.712
-6
-4
-2
0
2
4
6
Jan 02 2009 Apr 01 2009 Jul 01 2009 Oct 01 2009 Jan 04 2010 Mar 31 2010
3000+ R Packages from the Open Source community
Time Series analysis
Portfolio Optimization
Econometrics
Genomics
Clinical Trials
Bayesian Inference
Survival analysis
Social Networks
Data Visualization
Data APIs (Twitter)
.. and more
7
R User CommunityFrom: The R Ecosystem
bit.ly/R-ecosystem
Revolution Confidential
8
Revolution R Enterprise is
Revolution Confidential
9
R Productivity Environment (Windows)Script with type ahead and code
snippetsSolutions window
for organizing code and data
Packages installed and
loaded
Objects loaded in the
R Environment
Object details
Sophisticated debugging with
breakpoints , variable values etc.
http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm
Revolution Confidential
10
Interactive Debugging
One-click to set a breakpoint in an R script Step in/out/over, inspect variables Eliminate the edit -> browser -> repair cycle
Revolution Confidential
11
Coming soon: Revolution R GUI Accessible
Powerful
Extensible
Revolution Confidential
12
Performance: Multi-threaded Math
Open
Source R
Revolution R Enterprise
Computation (4-core laptop) Open Source R Revolution R Speedup
Linear Algebra1
Matrix Multiply 327 sec 13.4 sec 23x
Cholesky Factorization 31.3 sec 1.8 sec 17x
Linear Discriminant Analysis 216 sec 74.6 sec 2x
General R Benchmarks2
R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x
R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable
1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php2. http://r.research.att.com/benchmarks/
Revolution Confidential
13
Three Paradigms for Big Data
Standard R engine is constrained by capacity and performance
Revolution R Enterprise offers three methods for big data with R: Off-line: parallel out-of-memory analytics Off-line, distributed analytics On-line, in-database analytics
Hadoop Netezza
Revolution Confidential
14
Revolution R Enterprise with RevoScaleRBig Data Statistics in R
www.revolutionanalytics.com/bigdata
Every US airline departure and arrival, 1987-2008
File: AirlineData87to08.xdfRows: 123.5 millionVariables: 29Size on disk: 13.2Gb
arrDelayLm2 <- rxLinMod(ArrDelay ~ DayOfWeek:F(CRSDepTime),cube=TRUE)
Revolution Confidential
15
Example: Old Wives Census Analysis
http://info.revolutionanalytics.com/CensusOldWivesWhitePaper.html
Revolution Confidential
16
Compute Node
(RevoScaleR)
Compute Node
(RevoScaleR) Master Node
(RevoScaleR)
DataPartition
DataPartition
Compute Node
(RevoScaleR)
Compute Node
(RevoScaleR)
DataPartition
DataPartition
• Portions of the data source are made available to each compute node
• RevoScaleR on the master node assigns a task to each compute node
• Each compute node independently processes its data, and returns its intermediate results back to the master node
• master node aggregates all of the intermediate results from each compute node and produces the final result
RevoScaleR – Distributed Computing
*Available for Microsoft HPC Server, November 2011Video demo: http://bit.ly/riUBgs
Revolution Confidential
17
Revolution Analytics with Netezza Appliance
More info: http://bit.ly/R-Netezza
Revolution Confidential
18
R Client
R
Task Tracker
Map or Reduce
Job Tracker
Task Node
Revolution Analytics with Hadoop
• Connectors to HDFS and HBASE for interacting with data stores directly in R
• Hadoop Streaming package for executing MapReduce jobs from R.
HDFS
Revolution Confidential
19
Enterprise Readiness: Revolution R Enterprise Server
Multi-User Support Production Applications
Integrate R analytics into Web based applications Data Analysis and Visualization Reporting Dashboards Interactive applications
Revolution R Enterprise Server with RevoDeployR
Revolution Confidential
20
Deployment with Revolution R Enterprise
RevoDeployR Web Services
Client libraries (JavaScript, Java, .NET)
Desktop Applications (i.e. Excel)
Business Intelligence
(i.e. Jaspersoft)
Interactive Web Applications
HTTP/HTTPS – JSON/XML
Session Management
AuthenticationData/Script
ManagementAdministration
R
R Programmer
ApplicationDeveloper
End User
Revolution Confidential
21
The Advanced Analytics Stack
Deployment / Consumption
Advanced Analytics
ETL
Data / Infrastructure
“Open Analytics Stack” White Paper: bit.ly/lC43Kw
Revolution Confidential
22
On-Call Technical Support Consulting
Migration | Analytics | Applications | Validation Training
R | Revolution R | Statistical Topics Systems Integration
BI | ERP | Databases | Cloud
Revolution Confidential
Wrapping Up
Revolution ConfidentialWhy R?
24
Every data analysis technique at your fingertips Create beautiful and unique data visualizations Get better results faster Draw on the talents of data scientists worldwide R is hot, and growing fast
Revolution Confidential
25
Revolution R Enterprise
High-performance R for multiprocessor systemsModern Integrated Development EnvironmentStatistical Analysis of Terabyte-Class Data Sets In-database R analytics with Hadoop1 and NetezzaDeploy R Applications via Web ServicesTelephone and email technical supportTraining and consulting services100% compatible with R packagesEasy-to-Use GUI1
Production-Grade Statistical Analysis for the Workplace
1 Coming Soon
Revolution Confidential
26
Further Reading
http://bit.ly/revo-r-pdf http://bit.ly/r-is-hot
Revolution Confidential
27
Revolution R Enterprise: Free to Academia
Personal use Research Teaching Package development
Free Academic Downloadwww.revolutionanalytics.com/downloads/free-academic.php
Discounted Technical Support Subscriptions Available
Revolution Confidential
28
Thank You!
Download slides, replay (from Aug 24) http://bit.ly/railcj
Learn more about Revolution R revolutionanalytics.com/products
Keep up to date with R and Revolution news revolutionanalytics.com/newsletter
Contact Revolution Analytics http://bit.ly/hey-revo
Revolution Confidential
29
The leading commercial provider of software and support for the popular open source R statistics language.
www.revolutionanalytics.com+1 (650) 330 0553
Twitter: @RevolutionR