Data Analytics with MATLAB - MathWorks · 2 What is Data Analytics? “ Data analytics solutions...
Transcript of Data Analytics with MATLAB - MathWorks · 2 What is Data Analytics? “ Data analytics solutions...
1© 2014 The MathWorks, Inc.
Data Analytics with MATLAB
MathWorks India
2
What is Data Analytics?
“Data analytics solutions allow firms
to discover, optimize, and deploy predictive
models
by analyzing data sources
to improve business outcomes.”
Forrester Research
Challenges
• Overloaded by data
• Competition for better decision-making
• Big Data buzz, but missing solution to make sense of it (analytics)
and use it (deployment)
3
Case Study: RITA Data Analytics
Goal: Create a data analytics tool, to get business
intelligence from RITA flight dataset. Data
available from 1987 till date
Challenges: Huge volume of Data (~ 30 GB)
Selective data extraction and data mining
Predictive analytics based on data
End users need to be isolated form the
technicalities
Solution: Create an interactive application, which
– Showcases critical statistics from data
– Performs data mining and machine learning
– Allows user to perform predictive analytics
– Creates detailed explanatory reports, giving relevant
intelligence
4
• Unified end to end platform for Data
Analytics
• Parallel processing to handle large data
and intensive calculations
• In-built machine learning algorithms for
predictive analytics
• Support for building, deploying and
integrating analytics
Key take-aways
5
Data Analytics Workflow
MATLABProduction
Server(s)
WebServer(s)
Excel add-ins
Desktop
Web & Enterprise
6
Customer Example – Tesco (Challenge)
7
Tesco (Required Solution)
8
Tesco – Data Analytics Requirement
Temperature History
Weather Forecast
Sales History
Data Analytics Models
Enterprise Systems
9
Tesco – Data Analytics Solution and Results
Business
ResultsTemperature History
Weather Forecast
Sales History
Data Analytics Models
Enterprise Systems
Automate
10
Tesco (Results Achieved)
11
Integrated Platform for Data Analytics
Challenges MATLAB Solution
Handling Large Data Distributed arrays and Parallel Computing
Extract value from data Built-in Statistics and Machine learning
algorithms
Time to deploy & integrate Ease of deployment and leveraging enterprise
Technology risk High-quality libraries and support
12
A scalable solution for big data computation !
Parallel and Distributed Computing
13
Distributed Arrays: A PCT & MDCS Paradigm
Multiple Flat
Files
Distributed
Arrays
14
Predictive Analytics - Overview
Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
LearningClustering
Group and interpretdata based only
on input data
Develop predictivemodel based on bothinput and output data
Type of Learning Categories of Algorithms
15
Predictive Analytics - Unsupervised LearningClustering with MATLAB
Supervised
Learning
Unsupervised
LearningClustering
Group and interpretdata based only
on input dataMachine
Learning
K-means,
Fuzzy K-means
Hierarchical
Neural Network
Gaussian
Mixture
Classification
Regression
16
Supervised LearningClassification with MATLAB
Unsupervised
Learning
Develop predictivemodel based on bothinput and output data
Machine
Learning
Supervised
Learning
Decision Tree
Ensemble
Method
Neural Network
Support Vector
Machine
Classification
17
Supervised LearningRegression with MATLAB
Unsupervised
Learning
Develop predictivemodel based on bothinput and output data
Machine
Learning
Supervised
Learning
Linear
Non-linear
Non-parametric
Regression
18
• Unified end to end platform for Data
Analytics
• Parallel processing to handle large data
and intensive calculations
• In-built machine learning algorithms for
predictive analytics
• Support for building, deploying and
integrating analytics
Key take-aways