Practical Machine Learning with R - Matthewrenze · Insurance Policy Risk Data Set Insurance Policy...

Practical Machine Learning with R

@MatthewRenze#Microsoft

Job Postings for Machine Learning

Source: Indeed.com

Source: Stack Overflow 2017

Average Salary by Job Type (USA)

$108,000

$101,000

$100,000

Tool: language, platform, analytics

Source: O’Reilly 2015 Data Science Salary Survey

Overview

1. Intro to ML and R

2. Classification

3. Regression

4. Clustering

5. ML in Practice

How Does This Apply to Me?

Make decisions using data

Make predictions using data

Make recommendations using data

Automate these with code

Conceptual Model

Data PredictionMachine

Learning

𝑓 𝑥

About Me

Data Science Consultant

EducationB.S. in Computer Science

B.A. in Philosophy

Data Science specializations

CommunityPublic speaker

Pluralsight author

Microsoft MVP

Open source

Schedule

Lectures (10 min)

Demos (10 min)

Labs (20 min)

Breaks (5 min)

Logistics

Pairing for labs is optional

Ask questions if needed

Come and go as needed

Feedback at the end

A(Easy)

B(Hard)

A(Easy)

B(Hard)

Workshop URL

http://www.matthewrenze.com/workshops/practical-machine-learning-with-r/

Introduction to Machine Learning

What is machine learning?

ArtificialIntelligence

StatisticsMachineLearning

𝑓 𝑥

Data Function Prediction

DataPredictionFunction

𝑓 𝑥

Cat Not cat

𝑓 𝑥

Not cat

Cat Is cat?

𝑓 𝑥

Not cat

Cat Is cat?

𝑓 𝑥

Not cat

Cat Is cat? Yes

𝑓 𝑥

Not cat

What types of machine learning exist?

Types of Machine Learning

Supervised Learning Unsupervised Learning

Supervised Learning

Unsupervised Learning

How does machine learning work?

Training

Algorithm

Training

AlgorithmModel

Training

AlgorithmModel

Training

AlgorithmModel

Training

New Data

Training

AlgorithmModel

Training

New Data

Prediction

Super simplified version of machine learning!

What can machine learning do?

𝑓 𝑥

Source: YOLO: Real-Time Object Detection

Source: http://grail.cs.washington.edu/projects/AudioToObama/ Source: Nvidia

Source: http://grail.cs.washington.edu/projects/AudioToObama/

Source: Pouff - Grocery TripSource: Google Deep Mind

Source: Boston Dynamics

𝑓 𝑥 1.23

Disclaimer

Introduction to R

What is R?

Open source

Language and environment

Numerical and graphical

Cross platform

What is R?

Active development

Large user community

Modular and extensible

10,000+ extensions

Source: http://redmonk.com/sogrady/2016/07/20/language-rankings-6-16/

Tool: language, platform, analytics

Source: O’Reilly 2015 Data Science Salary Survey

Demo 1 R Language Basics

Lab 1 R Language Basics

Classification

Count of Spam Words

𝑓 𝑥

Data Function Category

Classification Algorithms

k-Nearest Neighbors

Decision Tree Classifier

Naïve Bayes Classifier

Support Vector Machine

Neural Network Classifier

Classification Algorithms

is sex male?

is age > 9.5?

is family > 2.5?

SurvivedDied

Survived

k-Nearest Neighbors Decision Tree Neural Network

k-Nearest Neighbors Classifier

Count of Spam Words

K-Nearest Neighbors Classifier

Supervised learning

Uses class of neighbors ?

Supervised learning

Uses class of neighbors

k specifies how many?

Supervised learning

Uses class of neighbors

k specifies how many

Simple and easy

Count of Spam Words

Is count of spam words > 5?

Not Spam?

Is correct-spelling ratio > 50%?

Not Spam

Is correct-spelling ratio > 50%?

Is known contact?

SpamNot spam

Not Spam

Count of Spam Words

Has count of spam words > 5?

Supervised learning

is sex male?

is age > 9.5?

is family > 2.5?

SurvivedDied

Survived

Supervised learning

Tree of decisions

is sex male?

is age > 9.5?

is family > 2.5?

SurvivedDied

Survived

Supervised learning

Tree of decisions

Information gain

is sex male?

is age > 9.5?

is family > 2.5?

SurvivedDied

Survived

Supervised learning

Tree of decisions

Information gain

Simple and easy

is sex male?

is age > 9.5?

is family > 2.5?

SurvivedDied

Survived

inputs neuron outputs

Artificial Neuron𝑥1

Artificial Neuron

Artificial Neuron𝜔0

Artificial Neuron

𝜔1𝜔2

Artificial Neuron𝑥1

𝑦𝑘 = 𝜑

𝑗=0

𝑤𝑘𝑗𝑥𝑗

Artificial Neural Network

input outputhidden

Forward propagation

Backward propagation

Forward propagation

Supervised learning

Neurons in a brain

Supervised learning

Neurons in a brain

Weighted connections

Supervised learning

Neurons in a brain

Weighted connections

Complex

Real-World Examples

Should we approve this loan?

Will this customer buy from us?

Should we replace this part?

Does this person have cancer?

Iris Data Set

Iris Setosa Iris Versicolor Iris Virginica

Photos by Radomił Binek, Danielle Langlois, and Frank Mayfield

Fisher’s Iris Data

Species Petal Length Petal Width Sepal Length Sepal Width

setosa 1.1 0.1 4.3 3

setosa 1.4 0.2 4.4 2.9

setosa 1.3 0.2 4.4 3

setosa 1.3 0.2 4.4 3.2

setosa 1.3 0.3 4.5 2.3

… … … …

Iris Data Set

Goal: Predict species based on

petal and sepal measurements

Demo 2 - Classification

Insurance Policy Risk Data Set

Insurance Policy Risk

Gender State State Rate Height Weight BMI Age Risk

Male MA 0.01 184 67.8 20.0 77 High

Male VA 0.14 163 89.4 33.6 82 High

Female NY 0.09 170 81.2 28.1 31 Low

Male TN 0.12 175 99.7 32.6 39 Low

Female FL 0.11 184 72.1 21.3 68 High

… … … … … … …

Insurance Policy Rates Data Set

Insurance Policy Rates

Gender State State Rate Height Weight BMI Age Rate

Male MA 0.01 184 67.8 20.0 77 0.33

Male VA 0.14 163 89.4 33.6 82 0.87

Female NY 0.09 170 81.2 28.1 31 0.01

Male TN 0.12 175 99.7 32.6 39 0.02

Female FL 0.11 184 72.1 21.3 68 0.15

… … … … … … …

Lab 2A – Classification (Easy)

Goal: Predict species based on

petal and sepal measurements

Lab 2B – Classification (Hard)

Goal: Predict the risk of

an insurance policy

Regression

𝑓 𝑥 1.23

Data Function Number

Regression Algorithms

Linear Regression

Polynomial Regression

Lasso Regression

ElasticNet Regression

Neural Network Regression

Regression Algorithms

Simple Linear Multiple Linear Neural Network

Simple Linear Regression

Relationship

Linear model

Relationship

Linear model

y = m · x + b

Relationship

Linear model

y = m · x + b

Parameters estimated

Multiple Linear Regression

Similar to SLR

Multiple variables

Similar to SLR

Multiple variables

Multiple slopes

Similar to SLR

Multiple variables

Multiple slopes

Categorical variables

Similar to NN classifier

Numeric output

Real-World Examples

How much profit will we make?

What will the price be tomorrow?

How many units will they buy?

How long until this part fails?

Demo 3 - Regression

Goal: Predict petal width

Lab 3A – Regression (Easy)

Goal: Predict petal width

Lab 3B – Regression (Hard)

Goal: Predict mortality rate

Clustering

Income

𝑓 𝑥

Data Function Group

Clustering Algorithms

K-means

Hierarchical clustering

Expectation maximization

k-Means Clustering

Income

k-Means Clustering

Unsupervised learning

Source: Wikipedia

k-Means Clustering

Specify k (# of clusters)

Source: Wikipedia

k-Means Clustering

Algorithm finds centers

Source: Wikipedia

k-Means Clustering

Algorithm finds centers

Random restarts

Source: Wikipedia

Hierarchical Clustering

a b c d e f

abcdef

a b c d e f

abcdef

a b c d e f

abcdef

a b c d e f

abcdef

a b c d e f

abcdef

a b c d e f

abcdef

a b c d e f

abcdef

a b c d e f

abcdef

Tree of connectedness

a b c d e f

abcdef

Tree of connectedness

Cuts create clusters

a b c d e f

abcdef

Real-world Examples

What are our market segments?

How to group our documents?

Which products to recommend?

Demo 4 - Clustering

Goal: Group flowers by similarity

Lab 4A – Clustering (Easy)

Goal: Group flowers by similarity

Lab 4B – Clustering (Hard)

Goal: Group insurance policies

Ensemble Learning

Wisdom of the Crowds

𝑓2 𝑥

𝑓1 𝑥

𝑓3 𝑥

Ensemble Learning

Types of Ensembles

Same Type of Model Different Types of Models

Ensemble Creation Techniques

Bagging

Boosting

Stacking

𝑓2 𝑥𝑓1 𝑥 𝑓3 𝑥

Ensemble Aggregation Techniques

Averaging

Majority Vote

Weighted Average

Weighted Majority Vote 𝑓2 𝑥𝑓1 𝑥 𝑓3 𝑥

Random Forest Classifier

Multiple trees

Created by bagging

Multiple trees

Created by bagging

Majority vote

Multiple trees

Created by bagging

Majority vote

More robust

Why Use Ensemble Learning?

More accurate

More robust

More stable

Why Use Ensemble Learning?

More accurate

More robust

More stable

More complex

More CPU time

More art than science

Ensemble Learning Demo

V1 V2 V3 … V58 V59 V60 Class

0.02 0.03 0.04 … 0.00 0.01 0.00 rock

0.04 0.05 0.08 … 0.00 0.01 0.00 mine

0.02 0.05 0.10 … 0.01 0.01 0.01 rock

0.01 0.01 0.06 … 0.00 0.00 0.01 rock

0.07 0.06 0.04 … 0.00 0.01 0.01 mine

0.02 0.04 0.02 … 0.00 0.01 0.00 rock

… … … … … … … …

Demo 5 – ML in Practice

Goal: Predict rock or mine

Lab 5A – ML in Practice (Easy)

Goal: Predict rock or mine

Lab 5B – ML in Practice (Hard)

Goal: Predict risk class

Deep Learning

𝑓2 𝑥𝑓1 𝑥 𝑓3 𝑥

Deep Learning

input outputhidden 2

Deep Neural Network

hidden 1 hidden 3

Deep Neural Network

hidden 1 hidden 3

Deep Neural Network

hidden 1 hidden 3

Deep Neural Network

hidden 1 hidden 3

Deep Neural Network

hidden 1 hidden 3

Deep Neural Network

hidden 1 hidden 3

Deep Neural Network

hidden 1 hidden 3

Abstractness

Deep Learning Techniques

Fully connected (DNN)

Convolutional (CNN)

Recurrent (RNN)

Generative Adversarial (GAN)

Deep Q Learning (DQN)

Deep Neural Network

Neural network

Deep Neural Network

Neural network

Multiple hidden layers

Deep Neural Network

Neural network

Non-linear activation

Deep Neural Network

Neural network

Non-linear activation

Fully connected

Convolutional Neural Networks (CNN)

Convolutional Neural Network (CNN)

Sparse

Convolutions

Sparse

Convolutions

Filters

Sparse

Convolutions

Filters

Pooling

𝑓 𝑥

Why Use Deep Learning?

More powerful

More accurate

Data synthesis

Why Use Deep Learning?

More powerful

More accurate

Data synthesis

More complex

More training

Less transparent

Deep Learning Demo

5 10 15 20 25

28 x 28

Label Pixel 0 Pixel 1 Pixel 2 … Pixel 781 Pixel 782 Pixel 783

3 0 0 0 … 0 0 0

5 0 0 0 … 0 0 0

0 0 0 0 … 0 0 0

4 0 0 0 … 0 0 0

1 0 0 0 … 0 0 0

9 0 0 0 … 0 0 0

… … … … … … … …

28 x 28

Convolution 1

5x5 stride

20 filters tanh

Pool 1

Convolution 2

5x5 stride

50 filters tanh Pool

Demo 6 – Deep Learning

Goal: Predict handwritten digits

with a deep neural network

Lab 6A – Deep Learning (Easy)

with a deep neural network

with CNN (LeNet)

Reinforcement Learning

NOTE: Add video of RL playing video game

𝑓 𝑥 ActionState

EnvironmentAgent

action

reward

WorldCar

position

destination

Action replay

Optimal policy

Action replay

Optimal policy

Discounted reward

Action replay

Optimal policy

Discounted reward

Markov decision process

Reinforcement Learning Demo

Grid World

States

Actions

Rewards

Optimal Policy

States: s1, s2, s3, s4

Actions: up, down, left, right

Rewards: s1, s3, s3 = -1;

s4 = 10

Policy: s1 = down

s2 = right

s3 = up

Tic-Tac-Toe

ML in Practice

What is the machine learning process?

Find a question

Prepare the data

Find a question

Prepare the data

Train the model

Find a question

Prepare the data

Train the model

Evaluate the

Find a question

Prepare the data

Train the model

Evaluate the

Deploy the

Find a question

Prepare the data

Train the model

Evaluate the

Deploy the

Monitor the

Find a

question

Prepare

the data

Train the

Evaluate

Deploy

Monitor

Creating accurate and robust models is not easy

Goodness of Fit

Underfit

Goodness of Fit

Underfit Overfit

Goodness of Fit

Underfit Good fit Overfit

Goodness of Fit

Curse of Dimensionality

Movie Break

Demo 8 – ML in Practice

Goal: Predict survivors

of the Titanic

Lab 8A – ML in Practice (Easy)

Goal: Predict survivors

of the Titanic

Goal: Predict risk in practice

ML in Production

How to Deploy to Production

Deploy to web app (Shiny)

Deploy to cloud (Azure ML)

Deploy to server (ML Server)

Deploy to any app (ONNX)

Conclusion

This is just the tip of the iceberg!This is just the tip of the iceberg!

Ensemble Learning

Deep Learning

EnvironmentAgent

action

reward

Where do we go from here?

Where to Go Next

Data Camp: https://www.datacamp.com

Pluralsight: https://www.pluralsight.com

Coursera: https://www.coursera.org

www.pluralsight.com/authors/matthew-renze

Pluralsight Courses

Data Science with R

Data Science: The Big Picture

Deep Learning: The Big Picture

Exploratory Data Analysis with R

Data Visualization with R (3-part)

https://www.pluralsight.com/authors/matthew-renze

www.matthewrenze.com

Feedback

Very important to me!

What did you like?

What could I improve?

Conclusion

1. Intro to ML and R

2. Classification

3. Regression

4. Clustering

5. ML in Practice

Are you prepared?

Is your organization?

Is our world prepared?

Contact Info

Matthew Renze

Data Science Consultant

Renze Consulting

Twitter: @matthewrenze

Email: info@matthewrenze.com

Website: www.matthewrenze.com

Thank You! : )

Practical Machine Learning with R - Matthewrenze · Insurance Policy Risk Data Set Insurance Policy...

Documents

Transcript of Practical Machine Learning with R - Matthewrenze · Insurance Policy Risk Data Set Insurance Policy...

SPLIT - Microsoft · 2020. 12. 2. · AR30TXEABWKXSA H: 9.0 Outdoor 998 x 940 x 330 67.8 H: 2.4 - 13.0 SAMSUNG INVERTER REVERSE CYCLE SPLIT SYSTEMS. Manufacturers continual product

Caractéristiques des propulseurs 2011-2012€¦ · Cesaroni Technology Référence Fabricant 68F79-13A Dimensions 24mm x 133mm Masse chargée 107.5 g Impulsion totale 67.8 Ns (15.2

STANDARD THERMODYNAMIC PROPERTIES OF CHEMICAL … · AsF 3 Arsenic(III) ﬂuoride -821.3 -774.2 181.2 126.6 -785.8 -770.8 289.1 65.6 AsGa Gallium arsenide -71.0 -67.8 64.2 46.2 AsH

Total sales volumes - 57.7 52.7 68.2 67.8 73.9 75.0 Sales ... · Canadian Spirit Resources Inc. MANAGEMENT DISCUSSION AND ANALYSIS For the three and nine month periods ended September

Annual Report 2010 - Delticom€¦ · Return on equity % 45.4 34.4 +11.0 Liquidity position9 € million 67.8 40.6 +66.9 ... we approached the winter season of 2010 aware that matching

BAKER CRUISE DRIVE TOP COVER BAKER 4 SPEED ...crustyoldsite.bakerdrivetrain.com/wp-content/uploads/...SPROCKET NUT 50 ft-lbs (67.8 Nm) initial torque; then turn another 30-40 degrees;

Polk Library UW Oshkosh Offers List · 2016-06-23 · I 67.8:St 8 Practical guide to water quality studies of streams, by F. W 1969 I 67.8:W 29 Practice of water pollution biology

Financing Climate Action in Georgia - OECD€¦ · energy sector (i.e. 67.8% of total). Examples of large-scale energy projects include the development or rehabilitation of hydropower

3Q14 Earnings Results - Q Inversionistasqinversionistas.qualitas.com.mx › portal › wp-content › uploads › en_3… · COST RATIOS Acquisition Ratio 23.2% 22.6% 61 71.9% 67.8%

China Resources Quarterly - Department of Industry ... · Web view2175.8 1773.9 1317.2 1899.8 1950.0 Silver China imports t 99.7 83.6 85.5 99.9 78.1 67.8 90.5 83.7 76.3 na Domestic

Machine Learning with R - Home - Matthewrenze · About Me Data Science Consultant Education B.S. in Computer Science B.A. in Philosophy Community Public Speaker Pluralsight Author

Morarji Desai National Institute of Yoga...Priyanka Sharma Aakash Shubham Singh Verma Tanisha Mahajan WAITING LIST 67.8 67.61 67.5 67.45 67.45 67.34 67.13 67.12 67.06 64.85 64.8 64.58

Competition: 2020 Research Trainee Form: Application ID...Pubs_Patents_and_IP_rights_-_TRAINEE.pdf 67.8 KB - 10/30/2019 10:35am Total Files: 1 Project Title The title of your project

PARTNERSHIP AND INVESTMENT OPPORTUNITIES · Split of presentations in 2017 Kanazawa, Japan North America, 15.4% Europe, 11.6% Asia, 67.8% Oceania, 4.9% South America, 0.1% ... Waterfront

E S A F RAN 40 companies for EMPLOYEE SHAREHOLDING SHARE CAPITAL ... 67.8% Public 18% French State 14.1% Employees Treasury …

Find measures of institutional performance and economic ...faculty.arts.ubc.ca/asiwan/documents/nunn-econ495-lecture.pdf · Egypt EGY 7.95 6.77 67.8 Singapore SGP 10.15 9.32 17.7

Sivoia QED curtain track systems applications · • Carries curtains up to 67.8 kg across a track • Ideal for large, heavy curtains • The right solution for larger curved tracks

Clean Architecture - Matthewrenze · Venus Sun Saturn Jupiter Mars. Classic 3-layer Database-centric Architecture Database Database Data Access Business Logic UI. Database- vs. Domain-centric

GOLD SURROUND€¦ · 15 / 16" [9.9 cm] 13 1/ 2" [34.3 cm] 14 3/ 16" [36.1 cm] 26 11 / 16" [67.8 cm] Top view Bottom view 13 1/ 2" [34.3 cm] 26 11 / 16" [67.8 cm] 25 1 / 2" [64.8

2002 ELECTION RESULTS - California Courts · The Capitol Connection Page 3 District Counties Represented Candidates Notes 32 Los Angeles, San Bernardino D – NELL SOTO* (67.8%) R