SciKit Learn: How to Standardize Your Data

How to

Standardize Your Data:

A ML Recipe

DAMIAN MINGLECHIEF DATA SCIENTIST, WPC Healthcare

@DamianMingle

GET THE FULL STORY

bit.ly/UseSciKitNow

What’s Standardization Anyway?

• Often referred to as “functions and transformers that change raw feature vectors into a representation that is more suitable for the downstream estimator”

• Shifting the distribution of each attribute to have a mean of “0” and a standard deviation of “1”.

Why Standardization Matters

• It’s a common requirement of models

• Models may behave badly without it

• It’s useful for models that rely on the distribution of attributes such as Gaussian processes.

Power in SciKit Learn

• Preprocessing

• Clustering

• Regression

• Classification

• Dimensionality Reduction

• Model Selection

Power of SciKit Learn

Let’s Look at ML Recipe

Standardization

The Imports

from sklearn.datasets import load_iris

from sklearn import preprocessing

Separate Features from Target

iris = load_iris()

print(iris.data.shape)

X = iris.data

y = iris.target

Standardize the Features

normalized_X = preprocessing.scale(X)

Standardization Recipe

# Normalize the data attributes for the Iris dataset.

from sklearn.datasets import load_iris

from sklearn import preprocessing

# load the iris dataset iris = load_iris() print(iris.data.shape)

# separate the data from the target attributes

X = iris.data

y = iris.target

# normalize the data attributes

normalized_X = preprocessing.scale(X)

How to

Standardize Your Data:

An ML Recipe

DAMIAN MINGLECHIEF DATA SCIENTIST, WPC Healthcare

@DamianMingle

GET THE FULL STORY

bit.ly/UseSciKitNow

Resources

• Society of Data Scientists

• SciKit Learn

• Also:• Scaling features to a range (MinMaxScaler or MaxAbsScaler)

• Scaling sparse data (StandardScaler)

• Scaling data with outliers (RobustScaler)

SciKit Learn: How to Standardize Your Data

Data & Analytics

Transcript of SciKit Learn: How to Standardize Your Data

Intro to machine learning with scikit learn

Machine Learning - University of Rhode Island...Machine Learning in Python -Scikit-Learn We will be using the Scikit-Learn module to build decision trees. Scikit-learn or sklearnfor

SUPERVISED LEARNING WITH SCIKIT-LEARN - Amazon S3 · Supervised Learning with scikit-learn Reinforcement learning So#ware agents interact with an environment Learn how to optimize

scikit-learn empiriciSN - SLAC

Machine Learning with MLlib and scikit-learn · Machine Learning with MLlib and scikit-learn Christopher Homa . Goal Compare performance of sk-learn and MLlib machine learning libraries

Accelerating Random Forests in Scikit-Learn

SciKit-Learn Laboratory Documentation

Think Machine Learning with Scikit-Learn (Python)

What's new in scikit-learn 0.17

Think machine-learning-with-scikit-learn-chetan

Gradient Boosted Regression Trees in scikit-learn

Lessons learned from scikit-learn · Lessons learned from scikit-learn (and how they helped me in research & life) Vlad Niculae Cornell University

Exploring Machine Learning in Python with Scikit-Learn

Scikit-learn / Keras Basic Implementation Tutorialmll.csie.ntu.edu.tw/course/iot_s19/lecture/190320_SklearnKerasTutorial.pdfWhat’s Scikit-learn?!6 • A free software machine learning

Machine Learning in Python with scikit-learn · Outline • Machine Learning refresher • scikit-learn • How the project is structured • Some improvements released in 0.15 •

Scikit Learn: Data Normalization Techniques That Work

Machine Learning with scikit-learn

Pycon 2012 Scikit-Learn

Introduction to Machine Learning with SciKit-Learn

Teaching ML with scikit-learn at Telecom ParisTech