Automatic Machine Learning, AutoML

Automatic Machine Learning

By: Himadri Mishra, 13074014

Overview: What is Machine Learning?

● Subfield of computer science● Evolved from the study of pattern recognition and

computational learning theory in artificial intelligence● Gives computers the ability to learn without being

explicitly programmed● Explores the study and construction of algorithms that

can learn from and make predictions on data

Basic Flow of Machine Learning

Overview: Why Machine Learning?

● Some tasks are difficult to define algorithmically. Example: Learning to recognize objects.

● High-value predictions that can guide better decisions and smart actions in real time without human intervention

● Machine learning as a technology that helps analyze these large chunks of big data,

● Research area that targets progressive automation of machine learning

● Also known as AutoML● Focuses on end users without expert knowledge● Offers new tools to Machine Learning experts.

○ Perform architecture search over deep representations○ Analyse the importance of hyperparameters

○ Development of flexible software packages that can be instantiated automatically in a data-driven way

● Follows the paradigm of Programming by Optimization (PbO)

What is Automatic Machine Learning?

Examples of AutoML

● AutoWEKA: Approach for the simultaneous selection of a machine learning algorithm and its hyperparameters

● Deep Neural Networks: notoriously dependent on their hyperparameters, and modern optimizers have achieved better results in setting them than humans (Bergstra et al, Snoek et al).

● Making a science of model search: a complex computer vision architecture could automatically be instantiated to yield state-of-the-art results on 3 different tasks: face matching, face identification, and object recognition.

Methods of AutoML

● Bayesian optimization● Regression models for structured data and big data● Meta learning● Transfer learning● Combinatorial optimization.

An AutoML Framework

Modules of AutoML Framework, unraveled

● Data Pre-Processing● Problem Identification and Data Splitting● Feature Engineering● Feature Stacking● Application of various models to data● Decomposition● Feature Selection● Model selection and HyperParameter tuning● Evaluation of Model

Data Pre-Processing

● Tabular data is most common way of representing data in machine learning or data mining

● Data must be converted to a tabular form

Problem Identification and Data Splitting

● Single column, binary values (Binary Classification)● Single column, real values (Regression problem)● Multiple column, binary values (Multi-Class

Classification)● Multiple column, real values (Multiple target Regression

problem)● Multilabel Classification

Types of Labels

● Stratified KFold splitting for Classification● Normal KFold split for regression

Feature Engineering

● Numerical Variables○ No Processing Required

● Categorical Variables○ Label Encoders○ One Hot Encoders

● Text Variables○ Count Vectorize○ TF-IDF vectorize

Types of Variables

Feature Stacking

● Two Kinds of Stacking○ Model Stacking

■ An Ensemble Approach■ Combines the power of diverse models into single

○ Feature Stacking■ Different features after processing, gets combined

● Our Stacker Module is a feature stacker

Application of models and Decomposition

● We should go for Ensemble tree based models:○ Random Forest Regressor/Classifier○ Extra Trees Regressor/Classifier○ Gradient Boosting Machine Regressor/Classifier

● Can’t apply linear models without Normalization○ For dense features Standard Scaler Normalization

○ For Sparse Features Normalize without scaling about mean, only to unit variance

● If the above steps give a “good” model, we can go for optimization of hyperparameters module, else continue

● For High dimensional data, PCA is used to decompose● For images start with 10-15 components and increase it as

long as results improve● For other kind of data, start with 50-60 components● For Text Data, we use Singular Value Decomposition after

converting text to sparse matrix

Feature Selection

● Greedy Forward Selection○ Selecting best features iteratively○ Selecting features based on coefficients of model

● Greedy backward elimination● Use GBM for normal features and Random Forest for Sparse

features for feature evaluation

Model selection and HyperParameter tuning

● Most important and fundamental process of Machine Learning

● Classification:○ Random Forest○ GBM○ Logistic Regression○ Naive Bayes○ Support Vector Machines○ k-Nearest Neighbors

● Regression○ Random Forest○ GBM○ Linear Regression○ Ridge○ Lasso○ SVR

Choice of Model and Hyperparameters

Evaluation of Model

Saving all Transformations on Train Data for reuse

Re-Use of saved transformations for Evaluation on validation set

Current Research

Automatic Architecture selection for Neural Network

Automatically Tuned Neural Network

● Auto-Net is a system that automatically configures neural networks● Achieved the best performance on two datasets in the human expert track of

the recent ChaLearn AutoML Challenge● Works by tuning:

○ layer-independent network hyperparameters○ per-layer hyperparameters

● Auto-Net submission reached an AUC score of 90%, while the best human competitor (Ideal Intel Analytics) only reached 80%

● first time an automatically-constructed neural network won a competition dataset

Conclusion

● Machine learning (ML) has achieved considerable successes in recent years and an ever-growing number of disciplines rely on it.

● However, its success crucially relies on human machine learning experts to perform various tasks manually

● The rapid growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used easily and without expert knowledge

● Auto-ML is an open research topic and will be very soon challenging the state of the Art results in various domains

Thank You

Automatic Machine Learning, AutoML

Engineering

Transcript of Automatic Machine Learning, AutoML