Learning to Warm-Start Bayesian Hyperparameter...

Learning to Warm-StartBayesian Hyperparameter Optimization

and Task-Adaptive Ensemble of Meta-Learnersfor Few-Shot Classification

Jungtaek Kim (jtkim@postech.ac.kr)

Machine Learning Group,Department of Computer Science and Engineering, POSTECH,

77 Cheongam-ro, Nam-gu, Pohang 37673,Gyeongsangbuk-do, Republic of Korea

September 11, 2018

Table of Contents

Learning to Warm-Start Bayesian Hyperparameter OptimizationMotivationMain ArchitectureExperiments

Task-Adaptive Ensemble of Meta-Learners for Few-Shot ClassificationMotivationMain ArchitectureExperiments

Learning to Warm-Start BayesianHyperparameter Optimization

Motivation

I Bayesian hyperparameter optimization usually starts fromrandom initial points.

I Better initializations might help to speed up Bayesianhyperparameter optimization.

I Mappings from hyperparameters to validation error are able tobe trained.

I We attempt to transfer prior knowledge about initializationsto new task.

Main Architecture

All weights are shared.

Meta-featureextractor

Dataset

Deep featureextractor

Dataset

Deep featureextractor

Meta-feature distance

fc layerfc layer

Experiments (EI)

0 5 10 15 20Iteration

(a) AwA2

(b) Caltech-101

(c) Caltech-256

(d) CIFAR-10

(e) CIFAR-100

(f) CUB200-2011

(g) MNIST

(h) VOC2012

Random init. (Uniform)

Random init. (Latin)

Random init. (Halton)

Nearest best init. (ADF)

Nearest best init. (Bi-LSTM)

Experiments (UCB)

(j) AwA2

(k) Caltech-101

(l) Caltech-256

(m) CIFAR-10

(n) CIFAR-100

(o) CUB200-2011

(p) MNIST

(q) VOC2012

Random init. (Uniform)

Random init. (Latin)

Random init. (Halton)

Nearest best init. (ADF)

Nearest best init. (Bi-LSTM)

Task-Adaptive Ensemble ofMeta-Learners for Few-Shot

Classification

Motivation

I Few-shot classification needs to generalize training episodesand outperform in test episodes.

I Domain distribution of meta-learner for few-shot classificationis assumed not to be changed.

I In practice, domain distribution is able to be varied.

I We try to make ensemble of several meta-learners, each ofwhich is trained by the episodes from single dataset.

Main Architecture

Experiments

Learning to Warm-Start Bayesian Hyperparameter...

Documents

Transcript of Learning to Warm-Start Bayesian Hyperparameter...

Bayesian Optimization and Meta -Learning · 2019-06-17 · NAS as Hyperparameter Optimization Frank Hutter: Bayesian Optimization and Meta -Learning 9 We can rewrite most NAS spaces

AutoNE: Hyperparameter Optimization for Massive …AutoNE: Hyperparameter Optimization for Massive Network Embedding Ke Tu∗ Tsinghua University tuk15@mails.tsinghua.edu.cn Jianxin

COMP 551 – Applied Machine Learning Lecture 21: Bayesian …jpineau/comp551/Lectures/21... · 2017-11-21 · 21 Herke van Hoof Hyperparameter optimisation for GPR & BLR • Example

Hyperparameter Learning via Distributional Transfer · Hyperparameter Learning via Distributional Transfer ... utilizing such structure in order to borrow strength across different

Forward and Reverse Gradient-based Hyperparameter Optimization · Dougal Maclaurin, David Duvenaud, and Ryan P. Adams. Gradient-based hyperparameter optimization through reversible

Speeding up Automatic Hyperparameter Optimization of Deep ...

REDUCING THE SEARCH SPACE FOR HYPERPARAMETER OPTIMIZATION ...home.engineering.iastate.edu/~chinmay/files/papers/pgsrICASSP19.… · of hyperparameter optimization (HPO) addresses

HYPERPARAMETER OPTIMIZATION OF DEEP CONVOLUTIONAL …

Bayesian Hyperparameter Optimization : Overfitting ... · Bayesian Hyperparameter Optimization: Overﬁtting, Ensembles and Conditional Spaces Thèse Julien-Charles Lévesque Sous

Gaussian Processes for Text Regressionetheses.whiterose.ac.uk/17619/1/thesis.pdf · 3.2 String kernel hyperparameter optimisation results. For each hyperparameter its original value

Speeding up Automatic Hyperparameter …ml.informatik.uni-freiburg.de/papers/15-IJCAI...Speeding up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of

Efficient Benchmarking of Hyperparameter Optimizers via ... · Holger H. Hoos and Kevin Leyton-Brown University of British Columbia {hoos, kevinlb}@cs.ubc.ca What is Bayesian Optimization?

Mini-Course 6: Hyperparameter Optimization { Harmonica · Mini-Course 6: Hyperparameter Optimization { Harmonica Yang Yuan Computer Science Department Cornell University. ... the

Multi-Objective Multi-Fidelity Hyperparameter Optimization ...

Bilevel Programming for Hyperparameter Optimization and ...

Hyperparameter optimization with approximate gradient

CSC321 Lecture 21: Bayesian Hyperparameter Optimizationrgrosse/courses/csc321_2017/slides/lec21.pdf · Overview Today’s lecture: a neat application of Bayesian parameter estimation

Automated Hyperparameter Tuning for Effective Machine Learningsupport.sas.com/resources/papers/proceedings17/SAS0514-2017.pdf · SAS514-2017 Automated Hyperparameter Tuning for Effective

Three Layer Super Learner Ensemble with Hyperparameter ...

Hyperparameter Estimation in Exponential Family State ... · Hyperparameter Estimation in Exponential Family State Space Models Sonderforschungsbereich 386, Paper 6 (1995) ... tial