Model-Agnostic Meta-Learning for Fast Adaptation of Deep...

Model-Agnostic Meta-Learning for FastAdaptation of Deep Networks

Chelsea Finn, Pieter Abbeel, Sergey LevinePresented by: Teymur Azayev

CTU in Prague

17 January 2019

Deep Learning

I Very powerful, expressive differentiable models.

I Flexibility is a double edged sword.

How do we reduce the amount of required samples?

Use Use Prior knowledge (not in a Bayesian sense). This can be inthe form of:

I Model constraint

I Sampling strategy

I Update rule

I Loss function

I etc...

Meta learning

Learning to learn fast.Essentially learning a prior from a distribution of tasks.Several recent successful approaches:

I Model based meta-learning [Adam Santoro et al.],[Jx Wang et al.], [Yan Duan et al.]

I Metric meta-learning[Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov.],[Oriol Vinyals et al.]

I Optimization based meta-learning[Sachin Ravi and Hugo Larochelle],[Marcin Andrychowicz et al.],

Model Agnostic Metal Learning

Main idea: Learn a parameter initialization for a distribution oftasks, such that given a new task a small amount of examples(gradient updates) suffice.

Definitions

Task Ti ∼ p(T ) is defined as a tuple (Hi , qi ,LTi) consisting of

I time horizon Hi where for supervised learning Hi = 1

I initial state distribution qi (x0) and state transition distributionqi (xt+1|xt)

I Task loss function LTi→ R

I Task distribution p

Losses

I θ∗i is the optimal parameter for task Ti

I θ′i is the parameters obtained for task Ti after a single update

I 2) is the meta objective

Algorithm

Reinforcement learning

Reinforcement learning adaptation

Sin wave regression

Tasks: Regressing randomly generated sin waves

I amplitudes ranging in [0.1, 5]

I phases [0, 2π]

I Sampled uniformly in range [−5, 5]

Sin wave regression

Classification tasks

Omniglot

I 20 instances of 1623 characters from 50 different alphabets

I Each instance drawn by a different person

I Randomly select 1200 characters for training and theremaining for testing

MiniImagenet

I 64 training classes, 12 validation classes, and 24 test classes

RL experiment

I Rllab benchmark suite, Mujoco simulator

I Gradient update are computed using policy gradientalgorithms.

I Tasks are defined by the agents simply having slightlydifferent goals

I Agents are expected to infer new goal from reward afterreceiving only 1 gradient update.

Conclusion

I Simple effective meta learning method

I Decent amount of follow up work [?], [?]

I Concept extendable to meta learning other parts of thetraining procedure

Thank you for your attention

ReferencesMarcin Andrychowicz et al.

Learning to learn by gradient descent by gradient descent.

NIPS 2016

Yan Duan et al.

RL2: Fast Reinforcement Learning via Slow Reinforcement Learning.

Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov.

Siamese Neural Networks for One-shot Image Recognition

ICML 2015

Zhenguo Li et al.

Meta-SGD: Learning to Learn quickly for few shot learning.

Matthias Plappert et al.

Meta-SGD: Parameter Space Noise for Exploration

Sachin Ravi and Hugo Larochelle

Meta-SGD:Optimization as a Model for Few-shot Learning

ICLR 2017

Adam Santoro et al.

One-shot Learning with Memory-Augmented Neural Networks

ICML 2016

Oriol Vinyals et al.

Matching Networks for One Shot Learning

NIPS 2016

Jx Wang et al.

Learning to Reinforcement Learn

Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

Model-Agnostic Meta-Learning for Multimodal Task Distributions

Ignasi Clavera, Anusha Nagabandi, Simin Liu, Ronald S. Fearing, PieterAbbeel, Sergey Levine, Chelsea Finn

Learning to Adapt in Dynamic, Real-World Environments throughMeta-Reinforcement Learning

References I

Model-Agnostic Meta-Learning for Fast Adaptation of Deep...

Documents

Transcript of Model-Agnostic Meta-Learning for Fast Adaptation of Deep...

Interfaith Wedding: Agnostic marrying Catholic

Concurrency · Making code transactional @Transactional public void findAllFoosAround() {//tx agnostic code here} withTransaction {//tx agnostic code here}

Compiler-Agnostic Function Detection in Binaries · Compiler-Agnostic Function Detection in Binaries Dennis Andriessey, Asia Slowinska, Herbert Bosy ... Compiler-Agnostic Function

Padrino is agnostic

Albert Einstein the Agnostic

Location Agnostic Data - Proligence · Datawarehouse Location Agnostic Data 6 Schema A SALES SALES_HISTORY Schema B SALES_HISTORY Datawarehouse. Location Agnostic Data 7 Schema A

Powerful, Tumor-agnostic Immunotherapy Treatment

Agnostic Device Drivers

Annotation-agnostic differential expression analysis

Domain Agnostic Learning with Disentangled Representationsproceedings.mlr.press/v97/peng19b/peng19b.pdf · Domain Agnostic Learning with Disentangled Representations Xingchao Peng1

Fuzzy control 5 6 - cmp.felk.cvut.cz

Robotics - cmp.felk.cvut.cz

APIs for platform agnostic communication

Designing Experiment Agnostic Remote Laboratories

Supplementary Material: Language-Agnostic Visual-Semantic ...openaccess.thecvf.com/content_ICCV_2019/...Language-Agnostic Visual-Semantic Embeddings 1. Hyper-Parameters and Training

Distribution Agnostic Video Server

How to become an agnostic. A platform agnostic

Platform agnostic information systems development

€¦ · INTEGRATION . Data-agnostic Platform-agnostic Ego-agnostic . OPEN API ARCHITECTURE Sales Force ... Marketo FULCRUM THE INDUSTRY EXCHANGE iModerate Research Technologies .

Progressive Enhancement 2.0 (Conference Agnostic)