Download - Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Transcript
Page 1: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Variational Inference and Generative Models

CS 285: Deep Reinforcement Learning, Decision Making, and Control

Sergey Levine

Page 2: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Class Notes

1. Homework 3 due next week!

Page 3: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Where we are in the course

RL algorithms advanced topics

Page 4: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

1. Probabilistic latent variable models

2. Variational inference

3. Amortized variational inference

4. Generative models: variational autoencoders

• Goals• Understand latent variable models in deep learning

• Understand how to use (amortized) variational inference

Today’s Lecture

Page 5: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Probabilistic models

Page 6: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Latent variable models

mixtureelement

Page 7: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Latent variable models in general

“easy” distribution(e.g., Gaussian)

“easy” distribution(e.g., Gaussian)

“easy” distribution(e.g., conditional Gaussian)

Page 8: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Latent variable models in RLconditional latent variable

models for multi-modal policieslatent variable models for

model-based RL

Page 9: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Other places we’ll see latent variable models

Mombaur et al. ‘09Muybridge (c. 1870) Ziebart ‘08Li & Todorov ‘06

Using RL/control + variational inference to model human behavior

Using generative models and variational inference for exploration

Page 10: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

How do we train latent variable models?

Page 11: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Estimating the log-likelihood

Page 12: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

The variational approximation

Page 13: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

The variational approximation

Jensen’s inequality

Page 14: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

A brief aside…

Entropy:

Intuition 1: how random is the random variable?

Intuition 2: how large is the log probability in expectation under itself

high

low

this maximizes the first part

this also maximizes the second part(makes it as wide as possible)

Page 15: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

A brief aside…

KL-Divergence:

Intuition 1: how different are two distributions?

Intuition 2: how small is the expected log probability of one distribution under another, minus entropy?

why entropy?

this maximizes the first part

this also maximizes the second part(makes it as wide as possible)

Page 16: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

The variational approximation

Page 17: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

The variational approximation

Page 18: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

How do we use this?

how?

Page 19: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

What’s the problem?

Page 20: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Break

Page 21: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

What’s the problem?

Page 22: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Amortized variational inference

how do we calculate this?

Page 23: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Amortized variational inference

look up formula for entropy of a Gaussian

can just use policy gradient!

What’s wrong with this gradient?

Page 24: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

The reparameterization trick

Is there a better way?

most autodiff software (e.g., TensorFlow) will compute this for you!

Page 25: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Another way to look at it…

this often has a convenient analytical form (e.g., KL-divergence for Gaussians)

Page 26: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Reparameterization trick vs. policy gradient

• Policy gradient• Can handle both discrete and

continuous latent variables

• High variance, requires multiple samples & small learning rates

• Reparameterization trick• Only continuous latent variables

• Very simple to implement

• Low variance

Page 27: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

The variational autoencoder

Page 28: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Using the variational autoencoder

Page 29: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Conditional models

Page 30: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Examples

Page 31: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

1. collect data

2. learn embedding of image & dynamics model (jointly)

3. run iLQG to learn to reach image of goal

a type of variational autoencoder with temporally decomposed latent state!

Page 32: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Local models with images

Page 33: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

Local models with images

variational autoencoder with stochastic dynamics

Page 34: Variational Inference and Generative Modelsrail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-13.pdf · 2019-11-18 · The variationalautoencoder. Using the variational autoencoder.

We’ll see more of this for…

Mombaur et al. ‘09Muybridge (c. 1870) Ziebart ‘08Li & Todorov ‘06

Using RL/control + variational inference to model human behavior

Using generative models and variational inference for exploration