Variational Autoencoders - An Introduction Introduction I Auto-Encoding Variational Bayes, Diederik...

Click here to load reader

  • date post

    09-Apr-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of Variational Autoencoders - An Introduction Introduction I Auto-Encoding Variational Bayes, Diederik...

  • Variational Autoencoders - An Introduction

    Devon Graham

    University of British Columbia [email protected]

    Oct 31st, 2017

  • Table of contents

    Introduction

    Deep Learning Perspective

    Probabilistic Model Perspective

    Applications

    Conclusion

  • Introduction

    I Auto-Encoding Variational Bayes, Diederik P. Kingma and Max Welling, ICLR 2014

    I Generative model

    I Running example: Want to generate realistic-looking MNIST digits (or celebrity faces, video game plants, cat pictures, etc)

    I https://jaan.io/

    what-is-variational-autoencoder-vae-tutorial/

    I Deep Learning perspective and Probabilistic Model perspective

    3 / 30

    https://jaan.io/what-is-variational-autoencoder-vae-tutorial/ https://jaan.io/what-is-variational-autoencoder-vae-tutorial/

  • Introduction

    I Auto-Encoding Variational Bayes, Diederik P. Kingma and Max Welling, ICLR 2014

    I Generative model

    I Running example: Want to generate realistic-looking MNIST digits (or celebrity faces, video game plants, cat pictures, etc)

    I https://jaan.io/

    what-is-variational-autoencoder-vae-tutorial/

    I Deep Learning perspective and Probabilistic Model perspective

    3 / 30

    https://jaan.io/what-is-variational-autoencoder-vae-tutorial/ https://jaan.io/what-is-variational-autoencoder-vae-tutorial/

  • Introduction

    I Auto-Encoding Variational Bayes, Diederik P. Kingma and Max Welling, ICLR 2014

    I Generative model

    I Running example: Want to generate realistic-looking MNIST digits (or celebrity faces, video game plants, cat pictures, etc)

    I https://jaan.io/

    what-is-variational-autoencoder-vae-tutorial/

    I Deep Learning perspective and Probabilistic Model perspective

    3 / 30

    https://jaan.io/what-is-variational-autoencoder-vae-tutorial/ https://jaan.io/what-is-variational-autoencoder-vae-tutorial/

  • Introduction

    I Auto-Encoding Variational Bayes, Diederik P. Kingma and Max Welling, ICLR 2014

    I Generative model

    I Running example: Want to generate realistic-looking MNIST digits (or celebrity faces, video game plants, cat pictures, etc)

    I https://jaan.io/

    what-is-variational-autoencoder-vae-tutorial/

    I Deep Learning perspective and Probabilistic Model perspective

    3 / 30

    https://jaan.io/what-is-variational-autoencoder-vae-tutorial/ https://jaan.io/what-is-variational-autoencoder-vae-tutorial/

  • Introduction

    I Auto-Encoding Variational Bayes, Diederik P. Kingma and Max Welling, ICLR 2014

    I Generative model

    I Running example: Want to generate realistic-looking MNIST digits (or celebrity faces, video game plants, cat pictures, etc)

    I https://jaan.io/

    what-is-variational-autoencoder-vae-tutorial/

    I Deep Learning perspective and Probabilistic Model perspective

    3 / 30

    https://jaan.io/what-is-variational-autoencoder-vae-tutorial/ https://jaan.io/what-is-variational-autoencoder-vae-tutorial/

  • Introduction - Autoencoders

    I

    I Attempt to learn identity function

    I Constrained in some way (e.g., small latent vector representation)

    I Can generate new images by giving different latent vectors to trained network

    I Variational: use probabilistic latent encoding

    4 / 30

  • Introduction - Autoencoders

    I

    I Attempt to learn identity function

    I Constrained in some way (e.g., small latent vector representation)

    I Can generate new images by giving different latent vectors to trained network

    I Variational: use probabilistic latent encoding

    4 / 30

  • Introduction - Autoencoders

    I

    I Attempt to learn identity function

    I Constrained in some way (e.g., small latent vector representation)

    I Can generate new images by giving different latent vectors to trained network

    I Variational: use probabilistic latent encoding

    4 / 30

  • Introduction - Autoencoders

    I

    I Attempt to learn identity function

    I Constrained in some way (e.g., small latent vector representation)

    I Can generate new images by giving different latent vectors to trained network

    I Variational: use probabilistic latent encoding

    4 / 30

  • Introduction - Autoencoders

    I

    I Attempt to learn identity function

    I Constrained in some way (e.g., small latent vector representation)

    I Can generate new images by giving different latent vectors to trained network

    I Variational: use probabilistic latent encoding

    4 / 30

  • Deep Learning Perspective

    5 / 30

  • Deep Learning Perspective

    I Goal: Build a neural network that generates MNIST digits from random (Gaussian) noise

    I Define two sub-networks: Encoder and Decoder

    I Define a Loss Function

    6 / 30

  • Deep Learning Perspective

    I Goal: Build a neural network that generates MNIST digits from random (Gaussian) noise

    I Define two sub-networks: Encoder and Decoder

    I Define a Loss Function

    6 / 30

  • Deep Learning Perspective

    I Goal: Build a neural network that generates MNIST digits from random (Gaussian) noise

    I Define two sub-networks: Encoder and Decoder

    I Define a Loss Function

    6 / 30

  • Encoder

    I A neural network qθ(z |x)

    I Input: datapoint x (e.g. 28× 28-pixel MNIST digit) I Output: encoding z , drawn from Gaussian density with

    parameters θ

    I |z | � |x |

    I

    7 / 30

  • Encoder

    I A neural network qθ(z |x) I Input: datapoint x (e.g. 28× 28-pixel MNIST digit)

    I Output: encoding z , drawn from Gaussian density with parameters θ

    I |z | � |x |

    I

    7 / 30

  • Encoder

    I A neural network qθ(z |x) I Input: datapoint x (e.g. 28× 28-pixel MNIST digit) I Output: encoding z , drawn from Gaussian density with

    parameters θ

    I |z | � |x |

    I

    7 / 30

  • Encoder

    I A neural network qθ(z |x) I Input: datapoint x (e.g. 28× 28-pixel MNIST digit) I Output: encoding z , drawn from Gaussian density with

    parameters θ

    I |z | � |x |

    I

    7 / 30

  • Encoder

    I A neural network qθ(z |x) I Input: datapoint x (e.g. 28× 28-pixel MNIST digit) I Output: encoding z , drawn from Gaussian density with

    parameters θ

    I |z | � |x |

    I

    7 / 30

  • Decoder

    I A neural network pφ(x |z), parameterized by φ

    I Input: encoding z , output from encoder

    I Output: reconstruction x̃ , drawn from distribution of the data

    I E.g., output parameters for 28× 28 Bernoulli variables

    I

    8 / 30

  • Decoder

    I A neural network pφ(x |z), parameterized by φ I Input: encoding z , output from encoder

    I Output: reconstruction x̃ , drawn from distribution of the data

    I E.g., output parameters for 28× 28 Bernoulli variables

    I

    8 / 30

  • Decoder

    I A neural network pφ(x |z), parameterized by φ I Input: encoding z , output from encoder

    I Output: reconstruction x̃ , drawn from distribution of the data

    I E.g., output parameters for 28× 28 Bernoulli variables

    I

    8 / 30

  • Decoder

    I A neural network pφ(x |z), parameterized by φ I Input: encoding z , output from encoder

    I Output: reconstruction x̃ , drawn from distribution of the data

    I E.g., output parameters for 28× 28 Bernoulli variables

    I

    8 / 30

  • Decoder

    I A neural network pφ(x |z), parameterized by φ I Input: encoding z , output from encoder

    I Output: reconstruction x̃ , drawn from distribution of the data

    I E.g., output parameters for 28× 28 Bernoulli variables

    I

    8 / 30

  • Loss Function

    I x̃ is reconstructed from z where |z | � |x̃ |

    I How much information is lost when we go from x to z to x̃?

    I Measure this with reconstruction log-likelihood: log pφ(x |z) I Measures how effectively the decoder has learned to

    reconstruct x given the latent representation z

    9 / 30

  • Loss Function

    I x̃ is reconstructed from z where |z | � |x̃ | I How much information is lost when we go from x to z to x̃?

    I Measure this with reconstruction log-likelihood: log pφ(x |z) I Measures how effectively the decoder has learned to

    reconstruct x given the latent representation z

    9 / 30

  • Loss Function

    I x̃ is reconstructed from z where |z | � |x̃ | I How much information is lost when we go from x to z to x̃?

    I Measure this with reconstruction log-likelihood: log pφ(x |z)

    I Measures how effectively the decoder has learned to reconstruct x given the latent representation z

    9 / 30

  • Loss Function

    I x̃ is reconstructed from z where |z | � |x̃ | I How much information is lost when we go from x to z to x̃?

    I Measure this with reconstruction log-likelihood: log pφ(x |z) I Measures how effectively the decoder has learned to

    reconstruct x