CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a...

34
CSC412: Variational Autoencoders David Duvenaud

Transcript of CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a...

Page 1: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

CSC412: Variational Autoencoders

David Duvenaud

Page 2: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Class Projects

• Focus is on research skills, providing context for results, evidence for claims

• Excuse to start larger project

Page 3: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Variational Inference• Directly optimize the parameters phi of an

approximate distribution q(z|x, phi) to match p(z|x, theta)

• What if there is a local latent variable per-datapoint, and some global parameters? e.g. Bayesian PCA, generative image models, topic models

• Directly optimize the parameters phi_i of each approximate distribution q(z_i|x_i, phi_i) to match p(z_i|x_i, theta)

Page 4: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

SVI Algorithm:

• 1. Sample z from q(z|x, phi)

• 2. Return log p(z, x | theta) - log q(z | x, phi)

• In this setting, only one set of latent params z, as in a Bayesian neural net

Page 5: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

SVI w/ per-datapoint latents:

1. Sample x_i from dataset

2. Optimize phi_i to minimize KL(q(z_i | x_i, phi_i)|| p(z_i | x_i, theta))

3. Sample z from q(z_i|x_i, phi_i)

4. Return log p(z_i, x_i | theta) - log q(z_i | x_i, phi_i)

Page 6: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Variational Autoencoder:

1. Sample x_i from dataset

2. Compute phi_i as a function of x, and recognition params phi_r: phi_i = f(x, phi_r)

3. Sample z from q(z_i|x_i, phi_i)

4. Return log p(z_i, x_i | theta) - log q(z_i | x_i, phi_i)

Page 7: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Consequences of using a recognition network

• Don’t need to re-optimize phi_i each time theta changes - much faster

• Recognition net won’t necessary give optimal phi_i

• Can have fast test-time inference (vision)

• Can train recognition net with gradient descent

• Could also differentiate through optimization

Page 8: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Simple but not obvious

• It took a long time get here!

• Independently developed as denoising autoencoders (Bengio et al.) and amortized inference (many others)

• Helmholtz machine - same idea in 1995 but used discrete latent variables

Page 9: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,
Page 10: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Autoencoder Motivation

• Want compact representation of data

• x = dec(enc(x))

• Need to prevent enc = dec = identity

• So, add noise to encoding

• Gives VAE bound but with a free parameter

Page 11: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Benefits of compact latent code

• http://www.dpkingma.com/sgvb_mnist_demo/demo.html

• Nearby z’s give similar x

• Recent work on ‘disentangling’ latent rep

Page 12: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Code examples

• Show VAE code

Page 13: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Variations: Decoder

• Orginally, p(x|z) = N(x | dec(z, theta), sigma I)

• Final step has independence assumption, causes blurry samples

• p(x|z) can be anything: rnn, pixelRNN, real NVP, de_convolutional net

Page 14: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,
Page 15: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Variations• Decoder often looks like inverse of encoder

• Encoders can come from supervised learning

Learning Deconvolution Network for Semantic Segmentation http://arxiv.org/abs/1505.04366.

Page 16: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Text autoencoders

• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio

Page 17: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

TextVAE-Interpola*on

Page 18: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

What is a molecule?Graph SMILES string

Page 19: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Repurposing text autoencoders

c1ccccc1 c1ccccc1

ENCODERNeural Network

DECODERNeural Network

CONTINUOUS MOLECULARREPRESENTATION

Latent Space

Discrete Structure SMILES

Discrete Structure SMILES

Can be trained on unlabeled data

Page 20: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Map of 220,000 Drugs

Page 21: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Map of 100,000 OLEDs

Page 22: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Random Organic LEDs

F S S F O OO S

N N N

HS S S S SH

HO SHH2N NH2

O

FHNH2N

NH3

F F

HO F

N NH2

OHSHSO

F S FH2S

OHHO OH

FHS

NH2

OH

O

N SH

O

NH2H2N FHF

SHS

S

F

H2O

SHSHS F

CH4

H2N OHN

F F

HO S S S S S S S S S S S SH

N

S N

NO

O

N

N

N N

N

O O

OO

S O

N

N N

NN N

N

NN

N

NN N

NN

NS

N

N N

N

N

N N

N

N

N

O

O

NN

S N

NNN

N

N NN

N

O

NN N

N

N

NNO

NN

NO

NN

N

NNN

N

Standard autoencoderVariational autoencoder

Page 23: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Molecules near

O

NHN+

O

-O

Cl

Cl

NH

NH

N+

O

O-Cl

Cl

Br

N

N+

O

-OCl BrCl

O

NH

N+

O

O-Cl

Cl

SHCl

ON+

O

-O

NH

Cl

Cl

O

NH

N

N+

O

-OCl

F

O

ON+

O

-OCl

Cl

HN

O

NH

N+

O

O-Br

Cl O

ON+

O

O-Cl

Br O

NH

NH

N+

O

O-Cl

Cl

O

O

HN

N+

O

-O

HN Cl

Br

O

NH

N+

O

O-HO

Cl

Cl

ON+

O

O-Cl

Cl Br

ON+

O

O-Br

Cl

Cl

HN

N+

O

O-

Cl

Cl

O

O

NH

N

N+

O

O-Cl

Cl

Br

O

NH

N+

O

O-F

NH

Cl

O

O

HN

N+

O

O-HO

F

F

NHN+

O

-OF

NH

Cl

BrO

NH

N+

O

O-Br

Cl

Cl

O

O

N+

O

O-Cl

Cl Br

O

NH

N+

O

O-Cl

Br

O

ON+

O

O-Cl

Cl O

O

NH

N+

O

O-Cl

Cl

O

ON+

O

O-Cl

F

Cl

O

NH

N+

O

O-

Cl

Cl

Cl

O

NH

N+

O

O-Cl

Cl O

ON+

O

O-Cl

Cl

O

O

NH

N+

O

O-Cl

Cl

Br

O

NH

N

N+

O

O-Cl

Cl

Cl

O

NN+

O

O-

Br

Cl

Cl

O

NH

N+

O

O-Cl

O Cl

ON+

O

O-Cl

Cl

Cl

O

NH

N+

O

O-Cl

Cl Cl

O NH

N+O O-

Cl

N

O NH

N+O O-

Br

O

NH

N+

O

O-Br

Cl Cl

O

NH

N+

O

O-

NH

Cl

Br

O

N+

O

-O

N NH

Cl

O

O

NH

N+

O

O-Cl

NCl

Cl

O

O

N+

O

O-

NH

Cl Cl

O

NH

N+

O

O-Cl

Cl Br

HN

N+

O

O-

Cl

Cl

Br

O

NH

N+

O

O-Br

Cl

Br

O

NH

N+

O

O-

Cl

Cl

O

NH

N+

O

O-Cl

OCl

ON+

O

O-Cl

Cl

Br

O

NH

N

N+

O

O-Cl

Cl

Cl

O

NN+

O

O-

Cl

Cl Cl

O

NH

N+

O

O-Cl

Cl

OH

Page 24: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

NH2 O

N ONH

O

O

HN

HN O

O

O

O

N

O

O

O

HN

O

O

O

NH NO

O

O

ONH2

NHO

HN

HN

O

O

HN N

NHO

O

O

O

NH

O

O

O

O

NO

O

N

HN O

O

O O

NH

HNN

O

O

O

N

N

OO

O

NH2

N

HNO

O

OO

N

O

NHO

S

O

N

O

O

OO

OH

NH

O

OO

O

H2NS

HNO

O

O

OHS

NH

O

O

O

HO

SHN

HN O

O

O

O

NH2

NH

N

O

OH

OO

N

HN

O

O

O

F

N

N

O

O

O N

HN

NHOO

O

N

OO

HN

N

O

N

HNN

HNO

O

N

NO

OH

OO

NH

OHN

O

OO

H2N NN

HNH2NO

O

O

NH2

N

NHO

HN

Cl

O

O

N NH

OO

NHS

OH

F

N N

O

O

HN

HN

NHO

O

O

O

NN

NO

BrO

ONH

NN

OO

NH

O

OO

HO

HN NOO

O

N

NHO

O

N

Cl

Cl

O

O

ONH

HN

O

OO

O

HN

NHO

O

O

O

O

N

HO

HN

N O

O

HN

HN

NH

O

O

NH2

O

NH

N

HN

O

O

O

O

N

OHN

O

O

F

O

N

O

OO

NN

NH2

OO

NH2

NH

N

Cl

O

O

F

N

O

O

O

O

Molecules near

Page 25: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,
Page 26: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,
Page 27: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,
Page 28: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,
Page 29: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

No chemistry-specific design!

Page 30: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Gradient-based optimization

Page 31: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Gradient-based optimization

• Can’t necessarily start from given molecule, need to encode/decode

• Can’t go too far from start, wander into ‘holes’ or empty regions

Page 32: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Encoder can look at decoder

• https://www.youtube.com/watch?v=Zt-7MI9eKEo

Page 33: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Recent Extensions• Importance-Weighted Autoencoders (IWAE) Burda,

Grosse, Salakhudtinov

• Mixture distributions in posterior

• GAN-style ideas to avoid evaluating q(z|x), p(x|z), even p(z)

• Normalizing flows: Produce arbitrarily-complicated q(z|x)

• Incorporate HMC or local optimization to define q(z|x)

Page 34: CSC412: Variational Autoencodersduvenaud/courses/csc412/...• Generating Sentences from a Continuous Space. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz,

Generative Adversarial Networks

• Also a latent-variable model: x = f(z), z from N(0,I)

• Trained adversarially, can also optimize p(x)

• Recent work on adversarially-trained VAEs:

• Match p(z, x) to q(z, x)

• Sample z from p(z), x from p(x|z)

• Sample x from data, z from q(z|x)