Composing Graphical Models with Neural Networks for...

48
Composing Graphical Models with Neural Networks for Structured Representations and Fast Inference Matthew J. Johnson , David Duvenaud , Alexander B. Wiltschko , Sandeep R. Datta , Ryan P. Adams Presented by Meet Taraviya, Joe Halabi, Derek Xu

Transcript of Composing Graphical Models with Neural Networks for...

Page 1: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Composing Graphical Models with Neural Networks for Structured

Representations and Fast InferenceMatthew J. Johnson, David Duvenaud, Alexander B. Wiltschko, Sandeep R. Datta, Ryan P. Adams

Presented by Meet Taraviya, Joe Halabi, Derek Xu

Page 2: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Motivation

Page 3: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Fitting Spiral Data

Page 4: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Compare two approachesProbabilistic Graphical Models

● + Structured representations● + Priors and uncertainty● + Data and computational efficiency● - Rigid assumptions● - Feature engineering

Deep Learning

● + Flexible● + Feature learning● - Not interpretable● - Difficult parameterization● - Requires a lot of data

Page 5: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations
Page 6: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations
Page 7: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations
Page 8: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Proposal

Page 9: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

SVAE: Combining PGMs with VAEs● Use probabilistic graphical models (PGMs) for representing structured

probability distributions over latent variables.● Use variational autoencoders (VAEs) to learn the non-linear observations.● Thoroughly describe the inference and learning procedures for these models.

Page 10: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations
Page 11: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Background

Page 12: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Exponential family

Recall: CRF/MRF are special cases of the exponential family

Page 13: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Variational Inference

← Intractable!

Page 14: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Variational Inference - Derivation

Page 15: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Mean-field Variational Inference

Page 16: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Autoencoder

Page 17: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Can we generate new data from an autoencoder?

Page 18: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Variational Autoencoder

Page 19: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations
Page 20: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Back to the proposal

Page 21: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

conjugate prior on global variablesexponential family on local variablesexponential prior on observation parametersneural network observation model

Page 22: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Example: 1-dimensional GMM with K clusters

Page 23: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

How are these exponential distributions?

Page 24: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Generalizing to all graphical models

exponential families

recognition network

conjugate prior of p(x|θ)

Page 25: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Generalizing to all graphical models

Why?

exponential families

recognition network

conjugate prior of p(x|θ)

Page 26: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Inference and Learning

Page 27: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Problem: Inference and Learning- Variational inference does not work because of the nonlinearities

introduced by generalizing p(y|x,γ), which is assumed to be an arbitrary distribution, causes difficulties in finding the other distributions.

- Mean-field Variational Inference Objective (ELBO):

- Questions:1. How can we find p(y|x,γ), if it is any general distribution?2. How can we exploit p(x|θ), p(y|x,γ) to improve variational inference?

Page 28: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Solution: Overview- “Why?” is answered by how we perform inference and learning

- “How can we find p(y|x,γ), if it is any general distribution?”- Extension of Variational Autoencoder (VAE)

- Conclusion:

- “How can we exploit p(x|θ), p(y|x,γ) to improve variational inference?”- Extension of Variational Inference (VI)

- Conclusion:

Page 29: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

- Intuition:

Solution: VAE Recognition Networks

Page 30: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Solution: VAE Recognition Networks- Use a recognition network to estimate the solution to the problem

- Estimate this complicated p(y|x,gamma) with a “simpler” exponential distribution

- Formally: estimate using a CRF with same sufficient statistics as x:

Page 31: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Solution: VAE Recognition Networks- Use a recognition network to estimate the solution to the problem

- Estimate this complicated p(y|x,gamma) with a “simpler” exponential distribution

- Formally: estimate using a CRF with same sufficient statistics as x:

Why?

Page 32: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Solution: VI Extension- Intuition:

- The local variational parameters have a closed-form solution; the global variational parameters have a natural gradient (which can be computed from the optimal local variational parameters)

Page 33: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Solution: VI Extension- Concretely:

- Step 1: find optimal local variational parameters

- Step 2: find natural gradients of global variational parameters (Newton’s Method)

“Hold my beer as we do some optimization.”

Page 34: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

- Proof: How can we find the optimal q(x)? [1/2]

Solution: VI Extension (Step 1)

Page 35: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

- Proof: How can we find the optimal q(x)? [2/2]

Solution: VI Extension (Step 1)

In our case,

Page 36: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

- Proof: Why does partial optimization give us the natural gradients? [1/2]

Solution: VI Extension (Step 2)

Page 37: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

- Proof: Why does partial optimization give us the natural gradients? [2/2]

Solution: VI Extension (Step 2)

i.e. Newton’s Method

Page 38: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Solution: VI Extension (Step 2)- Reparameterization trick to find the recognition model parameters: [1/1]

Page 39: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Overall Algorithm

Page 40: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Experiments

Page 41: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Modelling behaviour of mice● Discrete latent variables zn: MC to describe the “state”, eg dart/pause/rear● Continuous latent variables xn: LDS to describe motion in a particular state● Observed yn: Depth images of the mouse

Page 42: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Latent switching linear dynamic system (SLDS)

Page 43: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Generating new frames

Page 44: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Predicting frames corresponding to a given state

Page 45: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Conclusions● Structured variational autoencoders provide a general framework that

combines some of the strengths of probabilistic graphical models and deep learning methods.

● Graphical models for latent structure and fast inference.● Neural networks for non-linear observation models.

Page 46: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Questions

Page 47: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

Questions

1. Why do we want to combine probabilistic graphical models and variational autoencoders?

2. How many neural networks does SVAE have?3. Why can we not just use variational inference for the model?

Page 48: Composing Graphical Models with Neural Networks for ...web.cs.ucla.edu/~yzsun/classes/2020Winter_CS249/... · Composing Graphical Models with Neural Networks for Structured Representations

References● https://towardsdatascience.com/understanding-variational-autoencoders-vaes-f70510919f73

● Blei, David M., Alp Kucukelbir, and Jon D. McAuliffe. “Variational Inference: A Review for Statisticians.” Journal of the American Statistical Association 112.518 (2017): 859–877. Crossref. Web.

● https://simons.berkeley.edu/sites/default/files/docs/6626/berkeley.pdf

● Johnson, Matthew J., et al. "Composing graphical models with neural networks for structured representations and fast inference." Advances in neural information processing systems. 2016.