Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit...

57
Gaussian Mixture Models and EM algorithm Radek Danecek

Transcript of Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit...

Page 1: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Gaussian Mixture Models and EM algorithm

Radek Danecek

Page 2: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Gaussian Mixture Model

• Unsupervised method

• Fit multimodal Gaussian distributions

Page 3: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Formal Definition

• The model is described as:

• The parameters of the model are:

• The training data is unlabeled – unsupervised setting

• Why not fit with MLE?

Page 4: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Optimization problem

• Model:

• Apply MLE:• Maximize:

• Difficult, non convex optimization with constraints

• Use EM algorithm instead

Page 5: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

EM Algorithm for GMMs

• Idea: • Objective function:

• Split optimization of the objective into to parts

• Algorithm:• Initialize model parameters (randomly):

• Iterate until convergence: • E-step

• Assign cluster probabilities (“soft labels”) to each sample

• M-step

• Solve the MLE using the soft labels

Page 6: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Initialization

• Initialize model parameters (randomly)

• Uniform for cluster probabilities

• Centers• Random

• K-means heuristics

• Covariances: • Spherical, according to empirical variance

Page 7: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

E-step

• For each data point and each cluster , compute the probabilitythat belongs to(given current model parameters)

“soft labels”

N

K

Probabilities of point n belonging to clusters 1…K (sum up to 1)

Sum determines the “weight” of cluster k

Page 8: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

E-step

• For each data point and each cluster , compute the probabilitythat belongs to(given current model parameters)

“soft labels”

Page 9: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

M-step

• Now we have “soft labels” for the data -> fall back to supervised MLE

• Optimize the log likelihood:• Instead of the original (difficult objective):

We optimize the following:

• Differentiate w.r.t.

Page 10: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

M-step

• Update model parameters:

• Update prior for each cluster: N

K

Probabilities of point n belonging to clusters 1…K (sum up to 1)

Sum determines the “weight” of cluster k

Normalized column-wise sum are priors for clusters 1…K

Page 11: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

M-step

• Update model parameters:

• Update mean and covariance of each cluster

Page 12: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

M-step

• Update model parameters:

• Update mean and covariance of each cluster

Page 13: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

EM Algorithm for GMMs

• Idea: • Objective function:

• Split optimization of the objective into to parts

• Algorithm:• Initialize model parameters (randomly):

• Iterate until convergence: • E-step

• Assign cluster probabilities (“soft labels”) to each sample

• M-step

• Find optimal parameters given the soft labels

Page 14: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Overlapping clusters

Page 15: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Overlapping clusters

Page 16: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Unequal cluster size

Page 17: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Imbalanced cluster size

Page 18: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Sensitivity to Initialization

Page 19: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Sensitivity to Initialization

Page 20: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Sensitivity to Initialization

Page 21: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Sensitivity to Initialization

Page 22: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Sensitivity to Initialization

Page 23: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Sensitivity to initialization

Page 24: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Degenerate covariance

• The determinant of the covariance matrix tends to 0

Page 25: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

• Input: an image

• Can be thought of as a dataset of 3D (color) samples

• Run 3D GMM clustering over

Page 26: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

Input Image

Page 27: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

3 clusters

Page 28: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

4 clusters

Page 29: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

5 clusters

Page 30: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

7 clusters

Page 31: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

8 clusters

Page 32: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

9 clusters

Page 33: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

10 clusters

Page 34: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

15 clusters

Page 35: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

20 clusters

Page 36: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Practical Example – Color segmentation

Input Image

Page 37: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

EM Algorithm for GMMs

• Idea: • Objective function:

• Split optimization of the objective into to parts

• Algorithm:• Initialize model parameters (randomly):

• Iterate until convergence: • E-step

• Assign cluster probabilities (“soft labels”) to each sample

• M-step

• Find optimal parameters given the soft labels

Page 38: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Generalized EM

• Idea: • Objective function:

• Split optimization of the objective into to parts

• Algorithm:• Initialize model parameters (randomly):

• Iterate until convergence: • E-step

• Assign cluster probabilities (“soft labels”) to each sample

• M-step

• Find optimal parameters given the soft labels

Page 39: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Generalized EM

• Idea: • Objective function:

• Split optimization of the objective into to parts

• Algorithm:• Initialize model parameters (randomly):

• Iterate until convergence: • E-step

• Assign cluster probabilities (“soft labels”) to each sample

• M-step

• Find optimal parameters given the soft labels

Page 40: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Generalized EM

• Idea: • Objective function:

• Split optimization of the objective into to parts

• Algorithm:• Initialize model parameters (randomly):

• Iterate until convergence: • E-step

• Assign cluster probabilities (“soft labels”) to each sample

• M-step

• Find optimal parameters given the soft labels

Page 41: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Generalized EM

• Idea: • Objective function:

• Split optimization of the objective into to parts

• Algorithm:• Initialize model parameters (randomly):

• Iterate until convergence: • E-step

• Assign cluster probabilities (“soft labels”) to each sample

• M-step

• Find optimal parameters given the soft labels

Page 42: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Generalized M-step

• What is the objective function?

• GMM:

• General:

Page 43: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Consider a mixture of K multivariate Bernoulli distributions with parameters , where

• Multivariate Bernoulli distribution:

• Question 1: Write down the equation for the E-step update

hint GMM: Answer:

Page 44: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Consider a mixture of K multivariate Bernoulli distributions with parameters , where

• Multivariate Bernoulli distribution:

• Question 1: Write down the equation for the E-step update

hint GMM: Answer:

Page 45: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Consider a mixture of K multivariate Bernoulli distributions with parameters , where

• Multivariate Bernoulli distribution:

• Question 2: Write down the EM objective:

Page 46: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Consider a mixture of K multivariate Bernoulli distributions with parameters , where

• Multivariate Bernoulli distribution:

• Question 2: Write down the EM objective:

Page 47: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Multivariate Bernoulli distribution:

• EM objective:

Page 48: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Multivariate Bernoulli distribution:

• EM objective:

Page 49: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Multivariate Bernoulli distribution:

• EM objective:

Page 50: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Multivariate Bernoulli distribution:

• EM objective:

Page 51: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Question 3: Write down the M-step update

• Differentiate wrt.:

Page 52: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Question 3: Write down the M-step update

• Differentiate wrt.:

Page 53: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Exercise

• Question 3: Write down the M-step update

• Differentiate wrt.:

Page 54: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Summary

• EM algorithm is useful for fitting GMMs (or other mixtures) in an unsupervised setting

• Can be used for: • Clustering

• Classification

• Distribution estimation

• Outlier detection

Page 55: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Other unsupervised clustering techniques

Source: https://scikit-learn.org/stable/modules/clustering.html

Page 56: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

Alternative for density estimation

• Kernel density estimation

Source: https://scikit-learn.org/stable/auto_examples/neighbors/plot_species_kde.html

Page 57: Gaussian Mixture Models and EM algorithm · Gaussian Mixture Model •Unsupervised method •Fit multimodal Gaussian distributions . Formal Definition •The model is described as:

References

• Lecture slides/videos

• https://www.python-course.eu/expectation_maximization_and_gaussian_mixture_models.php

• https://scikit-learn.org/stable/modules/clustering.html