Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

35
Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA

Transcript of Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

Page 1: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

Ouyang Ruofei

Topic Model Latent Dirichlet Allocation

Ouyang Ruofei

May. 10 2013

LDA

Page 2: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

2

Introduction

Ouyang Ruofei LDA

Parameters:

Inference:

data = latent pattern + noise

Page 3: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

3

Introduction

Ouyang Ruofei LDA

Parametric Model:

Nonparametric Model:

Number of parameters is fixed w.r.t. sample size

Number of parameters grows with sample size

Infinite dimensional parameter space

Problem Parameter

Density Estimation Distributions

Regression Functions

Clustering Partitions

Page 4: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

4

Clustering

Ouyang Ruofei LDA

1.Ironman 2.Thor 3.Hulk

Indicator variable for each data point

Page 5: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

5

Dirichlet process

Ouyang Ruofei LDA

Ironman: 3 times Thor: 2 times Hulk: 2 times

Without the likelihood, we know that:

1. There are three clusters

2. The distribution over three clusters

New data

Page 6: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

6

Dirichlet process

Ouyang Ruofei LDA

Dirichlet distribution:

pdf:

mean:

Example:

Dir(Ironman,Thor,Hulk)

Page 7: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

7

Dirichlet process

Ouyang Ruofei LDA

Dirichlet distribution: Multinomial distribution:

Conjugate prior

Posterior: Example:

Ironman Thor Hulk

Prior 3 2 2

Likelihood 100 300 200

Posterior 103 302 202

Pseudo count

Page 8: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

8

Dirichlet process

Ouyang Ruofei LDA

In our Avengers model, K=3 (Ironman, Thor, Hulk)

Dirichlet process:

However, this guy comes…

Dirichlet distribution can’t model this stupid guy

K = infinity

Nonparametrics here mean infinite number of clusters

Page 9: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

9

Dirichlet process

Ouyang Ruofei LDA

α: Pseudo counts in each cluster

G0: Base distribution of each cluster

A distribution over distributions

Dirichlet process:

AGiven any partition

Distribution template

Page 10: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

10

Dirichlet process

Ouyang Ruofei LDA

Construct Dirichlet process by CRP

In a restaurant, there are infinite number of tables.

Chinese restaurant process:

Costumer 1 seats at an unoccupied table with p=1.

Costumer N seats at table k with p=

Page 11: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

11

Dirichlet process

Ouyang Ruofei LDA

Page 12: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

12

Dirichlet process

Ouyang Ruofei LDA

Page 13: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

13

Dirichlet process

Ouyang Ruofei LDA

Page 14: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

14

Dirichlet process

Ouyang Ruofei LDA

Page 15: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

15

Dirichlet process

Ouyang Ruofei LDA

Customers : data

Tables : clusters

Page 16: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

16

Dirichlet process

Ouyang Ruofei LDA

Train the model by Gibbs sampling

Page 17: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

17

Dirichlet process

Ouyang Ruofei LDA

Train the model by Gibbs sampling

Page 18: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

18

Gibbs sampling

Ouyang Ruofei LDA

Gibbs sampling is a MCMC method to obtain a sequence of observations from a multivariate distribution

The intuition is to turn a multivariate problem into a sequence of univariate problem.

Multivariate:

Univariate:

In Dirichlet process,

Page 19: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

19

Gibbs sampling

Ouyang Ruofei LDA

Gibbs sampling pseudo code:

Page 20: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

20

Topic model

Ouyang Ruofei LDA

Document

Mixture of topics

we can read words

Latent variable

But,

topics words

Page 21: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

21

Topic model

Ouyang Ruofei LDA

Page 22: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

22

Topic model

Ouyang Ruofei LDA

Page 23: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

23

Topic model

Ouyang Ruofei LDA

word/topic count topic/doc count

topic of xij

observed wordother topics

other words

Page 24: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

24

Topic model

Ouyang Ruofei LDA

Apply Dirichlet process in topic model

Topic 1 Topic 2 Topic 3

Document P1 P2 P3

Topic 1 Topic 2 Topic 3

Word Q1 Q2 Q3

Learn the distribution of topics in a document

Learn the distribution of topics for a word

Page 25: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

25

Topic model

Ouyang Ruofei LDA

t1 t2 t3

d1

t1 t2 t3

d2

t1 t2 t3

d3

w1 w2 w3 w4

t1

t2

t3

topic/doc table word/topic table

Page 26: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

26

Topic model

Ouyang Ruofei LDA

Latent Dirichlet allocation:

Dirichlet mixture model:

Page 27: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

27

LDA Example

Ouyang Ruofei LDA

w: ipad apple itunes mirror queen joker ladygaga

t1: product

t2: storyt3: poker

d1: ipad apple itunes

d2: apple mirror queen

d3: queen joker ladygaga

d4: queen ladygaga mirror

In fact, the topics are latent

Page 28: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

28

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: queen joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 1 1

sum 1 2 1 2 3 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 2

d4 1 2 0

1 2 3

2 1 2

3 3 1

2 1 2

Page 29: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

29

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 1 1

sum 1 2 1 2 3 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 2

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen

Page 30: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

30

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 1-1 1

sum 1 2 1 2 3-1 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 2-1

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen

Page 31: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

31

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 0 1

sum 1 2 1 2 2 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 1

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen

Page 32: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

32

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2+1

t3 1 0 1

sum 1 2 1 2 2+1 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0+1 1

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen2

Page 33: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

33

Further

Ouyang Ruofei LDA

Dirichlet distribution prior: K topics

Alpha mainly controls the probability of a topic with few training data in the document.

Dirichlet process prior: infinite topics

Beta mainly controls the probability of a topic with few training data in the words.

Supervised

Unsupervised

Page 34: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

34

Further

Ouyang Ruofei LDA

Unrealistic bag of words assumption

Lose power law behavior

TNG, biLDA

Pitman Yor language model

David Blei has done an extensive survey on topic modelhttp://home.etf.rs/~bfurlan/publications/SURVEY-1.pdf

Page 35: Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA.

Q&A

Ouyang Ruofei LDA