Topic Model Latent Dirichlet Allocation

35
Ouyang Ruofei Topic Model Latent Dirichlet Allocation Ouyang Ruofei May. 10 2013 LDA

description

Topic Model Latent Dirichlet Allocation. Ouyang Ruofei. May. 10 2013. Ouyang Ruofei. LDA. Introduction. Parameters:. Inference:. data = latent pattern + noise. Ouyang Ruofei. LDA. Introduction. Parametric Model:. Number of parameters is fixed w.r.t . sample size. - PowerPoint PPT Presentation

Transcript of Topic Model Latent Dirichlet Allocation

Page 1: Topic Model  Latent  Dirichlet  Allocation

Ouyang Ruofei

Topic Model Latent Dirichlet Allocation

Ouyang Ruofei

May. 10 2013

LDA

Page 2: Topic Model  Latent  Dirichlet  Allocation

2

Introduction

Ouyang Ruofei LDA

Parameters:

Inference:data = latent pattern + noise

Page 3: Topic Model  Latent  Dirichlet  Allocation

3

Introduction

Ouyang Ruofei LDA

Parametric Model:

Nonparametric Model:

Number of parameters is fixed w.r.t. sample size

Number of parameters grows with sample sizeInfinite dimensional parameter space

Problem ParameterDensity Estimation Distributions

Regression FunctionsClustering Partitions

Page 4: Topic Model  Latent  Dirichlet  Allocation

4

Clustering

Ouyang Ruofei LDA

1.Ironman 2.Thor 3.Hulk

Indicator variable for each data point

Page 5: Topic Model  Latent  Dirichlet  Allocation

5

Dirichlet process

Ouyang Ruofei LDA

Ironman: 3 times Thor: 2 times Hulk: 2 times

Without the likelihood, we know that:

1. There are three clusters

2. The distribution over three clusters

New data

Page 6: Topic Model  Latent  Dirichlet  Allocation

6

Dirichlet process

Ouyang Ruofei LDA

Dirichlet distribution:

pdf:

mean:

Example:

Dir(Ironman,Thor,Hulk)

Page 7: Topic Model  Latent  Dirichlet  Allocation

7

Dirichlet process

Ouyang Ruofei LDA

Dirichlet distribution: Multinomial distribution:

Conjugate prior

Posterior: Example:

Ironman Thor HulkPrior 3 2 2

Likelihood 100 300 200Posterior 103 302 202

Pseudo count

Page 8: Topic Model  Latent  Dirichlet  Allocation

8

Dirichlet process

Ouyang Ruofei LDA

In our Avengers model, K=3 (Ironman, Thor, Hulk)

Dirichlet process:

However, this guy comes…

Dirichlet distribution can’t model this stupid guy

K = infinity

Nonparametrics here mean infinite number of clusters

Page 9: Topic Model  Latent  Dirichlet  Allocation

9

Dirichlet process

Ouyang Ruofei LDA

α: Pseudo counts in each cluster

G0: Base distribution of each cluster

A distribution over distributions

Dirichlet process:

AGiven any partition

Distribution template

Page 10: Topic Model  Latent  Dirichlet  Allocation

10

Dirichlet process

Ouyang Ruofei LDA

Construct Dirichlet process by CRP

In a restaurant, there are infinite number of tables.

Chinese restaurant process:

Costumer 1 seats at an unoccupied table with p=1.

Costumer N seats at table k with p=

Page 11: Topic Model  Latent  Dirichlet  Allocation

11

Dirichlet process

Ouyang Ruofei LDA

Page 12: Topic Model  Latent  Dirichlet  Allocation

12

Dirichlet process

Ouyang Ruofei LDA

Page 13: Topic Model  Latent  Dirichlet  Allocation

13

Dirichlet process

Ouyang Ruofei LDA

Page 14: Topic Model  Latent  Dirichlet  Allocation

14

Dirichlet process

Ouyang Ruofei LDA

Page 15: Topic Model  Latent  Dirichlet  Allocation

15

Dirichlet process

Ouyang Ruofei LDA

Customers : data

Tables : clusters

Page 16: Topic Model  Latent  Dirichlet  Allocation

16

Dirichlet process

Ouyang Ruofei LDA

Train the model by Gibbs sampling

Page 17: Topic Model  Latent  Dirichlet  Allocation

17

Dirichlet process

Ouyang Ruofei LDA

Train the model by Gibbs sampling

Page 18: Topic Model  Latent  Dirichlet  Allocation

18

Gibbs sampling

Ouyang Ruofei LDA

Gibbs sampling is a MCMC method to obtain a sequence of observations from a multivariate distribution

The intuition is to turn a multivariate problem into a sequence of univariate problem.

Multivariate:

Univariate:

In Dirichlet process,

Page 19: Topic Model  Latent  Dirichlet  Allocation

19

Gibbs sampling

Ouyang Ruofei LDA

Gibbs sampling pseudo code:

Page 20: Topic Model  Latent  Dirichlet  Allocation

20

Topic model

Ouyang Ruofei LDA

Document

Mixture of topics

we can read words

Latent variable

But,

topics words

Page 21: Topic Model  Latent  Dirichlet  Allocation

21

Topic model

Ouyang Ruofei LDA

Page 22: Topic Model  Latent  Dirichlet  Allocation

22

Topic model

Ouyang Ruofei LDA

Page 23: Topic Model  Latent  Dirichlet  Allocation

23

Topic model

Ouyang Ruofei LDA

word/topic count topic/doc counttopic of xij

observed wordother topics

other words

Page 24: Topic Model  Latent  Dirichlet  Allocation

24

Topic model

Ouyang Ruofei LDA

Apply Dirichlet process in topic model

Topic 1 Topic 2 Topic 3

Document P1 P2 P3

Topic 1 Topic 2 Topic 3

Word Q1 Q2 Q3

Learn the distribution of topics in a document

Learn the distribution of topics for a word

Page 25: Topic Model  Latent  Dirichlet  Allocation

25

Topic model

Ouyang Ruofei LDA

t1 t2 t3

d1

t1 t2 t3

d2

t1 t2 t3

d3

w1 w2 w3 w4

t1

t2

t3

topic/doc table word/topic table

Page 26: Topic Model  Latent  Dirichlet  Allocation

26

Topic model

Ouyang Ruofei LDA

Latent Dirichlet allocation:

Dirichlet mixture model:

Page 27: Topic Model  Latent  Dirichlet  Allocation

27

LDA Example

Ouyang Ruofei LDA

w: ipad apple itunes mirror queen joker ladygaga

t1: product

t2: storyt3: poker

d1: ipad apple itunes

d2: apple mirror queen

d3: queen joker ladygaga

d4: queen ladygaga mirror

In fact, the topics are latent

Page 28: Topic Model  Latent  Dirichlet  Allocation

28

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: queen joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 1 1

sum 1 2 1 2 3 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 2

d4 1 2 0

1 2 3

2 1 2

3 3 1

2 1 2

Page 29: Topic Model  Latent  Dirichlet  Allocation

29

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 1 1

sum 1 2 1 2 3 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 2

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen

Page 30: Topic Model  Latent  Dirichlet  Allocation

30

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 1-1 1

sum 1 2 1 2 3-1 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 2-1

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen

Page 31: Topic Model  Latent  Dirichlet  Allocation

31

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2

t3 1 0 1

sum 1 2 1 2 2 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0 1

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen

Page 32: Topic Model  Latent  Dirichlet  Allocation

32

LDA example

Ouyang Ruofei LDA

d1: ipad apple itunes

d2: apple mirror queen

d3: joker ladygaga

d4: queen ladygaga mirror

ipad apple itunes mirror queen joker ladygaga

t1 1 1 2

t2 2 1 2+1

t3 1 0 1

sum 1 2 1 2 2+1 1 2

t1 t2 t3

d1 1 1 1

d2 1 2 0

d3 1 0+1 1

d4 1 2 0

1 2 3

2 1 2

3 1

2 1 2

queen2

Page 33: Topic Model  Latent  Dirichlet  Allocation

33

Further

Ouyang Ruofei LDA

Dirichlet distribution prior: K topics

Alpha mainly controls the probability of a topic with few training data in the document.

Dirichlet process prior: infinite topics

Beta mainly controls the probability of a topic with few training data in the words.

Supervised

Unsupervised

Page 34: Topic Model  Latent  Dirichlet  Allocation

34

Further

Ouyang Ruofei LDA

Unrealistic bag of words assumption

Lose power law behavior

TNG, biLDA

Pitman Yor language model

David Blei has done an extensive survey on topic modelhttp://home.etf.rs/~bfurlan/publications/SURVEY-1.pdf

Page 35: Topic Model  Latent  Dirichlet  Allocation

Q&A

Ouyang Ruofei LDA