Practical Collapsed Stochastic Variational Inference

6
Arnim Bleier, [email protected] CSS Workshop, GESIS, Köln, 16.12.2013 Practical Collapsed Stochastic Variational Inference

description

Recent advances have made it feasible to apply the stochastic variational paradigm to a collapsed representation of latent Dirichlet allocation (LDA). While the stochastic variational paradigm has successfully been applied to an uncollapsed representation of the hierarchical Dirichlet process (HDP), no attempts to apply this type of inference in a collapsed setting of non-parametric topic modeling have been put forward so far. In this paper we explore such a collapsed stochastic variational Bayes inference for the HDP. The proposed online algorithm is easy to implement and accounts for the inference of hyper-parameters. First experiments show a promising improvement in predictive performance. http://mimno.infosci.cornell.edu/nips2013ws/nips2013tm_submission_29.pdf

Transcript of Practical Collapsed Stochastic Variational Inference

Page 1: Practical Collapsed Stochastic Variational Inference

Arnim Bleier, [email protected] CSS Workshop, GESIS, Köln, 16.12.2013

Practical Collapsed Stochastic Variational Inference

Page 2: Practical Collapsed Stochastic Variational Inference

Classic Collapsed LDA Inference

nkwdi= nkwdi

� zdik

nkwdi= nkwdi

+ zdik

zdik / (ndk + ↵⇡k)nkwdi

+ �

nk. + V �

ndk = ndk � zdik

ndk = ndk + zdik

repeat for each document d do for each word i in d do

end for end for until stopping criterion is met.

Page 3: Practical Collapsed Stochastic Variational Inference

nkwdi= nkwdi

� zdik

nkwdi= nkwdi

+ zdik

zdik / (ndk + ↵⇡k)nkwdi

+ �

nk. + V �

ndk = ndk � zdik

ndk = ndk + zdik

repeat for each document d do for each word i in d do

end for end for until stopping criterion is met.

⇡k = ⇡̄k

k�1Y

l=1

(1� ⇡̄l)

vk = � +TX

l=k+1,d

E[ (ndl � 1)]

uk = 1 +X

d

E[ (ndk � 1)]

Sato, I., Kurihara, K., Nakagawa, H.: Practical Collapsed Variational Bayes Inference for Hierarchical Dirichlet Process. KDD (2012)

⇡̄k =uk

uk + vk; with

Stochastic Practical Collapsed Variational Inference for the HDP

Page 4: Practical Collapsed Stochastic Variational Inference

Stochastic Practical Collapsed Variational Inference for the HDP

nkwdi= nkwdi

� zdik

nkwdi= nkwdi

+ zdik

zdik / (ndk + ↵⇡k)nkwdi

+ �

nk. + V �

ndk = ndk � zdik

repeat for each document d do for each word i in d do

ndk (1� ⇢dt )ndk + ⇢dtNdzkndk = ndk + zdiknkw (1� ⇢ct)nkw + ⇢ctNzk [wdi = w]

⇡k = ⇡̄k

k�1Y

l=1

(1� ⇡̄l)

vk = � +TX

l=k+1,d

E[ (ndl � 1)]

uk = 1 +X

d

E[ (ndk � 1)]

⇡̄k =uk

uk + vk; with

Bleier, A.: Practical Collapsed Stochastic Variational Inference for the HDP. NIPS Topic Models Workshop (2013)

uk (1� ⇢ht )uk + ⇢ht (1 +DE[ (ndk � 1)])

vk (1� ⇢ht )vk + ⇢ht (� +DTX

l=k+1

E[ (ndl � 1)])

end for end for until stopping criterion is met.

Page 5: Practical Collapsed Stochastic Variational Inference

Stochastic Variational Inference for LDA

nkwdi= nkwdi

� zdik

nkwdi= nkwdi

+ zdik

zdik / (ndk + ↵⇡k)nkwdi

+ �

nk. + V �

ndk = ndk � zdik

repeat for each document d do for each word i in d do

ndk (1� ⇢dt )ndk + ⇢dtNdzkndk = ndk + zdiknkw (1� ⇢ct)nkw + ⇢ctNzk [wdi = w]

end for end for until stopping criterion is met.

Foulds, J., Boyles, L., Smyth, P., Welling, M.: Stochastic Collapsed Variational Bayesian Inference for LDA. KDD (2013)

Page 6: Practical Collapsed Stochastic Variational Inference

Evaluation

Predictive performance of the algorithm on the Associated Press (TREC-1) data. For the evaluation we compared the perplexity versus the number of documents seen for PCSVB0, SCVB0 and PCVB0. We trained the model on 80% of the documents. All held out documents were split; 70% of the tokens in each held out document were used to estimate the document parameters, the remaining 30% were used to compute the perplexity. Step-size schedule: Configuration: corpus-wide schedule : s = 10, = 1000 document schedule : s = 1, = 10 hyperparameter & stick weights : s = 5, = 100

⇢t =s

(⌧ + t)0.9

⇢ct ⌧⇢dt ⌧

⌧⇢ht