Fast and Accurate Inference for Topic Models
description
Transcript of Fast and Accurate Inference for Topic Models
![Page 1: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/1.jpg)
Fast and Accurate Inference for Topic Models
James FouldsUniversity of California, Santa Cruz
Presented at eBay Research Labs
![Page 2: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/2.jpg)
2
Motivation• There is an ever-increasing wealth of digital
information available– Wikipedia– News articles– Scientific articles– Literature– Debates– Blogs, social media …
• We would like automatic methods to help us understand this content
![Page 3: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/3.jpg)
3
Motivation
• Personalized recommender systems• Social network analysis• Exploratory tools for scientists• The digital humanities• …
![Page 4: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/4.jpg)
4
The Digital Humanities
![Page 5: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/5.jpg)
5
Dimensionality reduction
The quick brown fox jumps over the sly lazy dog
![Page 6: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/6.jpg)
6
Dimensionality reduction
The quick brown fox jumps over the sly lazy dog[5 6 37 1 4 30 9 22 570 12]
![Page 7: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/7.jpg)
7
Dimensionality reduction
The quick brown fox jumps over the sly lazy dog[5 6 37 1 4 30 9 22 570 12]
Foxes Dogs Jumping[40% 40% 20% ]
![Page 8: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/8.jpg)
8
Latent Variable Models
Z
XΦParameters
Latent variables
Observed dataData Points
Dimensionality(X) >> dimensionality(Z)Z is a bottleneck, which finds a compressed, low-dimensional representation of X
![Page 9: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/9.jpg)
Latent Feature Models forSocial Networks
Alice Bob
Claire
![Page 10: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/10.jpg)
Latent Feature Models forSocial Networks
CyclingFishingRunning
WaltzRunning
TangoSalsa
Alice Bob
Claire
![Page 11: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/11.jpg)
Latent Feature Models forSocial Networks
CyclingFishingRunning
WaltzRunning
TangoSalsa
Alice Bob
Claire
![Page 12: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/12.jpg)
Latent Feature Models forSocial Networks
CyclingFishingRunning
WaltzRunning
TangoSalsa
Alice Bob
Claire
![Page 13: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/13.jpg)
Miller, Griffiths, Jordan (2009)Latent Feature Relational Model
CyclingFishingRunning
WaltzRunning
TangoSalsa
Cycling Fishing Running Tango Salsa Waltz
Alice
Bob
ClaireZ =
Alice Bob
Claire
![Page 14: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/14.jpg)
14
Latent Representations
• Binary latent feature
• Latent class
• Mixed membership
Cycling Fishing Running Tango Salsa WaltzAlice 1 1 1Bob 1 1Claire 1 1
Cycling Fishing Running Tango Salsa WaltzAlice 0.2 0.4 0.4Bob 0.5 0.5Claire 0.9 0.1
Cycling Fishing Running Tango Salsa WaltzAlice 1Bob 1Claire 1
![Page 15: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/15.jpg)
15
Latent Representations
• Binary latent feature
• Latent class
• Mixed membership
Cycling Fishing Running Tango Salsa WaltzAlice 1 1 1Bob 1 1Claire 1 1
Cycling Fishing Running Tango Salsa WaltzAlice 0.2 0.4 0.4Bob 0.5 0.5Claire 0.9 0.1
Cycling Fishing Running Tango Salsa WaltzAlice 1Bob 1Claire 1
![Page 16: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/16.jpg)
16
Latent Representations
• Binary latent feature
• Latent class
• Mixed membership
Cycling Fishing Running Tango Salsa WaltzAlice 1 1 1Bob 1 1Claire 1 1
Cycling Fishing Running Tango Salsa WaltzAlice 0.2 0.4 0.4Bob 0.5 0.5Claire 0.9 0.1
Cycling Fishing Running Tango Salsa WaltzAlice 1Bob 1Claire 1
![Page 17: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/17.jpg)
17
Latent Variable ModelsAs Matrix Factorization
![Page 18: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/18.jpg)
18
Latent Variable ModelsAs Matrix Factorization
![Page 19: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/19.jpg)
Miller, Griffiths, Jordan (2009)Latent Feature Relational Model
CyclingFishingRunning
WaltzRunning
TangoSalsa
Cycling Fishing Running Tango Salsa Waltz
Alice
Bob
ClaireZ =
Alice Bob
Claire
![Page 20: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/20.jpg)
Miller, Griffiths, Jordan (2009)Latent Feature Relational Model
CyclingFishingRunning
WaltzRunning
TangoSalsa
Cycling Fishing Running Tango Salsa Waltz
Alice
Bob
ClaireZ =
Alice Bob
Claire E[Y] =(ZWZT)
![Page 21: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/21.jpg)
21
Topics
Topic 1Reinforcement learning
Topic 2Learning algorithms
Topic 3Character recognition
Distributionover allwords indictionary
A vector of discrete probabilities (sums to one)
![Page 22: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/22.jpg)
22
Topics
Topic 1Reinforcement learning
Topic 2Learning algorithms
Topic 3Character recognition
Top 10 words
![Page 23: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/23.jpg)
23
Topics
Topic 1Reinforcement learning
Topic 2Learning algorithms
Topic 3Character recognition
Top 10 words
![Page 24: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/24.jpg)
24
Latent Dirichlet Allocation(Blei et al., 2003)
•For each document d• Draw its topic proportion θ(d) ~ Dirichlet(α)• For each word wd,n
• Draw a topic assignment zd,n ~ Discrete(θ(d))• Draw a word from the chosen topic wd,n ~ Discrete(φZd,n)
φ
![Page 25: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/25.jpg)
25
Latent Dirichlet Allocation(Blei et al., 2003)
•For each topic k• Draw its distribution over words φ(k) ~ Dirichlet(β)• For each word wd,n
• Draw a topic assignment zd,n ~ Discrete(θ(d))• Draw a word from the chosen topic wd,n ~ Discrete(φZd,n)
φ
![Page 26: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/26.jpg)
26
Latent Dirichlet Allocation(Blei et al., 2003)
•For each document d• Draw its topic proportion θ(d) ~ Dirichlet(α)• For each word wd,n
• Draw a topic assignment zd,n ~ Discrete(θ(d))• Draw a word from the chosen topic wd,n ~ Discrete(φZd,n)
φ
![Page 27: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/27.jpg)
27
Latent Dirichlet Allocation(Blei et al., 2003)
•For each document d• Draw its topic proportion θ(d) ~ Dirichlet(α)• For each word wd,n
• Draw a topic assignment zd,n ~ Discrete(θ(d))• Draw a word from the chosen topic wd,n ~ Discrete(φZd,n)
φ
![Page 28: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/28.jpg)
28
Latent Dirichlet Allocation(Blei et al., 2003)
•For each document d• Draw its topic proportion θ(d) ~ Dirichlet(α)• For each word wd,n
• Draw a topic assignment zd,n ~ Discrete(θ(d))• Draw a word from the chosen topic wd,n ~ Discrete(φZd,n)
φ
![Page 29: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/29.jpg)
29
Latent Dirichlet Allocation(Blei et al., 2003)
•For each document d• Draw its topic proportion θ(d) ~ Dirichlet(α)• For each word wd,n
• Draw a topic assignment zd,n ~ Discrete(θ(d))• Draw a word from the chosen topic wd,n ~ Discrete(φZd,n)
φ
![Page 30: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/30.jpg)
30
Latent Dirichlet Allocation(Blei et al., 2003)
•For each document d• Draw its topic proportion θ(d) ~ Dirichlet(α)• For each word wd,n
• Draw a topic assignment zd,n ~ Discrete(θ(d))• Draw a word from the chosen topic wd,n ~ Discrete(φZd,n)
φ
![Page 31: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/31.jpg)
31
LDA as Matrix Factorization
θ φTx
![Page 32: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/32.jpg)
32
Let’s say we want to build an LDAtopic model on Wikipedia
![Page 33: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/33.jpg)
33
LDA on Wikipedia
102
103
104
105
-780
-760
-740
-720
-700
-680
-660
-640
-620
-600
Time (s)
Avg
. Log
Lik
elih
ood
VB (10,000 documents)
1 hour 6 hours
12 hours
10 mins
![Page 34: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/34.jpg)
34
LDA on Wikipedia
102
103
104
105
-780
-760
-740
-720
-700
-680
-660
-640
-620
-600
Time (s)
Avg
. Log
Lik
elih
ood
VB (10,000 documents)
VB (100,000 documents)
1 hour 6 hours
12 hours
10 mins
![Page 35: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/35.jpg)
35
LDA on Wikipedia
102
103
104
105
-780
-760
-740
-720
-700
-680
-660
-640
-620
-600
Time (s)
Avg
. Log
Lik
elih
ood
VB (10,000 documents)
VB (100,000 documents)
1 full iteration = 3.5 days!
1 hour 6 hours
12 hours
10 mins
![Page 36: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/36.jpg)
36
LDA on Wikipedia
Stochastic variational inference
102
103
104
105
-780
-760
-740
-720
-700
-680
-660
-640
-620
-600
Time (s)
Avg
. Log
Lik
elih
ood
Stochastic VB (all documents)
VB (10,000 documents)
VB (100,000 documents)
Stochastic variational inference
1 hour 6 hours
12 hours
10 mins
![Page 37: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/37.jpg)
37
LDA on Wikipedia
Stochastic collapsed variational inference
102
103
104
105
-780
-760
-740
-720
-700
-680
-660
-640
-620
-600
Time (s)
Avg
. Log
Lik
elih
ood
SCVB0 (all documents)Stochastic VB (all documents)VB (10,000 documents)VB (100,000 documents)
1 hour 6 hours
12 hours
10 mins
![Page 38: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/38.jpg)
38
Available tools
VB Collapsed Gibbs Sampling Collapsed VB
Batch Blei et al. (2003) Griffiths and Steyvers (2004)
Teh et al. (2007), Asuncion et al.
(2009)
Stochastic Hoffman et al. (2010, 2013)
Mimno et al. (2012) (partially collapsed VB/Gibbs hybrid)
???
![Page 39: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/39.jpg)
39
Available tools
VB Collapsed Gibbs Sampling Collapsed VB
Batch Blei et al. (2003) Griffiths and Steyvers (2004)
Teh et al. (2007), Asuncion et al.
(2009)
Stochastic Hoffman et al. (2010, 2013)
Mimno et al. (2012) (partially collapsed VB/Gibbs hybrid)
???
![Page 40: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/40.jpg)
40
Collapsed Inference for LDAGriffiths and Steyvers (2004)
• Marginalize out the parameters, and perform inference on the latent variables only
Z
𝛉
𝚽 Z
![Page 41: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/41.jpg)
41
Collapsed Inference for LDAGriffiths and Steyvers (2004)
• Marginalize out the parameters, and perform inference on the latent variables only
– Simpler, faster and fewer update equations– Better mixing for Gibbs sampling
![Page 42: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/42.jpg)
42
• Collapsed Gibbs sampler
Collapsed Inference for LDAGriffiths and Steyvers (2004)
![Page 43: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/43.jpg)
43
• Collapsed Gibbs sampler
Collapsed Inference for LDAGriffiths and Steyvers (2004)
Word-topic counts
![Page 44: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/44.jpg)
44
• Collapsed Gibbs sampler
Collapsed Inference for LDAGriffiths and Steyvers (2004)
Document-topic counts
![Page 45: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/45.jpg)
45
• Collapsed Gibbs sampler
Collapsed Inference for LDAGriffiths and Steyvers (2004)
Topic counts
![Page 46: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/46.jpg)
46
Stochastic Optimization for ML
Stochastic algorithms– While (not converged)
• Process a subset of the dataset, to estimate the update• Update parameters
![Page 47: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/47.jpg)
47
Stochastic Optimization for ML
• Stochastic gradient descent– Estimate the gradient
• Stochastic variational inference(Hoffman et al. 2010, 2013)– Estimate the natural gradient of the variational
parameters• Online EM (Cappe and Moulines, 2009)
– Estimate E-step sufficient statistics
![Page 48: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/48.jpg)
48
Goal: Build a Fast, Accurate,Scalable Algorithm for LDA
• Collapsed LDA– Easy to implement– Fast– Accurate– Mixes well / propagates information quickly
• Stochastic algorithms– Scalable
• Quickly forgets random initialization• Memory requirements, update time independent of size of data set• Can estimate topics before a single pass of the data is complete
• Our contribution: an algorithm which gets the best of both worlds
![Page 49: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/49.jpg)
49
Variational Bayesian Inference
• An optimization strategy for performing posterior inference, i.e. estimating Pr(Z|X)
P
Q
![Page 50: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/50.jpg)
50
Variational Bayesian Inference
• An optimization strategy for performing posterior inference, i.e. estimating Pr(Z|X)
KL(Q || P)
P
Q
![Page 51: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/51.jpg)
51
Variational Bayesian Inference
• An optimization strategy for performing posterior inference, i.e. estimating Pr(Z|X)
KL(Q || P)P
Q
![Page 52: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/52.jpg)
52
Collapsed Variational Bayes(Teh et al., 2007)
• K-dimensional discrete variational distributions for each token
![Page 53: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/53.jpg)
53
Collapsed Variational Bayes(Teh et al., 2007)
• K-dimensional discrete variational distributions for each token
• Mean field assumption
![Page 54: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/54.jpg)
54
Collapsed Variational Bayes(Teh et al., 2007)
• K-dimensional discrete variational distributions for each token
• Mean field assumption
• Improved variational bound
![Page 55: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/55.jpg)
55
Collapsed VBMean field assumption
The Quick Brown Fox Jumped Over
Foxes 0.33 0.5 0.5 1 0 0.2
Dogs 0.33 0.3 0.5 0 0 0.2
Jumping 0.33 0.2 0 0 1 0.6
Words
Topics
![Page 56: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/56.jpg)
56
• Collapsed Gibbs sampler
Collapsed Variational Bayes(Teh et al., 2007)
The Quick Brown Fox Jumped Over
Foxes 0 1 1 1 0 0
Dogs 1 0 0 0 0 0
Jumping 0 0 0 0 1 1
![Page 57: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/57.jpg)
57
• Collapsed Gibbs sampler
• CVB0 (Asuncion et al., 2009)
Collapsed Variational Bayes(Teh et al., 2007)
![Page 58: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/58.jpg)
58
• CVB0 (Asuncion et al., 2009)
Collapsed Variational Bayes(Teh et al., 2007)
The Quick Brown Fox Jumped Over
Foxes 0.33 0.5 0.5 1 0 0.2
Dogs 0.33 0.3 0.5 0 0 0.2
Jumping 0.33 0.2 0 0 1 0.6
![Page 59: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/59.jpg)
59
• CVB0 (Asuncion et al., 2009)
Collapsed Variational Bayes(Teh et al., 2007)
The Quick Brown Fox Jumped Over
Foxes 0.33 0.5 0.5 1 0 0.2
Dogs 0.33 0.3 0.5 0 0 0.2
Jumping 0.33 0.2 0 0 1 0.6
![Page 60: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/60.jpg)
60
• CVB0 (Asuncion et al., 2009)
Collapsed Variational Bayes(Teh et al., 2007)
The Quick Brown Fox Jumped Over
Foxes 0.33 0.5 0.9 1 0 0.2
Dogs 0.33 0.3 0.1 0 0 0.2
Jumping 0.33 0.2 0 0 1 0.6
![Page 61: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/61.jpg)
61
CVB0 Statistics
• Simple sums over the variational parameters
![Page 62: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/62.jpg)
62
Stochastic Optimization for ML
• Stochastic gradient descent– Estimate the gradient
• Stochastic variational inference(Hoffman et al. 2010, 2013)– Estimate the natural gradient of the variational parameters
• Online EM (Cappe and Moulines, 2009)– Estimate E-step sufficient statistics
• Stochastic CVB0– Estimate the CVB0 statistics
![Page 63: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/63.jpg)
63
Stochastic Optimization for ML
• Stochastic gradient descent– Estimate the gradient
• Stochastic variational inference(Hoffman et al. 2010, 2013)– Estimate the natural gradient of the variational parameters
• Online EM (Cappe and Moulines, 2009)– Estimate E-step sufficient statistics
• Stochastic CVB0– Estimate the CVB0 statistics
![Page 64: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/64.jpg)
64
Estimating CVB0 Statistics
![Page 65: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/65.jpg)
65
Estimating CVB0 Statistics
• Pick a random word i from a random document j
![Page 66: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/66.jpg)
66
Estimating CVB0 Statistics
• Pick a random word i from a random document j
• An unbiased estimator is:
![Page 67: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/67.jpg)
67
Stochastic CVB0
• In an online algorithm, we cannot store the variational parameters
• But we can update them!
![Page 68: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/68.jpg)
68
Stochastic CVB0
• Keep an online average of the CVB0 statistics
![Page 69: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/69.jpg)
69
Extra Refinements
• Optional burn-in passes per document
• Minibatches
• Operating on sparse counts
![Page 70: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/70.jpg)
70
Stochastic CVB0Putting it all Together
![Page 71: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/71.jpg)
71
Experimental Results – Large Scale
![Page 72: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/72.jpg)
72
Experimental Results – Large Scale
![Page 73: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/73.jpg)
73
Experimental Results – Small Scale
• Real-time or near real-time results are important for EDA applications
• Human participants shown the top ten words from each topic
![Page 74: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/74.jpg)
74
Experimental Results – Small Scale
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
SCVB0
SVB
NIPS (5 Seconds) New York Times (60 Seconds)
Mean number of errors
Standard deviations: 1.1 1.2 1.0 2.4
![Page 75: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/75.jpg)
75
Convergence Analysis
• Theorem: with an appropriate sequence of step sizes, SCVB0 converges to a stationary point of the MAP, with adjusted hyper-parameters
![Page 76: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/76.jpg)
76
Convergence Analysis
• Step 1) An alternative derivation of “batch SCVB0” as an EM algorithm for MAP
EM statistics:
E-step responsibilites
![Page 77: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/77.jpg)
77
Convergence Analysis
• Step 1) An alternative derivation of “batch SCVB0” as an EM algorithm for MAP
EM statistics:
E-step:
Equivalent to SCVB0 update, but withhyper-parameters adjusted by one
![Page 78: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/78.jpg)
78
Convergence Analysis
• Step 1) An alternative derivation of “batch SCVB0” as an EM algorithm for MAP
EM statistics:
M-step:
E-step:
Synchronize parameters (estimated EM statistics)with the EM statistics
![Page 79: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/79.jpg)
79
Convergence Analysis
• Step 2) Stochastic CVB0 is a Robbins Monro stochastic approximation algorithm for finding the fixed points of this EM algorithm
![Page 80: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/80.jpg)
80
Convergence Analysis
• Step 2) Stochastic CVB0 is a Robbins Monro stochastic approximation algorithm for finding the fixed points of this EM algorithm
Goal: Find the roots of a function
![Page 81: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/81.jpg)
81
Convergence Analysis
• Step 2) Stochastic CVB0 is a Robbins Monro stochastic approximation algorithm for finding the fixed points of this EM algorithm
Observe noisy measurement
![Page 82: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/82.jpg)
82
Convergence Analysis
• Step 2) Stochastic CVB0 is a Robbins Monro stochastic approximation algorithm for finding the fixed points of this EM algorithm
Observe noisy measurement
Move in the direction of the noisy measurement
![Page 83: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/83.jpg)
83
Convergence Analysis
• Step 2) Stochastic CVB0 is a Robbins Monro stochastic approximation algorithm for finding the fixed points of this EM algorithm
The step that the EM algorithm takes
![Page 84: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/84.jpg)
84
Convergence Analysis
• Step 3) Show that the stochastic approximation algorithm converges
• A Lyapunov function is an “objective function” for an SA algorithm.
• The existence of such a function, with certain conditions holding, is sufficient for convergence with an appropriate sequence of step sizes
![Page 85: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/85.jpg)
85
Convergence Analysis
• Step 3) Show that the stochastic approximation algorithm converges
• A Lyapunov function is an “objective function” for an SA algorithm.
• The existence of such a function, with certain properties holding, is sufficient for convergence with an appropriate sequence of step sizes
• We show that (the negative of the Lagrangian of)
the EM lower bound is such a Lyapunov function
![Page 86: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/86.jpg)
86
Convergence Analysis
• Step 3) Show that the stochastic approximation algorithm converges
• A Lyapunov function is an “objective function” for an SA algorithm.
• The existence of such a function, with certain properties holding, is sufficient for convergence with an appropriate sequence of step sizes
• We show that (the negative of the Lagrangian of)
the EM lower bound is such a Lyapunov function
![Page 87: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/87.jpg)
87
Future work
• Exploit sparsity
• Parallelization
• Nonparametric extensions
• Generalizations to other models?
![Page 88: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/88.jpg)
88
Probabilistic Soft Logic(Lise Getoor’s research group, see psl.cs.umd.edu )
User-specified logical rules
![Page 89: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/89.jpg)
89
Probabilistic Soft Logic(Lise Getoor’s research group, see psl.cs.umd.edu )
User-specified logical rules
Probabilistic model
![Page 90: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/90.jpg)
90
Probabilistic Soft Logic(Lise Getoor’s research group, see psl.cs.umd.edu )
User-specified logical rules
Probabilistic model
Fast inference
![Page 91: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/91.jpg)
91
Probabilistic Soft Logic(Lise Getoor’s research group, see psl.cs.umd.edu )
User-specified logical rules
Probabilistic model
Fast inference
Structured predictionEntity resolution
Collective classification
…
Link prediction
![Page 92: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/92.jpg)
92
Publications from my Thesis Work
Algorithm papers• J. R. Foulds, L. Boyles, C. DuBois, P. Smyth and M. Welling. Stochastic collapsed variational
Bayesian inference for latent Dirichlet allocation. KDD 2013.
• J. R. Foulds and P. Smyth. Annealing Paths for the Evaluation of Topic Models. UAI 2014.
Modeling papers• J. R. Foulds, P. Smyth. Modeling scientific impact with topical influence regression. EMNLP
2013.
• J. R. Foulds, A. Asuncion, C. DuBois, C. T. Butts, P. Smyth. A dynamic relational infinite feature model for longitudinal social networks. AI STATS 2011
![Page 93: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/93.jpg)
93
Other publications• C. DuBois, J. R. Foulds, P. Smyth. Latent set models for two-mode network data. ICWSM 2011.
• J. R. Foulds, N. Navaroli, P. Smyth, A. Ihler. Revisiting MAP estimation, message passing and perfect graphs. AI STATS 2011.
• J. R. Foulds and P. Smyth. Multi-instance mixture models and semi-supervised learning. SIAM SDM 2011.
• J. R. Foulds and E. Frank. Speeding up and boosting diverse density learning. Discovery Science, 2010.
• J. R. Foulds and E. Frank. A review of multi-instance learning assumptions. Knowledge Engineering Review, 25(1), 2010.
• J. R. Foulds and E. Frank. Revisiting multiple-instance learning via embedded instance selection. Australasian Joint Conference on Artificial Intelligence, 2008.
• J. R. Foulds and L. R. Foulds, A probabilistic dynamic programming model of rape seed harvesting. International Journal of Operational Research 2006, 1(4), 2006.
• J. R. Foulds and L. R. Foulds, Bridge lane direction specification for sustainable traffic management. Asia-Pacific Journal of Operational Research, 23(2), 2006.
![Page 94: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/94.jpg)
94
Thanks to my Collaborators
• My PhD advisor, Padhraic Smyth
• SCVB0 is also joint work with:– Levi Boyles– Chris DuBois– Max Welling
![Page 95: Fast and Accurate Inference for Topic Models](https://reader035.fdocuments.in/reader035/viewer/2022062408/56813cc2550346895da66e16/html5/thumbnails/95.jpg)
95
Thank You!
Questions?