An Introduction to Topic Modeling - Verbs...
Transcript of An Introduction to Topic Modeling - Verbs...
![Page 1: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/1.jpg)
An Introduction to Topic Modeling
Daniel W. Peterson
Department of Computer ScienceUniversity of Colorado at Boulder
April 24, 2013
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 1 / 20
![Page 2: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/2.jpg)
Latent Semantic Analysis
Documents x Terms matrix: large and sparse
Use SVD to decompose it into three matrices
Keep only the “important” dimensions
Assumptions:
Word order doesn’t matterWords are orthogonal dimensions in a high-dimensional space
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 2 / 20
![Page 3: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/3.jpg)
Probabilistic Latent Semantic Analysis
Documents are generated by a probabilistic process
Structure based on topicsDifferent topics make different words more likely
Assumptions:
Word order doesn’t matterEach word is chosen as the result of exactly one topic
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 3 / 20
![Page 4: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/4.jpg)
Probabilistic Latent Semantic Analysis
N documents
A document is L words long
Each entry has an assignment toone of K topics
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 4 / 20
![Page 5: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/5.jpg)
Probabilistic Latent Semantic Analysis
How do we choose a topic?
We sample from a distributionover topics.
How do we choose a word?We sample from a distributionover words.
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 5 / 20
![Page 6: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/6.jpg)
Probabilistic Latent Semantic Analysis
How do we choose a topic?We sample from a distributionover topics.
How do we choose a word?
We sample from a distributionover words.
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 5 / 20
![Page 7: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/7.jpg)
Probabilistic Latent Semantic Analysis
How do we choose a topic?We sample from a distributionover topics.
How do we choose a word?We sample from a distributionover words.
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 5 / 20
![Page 8: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/8.jpg)
Multinomial Distribution
Select one of several possible outcomes
Outcomes may be equally likely (like dice)
OR: some outcomes may be more likely thanothers (load the dice)
Looks like: a 1× n vector of probabilities
[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0
A sample looks like: a number
The outcome of rolling the diceProbability we get i is given by xi
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20
![Page 9: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/9.jpg)
Multinomial Distribution
Select one of several possible outcomes
Outcomes may be equally likely (like dice)
OR: some outcomes may be more likely thanothers (load the dice)
Looks like: a 1× n vector of probabilities
[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0
A sample looks like: a number
The outcome of rolling the diceProbability we get i is given by xi
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20
![Page 10: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/10.jpg)
Multinomial Distribution
Select one of several possible outcomes
Outcomes may be equally likely (like dice)
OR: some outcomes may be more likely thanothers (load the dice)
Looks like: a 1× n vector of probabilities
[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0
A sample looks like: a number
The outcome of rolling the diceProbability we get i is given by xi
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20
![Page 11: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/11.jpg)
Multinomial Distribution
Select one of several possible outcomes
Outcomes may be equally likely (like dice)
OR: some outcomes may be more likely thanothers (load the dice)
Looks like: a 1× n vector of probabilities
[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0
A sample looks like: a number
The outcome of rolling the diceProbability we get i is given by xi
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20
![Page 12: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/12.jpg)
Multinomial Distribution
Select one of several possible outcomes
Outcomes may be equally likely (like dice)
OR: some outcomes may be more likely thanothers (load the dice)
Looks like: a 1× n vector of probabilities
[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0
A sample looks like: a number
The outcome of rolling the diceProbability we get i is given by xi
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20
![Page 13: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/13.jpg)
Probabilistic Latent Semantic Analysis
θ is a distribution over topicsin a document
One θ for each document
θ is a 1× K vector
Sum of θ is 1
φ is a distribution over wordsin a topic
One φ for each topic
φ is a 1×W vector
Sum of φ is 1
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 7 / 20
![Page 14: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/14.jpg)
Probabilistic Latent Semantic Analysis
θ is a distribution over topicsin a document
One θ for each document
θ is a 1× K vector
Sum of θ is 1
φ is a distribution over wordsin a topic
One φ for each topic
φ is a 1×W vector
Sum of φ is 1
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 7 / 20
![Page 15: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/15.jpg)
Probabilistic Latent Semantic Analysis
Fold θ into graphicalmodel
Where do θ and φ comefrom?
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 8 / 20
![Page 16: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/16.jpg)
Probabilistic Latent Semantic Analysis
Fold θ into graphicalmodel
Where do θ and φ comefrom?
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 8 / 20
![Page 17: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/17.jpg)
Topic Modeling
Sample θ and φ from anappropriate distribution
Dirchlet: a distributionover distributions
Incorporating Dirichletprior provides smoothing
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 9 / 20
![Page 18: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/18.jpg)
Topic Modeling
Sample θ and φ from anappropriate distribution
Dirchlet: a distributionover distributions
Incorporating Dirichletprior provides smoothing
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 9 / 20
![Page 19: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/19.jpg)
Topic Modeling
Sample θ and φ from anappropriate distribution
Dirchlet: a distributionover distributions
Incorporating Dirichletprior provides smoothing
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 9 / 20
![Page 20: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/20.jpg)
Dirichlet Distribution
Takes n parameters α1, α2, . . . , αn
Distribution over 1× n vectors with sum of 1
αi are called concentration parameters
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 10 / 20
![Page 21: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/21.jpg)
Dirichlet Distribution with 2 Parameters
Figure: Image source: Wikipedia
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 11 / 20
![Page 22: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/22.jpg)
Dirichlet Distribution with 3 Parameters
Figure: Image source: Yee Whye Teh
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 12 / 20
![Page 23: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/23.jpg)
A Sample from a Dirichlet
A particular 1× n vector with sum of 1
[x1, x2, . . . , xn] such that x1 + x2 + . . .+ xn = 1
every xi > 0
A multinomial distribution
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 13 / 20
![Page 24: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/24.jpg)
A Sample from a Dirichlet
A particular 1× n vector with sum of 1
[x1, x2, . . . , xn] such that x1 + x2 + . . .+ xn = 1
every xi > 0
A multinomial distribution
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 13 / 20
![Page 25: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/25.jpg)
A Sample from a Dirichlet
A particular 1× n vector with sum of 1
[x1, x2, . . . , xn] such that x1 + x2 + . . .+ xn = 1
every xi > 0
A multinomial distribution
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 13 / 20
![Page 26: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/26.jpg)
Topic Modeling
Sample θ and φ from aDirichlet distribution
This is important forwhen we turn the modelaround:
Dirichlet distribution isconjugate prior ofmultinomial:
Given a Dirichlet prior,and counts of topicassignments, theposterior is also Dirichlet
β and γ are smoothingparameters
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 14 / 20
![Page 27: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/27.jpg)
Topic Modeling
Sample θ and φ from aDirichlet distribution
This is important forwhen we turn the modelaround:
Dirichlet distribution isconjugate prior ofmultinomial:
Given a Dirichlet prior,and counts of topicassignments, theposterior is also Dirichlet
β and γ are smoothingparameters
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 14 / 20
![Page 28: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/28.jpg)
Topic Modeling
Sample θ and φ from aDirichlet distribution
This is important forwhen we turn the modelaround:
Dirichlet distribution isconjugate prior ofmultinomial:
Given a Dirichlet prior,and counts of topicassignments, theposterior is also Dirichlet
β and γ are smoothingparameters
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 14 / 20
![Page 29: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/29.jpg)
Inference
Generative model explains how the data was created
Inference: trying to guess model parameters
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 15 / 20
![Page 30: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/30.jpg)
Inference
Generative model explains how the data was created
Inference: trying to guess model parameters
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 15 / 20
![Page 31: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/31.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 32: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/32.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 33: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/33.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 34: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/34.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 35: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/35.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a time
Spend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 36: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/36.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a timeSpend more time walking around more likely areas
We can get to likely areas from anywhereIt doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 37: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/37.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhere
It doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 38: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/38.jpg)
Gibbs Sampling
Hard to determine most likely model parameters
Hard for even relatively likely parameters
Can’t sample from overall distribution: sample instead a singlevariable
Take a walk through distribution
One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20
![Page 39: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/39.jpg)
Gibbs Sampling in a Topic Model
Start with randomassignment of topics
For each< word , document >pair:
Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w
Repeat the above manytimes
Smoothing (β and γ)very important
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20
![Page 40: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/40.jpg)
Gibbs Sampling in a Topic Model
Start with randomassignment of topics
For each< word , document >pair:
Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w
Repeat the above manytimes
Smoothing (β and γ)very important
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20
![Page 41: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/41.jpg)
Gibbs Sampling in a Topic Model
Start with randomassignment of topics
For each< word , document >pair:
Sample θ based oncounts and prior
Sample φ based oncounts and priorChoose k based on θ,φ, and w
Repeat the above manytimes
Smoothing (β and γ)very important
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20
![Page 42: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/42.jpg)
Gibbs Sampling in a Topic Model
Start with randomassignment of topics
For each< word , document >pair:
Sample θ based oncounts and priorSample φ based oncounts and prior
Choose k based on θ,φ, and w
Repeat the above manytimes
Smoothing (β and γ)very important
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20
![Page 43: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/43.jpg)
Gibbs Sampling in a Topic Model
Start with randomassignment of topics
For each< word , document >pair:
Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w
Repeat the above manytimes
Smoothing (β and γ)very important
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20
![Page 44: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/44.jpg)
Gibbs Sampling in a Topic Model
Start with randomassignment of topics
For each< word , document >pair:
Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w
Repeat the above manytimes
Smoothing (β and γ)very important
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20
![Page 45: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/45.jpg)
Gibbs Sampling in a Topic Model
Start with randomassignment of topics
For each< word , document >pair:
Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w
Repeat the above manytimes
Smoothing (β and γ)very important
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20
![Page 46: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/46.jpg)
Bayes Rule
P(k|β,X) ∝ P(k|β)P(X|k)
Sampling from a conditional distribution can bebroken down into sampling based on the parentnodes (prior, β) and the children (likelihood, X)
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 18 / 20
![Page 47: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/47.jpg)
Blocked Gibbs Sampling in a Topic Model
Start with randomassignment of topics
Repeat many times:
Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs
More sampling, lesscounting
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20
![Page 48: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/48.jpg)
Blocked Gibbs Sampling in a Topic Model
Start with randomassignment of topics
Repeat many times:
Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs
More sampling, lesscounting
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20
![Page 49: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/49.jpg)
Blocked Gibbs Sampling in a Topic Model
Start with randomassignment of topics
Repeat many times:
Sample all θ and φfrom counts and prior
Choose k for anumber of< word , document >pairs
More sampling, lesscounting
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20
![Page 50: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/50.jpg)
Blocked Gibbs Sampling in a Topic Model
Start with randomassignment of topics
Repeat many times:
Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs
More sampling, lesscounting
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20
![Page 51: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/51.jpg)
Blocked Gibbs Sampling in a Topic Model
Start with randomassignment of topics
Repeat many times:
Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs
More sampling, lesscounting
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20
![Page 52: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/52.jpg)
Collapsed Gibbs Sampling in a Topic Model
Integrate out θ and φ
Start with random assignment of topics
For each < word , document > pair:
Sample k directly from counts
Repeat many times
P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ
n(·)−i ,k + W γ
n(di )−i ,k + β
n(di )−i ,· + Kβ
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20
![Page 53: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/53.jpg)
Collapsed Gibbs Sampling in a Topic Model
Integrate out θ and φ
Start with random assignment of topics
For each < word , document > pair:
Sample k directly from counts
Repeat many times
P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ
n(·)−i ,k + W γ
n(di )−i ,k + β
n(di )−i ,· + Kβ
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20
![Page 54: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/54.jpg)
Collapsed Gibbs Sampling in a Topic Model
Integrate out θ and φ
Start with random assignment of topics
For each < word , document > pair:
Sample k directly from counts
Repeat many times
P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ
n(·)−i ,k + W γ
n(di )−i ,k + β
n(di )−i ,· + Kβ
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20
![Page 55: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/55.jpg)
Collapsed Gibbs Sampling in a Topic Model
Integrate out θ and φ
Start with random assignment of topics
For each < word , document > pair:
Sample k directly from counts
Repeat many times
P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ
n(·)−i ,k + W γ
n(di )−i ,k + β
n(di )−i ,· + Kβ
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20
![Page 56: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013 · Topic Modeling Sample and ˚from a Dirichlet distribution](https://reader034.fdocuments.in/reader034/viewer/2022042309/5ed70f5462136e72fb7bc28c/html5/thumbnails/56.jpg)
Collapsed Gibbs Sampling in a Topic Model
Integrate out θ and φ
Start with random assignment of topics
For each < word , document > pair:
Sample k directly from counts
Repeat many times
P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ
n(·)−i ,k + W γ
n(di )−i ,k + β
n(di )−i ,· + Kβ
Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20