Spatial LDA

Post on 18-Jul-2015

153 views 11 download

Tags:

Transcript of Spatial LDA

Spatial Latent Dirichlet Allocation

Spatial Latent Dirichlet Allocation

Authors : - Xiaogang Wang and Eric Crimson

Review By: George Mathew(george2)

Applications

Applications• Text Mining

• Identifying similar chapters in a book

Applications• Text Mining

• Identifying similar chapters in a book

• Computer Vision

• Face Recognition

Applications• Text Mining

• Identifying similar chapters in a book

• Computer Vision

• Face Recognition

• Colocation Mining

• Identifying forest fires

Applications• Text Mining

• Identifying similar chapters in a book

• Computer Vision

• Face Recognition

• Colocation Mining

• Identifying forest fires

• Music search

• Identifying genre of music based on segment of the song

LDA-Overview

LDA-Overview• A generative probabilistic model

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

• document - sequence of words

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

• document - sequence of words

• corpus - collection of all documents

LDA-Overview• A generative probabilistic model

• Represented as words, documents, corpus and labels

• words - primary unit of discrete data

• document - sequence of words

• corpus - collection of all documents

• label(Output) - class of the document

Wait a minute … So how are we going to perform computer vision applications

using words and documents

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

• spatial and temporal interest points

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

• spatial and temporal interest points

• moving pixels etc

Wait a minute … So how are we going to perform computer vision applications

using words and documents• So here words would represent visual words which

could consist of

• image patches

• spatial and temporal interest points

• moving pixels etc

• The paper takes an example of image classification in computer vision.

Data Preprocessing

Data Preprocessing• Image is convolved against a series of filters, 3

Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians

Data Preprocessing• Image is convolved against a series of filters, 3

Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians

• A grid is used to divide the image into local patches and the patch is sampled densely for a particular local descriptor.

Data Preprocessing• Image is convolved against a series of filters, 3

Gaussians, 4 Laplacians of Gaussians and 4 first order derivatives of Gaussians

• A grid is used to divide the image into local patches and the patch is sampled densely for a particular local descriptor.

• The local descriptors of each patch in the entire image set is clustered using k-means and stored in an auxiliary data structure(lets call it a “Workbook”).

Clustering using LDA

Clustering using LDA• Framework:

Clustering using LDA• Framework:

• M documents(images)

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

• Each topic k is modeled as a multinomial distribution over the WorkBook

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

• Each topic k is modeled as a multinomial distribution over the WorkBook

• 𝛼 and β are Dirichlet prior hyperparameters.

Clustering using LDA• Framework:

• M documents(images)

• Each document j has Nj words

• wji is the observed value of word i in document j

• All words will be clustered into k topics

• Each topic k is modeled as a multinomial distribution over the WorkBook

• 𝛼 and β are Dirichlet prior hyperparameters.

• ɸk, ∏j and zji are hidden variables used.

Clustering using LDA(cntd)• Generative algorithm:

• For a topic k a multinomial parameter ɸk is sampled from the Dirichlet prior ɸk ~ Dir(β)

• For a document j, a multinomial parameter ∏j over K topics is sampled from Dirichlet prior ∏j ~ Dir(𝛼)

• For a word i in document j, a topic label zji is sampled from the discrete distribution zji ~ Discrete(∏j)

• The value wji of word i in document j is sampled for the discrete distribution of topic zji, wji ~ Discrete(ɸzji)

• zji is sampled through Gibbs sampling procedure as follows:

• n(k)

-ji,w represents number of words in the corpus with value w assigned to topic k excluding word i in document j

• n(j)

-ji,k represents number of words in the document j assigned to topic k excluding word i in document j

What’s the issue with LDA?

What’s the issue with LDA?

• Spatial and Temporal components of the visual words are not considered. So co-occurence information is not utilized.

What’s the issue with LDA?

• Spatial and Temporal components of the visual words are not considered. So co-occurence information is not utilized.

• Consider the scenario where there is a series of animals with grass as the background. Since we assume an image to be a document and since the animal is only a small part of the image, it would most likely be classified as grass.

How can we resolve it?

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

• We could use overlapping regions as a document.

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

• We could use overlapping regions as a document.

• But since each overlapping document could contain a patch how would you decide which of the documents it should belong to?

How can we resolve it?• Use a grid layout on each image and each region in the grid could

be considered a document.

• But how would you handle overlap of a patch between two regions?

• We could use overlapping regions as a document.

• But since each overlapping document could contain a patch how would you decide which of the documents it should belong to?

• So we could replace each document(region) as a point and if a patch is closer to a particular point, we could assign it to that document.

• Framework:

• Besides the parameters used in LDA spatial information is also captured

• A hidden variable di indicates the document which word i is assigned to.

• Additionally for each document gd

j, xd

j, yd

j, represents the index, x coordinate and y coordinate of the document respectively.

• Additionally for each image gi, xi, yi, represents the index, x coordinate and y coordinate of the image respectively.

• Generative Algorithm:

• For a topic k a multinomial parameter ɸk is sampled from the Dirichlet prior ɸk ~ Dir(β)

• For a document j, a multinomial parameter ∏j over K topics is sampled from Dirichlet prior ∏j ~ Dir(𝛼)

• For a word i, a random variable di, is sampled from prior of p(di|η), indicating document for word i.

Clustering using Spatial LDA

Clustering using Spatial LDA(contd)

• Generative Algorithm:

• Image index and location of word is chosen from distribution p(ci|cd

di,𝝈). A gaussian kernel is chosen

• For a word j in document di, a topic label zi is sampled from the discrete distribution zji ~ Discrete(∏di)

• The value wi of word i is sampled for the discrete distribution of topic zi, wi ~ Discrete(ɸzi)

• zji is sampled through Gibbs sampling procedure as follows:

• n(k)

-i,w represents number of words in the corpus with value w assigned to topic k excluding word i and n

(j)

-i,k represents number of words in the document j assigned to topic k excluding word i

• The conditional distribution of di is represented as follows:

Results

cows cars faces bicyclesLDA(D) 0.376 0.555 0.717 0.556SLDA(D) 0.566 0.684 0.697 0.566LDA(FA) 0.558 0.396 0.586 0.529SLDA(FA) 0.033 0.244 0.371 0.422

What the paper missed

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

• In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated.

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

• In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated.

• Few Advancements were made in the paper:

What the paper missed• Comparisons with other standard clustering methods could have been

mentioned to highlight the efficiency of the algorithm.

• For the given experimental data an intuition on the selection of input parameters 𝛼,β and η could have been provided.

• In case of moving images, the temporal aspect of the images are ignored. In future, this could be considered as a parameter and the algorithm could be updated.

• Few Advancements were made in the paper:

• James Philbin, Josef Sivic and Andrew Zisserman. Geometric Latent Dirichlet Allocation on a Matching Graph for Large-scale Image Datasets, International Journal of Computer Vision, Volume 95, Number 2, page 138--153, nov 2011

Libraries for LDA

Libraries for LDA

• R - “lda” - http://cran.r-project.org/web/packages/lda/lda.pdf

Libraries for LDA

• R - “lda” - http://cran.r-project.org/web/packages/lda/lda.pdf

• Python - lda v1.0.2 - https://pypi.python.org/pypi/lda

Libraries for LDA

• R - “lda” - http://cran.r-project.org/web/packages/lda/lda.pdf

• Python - lda v1.0.2 - https://pypi.python.org/pypi/lda

• Java - GibbsLDA - http://gibbslda.sourceforge.net/

References

References• Wang, Xiaogang and Eric Grimson. Spatial Latent

Dirichlet Allocation. Advances in Neural Information Processing Systems 20 (NIPS 2007)

• D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, 2003.

• Diane J. Hu. Latent Dirichlet Allocation for Text, Images, and Music.