Post on 27-Jan-2015
description
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Probabilistic Generative Models for MachineVision
Davide Bacciu
bacciu@di.unipi.itDipartimento di Informatica
Università di Pisa
Corso di Intelligenza ArtificialeProf. Alessandro Sperduti
Università degli Studi di PadovaA.A. 2008/2009
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Outline
1 Machine Vision BackgroundIntroductionVisual Content RepresentationMachine Vision Applications
2 Region Annotation by Hierarchical Region Topic DiscoveryVisual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Introduction
Visual content representationVisual features - how to identify and describe the single bitsof visual informationImage representation - how to describe the visualinformation within a picture
Machine vision applicationsScene and object recognitionImage annotation
Focus on Probabilistic Generative Models and the Bag ofWords image representation
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Visual Features Descriptors
How do we describe the single bits of visual information?
Global descriptorsColor histogramsShape featuresTextureSpatial Envelope
Local descriptorsGabor filtersDifferential filtersDistribution-baseddescriptors
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Scale Invariant Feature Transform (SIFT)
Calculate gradient magnitudeand orientation at a givenkeypoint (x , y) and scale σCompute 3D histogram ofmagnitude orientations in a4× 4 neighborhood of thekeypointAngles are quantized to 8directionsRobust to geometric andillumination distortion
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Visual Features Detectors
How do we identify the interesting/informative parts of animage?
Feature detectorsSegment/Blob detectorsCorner/Interest point detectorsAffine invariant keypoint detectorsDense sampling
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Image Representation
Straightforward for global descriptorsAn image corresponds to its descriptorImages are confronted, classified and indexed based ontheir global color and/or texture histograms
Requires an additional step for local descriptorsLocal descriptors provide only information on a portion ofthe sceneProbabilistic modeling of segmentsBag of words
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Bag of Words Document Representation
Count the occurrences of each dictionary word in yourdocumentRepresent the document as a vector of word counts
Definition (Bag of Words Assumption)
Word order is not relevant for determining document semantics
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Bag of Words Image Representation
Each image is a document and each visual patch is a wordCount the occurrences of each dictionary visual word(visterm) in your imageRepresent the image as a vector of visterm counts(histogram)
...
Codebook
CollectionImage
Image Collection
Vector
k−means
Input Image
Local patches
Quantization
BOW
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Image Understanding Applications
Images are typically represented as vector of featuresDiscover the semantics of an image from its vectorialrepresentation
Object recognition (e.g. Airplane)Scene recognition (e.g. Forest)Image annotation (e.g. Sky, Tree, Boat, Sea, ...)
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Scene and Object Recognition
Structured approachesObject/scene categories arerepresented as collections ofconnected parts characterized bydistinctive appearance and spatialpositionPart-based probabilistic models
Weakly structured approachesBased on bag-of-word assumptionLatent aspect models
Place Context
PixelContext
Part Context
Object Context
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Latent Aspect Discovery
Rooted into text processingProbabilistic models describing the generative process ofimage content using hidden (i.e. latent) variablesModel likelihood is used to measure the fit of the unknownparameters with respect to the observationsModel parameters are fitted by Expectation Maximization(EM) or variational methods
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Probabilistic Latent Semantic Analysis (pLSA) (I)
P(d) P(z|d) P(w|z)
D
z wd
Wd
Observations are collected in the integer matrix n(wj ,di)
Every observation is associated to a latent topic k bymeans of the hidden variable zk
Each hidden topic k denotes a particular visual theme (e.g.an object)
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Probabilistic Latent Semantic Analysis (pLSA) (II)
P(d) P(z|d) P(w|z)
D
z wd
Wd
pLSA is trained by Expectation Maximization (EM)Conditional independence is used to decompose modellikelihood
L(N|θ) =D∏
i=1
W∏j=1
P(wj ,di)n(wj ,di )
P(wj ,di) = P(di)P(wj |di) = P(di)K∑
k=1
P(zk |di)P(wj |zk )
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
An Object Recognition Example
Caltech-4 Dataset: cars, faces, motorbikes and airplanes on acluttered backgroundFit a pLSA model with 7 latent topics (4 objects + 3 background)Determine the most likely topic of each visual word in image di
P(zk |wj ,di) =P(zk |di)P(wj |zk )∑K
k=1 P(zk |di)P(wj |zk )
Figure: Images by Sivic et al 2005Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Latent Dirichlet Allocation (LDA)
D
z
K
w
Wd
θ
β φ
α
pLSA provides a generative model only for the training dataLDA treats P(zk |di) as a latent random variable, samplingit from a Dirichlet distribution P(θ|α)
Maximizes data likelihood
P(w |φ, α, β) =
∫ ∑z
P(w |z, φ)P(z|θ)P(θ|α)P(φ|β)dθ
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Hierarchical Latent Topic Models...
Dependent Pachinko Allocation Model (Li and McCallum, 2006)
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Image Annotation
Learn association between image content and textTypically exploits region-based image representation
Image segmentationColor and texture descriptors
Non-probabilistic image annotationInformation retrieval approaches (e.g. vector based)Multiple instance learning
Probabilistic Image AnnotationMachine translation approachProbabilistic generativeLatent topic models
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Latent Aspect Models in Image Annotation
z
v
D
t
b
β
σ
µ
Md
Bd
α θ
z
D
v
b
t
Md
µ
σ
β
Bd
α θ
GM-LDA CORR-LDA
Based on a Mixture of Gaussians modeling of the Bd imageregions and a multinomial distribution for annotations Md
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
Image Annotation - Is it Worth the Hassle?
The Web is full of annotated multimedia information
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
IntroductionVisual Content RepresentationMachine Vision Applications
How Informative is Image Representation?
BOW assumption is strong in the visual domain: spatialcontext matters!Region-based representation is often too coarse andunstable.How can we retain the robustness of BOW whileintroducing spatial context?
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Introduction
Overtake flat region-based and keypoint-basedapproachesStructured representation conveying richer semantics anddiscriminative power than a BOW model without the rigidityof a part-based approachKey idea
Introduce a multi-resolution representation of visual contentdrawing both on region-based and keypoint-based modelsIntroduce a multi-layered structure at the semantic level oflatent topic discovery
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Multi-Resolution Image Representation
Multi-resolution V describing visual information at differentspatial granularityBOW representation is based on assumption that spatialdistribution of visual keywords can be neglected whendetermining the overall image appearance
Biased view holding mostly for corpora characterized bysmall variability of the pictorial contentNeed of structured approach that can isolate semanticallydifferent image portions
Text representationDocuments are collections of paragraphsParagraphs are collections of words
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Region-Keypoints Representation
bag−of−words
bag−of−regions
segmented image ...
wr
r1
r2
r3
r4
r5
Use an unsupervised learning model to extract perceptuallycoherent portions of the image rl
Use dense sampling to extract a set of local visual descriptors
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Hierarchical Region-Topic Probabilistic LatentSemantic Analysis (HRPLSA)
tz
D
T
d w
rd
φ
Wr
Complete model likelihood
LC(N|θ) =D∏
i=1
Ri∏l=1
K∑k=1
P(zk |di)Zilk
Wl∏j=1
T∑p=1
P(tp|zk )(φjp)Tkjp
nilj
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Model Fitting
HRPLSA parameters are fitted by applying ExpectationMaximization to the complete log-likelihood, e.g.
P(tp|zk ) =
∑Di=1
∑Ril=1 P(zk |di , rl)
∑Wlj=1 niljP(tp|zk ,wj)∑D
i ′=1∑Ri′
l ′=1 P(zk |di ′ , rl ′)ni ′l ′·
Model parameters θP(zk |di) - Document-specific mixing proportions of regiontopicsP(tp|zk ) - Mixing proportions of keyword topics given regionaspectsφjp - Multinomial keyword-topic distribution
Probabilistic inference is extended to images outside thetraining pool by folding-in
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Modeling Images and Words by HRPLSA(HRPLSA-ANN)
streetlight
sing
awning
window
sidewalk
plants
car
road
...
...
...
...
dsky
building
P (a|t)
P (a|z)
w1 w2 wW1
tW1t2
zR
t1t1
z2z1
t1t1
w1
t2
w2 wWR
tWR
Connection between region-topics (keyword-topics) and word islearned by maximizing constrained annotation log-likelihood
Lcal(N|θ′) =D∑
i=1
F∑f=1
mfi logK∑
k=1
P(af |zk )P(zk |di)
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Probabilistic InferenceA Few Examples
Most likely visual aspect within an image di
z i = arg maxzk∈Z
P(zk |di)
Most likely visual aspect z∗ for an image segment rl
z∗ = arg maxzk∈Z
P(zk |di , rl)
Most likely annotation a∗ for region rl in image dnew
a∗ = arg maxaf∈A
K∑k=1
P(af |zk )P(zk |dnew )
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Unsupervised Discovery and Segmentation of VisualClasses
Microsoft Research Cambridge dataset (MSRC-B1)240 images and 9 visual classesGround truth segmentation of visual classes
Image segments are assigned to the most likelydiscovered aspect and segmentation overlap is measured
ρor =GTo ∩ Rr
GTo ∪ Rr.
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Semantic Segmentation
Discovery of semantic visual aspects corrects segmentationerrors
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
MSRC-B1 Semantic Segmentation Result
Table: Segmentation Overlap on the MSRC-B1 Dataset
Algorithm Airplanes Bicycles Buildings Cars Cows Faces Grass Tree Sky Average
dLDA 0.43 0.50 0.16 0.45 0.52 0.44 0.60 0.71 0.74 0.51LDA10 0.11 0.06 0.09 0.14 0.11 0.40 0.41 0.69 0.41 0.27LDA15 0.08 0.56 0.06 0.14 0.14 0.43 0.57 0.62 0.41 0.33LDA20 0.10 0.50 0.40 0.15 0.58 0.45 0.54 0.46 0.50 0.40LDA25 0.14 0.52 0.21 0.17 0.48 0.44 0.45 0.59 0.50 0.39
HPRLSA1015 0.32 0.37 0.59 0.68 0.74 0.44 0.87 0.81 0.90 0.62
HPRLSA1020 0.35 0.56 0.65 0.67 0.74 0.47 0.89 0.47 0.91 0.63
HPRLSA1025 0.35 0.52 0.75 0.63 0.73 0.50 0.89 0.71 0.86 0.66
HPRLSA1520 0.36 0.49 0.56 0.60 0.68 0.52 0.89 0.77 0.91 0.64
HPRLSA1525 0.42 0.52 0.61 0.68 0.69 0.45 0.90 0.61 0.92 0.65
HPRLSA2025 0.44 0.43 0.48 0.52 0.67 0.48 0.88 0.75 0.91 0.61
HPRLSA1025 0.16 0.32 0.15 0.40 0.43 0.22 0.28 0.33 0.38 0.30
HPRLSA1525 0.24 0.31 0.13 0.38 0.37 0.19 0.25 0.21 0.45 0.28
Comparative analysis
HRPLSA
Multi-segmentation Latent Dirichlet Allocation (LDA)
Hierarchical Latent Dirichlet Allocation (hLDA)Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
MSRC-B1 Segmentation Example I
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
MSRC-B1 Segmentation Example II
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
MSRC-B1 Segmentation Example III
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Image Annotation
Washington Ground Truth dataset854 images manually annotated with 392 words427 training/test images
LabelMe dataset2688 images annotated with ∼ 400 words800 training / 1888 test images
Comparative analysisHRPLSA-ANNpLSA-FEATURESAnnotation Propagation (AP)
Precision/recall analysis
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Washington DatasetImage Annotation
Precision/Recall Plot
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
recall
prec
isio
n
plsa245
hrplsa245
ap245
plsa188
hrplsa188
ap188
plsa158
hrplsa158
ap158
Comparative precision/recall results on the Washington dataset
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Washington DatasetImage Annotation Example
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Washington DatasetRegion Annotation Example (I)
Examples of good HRPLSA-ANN annotations
clouds
people
people
house
skyclouds
tree
clouds
clouds50
100
150
200
250
300
350
400
450
tree treetree
tree
treetree
treesidewalk
tree
tree
building
50
100
150
200
250
300
350
400
450
500
window
clear
tree
house
treetree
tree
tree
tree
clear
tree
building
building
building
50
100
150
200
250
300
350
400
450
500
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Washington DatasetRegion Annotation Example (II)
Examples of bad HRPLSA-ANN annotations
flower
treetree
tree
treetree
tree
tree
tree
tree
tree
tree
tree
bush
50
100
150
200
250
300
350
400
450
500
tree
tree
tree
sky
cloudsclouds clouds
water
skysky
50
100
150
200
250
300
350
400
450
500
cloudshouse
tree
peopletree
people
tree
sky
tree
overcast50
100
150
200
250
300
350
400
450
500
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
Washington DatasetAnnotated Segments
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
LabelMe DatasetImage Annotation
Precision/Recall Plot
0 0.1 0.2 0.3 0.4 0.5 0.6 0.70.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
Recall
Pre
cisi
on
plsa60
hrplsa60,80
plsa25
hrplsa25,50
ap
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
LabelMe DatasetExample of Annotated Images
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
LabelMe DatasetRecalling Specific Terms
Balcony
precision recall0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
prec
isio
n/re
call
balcony
hrplsaplsaap
Sign
precision recall0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
prec
isio
n/re
call
sign
hrplsaplsaap
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
LabelMe DatasetRegion Annotation Issue I
Region annotation tends to predict only a limited number ofwords
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
build
ing car
field
mou
ntain
pers
onro
ad sea
sky
skys
crap
ertre
ewat
er
window
0 100 200 3000
200
400coast
0 100 200 3000
1000
2000forest
0 100 200 3000
500
1000highway
0 100 200 3000
1000
2000 city
0 100 200 3000
500
1000mountain
0 100 200 3000
500
1000country
0 100 200 3000
1000
2000street
0 100 200 3000
1000
2000tallbuilding
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
LabelMe DatasetRegion Annotation Issue II
Distribution of region-topics in mountain caption
Topic 40
Topic 20
Topic 11
The unbalanced distribution of term occurrences introduces a bias inthe generation of the segment caption that yields to distinct regionaspects being associated to the same frequently co-occurring word
Davide Bacciu Probabilistic Generative Models for Machine Vision
Machine Vision BackgroundRegion Annotation by Hierarchical Region Topic Discovery
Visual Content RepresentationHierarchical Probabilistic Latent Semantic AnalysisExperimental Evaluation
LabelMe DatasetAnnotated Segments
Davide Bacciu Probabilistic Generative Models for Machine Vision
ConclusionConcluding RemarksBibliography and Online Resources
Wrapping up...
Inspiration can be sought in related application fieldsBag of words representationGenerative models for textNeed to pay attention to the underlying assumptions of amodel!
Hierarchical multi-resolution latent aspect model (HRPLSA)Unsupervised semantic segmentation of visual contentMulti-level image captioning
Future challengesGeneralize multi-resolution representation and topichierarchy beyond a two-layered structureVideo/multimedia understanding
Davide Bacciu Probabilistic Generative Models for Machine Vision
ConclusionConcluding RemarksBibliography and Online Resources
Online ResourcesA notable introductory course to object recognition (with code)
http://people.csail.mit.edu/torralba/shortCourseRLOC/
Code repositories
http://www.cs.ubc.ca/~lowe/keypoints/ (OriginalSIFT software by D. Lowe)
http://vlfeat.org/ (SIFT, MSER and VQ routines)
www.robots.ox.ac.uk/~vgg/research/affine/ (Affinedetectors and more)
http://www.di.ens.fr/~russell/projects/mult_seg_discovery/index.html (Multiple segmentation LDA)
Image annotation tool and repository
http://labelme.csail.mit.edu/ (LabelMe)
Davide Bacciu Probabilistic Generative Models for Machine Vision
ConclusionConcluding RemarksBibliography and Online Resources
Bibliography
D. Bacciu, A Perceptual Learning Model to Discover the Hierarchical Latent Structure of Image Collections,Ph.D. Thesis, 2008.
J. Sivic and A. Zisserman. Video google: a text retrieval approach to object matching in videos.Proceedings of the Ninth IEEE International Conference on Computer Vision, ICCV 2003, pages1470–1477, vol.2, Oct. 2003.
J. Sivic, B.C. Russell, A.A. Efros, A. Zisserman, and W.T. Freeman. Discovering objects and their location inimages. Proceedings of the Tenth IEEE International Conference on Computer Vision, ICCV 2005,370–377, Oct. 2005.
J. Sivic, B. C. Russell, A. Zisserman, W. T. Freeman, and A. A. Efros. Unsupervised discovery of visualobject class hierarchies. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2008.
K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. M. Blei, and M. I. Jordan. Matching words and pictures.Journal of Machine Learning Research, vol. 3, 1107–1135, Feb. 2003.
D. M. Blei and M. I. Jordan. Modeling annotated data. Proceedings of the International Conference onResearch and Development in Information Retrieval, SIGIR 2003, Aug. 2003.
F. Monay and D. Gatica-Perez. Modeling semantic aspects for cross-media image indexing. IEEETransactions on Pattern Analysis and Machine Intelligence, 29(10):1802–1817, Oct. 2007.
Davide Bacciu Probabilistic Generative Models for Machine Vision
ConclusionConcluding RemarksBibliography and Online Resources
Thank You
Davide Bacciu Probabilistic Generative Models for Machine Vision