pump-priming: modeling scenes with invariant regions and aspect models
description
Transcript of pump-priming: modeling scenes with invariant regions and aspect models
![Page 1: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/1.jpg)
pump-priming: modeling scenes with invariant regions and aspect models
daniel gatica-perez
IDIAP Research Institute, Martigny, Switzerland
![Page 2: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/2.jpg)
project details
two partners IDIAP, Daniel Gatica-Perez, Jean-Marc Odobez Catholic University of Leuven (KUL), Tinne Tuytelaars
1 partly-funded PhD student per site (Pedro Quelhas, Mihai Osian) for one year
1 additional PhD student at IDIAP through additional non-PASCAL matching funds (Florent Monay)
project started in early fall 2004
research continued afterwards
![Page 3: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/3.jpg)
the original goal: scene classification
classify scene types: indoor/outdoor, city/landscape
traditional image representation global features (whole image) class-specific (hand-picked) features
our work invariant local descriptors, avoid class-specific features use non-labeled data learn latent structure with aspect
models
![Page 4: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/4.jpg)
the final story
![Page 5: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/5.jpg)
the bag of visterms (Sivic ’03, Willamowski ’04)
IMAGE set
SIFT local descriptors
Visterms
DoG + SIFTK-means
quantization
![Page 6: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/6.jpg)
the bag of visterms (BOV)
spatial relationships are discarded
=> like bag-of-words in text
visterm
coun
t
visterm
coun
t
![Page 7: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/7.jpg)
visterms as words
polysemysynonymy
goals exploit co-occurrence information disambiguated representation from bag-of-visterms
![Page 8: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/8.jpg)
PLSA
![Page 9: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/9.jpg)
images as mixtures of aspects
![Page 10: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/10.jpg)
aspect-based image ranking (soft clustering)
![Page 11: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/11.jpg)
aspect representation
CITY 4
12
1410
5
LANDSCAPE6
83
16
15
see webpage
![Page 12: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/12.jpg)
aspect 4, city
![Page 13: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/13.jpg)
aspect 3, landscape
![Page 14: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/14.jpg)
aspect 6, landscape
![Page 15: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/15.jpg)
some experiments
tasks indoor/outdoor, city/landscape, indoor/city/landscape
data 2700 indoor, 2500 city, 4200 landscape
models 1000 visterms in BOV (learned in additional data) 60 aspects in PLSA (learned in additional data)
classifier SVM, Gaussian kernel two-class and three-class
protocol results computed over 10 data subsets decreasing amount of training data
![Page 16: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/16.jpg)
classification errortraining data size
90% 10% 2.5%
indoor/outdoor
PLSA 7.8 9.1 11.4
BOV 7.6 9.7 12.2
Baseline 10.4 15.9 23.0
city/landscape
PLSA 4.7 5.8 8.1
BOV 5.3 7.4 12.4
Baseline 8.3 9.5 11.5
indoor/city/landscape
PLSA 11.9 14.6 16.7
BOV 11.1 15.4 20.7
Baseline 15.9 19.7 29.0
![Page 17: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/17.jpg)
aspect-based «segmentation»
visterms with high probability
p(v | z=4) ↔‘man-made’
p(v | z=6) ↔‘nature’
![Page 18: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/18.jpg)
aspect-based segmentation
![Page 19: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/19.jpg)
aspect-based segmentation
![Page 20: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/20.jpg)
so, pump-priming?
![Page 21: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/21.jpg)
pump-priming, good timing
similar method applied to scenes Fei-Fei et al. (Caltech), CVPR’05 Quelhas et al., ICCV’05
similar method applied to objects Monay et al., PASCAL Workshop on Subspace, Latent Structure, and Feature Selection Techniques
’05, Bohinj Sivic et al. (Oxford + MIT), ICCV’05
![Page 22: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/22.jpg)
![Page 23: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/23.jpg)
extensions aspect-based for man-made vs.
natural image segmentation (Monay et al, CVPR Workshop on Beyond Patches ’06)
color+texture visual vocabularies for scene classification (Quelhas, CIVR ’06)
scene classification for larger number of scene categories (Quelhas et al., PAMI ’07)
aspect-based image auto-annotation (Monay et al., PAMI ’07)
multi-level visual vocabularies for scene classification (Quelhas, CIVR’ 07)
![Page 24: pump-priming: modeling scenes with invariant regions and aspect models](https://reader036.fdocuments.in/reader036/viewer/2022062410/56815a74550346895dc7dafc/html5/thumbnails/24.jpg)
conclusion
visual scenes as mixtures of aspects
aspect models on visual vocabularies feature extraction ↔ classification latent structure ↔ browsing, clustering context ↔ segmentation text-visual vocabularies ↔ annotation
the pump-priming grant worked for us