Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
Transcript of Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
![Page 1: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/1.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 1/53
Outline
1 Introduction
Feature Learning
Correspondence in Computer Vision
Relational feature learning
2 Learning relational features
Sparse Coding Review
Encoding relationsInference
Learning
3 Factorization, eigen-spaces and complex cells
Factorization
Eigen-spaces, energy models, complex cells
4 Applications
Applications
Conclusions
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 127 / 174
![Page 2: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/2.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 2/53
Outline
1 Introduction
Feature Learning
Correspondence in Computer Vision
Relational feature learning
2 Learning relational features
Sparse Coding Review
Encoding relationsInference
Learning
3 Factorization, eigen-spaces and complex cells
Factorization
Eigen-spaces, energy models, complex cells
4 Applications
Applications
Conclusions
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 128 / 174
![Page 3: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/3.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 3/53
Bag-Of-Warps
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 129 / 174
![Page 4: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/4.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 4/53
Bag-Of-Warps
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 130 / 174
![Page 5: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/5.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 5/53
KTH Actions dataset
Collapsing all hidden representations at monocular SIFT
keypoints (across all keypoints and time frames) and performing
logistic regression yields 80.56% correct.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 131 / 174
![Page 6: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/6.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 6/53
Convolutional GBM
Convolutional GBM (Taylor et al., 2010):
Convolutional GBM on Hollywood2:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 132 / 174
![Page 7: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/7.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 7/53
Stacked convolutional ISA
(Le, et al., 2011)Velocity tuning of the higher-order features:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 133 / 174
![Page 8: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/8.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 8/53
ISA applied to action recognition
(Le, et al., 2011)KTH Hollywood2 UCF YouTube
until 2011 92.1 50.9 85.6 71.2
hierarchical ISA 93.9 53.3 86.5 75.8
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 134 / 174
![Page 9: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/9.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 9/53
Analogy making
A : A :: B : ?
Analogy making
1 Infer transformation from source images xsource,ysource:
z(xsource,ysource)
2 Apply the transformation to target image xtarget:
y(z,xtarget)
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 135 / 174
![Page 10: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/10.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 10/53
Analogy making
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 136 / 174
![Page 11: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/11.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 11/53
Filters learned from transforming faces
Filters learned from faces:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 137 / 174
![Page 12: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/12.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 12/53
Metric learning and analogy making
Learning a gated Boltzmann machine on changing facialexpressions.
(Susskind, et al., 2011)
Joint density training allows for comparing compatibilities of
pairs.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 138 / 174
![Page 13: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/13.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 13/53
Model/Task TFD
ID
TFD
Exp
PUBFIG
ID
AFFINE
cosine 0.848 0.663 0.649 0.721
RBM 0.869 0.656 0.647 0.799
conditional 0.805 0.634 0.557 0.825bilinear 0.905 0.637 0.774 0.812
3-way 0.932 0.705 0.771 0.930
3-way symm 0.951 0.695 0.762 0.931
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 139 / 174
![Page 14: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/14.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 14/53
Bi-linear classification
zk
z
y
x
xi
y j
Special case of a gated Boltzmann machine:Replace the output-“image” by a one-hot-encoded class-label.
This is a classifier, where each label can blend in it’s own model !
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 140 / 174
Bi li l ifi i
![Page 15: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/15.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 15/53
Bi-linear classification
zk
z
y
x
xi
y j
Marginalization is tractable in closed form
p(y|x) =
z
p(y, z|x) ∝
z
exp(xtwyz) =
z
exp(
ik
wyikxihk)
=
k
(1 + exp(
i
wyikxi))
It is also equivalent to a mixture of 2K logistic regressors (Nair,
2008), (Memisevic, et al.; 2010), (Warrell et al.; 2010)
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 141 / 174
Bi li l ifi ti
![Page 16: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/16.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 16/53
Bi-linear classification
xy
xi
z
zk
y j
We can factorize parameters like before.
This allows classes to share features.The activity of a factor, f , given class j, is now exactly equal to the
parameter value wy jf .
Thus the weights can be thought of as the responses of virtual
class-templates.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 142 / 174
![Page 17: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/17.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 17/53
Rotated digit classification
Data-set from the “deep learning-challenge” [Larochelle et al.,
2007] like before.
Learned rotation-invariant filters:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 143 / 174
Bi linear classification
![Page 18: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/18.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 18/53
Bi-linear classification
Deep Learning challenge (Larochelle et al., 2008).
SVMs NNet RBM DEEP GSM
dataset/model: SVMRBF SVMPOL NNet DBN1 DBN3 SAA3 GSM (unfact)
rectangles 2.15 2.15 7.16 4.71 2.60 2.41 0.83 (0.56)
rect.-images 24.04 24.05 33.20 23.69 22.50 24.05 22.51 (23.17)mnistplain 3.03 3.69 4.69 3.94 3.11 3.46 3.70 (3.98)
convexshapes 19.13 19.82 32.25 19.92 18.63 18.41 17.08 (21.03)
mnistbackrand 14.58 16.62 20.04 9.80 6.73 11.28 10.48 (11.89)
mnistbackimg 22.61 24.01 27.41 16.15 16.31 23.00 23.65 (22.07)
mnistrotbackimg 55.18 56.41 62.16 52.21 47.39 51.93 55.82 (55.16)
mnistrot 11.11 15.42 18.11 14.69 10.30 10.30 11.75 (16.15)
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 144 / 174
Extension to less than two frames
![Page 19: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/19.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 19/53
Extension to less than two frames
yy
yi
z
zk
y j
To train energy models on single images :Plug in the same image left and right.
Hiddens will model pixel covariance matrices.
Eg., (Ranzato et al., 2010), (Karklin, Lewicki; 2008)
Training can be finicky.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 145 / 174
Extension to less than two frames
![Page 20: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/20.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 20/53
Extension to less than two frames
yy
yi
z
zk
y j
To train energy models on single images :Plug in the same image left and right.
Hiddens will model pixel covariance matrices.
Eg., (Ranzato et al., 2010), (Karklin, Lewicki; 2008)
Training can be finicky. Use a relational auto-encoder.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 145 / 174
Extension to less than two frames
![Page 21: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/21.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 21/53
Extension to less than two frames
yy
yi
z
zk
y j
We can combine this with standard hidden units in one model.The combination tends to work better recognition (Ranzato et al.,
2010).
The vanilla hidden units then plays the role of
“higher-order-biases” (Memisevic, 2007).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 146 / 174
Extension to less than two frames
![Page 22: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/22.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 22/53
Extension to less than two frames
yy
yi
z
zk
y j
Learning higher-order within-image structure has been suggestedto address the fact that ICA does not really yield independent
components...
Add layers to model correlations of filter responses.
Closely related to Deep Learning.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 147 / 174
Some within image covariance and mean filters
![Page 23: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/23.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 23/53
Some within image covariance and mean filters
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 148 / 174
Within-image correlations
![Page 24: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/24.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 24/53
Within-image correlations
(Karklin, Lewicki; 2008), (Osindero et al., 2006), ...
ISA itself used mainly for modeling within-image structure.
(Ranzato et al., 2010) suggest combining covariance features andtraditional “mean” features, for example to generate images with
an MRF:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 149 / 174
mcRBMs on TIMIT
![Page 25: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/25.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 25/53
mcRBMs on TIMIT
mcRBM applied to speech recognition (phones, speaker
independent, TIMIT)(Dahl, et al.; 2010)
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 150 / 174
Transparent motion
![Page 26: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/26.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 26/53
Transparent motion
180 240 300 0 60 120angle
0.0
0.02
0.04
0.06
0.08
0.10
0.12
p r o b a b i l i t y
Hidden variables make extracting multiple, simultaneous motions
easy.When they fail they do so in a similar way as humans:
Better disrimination at large angles, averaging at very small
angles, “motion repulsion”.
(eg., Treue et al., 2000)Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 151 / 174
Depth as a latent variable
![Page 27: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/27.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 27/53
Depth as a latent variable
Learning a dictionary for stereo:
Generate left-right camera pairs with known disparities.Predict disparity from the hidden units.
This gives rise to a three-layer network, that may be trained with
Hebbian-like learning.
rectified
not rectified
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 152 / 174
Hiddens learn to encode disparities
![Page 28: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/28.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 28/53
Hiddens learn to encode disparities
Can use this to encode 3d-structure implicitly, for example, for
multi-view recognition.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 153 / 174
Norb stereo features
![Page 29: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/29.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 29/53
Norb stereo features
NORB training subset: NORB testset:
RBMmon RBMbin cc cc+bin RBMbin cc cc+bin73.65 60.43 34.85 31.48 63.28 38.91 36.80
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 154 / 174
“Harnessing the aperture problem”
![Page 30: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/30.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 30/53
g p p
Transformations are transformation invariant.
The 2-D subspace projections, however, are at the same time
affected by the aperture problem, so they are selective to othersources of variability, including object ID!
We can use the aperture effect to build invariant features:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 155 / 174
Harnessing the aperture problem
![Page 31: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/31.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 31/53
g p p
y jxiyx
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 156 / 174
Harnessing the aperture problem
![Page 32: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/32.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 32/53
g p p
y jxiyx
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 156 / 174
Harnessing the aperture problem
![Page 33: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/33.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 33/53
g p p
y jxiyx
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 156 / 174
Harnessing the aperture problem
![Page 34: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/34.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 34/53
y jxi
z
zk
yx
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 156 / 174
Harnessing the aperture problem
![Page 35: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/35.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 35/53
y jxi
z
zk
yx
pose-independent, content-independent
pose-independent, content-dependent
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 156 / 174
Rotation “quadrature” filters
![Page 36: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/36.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 36/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 157 / 174
Rotation “quadrature” filters
![Page 37: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/37.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 37/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 158 / 174
Representing digits using rotation aperture
![Page 38: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/38.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 38/53
features
aperture feature similarities image similarities
0
01
1 2
2
Learn rotation features. Represent digits using aperture features.
No video available? Fill video buffer with copies of the same
image: Represent the non-transformation.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 159 / 174
Rotated MNIST error rates
![Page 39: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/39.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 39/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 160 / 174
Video object features
![Page 40: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/40.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 40/53
Humans do not recognize still images but videos of objects.
The way in which an object changes can convey useful
information about the object, including 3-D structure.→ Learn features from videos not still images. For example,
(Lee and Soatto, 2011).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 161 / 174
The “norbjects” video dataset
![Page 41: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/41.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 41/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 162 / 174
3-D rotation subspaces
![Page 42: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/42.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 42/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 163 / 174
3-D rotation subspaces
![Page 43: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/43.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 43/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 164 / 174
3-D rotation subspaces
![Page 44: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/44.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 44/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 165 / 174
3-D rotation subspaces
![Page 45: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/45.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 45/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 166 / 174
3-D rotation subspaces
![Page 46: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/46.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 46/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 167 / 174
3-D rotation subspaces
![Page 47: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/47.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 47/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 168 / 174
“Harnessing the aperture problem”
![Page 48: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/48.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 48/53
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 169 / 174
Mocap
![Page 49: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/49.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 49/53
(Taylor, Hinton; 2009), (Taylor,
et al.; 2010)
Learning models on mocap
instead of images makes it
possible to model motion style
and to perform tracking.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 170 / 174
More Tracking
![Page 50: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/50.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 50/53
(Bazzani et al.), (Larochelle, Hinton, 2011)
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 171 / 174
Outline
![Page 51: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/51.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 51/53
1 Introduction
Feature Learning
Correspondence in Computer VisionRelational feature learning
2 Learning relational features
Sparse Coding Review
Encoding relations
Inference
Learning
3 Factorization, eigen-spaces and complex cells
Factorization
Eigen-spaces, energy models, complex cells4 Applications
Applications
Conclusions
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 172 / 174
Conclusions
![Page 52: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/52.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 52/53
Learning is a way to support simplicity and homogeneity of
complex, intelligent systems.Feature learning even more so.
Relational feature learning even more:
Learning “verbs”, not just “nouns”, can help address more taskswith a single kind of model.
This seems like a very good reason to have complex cells.
One reason, why looking for correspondences – across frames,
across views, across modalities, etc. – is a common operation, is
that mappings between modalities are often one-to-many .
The theory provides a strong inductive bias for products and/or
squaring non-linearities when building deep learning models.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 173 / 174
Thank you
![Page 53: Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning](https://reader036.fdocuments.in/reader036/viewer/2022062401/577ce7db1a28abf10395ecfd/html5/thumbnails/53.jpg)
7/31/2019 Mvfl_part4 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart4-cvpr2012-spatio-temporal-and-higher-order-feature-learning 53/53
More info, code, links, etc. at
http://www.cs.toronto.edu/~rfm/
multiview-feature-learning-cvpr/index.html
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 174 / 174