Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department...
-
Upload
giles-flynn -
Category
Documents
-
view
218 -
download
0
Transcript of Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department...
![Page 1: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/1.jpg)
Latent Variable / Hierarchical Models in Computational
Neural Science
Ying Nian WuUCLA Department of Statistics
March 30, 2011
![Page 2: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/2.jpg)
Outline• Latent variable models in statistics• Primary visual cortex (V1)• Modeling and learning in V1 • Layered hierarchical models• Joint work with Song-Chun Zhu and Zhangzhang Si
![Page 3: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/3.jpg)
Latent variable models
Hidden
Observed
Learning: Examples
Inference:
![Page 4: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/4.jpg)
Latent variable models Mixture model
Factor analysis
![Page 5: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/5.jpg)
Latent variable models
Hidden
Observed
Learning: Examples
Maximum likelihood
EM/gradient
Inference / explaining away
E-step / imputation
![Page 6: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/6.jpg)
Computational neural science
Z: Internal representation by neurons
Y: Sensory data from outside environment
Hidden
Observed
Connection weights
Hierarchical extension: modeling Z by another layer of hidden variables explaining Y instead of Z
Inference / explaining away
![Page 7: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/7.jpg)
Source: Scientific American, 1999
Visual cortex: layered hierarchical architecture
V1: primary visual cortex simple cells complex cells
bottom-up/top-down
![Page 8: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/8.jpg)
1]}[2
1exp{)(
22
22
21
21 ixe
xxxG
Simple V1 cells Daugman, 1985
Gabor wavelets: localized sine and cosine waves
Transation, rotation, dilation of the above function
![Page 9: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/9.jpg)
)'()'(,'
,,,, xBxIBIx
sxsx
image pixels
V1 simple cells
,,sxB
respond to edges
![Page 10: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/10.jpg)
Complex V1 cells Riesenhuber and Poggio,1999
2,,)(),( |,|max sxxAx BI
Image pixels
V1 simple cells
V1 complex cells
Local max
Local sum
•Larger receptive field •Less sensitive to deformation
![Page 11: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/11.jpg)
Independent Component Analysis Bell and Sejnowski, 1996
CBcBcI NN B ...11
Nicpci ,...,1tly independen )(~
)dim(IN
IIC AB 1
mNNmmm CBcBcI B ,11, ...
mmm IIC AB 1
Laplacian/Cauchy
![Page 12: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/12.jpg)
Hyvarinen, 2000
![Page 13: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/13.jpg)
Sparse coding Olshausen and Field, 1996
Laplacian/Cauchy/mixture Gaussians
Nicpci ,...,1tly independen )(~
NNBcBcI ...11
mNNmmm BcBcI ,11, ...)dim(IN
![Page 14: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/14.jpg)
Inference: sparsification, non-linear lasso/basis pursuit/matching pursuit mode and uncertainty of p(C|I) explaining-away, lateral inhibition
Nicpci ,...,1tly independen )(~
Sparse coding / variable selection
Learning: mNNmmm BcBcI ,11, ...
)dim(IN
A dictionary of representational elements (regressors)
NNBcBcI ...11
![Page 15: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/15.jpg)
Olshausen and Field, 1996
![Page 16: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/16.jpg)
}exp{)(
1),(
,, j
jiiji vhW
WZVHp
Nihi ,...,1 ,
V
Restricted Boltzmann Machine Hinton, Osindero and Teh, 2006
P(V|H)P(H|V): factorized no-explaining away
hidden, binary
visible
![Page 17: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/17.jpg)
Energy-based model Teh, Welling, Osindero and Hinton, 2003
)},(exp{),(
1)(
iiBIZ
Ip B
Features, no explaining-away
Maximum entropy with marginalsExponential family with sufficient stat
)},(exp{)(
1)(
,,,,,
sxsxs BI
ZIp
Zhu, Wu, and Mumford, 1997Wu, Liu, and Zhu, 2000
Markov random field/Gibbs distribution
![Page 18: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/18.jpg)
Zhu, Wu, and Mumford, 1997Wu, Liu, and Zhu, 2000
![Page 19: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/19.jpg)
Source: Scientific American, 1999
Visual cortex: layered hierarchical architecture
bottom-up/top-down
What is beyond V1?Hierarchical model?
![Page 20: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/20.jpg)
Hierchical ICA/Energy-based model?
Larger featuresMust introduce nonlinearitiesPurely bottom-up
![Page 21: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/21.jpg)
P(V,H) = P(H)P(V|H) P(H) P(V’,H)
I
H
V
V’
Discriminative correction by back-propagation
Unfolding, untying, re-learning
Hierarchical RBM Hinton, Osindero and Teh, 2006
![Page 22: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/22.jpg)
Hierarchical sparse coding
NNBcBcI ...11
,,sxB
Attributed sparse coding elements transformation group topological neighborhood system
UBcIii sx
n
ii
,,
1
Layer above : further coding of the attributes of selected sparse coding elements
![Page 23: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/23.jpg)
Active basis modelWu, Si, Gong, Zhu, 10Zhu, Guo, Wang, Xu, 05
n-stroke templaten = 40 to 60, box= 100x100
![Page 24: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/24.jpg)
Active basis model Wu, Si, Gong, Zhu, 10Zhu, et al., 05
Yuille, Hallinan, Cohen, 92
n-stroke templaten = 40 to 60, box= 100x100
![Page 25: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/25.jpg)
•Simplest AND-OR graph (Pearl, 84; Zhu, Mumford 06) AND composition and OR perturbations or variations of basis elements
•Simplest shape model: average + residual•Simplest modification of Olshausen-Field model•Further sparse coding of attributes of sparse coding elements
Simplicity
![Page 26: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/26.jpg)
Bottom layer: sketch against texture
Only need to pool a marginal q(c) as null hypothesis • natural images explicit q(I) of Zhu, Mumford, 97• this image explicit q(I) of Zhu, Wu, Mumford, 97
Maximum entropy (Della Pietra, Della Pietra, Lafferty, 97; Zhu, Wu, Mumford, 97; Jin, S. Geman, 06; Wu, Guo, Zhu, 08) Special case: density substitution (Friedman, 87; Jin, S. Geman, 06)
p(C, U) = p(C) p(U|C) = p(C) q(U|C) = p(C) q(U,C)/q(C)
![Page 27: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/27.jpg)
Shared sketch algorithm: maximum likelihood learning
Prototype: shared matching pursuit (closed-form computation)
Step 1: two max to explain images by maximum likelihood no early decision on edge detection Step 2: arg-max for inferring hidden variablesStep 3: arg-max explains away, thus inhibits (matching pursuit, Mallat, Zhang, 93)
Finding n strokes to sketch M images simultaneouslyn = 60, M = 9
![Page 28: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/28.jpg)
Bottom-up sum-max scoring (no early edge decision)
Top-down arg-max sketching
1. Reinterpreting MAX1: OR-node of AND-OR, MAX for ARG-MAX in max-product algorithm2. Stick to Olshausen-Field sparse top-down model : AND-node of AND-OR Active basis, SUM2 layer, “neurons” memorize shapes by sparse connections to MAX1 layer Hierarchical, recursive AND-OR/ SUM-MAX Architecture: more top-down than bottom-up Neurons: more representational than operational (OR-neurons/AND-neurons)
Cortex-like sum-max maps: maximum likelihood inference
SUM1 layer: simple V1 cells of Olshausen, Field, 96MAX1 layer: complex V1 cells of Riesenhuber, Poggio, 99
Scan over multiple resolutions
![Page 29: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/29.jpg)
Bottom-up detectionTop-down sketching
SUM1
MAX1
SUM2
arg MAX1
Sparse selective connection as a result of learningExplaining-away in learning but not in inference
Bottom-up scoring and top-down sketching
![Page 30: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/30.jpg)
![Page 31: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/31.jpg)
Scan over multiple resolutions and orientations (rotating template)
![Page 32: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/32.jpg)
Classification based on log likelihood ratio score
Freund, Schapire, 95; Viola, Jones, 04
![Page 33: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/33.jpg)
Adjusting Active Basis Model by L2 Regularized Logistic RegressionBy Ruixun Zhang
•Exponential family model, q(I) negatives Logistic regression for p(class | image), partial likelihood•Generative learning without negative examples basis elements and hidden variables•Discriminative adjustment with hugely reduced dimensionality correcting conditional independence assumption
L2 regularized logistic regressionre-estimated lambda’s
Conditional on: (1) selected basis elements (2) inferred hidden variables (1) and (2) generative learning
![Page 34: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/34.jpg)
Active basis templates
Adaboost templates
# of negatives: 10556 7510 4552 1493 12217
• Arg-max inference and explaining away, no reweighting, • Residual images neutralize existing elements, same set of training examples
• No arg-max inference or explaining away inhibition• Reweighted examples neutralize existing classifiers, changing set of examples
double # elements
same # elements
![Page 35: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/35.jpg)
Mixture model of active basis templates fitted by EM/maximum likelihood with random initialization
MNIST500 total
![Page 36: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/36.jpg)
Learning active basis models from non-aligned imageEM-type maximum likelihood learning, Initialized by single image learning
![Page 37: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/37.jpg)
Learning active basis models from non-aligned image
![Page 38: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/38.jpg)
Learning active basis models from non-aligned image
![Page 39: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/39.jpg)
![Page 40: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/40.jpg)
Hierarchical active basis by Zhangzhang Si et al. •And-OR graph: Pearl, 84; Zhu, Mumford, 06•Compositionality and reusability: Geman, Potter, Chi, 02; L.Zhu, Lin, Huang, Chen,Yuille, 08•Part-based method: everyone et al. •Latent SVM: Felzenszwalb, McAllester, Ramanan, 08•Constellation model: Weber, Welling, Perona, 00
Lowlog-likelihood
Highlog-like
![Page 41: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/41.jpg)
Simplicity
•Simplest and purest recursive two-layer AND-OR graph•Simplest generalization of active basis model
![Page 42: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/42.jpg)
AND-OR graph and SUM-MAX mapsmaximum likelihood inference
Cortex-like, related to Riesenhuber, Poggio, 99•Bottom-up sum-max scoring•Top-down arg-max sketching
![Page 43: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/43.jpg)
Hierarchical active basis by Zhangzhang Si et al.
![Page 44: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/44.jpg)
![Page 45: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/45.jpg)
![Page 46: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/46.jpg)
![Page 47: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/47.jpg)
Shape script by composing active basis shape motifsRepresenting elementary geometric shapes (shape motifs) by active bases (Si, Wu, 10) Geometry = sketch that can be parametrized
![Page 48: Latent Variable / Hierarchical Models in Computational Neural Science Ying Nian Wu UCLA Department of Statistics March 30, 2011.](https://reader030.fdocuments.in/reader030/viewer/2022033106/56649f455503460f94c66d9d/html5/thumbnails/48.jpg)
UBcIii sx
n
ii
,,
1
),...,1,,(motif shape),...,1,,( nixnix iik
kii
Bottom-layer: Olshausen-Field (foreground) + Zhu-Wu-Mumford (background) Maximum entropy tilting (Della Pietra, Della Pietra, Lafferty, 97) white noise texture (high entropy) sketch (low and mid entropy) (reverse the central limit theorem effect of information scaling)
Build up layers: (1) AND-OR, SUM-MAX (top-down arg-MAX) (2) Perpetual sparse coding: further coding of attributes of the current sparse coding elements (a) residuals of attributes continuous OR-nodes (b) mixture model discrete OR-nodes
Summary