1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman.
-
Upload
isis-eagles -
Category
Documents
-
view
222 -
download
2
Transcript of 1 Image Parsing: Unifying Segmentation, Detection, and Recognition Shai Bagon Oren Boiman.
1
Image Parsing: Unifying Segmentation, Detection,
and Recognition
Shai BagonOren Boiman
2
Image Understanding
• A long standing goal of Computer Vision
• Consists of understanding:– Objects and visual patterns– Context– State / Actions of objects– Relations between objects– Physical layout– Etc.
A picture is worth a
thousand words…
3
Natural Language Understanding• Very far from being solved• Even NL parsing (syntax) is
problematic
• Ambiguities requirehigh level (semantic)knowledge
4
Image Parsing• Decomposition to constituent visual
patterns– Edge Detection– Segmentation– Object Recognition
5
Image Parsing Framework
Segmentation Edge Detection
Object Recognition Classification
Generic Framework
Low-Low-Level Level TasksTasks
High-High-Level Level TasksTasks
IS
6
Inference:
Top-down (Generative)
Constellation, Star-Model etc.
Bottom-up(Discriminative)
SVM, Boosting, Neural Nets etc.
SPSIPISP || ITestsq jj |
+ Fast
- Possibly Inconsistent
+ Consistent Solutions
- Slow
Approach used in
“Image Parsing”
I S
7
Coming up next…
• Define a (Monstrous) Generative model for Image Parsing
• How to perform s-l-o-w inference on such models (MCMC)
• How to accelerate inference using bottom-up cues (DDMCMC)
SPSIPISP ||
8
Image Parsing Generative Model
– No. of regions K
– Region Shapes Li
and Types ζi
– Region Parameters Θi
SP
SIP |
K
iiiiR LIP
i1
,,|
Uniform
Uniform exp
exp
I
S
10
Generic Regions
Constant up to Gaussian noise
Gray level histogram
Quadratic form
2,~ Ngl Ghhgl ,...,~ 1
2,,
~,
yxN
yxgl
feydxcybxyaxyx 22,
11
Faces• Use a PCA model (Eigen-faces)• Estimate Cov. Σ and prin. comp.
,...~ 11 nnVcVcNF
nVV ...1
12
Text region shapes
• Use Spline templates• Allow Affine transformation• Allow small deformations of control
point• Shading intensity model
13
Problem Formulation
• Now we can compute
• We’d like to optimize
• over the space ofparse graphs
SPSIPISP ||
ISPS
|maxarg
K
iiiiR
K
iiiiii
LIpSIP
pLppKpSP
i1
1
,,||
||
15
Optimizing P(S|I) is not easy…• Hybrid State Space:
Continuous & Discrete• Enormous number of local maxima• Graphical model structure is not pre-
determined
Rules out gradient methods
Rules out Belief propagation
16
Optimize by Sampling!
• Monte Carlo Principle – Use random samples to optimize!– Lets say we’re given N samples from P(S|I)
•S1,…,SN
– Compute P(Si|I)
• Given Si it is easy to compute P(Si|I)
– Choose the best Si !
17
Detour: Sampling methods• How to sample from
(very) complex probability space• Sampling algorithm• Why is Markov Chained in Monte
Carlo?
18
Example
• Sample from
22|4|42
2
25
21
2
1
2
1
42
1 2
2
xeexp x
x
19
Markov Chain
• A sequence of RandomVariables
• Markov property
• Transition
,...2,1,,, 321 tsssX t
tttt xxpxxxp |,...,| 111
Kpp tt
1
04.6.
9.1.0
010
KGiven the
present
The future is independent of the past
jiji ssKK ,
20
Markov Chain – cont.
• Under certain conditions MC converges to unique distribution
• Stationary distribution – first eigen-vector of K
pKpKpp ˆˆˆ 1
21
Markov Chain Monte Carlo• Reminder: • Had we wanted a sample from
Take the value of Xt,
• How to make our the stationary distribution of MC ?
• How to guarantee convergence ?
p̂
p
pKp ˆ1
pp ˆ
t
pKpKpp ˆˆˆ 1
22
Markov Chain convergence• Irreducibility:
– The walk can reach any statestarting at any state
• Non-periodicity– Stationary distribution cannot depend
on t
23
• Detailed Balance:(stationary distribution), if
• Written as matrix product• Sufficient condition to converge to
p(x)
xxKxp ** xxKxpxxKxp ***
The same distribution p(.)
How to make p(x) Stationary
Kpp
*xxKxp *x
*x
Probability sum to 1
pxp ˆ
Forward stepBackward step
xxKxpxpSx
**
*Independent of x*
24
Kernel Selection
• Detailed Balance requires Kernel:• Metropolis-Hastings Kernel:
– Proposal: where to go next– Acceptance: should we go
• MH Kernel provides detailed balance
*xxK
xxq |
xxqxp
xxqxpxx
|
|,1min,
Among the ten most influencing algorithms in science and engineering
25
Metropolis Hastings
• Sample x*~q(x*|xt)
• Compute acceptance probability
• If rand<A,
• Else,
*1 xxt
tt xx 1
xxqxp
xxqxpxx
|
|,1min,
26
Can we use any q(.) ?
1. Easy to sample from:– we sample from q(.) instead of p(.)
27
Can we use any q(.) ?
2. Supports p(x) 00 xqxp
p(x)
q(x)
28
Can we use any q(.) ?
3. Explores p(x) wisely:– Too narrow q(.): q(x*|x) ~ N(x, .1)– Too wide q(.): q(x*|x) ~ N(0,20)
p(x)
q(x)
29
Can we use any q(.) ?
1. Easy to sample from:• we sample from q(.) instead of p(.)
2. Supports p(x)–
3. Explores p(x) wisely:– q(.) too narrow – q(.) too wide -> low acceptance
• The best q(.) is p(.) – but we can’t sample p(.) directly.
00 xqxp
30
Combining Kernels
• Suppose we have
Satisfying detailed balance with the same
• Then also satisfies detailed balance.
mixxK i ,..,1,*
*** xxKxpxxKxp ii
xp
*xxKwK ii
i
31
Combining MH Kernels
• The same applies to Metropolis Hastings Kernels:
– Combining MH Kernels with different proposals – MC will converge to xp
32
Example Revisited
• Proposal distribution:• Acceptance:
25,.~|* xNxxq
xxqxCxLxN
xxqxCxLxNA
|21
|21
,1min*
****
Given x - easy to
compute p(x) Normalization factor cancels
out
33
Example – cont.
34
MAP Estimation
• Converge to• Simulated Annealing:
– explore less – exploit more!
• As the density is peaked at the global maxima
xpxp iTi
1
0iT
xpmaxarg
35
Annealing - example
• As the density is peaked at the global maxima
0iT
36
• Dimensionality variation in our space
• Cannot directly comparedensity of differentstates!
Model Selection
Varying number of
regions
Varying types of
explanations per region
37
• Pair-wise common measure
Jump across dimensions
38
Reversible Jumps
• Common measure– Sample extensions u and u* s.t
dim(u)+dim(x) = dim(u*)+dim(x*)– Use common dimension for comparison
using invertible deterministic functions h and h’
– Explicitly allow reversible jumps x* x
uxhuxh ,, ***
39
MCMC Summary
• Sample p(x) using Markov Chain • Proposal q(x*|x)
– Supports p(x)– Guides the sampling
• Detailed balance– MH Kernel ensures convergence to p(x)
• Reversible Jumps– Comparing across models and dimensions
40
If you want to make a new sample,
You should first learn how to propose.
Acceptance is random
Eventually you’ll get trapped in endless chains
until you become stationary.
Some say it is better to do reversible jumps between models.
MCMC – Take home message
41
Back to image parsing
• A state is a parse tree• Moves between
possible parsesof the image
Varying number of
regions
Different region types: Text, Face
and GenericVarying
number of parameters
42
• Birth / Death of a Face / Text
• Split / Merge of a generic region
• Model switching for a region
• Region boundary evolution
MCMC Moves
43
Moves -> Kernel
• Birth / Death of a Face / Text
• Split / Merge of a generic region
• Model switching for a region
• Region boundary evolution
MCMC Moves
44
Moves -> Kernel
TextBirth
TextDeath
FaceBirth
FaceDeath
SplitRegion
MergeRegion
ModelSwitching
BoundaryEvolution
TextSub-Kernel
FaceSub-Kernel
GenericSub-Kernel
ISSK ;|*Dimensionality change: must allow reversible
jump
45
Using bottom-up cues
• So far we haven’t stated the proposal probabilities q(.)
• If q(.) is uninformed of the image, convergence can be painfully slow
• Solution: use the image to propose moves
Face birth kernel
46
Data Driven MCMC
• Define proposal probabilitiesq(x*|x;I)
• The proposal probabilities will depend on discriminative tests– Faces detection– Text detection– Edge detection– Parameter clustering
• Generative model with Discriminative proposals
47
Face/Text Detection
• Bottom-up cues: AdaBoost– hard classification
– Estimate posterior instead
– Run on sliding windows at several scales
ITst,signIsign AdaAda
iiihH
1,I| ITst, AdaAda lelq l
48
Edge Map
• Canny edge detection at several scales
• Only these edges for split / merge
49
Parameters clustering
• Estimate likely parameter settings in the image
• Cluster using Mean-Shift
50
How to propose?
• q(S*|S,I) should approximate p(S*|I)• Choose one sub-kernel at random
– (e.g., create face)
• Use bottom-up cues to generate proposals: S1,S2,…
• Weight proposal according to p(Si|I)
• Sample from discrete distribution
51
Generic region – split/merge• Split/merge according to edge map• Dimensionality change – reversible
S S’
52
Generic region – split/merge• Splitting k into i,j: Sk -> Sij
• Proposals are weighted
• Normalize weight to probabilities• Sample
ISP
ISPw
k
ijsplit |
|
kk
ijijsplit SPSIP
SPSIPw
|
|
54
Faces sub-kernel
• Adding a face :S->S’ • Take AdaBoost proposals • Compute weights wi=P(S’|I)/P(S|I)• Normalize weights to probability• Sample
• Reversible kernel – add/remove face kernel
55
Accept / Reject
• We have the proposal q(S’|S;I) • Check Metropolis Hastings
acceptance
ISpISSq
ISpISSq
|;'|
|';|',1min
56
Full diagram
TextBirth
TextDeath
FaceBirth
FaceDeath
SplitRegion
MergeRegion
ModelSwitching
BoundaryEvolution
TextSub-Kernel
FaceSub-Kernel
GenericSub-Kernel
ISSK ;|*Generative
Text Detection Face Detection Edge Detection Parameter Clustering
Input ImageDiscriminativ
e
57
Results
58
Results
59
Results
60
Results
61
Results
62
Limitations
• Scaling to a large number of objects– Algorithm design complexity– Convergence speed– Dealing with complex objects
• Good Synthesis / Detectionbut not so good segmentation
63
Extensions
64
Extensions
65
Extensions
66
• Image Parsing– Decomposition to constituent
visual patterns
• Top-down Generative Model for Parse Graphs
• Optimization using DDMCMC– MCMC – Discriminative bottom-up proposals
Summary
67
References• Zhuowen Tu, Xiangrong Chen, Alan L. Yuille, Song-
Chun Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. International Journal of Computer Vision, 2005.
• Z. Tu and S. Zhu. Image Segmentation by DDMCMC. IEEE Trans. Pattern Analysis and Machine Intelligence, 2002.
• Zhuowen Tu, Xiangrong Chen, A.L. Yuille and S.C. Zhu. Image Parsing: Unifying Segmentation, Detection, and Recognition. IEEE International Conference on Computer Vision, 2003.
• C. Andrieu, N. de Freitas, A. Doucet and M. Jordan. An introduction to MCMC for machine learning. Machine Learning, vol. 50, pp. 5--43, Jan.- Feb. 2003.
68
Backups
70
Example
• Compute posterior for a simple GMM:– Given one X, what component
of the mixture generated it?– Exhaustive search –
What if larger space? MpMxpxMp ||
71
Example revisited
74
Binarization
• Extracting text boundaries• Adaptive thresholding
WindowWindowThr std2.mean
75
What’s so special about Text?• Information lies in boundary
– AdaBoost: suggests region– Adaptive binarization: boundary
refinement
76
• Union of model subspaces
• How can we compare densitiesacross dimensions?
Model selection
U U
-5
-4
-3
-2
-1
0
1
2
3
4
5-5
-4-3
-2-1
01
23
45
-5
0
5
77
Parameter clustering
• Each cluster in parameter set induce saliency map
Shading
Gray level
78
Generic region – split/merge• Splitting k into i,j or merging i,j into k• Suggestions are weighted
jjjiii
kkk
jjjjiiii
jijimerge
kkk
jjjiii
kkkk
jiksplit
LpLp
qLq
LRpLRp
RRqw
Lp
qLqqLq
LRp
RRqw
,,,,
,
,,|,,|
,
,,
,,
,,|
,
,
RegionAffinity
ShapePrior
ParameterClustering
Current RegionProbability
Current parametersProbability
79
Switching node’s attributes• No dimensionality change• Weighting the proposals by
iiiiiii
iiichangei
LpLIp
qLqw
,,,,|
',''
80
Boundary Evolution Kernel• Does not change dimensionality• For two adjacent regions:
– Log likelihood ratio– Changes in area– Boundary curvature– Deviation from control points (text)– Brownian noise
j
i
vIp
vIp
;
;log