MAP Examples - University at Buffalo

16
Probabilistic AI Srihari 1 MAP Examples Sargur Srihari [email protected]

Transcript of MAP Examples - University at Buffalo

Page 1: MAP Examples - University at Buffalo

Probabilistic AI Srihari

1

MAP Examples

Sargur [email protected]

Page 2: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Topics

2

• Potts Model• CRF for OCR• Image segmentation based on energy

minimization

Page 3: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Examples of MAP

• Many interesting examples of MAP inference are instances of structured prediction, which involves doing inference in a conditional random field (CRF) model p(y∣x):

• Example of OCR given next– Where x is a character image and y is its

recognized label3

Page 4: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Potts models• A generalization of the Ising model and

Boltzmann model to multiple values

Page 5: MAP Examples - University at Buffalo

Probabilistic AI Srihari

MAP inference in 1-D structure

5

We are given images xi∈[0,1]d�d of characters in the form of pixel matrices;

MAP inference is to jointly recognizing the most likely word(yi)n

i=1 encoded by the images

Chain-structured conditional random field for OCR

Observedimages

Recognizedcharacters

Page 6: MAP Examples - University at Buffalo

Probabilistic AI Srihari

MAP inference in a 2-D structure

6

We are interested in locating an entity in an image and label all its pixels. Our input x ∈[0,1]d�d is a matrix of image pixels, our task is to predict the label y ∈{0,1}d�d, indicating whether each pixel encodes object we want to recover

Intuitively, neighboring pixels should have similar values in y, i.e. pixels associated with the horse should form one continuous blob (rather than white noise)

xi

yi

Page 7: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Potts’ model

• Prior knowledge modeled as PGMs• We can introduce potentials ϕ(yi,xi)

that encode the likelihood that any given pixel is from our subject.

• We then augment them with pairwise potentials ϕ(yi,yj) for neighboring yi,yj, which encourages adjacent y’s to have the same value with higher probability

7

xi

yi

ϕ(yi,xi) ϕ(yi,yj)

Page 8: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Ising Model in Statistical Physics• Energy of interacting atoms

– Determined from their spin • Atom’s spin is sum of its electron spins• Each atom associated with binary random variable

– Xi ∈{+1,-1} whose value is direction of atom�s spin

• Energy function parametric formεij(xi,xj) = -wijxixj

– Symmetric in Xi, Xj: note scope is pairwise

– Makes contribution wij to energy when Xi=Xj (same spin)– -wij otherwise

• Probability distribution over atoms (energy function)

When wij>0 model prefers aligned spins: ferromagnetismwij<0 : antiferromagneticwij=0: non-interacting

P(ξ) = 1Zexp − wij xix j − uixi

i∑

i< j∑

⎝⎜⎞

⎠⎟ξ ε Val(Χ)

Page 9: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Boltzmann Distribution• Variant of Ising Model• Variables Xi have value {0,1} instead of {+1,-1}

– Energy function has same parametric form εij(xi,xj)=-wijxixj

– Nonzero contribution –wij from edge Xi-Xj only when Xi=Xj=1

• Ising model has contribution wij when variables are same and –wij when they are different

• Has the same energy function as Ising model

9P(ξ) = 1Zexp − wij xix j − uixi

i∑

i< j∑

⎝⎜⎞

⎠⎟Mapping 0 to -1

Page 10: MAP Examples - University at Buffalo

Probabilistic AI Srihari

10

CRF for pixel labeling

• A conditional probability distribution p(x|y)• Neighboring pixel labels xi and xj are strongly

correlated

xi= unknown pixel labelyi= known noisy pixel

Page 11: MAP Examples - University at Buffalo

Probabilistic AI Srihari

11

Energy Functions

• Graph has two types of pairwise cliques– {xi,yi} expresses strength of label to pixel

values• Energy function η xiyi

– {xi,xj} are neighboring pixel labels• Choose β xixj

Page 12: MAP Examples - University at Buffalo

Probabilistic AI Srihari

12

Potential Function

• Complete energy function of model

– The hxi term biases towards pixel values that have one particular sign

– Defines a joint distribution over x and y given by

– We now fix elements of y to the observed values given by the pixels

– It implicitly defines a conditional distribution p(x|y) over the labeled image

{ , }

(x, y) i i j i ii i j i

E h x x x x yb h= - -å å å

1(x, y) exp{ (x, y)}p E

Z= -

Page 13: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Metric MRF for Labeling

• Task: – We are given a graph with nodes X1,..Xn, edges E– Task is to assign to each Xi a label in V={v1,..vk}

• E.g., labeling super-pixels as v1=cow, v2= grass• Each node, in isolation, has a preferred label

– E.g., when pixel xi has color=brown and label is cow, !i(xi) is high– When pixel xi has color=green and label is cow, !i(xi) is low

– However, we want smoothness constraint over neighbors

• Neighboring nodes should have “similar” values13

Page 14: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Solution for Labeling

• Solution: In a pairwise MRF

– Encode node preferences as node potentials

• !i takes higher value when xi is brown and label is cow

– Smoothness preferences as edge potentials

– Model in negative log-space, using energy functions

• Specify the energy function:

– For MAP objective, ignore partition function

• Goal: Minimize the energy (MAP objective)

– How to define smoothness? Next.

14

E(x1,..xn ) = εi (xi ) + εij (xi , x j )i, j∈E∑

i∑

argminx1,..xn

E(x1,..xn )

Page 15: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Smoothness for Metric MRF

15

Page 16: MAP Examples - University at Buffalo

Probabilistic AI Srihari

Generalizations of Smoothness for Metric MRF

1. Potts model (when there are more than two labels)

2. Distance Function on labels– Prefer neighboring nodes to have labels smaller

distance apart– Metric MRF

• Need a metric µ(vk,vl) on labels

16