MAP Examples - University at Buffalo

Probabilistic AI Srihari

1

MAP Examples

Sargur [email protected]


Topics

2

• Potts Model• CRF for OCR• Image segmentation based on energy

minimization


Examples of MAP

• Many interesting examples of MAP inference are instances of structured prediction, which involves doing inference in a conditional random field (CRF) model p(y∣x):

• Example of OCR given next– Where x is a character image and y is its

recognized label3


Potts models• A generalization of the Ising model and

Boltzmann model to multiple values


MAP inference in 1-D structure

5

We are given images xi∈[0,1]d�d of characters in the form of pixel matrices;

MAP inference is to jointly recognizing the most likely word(yi)n

i=1 encoded by the images

Chain-structured conditional random field for OCR

Observedimages

Recognizedcharacters


MAP inference in a 2-D structure

6

We are interested in locating an entity in an image and label all its pixels. Our input x ∈[0,1]d�d is a matrix of image pixels, our task is to predict the label y ∈{0,1}d�d, indicating whether each pixel encodes object we want to recover

Intuitively, neighboring pixels should have similar values in y, i.e. pixels associated with the horse should form one continuous blob (rather than white noise)

xi

yi


Potts’ model

• Prior knowledge modeled as PGMs• We can introduce potentials ϕ(yi,xi)

that encode the likelihood that any given pixel is from our subject.

• We then augment them with pairwise potentials ϕ(yi,yj) for neighboring yi,yj, which encourages adjacent y’s to have the same value with higher probability

7

xi

yi

ϕ(yi,xi) ϕ(yi,yj)


Ising Model in Statistical Physics• Energy of interacting atoms

– Determined from their spin • Atom’s spin is sum of its electron spins• Each atom associated with binary random variable

– Xi ∈{+1,-1} whose value is direction of atom�s spin

• Energy function parametric formεij(xi,xj) = -wijxixj

– Symmetric in Xi, Xj: note scope is pairwise

– Makes contribution wij to energy when Xi=Xj (same spin)– -wij otherwise

• Probability distribution over atoms (energy function)

When wij>0 model prefers aligned spins: ferromagnetismwij<0 : antiferromagneticwij=0: non-interacting

P(ξ) = 1Zexp − wij xix j − uixi

i∑

i< j∑

⎛

⎝⎜⎞

⎠⎟ξ ε Val(Χ)


Boltzmann Distribution• Variant of Ising Model• Variables Xi have value {0,1} instead of {+1,-1}

– Energy function has same parametric form εij(xi,xj)=-wijxixj

– Nonzero contribution –wij from edge Xi-Xj only when Xi=Xj=1

• Ising model has contribution wij when variables are same and –wij when they are different

• Has the same energy function as Ising model

9P(ξ) = 1Zexp − wij xix j − uixi

i∑

i< j∑

⎛

⎝⎜⎞

⎠⎟Mapping 0 to -1


10

CRF for pixel labeling

• A conditional probability distribution p(x|y)• Neighboring pixel labels xi and xj are strongly

correlated

xi= unknown pixel labelyi= known noisy pixel


11

Energy Functions

• Graph has two types of pairwise cliques– {xi,yi} expresses strength of label to pixel

values• Energy function η xiyi

– {xi,xj} are neighboring pixel labels• Choose β xixj


12

Potential Function

• Complete energy function of model

– The hxi term biases towards pixel values that have one particular sign

– Defines a joint distribution over x and y given by

– We now fix elements of y to the observed values given by the pixels

– It implicitly defines a conditional distribution p(x|y) over the labeled image

{ , }

(x, y) i i j i ii i j i

E h x x x x yb h= - -å å å

1(x, y) exp{ (x, y)}p E

Z= -


Metric MRF for Labeling

• Task: – We are given a graph with nodes X1,..Xn, edges E– Task is to assign to each Xi a label in V={v1,..vk}

• E.g., labeling super-pixels as v1=cow, v2= grass• Each node, in isolation, has a preferred label

– E.g., when pixel xi has color=brown and label is cow, !i(xi) is high– When pixel xi has color=green and label is cow, !i(xi) is low

– However, we want smoothness constraint over neighbors

• Neighboring nodes should have “similar” values13


Solution for Labeling

• Solution: In a pairwise MRF

– Encode node preferences as node potentials

• !i takes higher value when xi is brown and label is cow

– Smoothness preferences as edge potentials

– Model in negative log-space, using energy functions

• Specify the energy function:

– For MAP objective, ignore partition function

• Goal: Minimize the energy (MAP objective)

– How to define smoothness? Next.

14

E(x1,..xn ) = εi (xi ) + εij (xi , x j )i, j∈E∑

i∑

argminx1,..xn

E(x1,..xn )


Smoothness for Metric MRF

15


Generalizations of Smoothness for Metric MRF

1. Potts model (when there are more than two labels)

2. Distance Function on labels– Prefer neighboring nodes to have labels smaller

distance apart– Metric MRF

• Need a metric µ(vk,vl) on labels

16

MAP Examples - University at Buffalo

Documents

Transcript of MAP Examples - University at Buffalo