Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical...

36
 Akshat Kumar & Shlomo Zilberstein.  MAP Estimation for graphical Models by Likelihood Maximization. 

Transcript of Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical...

Page 1: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical 

Models by Likelihood Maximization. 

Page 2: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

MAP problem for MRF

X3X1

X2

Task: find most probable assignmentsto all variables = MPE

Page 3: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Paper in a nutshell

● We want to solve MAP problem for Markov Random Fields

● Can formulate as Linear Programming problem

● Associate a marginal polytop with the MAP problem● Exponentially large constraints – difficult to solve

● Well­known LP relaxations to find the upper bound exist

● Outer bound on the marginal polytop● Authors propose relaxation to get a lower­bound

● Inner bound on the marginal polytop● Not convex – formulate as a mixture of BN, solve with EM

●  → Get fast and pretty accurate approximation method

Page 4: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

MAP problem for MRF as LP

X3X1

X2

Introduce µ ­ a vector of marginal probabilities for each node and edge of MRFcoming from some joint probability p

Page 5: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

MAP problem for MRF as LP

X3X1

X2

MAP problem is equivalent to solving:

Page 6: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

MAP problem for MRF as LP

X3X1

X2

 µ should satisfy constraints

MAP problem is equivalent to solving:

Page 7: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 8: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

MAP problem for MRF as LP relaxation (upper bound)

● We don't assume anymore that µ comes from some distribution p

● Only pairwise and singleton consistently is required

Marginal polytopeis expressed by:

Page 9: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 10: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 11: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 12: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 13: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 14: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

MAP as mixture of Bayes Nets

X3X1

X2

X1

X2

X1

X2

X3

X2

X1

X2

X1

X3

l=1 l=3l=2

l=1

l=2

l=3

We want to estimate marginal probabilities:

Page 15: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

MAP as mixture of Bayes Nets

X3X1

X2

X1

X2

X1

X2

X3

X2

X1

X2

X1

X3

l=1 l=3l=2

l=1

l=2

l=3

Example: for l=1

Full joint for a particular Bayes Net indicated by l:

Page 16: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Some notation:

X3X1

X2

X1

X2

X1

X2

X3

X2

X1

X2

X1

X3

l=1 l=3l=2

l=1

l=2

l=3

Page 17: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Some notation:

X3X1

X2

X1

X2

X1

X2

X3

X2

X1

X2

X1

X3

l=1 l=3l=2

l=1

l=2

l=3

Page 18: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Some notation:

X3X1

X2

X1

X2

X1

X2

X3

X2

X1

X2

X1

X3

l=1 l=3l=2

l=1

l=2

l=3

Page 19: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Some notation:

X3X1

X2

X1

X2

X1

X2

X3

X2

X1

X2

X1

X3

l=1 l=3l=2

l=1

l=2

l=3

Page 20: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

X3X1

X2

X1

X2

X1

X2

X3

X2

X1

X2

X1

X3

l=1 l=3l=2

Page 21: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 22: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

That's the likelihood of BN mixture from previous slide:

And that's our optimization criterion for lower bound:

Recalling that 

These are the same things

So, can maximize Lp and get MAP lower bound!

Let's use EM for that

Page 23: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 24: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Graphical illustration for auxiliary function

Page 25: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 26: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Update rulesExpected complete log­likelihood:

p – old parametersp* ­ new parameters

Page 27: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Update rulesExpected complete log­likelihood:

p – old parametersp* ­ new parameters

M­step

Page 28: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 29: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 30: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Algorithm

Complexity of computing a single message is O(d2), were d ­ domain size

Independent of degree ???

Page 31: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 32: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Experiment: solution quality

Page 33: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 34: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Page 35: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Experiments: compare with max­product

Page 36: Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical …dechter/courses/ics-295/winter... · 2011. 1. 30. · Akshat Kumar & Shlomo Zilberstein. MAP Estimation for graphical

   

Why is this method good?

● Average solution quality within 95% of optimal (for protein design)

● Increases lower bound on MAP rapidly● Embarassingly parallel 

● (assuming shared memory? o.w. too much communication)

● Doesn't work:

● Өmax

­ Өmin is large