Belief Propagation and Loop Series on Planar Graphs Volodya Chernyak, Misha Chertkov and Razvan...

35
Belief Propagation and Loop Series on Planar Graphs Volodya Chernyak, Misha Chertkov and Razvan Teodorescu Department of Chemistry, Wayne State University Theory Division, Los Alamos National Laboratory

Transcript of Belief Propagation and Loop Series on Planar Graphs Volodya Chernyak, Misha Chertkov and Razvan...

Belief Propagation and Loop Series on Planar Graphs

Volodya Chernyak, Misha Chertkov and Razvan Teodorescu

Department of Chemistry, Wayne State UniversityTheory Division, Los Alamos National Laboratory

Acknowledgements

John Klein (Wayne State)John Klein (Wayne State)

Outline• Vertex model formulation for statistical inference problem

• Traces and graphical traces

• Gauge-invariant formulation of loop calculus

• From single-connected partition to dimer matching problem

• Pfaffian expression for the single partition

• Tractable models

• From general binary model to the dimer-monomer matching problem and Pfaffian series on planar graphs

• Fermion (Grassman), continuous and supersymmetric cases

Statistical inference

Vertex model formulationq-ary variables reside on edgesq-ary variables reside on edges

Probability of a configurationProbability of a configuration Partition functionPartition function

Reduced variablesReduced variablesMarginal probabilitiesMarginal probabilities

can be expressed in terms ofcan be expressed in terms ofthe derivatives of the free energythe derivatives of the free energywith respect to factor-functionswith respect to factor-functions

Forney ’01; Loeliger ’01Forney ’01; Loeliger ’01BaxterBaxter

Loop calculus (binary alphabet)

Belief Propagation (BP) is exact on a treeBelief Propagation (BP) is exact on a tree

Equivalent models: gauge fixing and transformations

Replace the model with an equivalent more convenient modelReplace the model with an equivalent more convenient model

Invariant approachInvariant approach Coordinate approachCoordinate approach

(i) Introduce an invariant object that(i) Introduce an invariant object that describes partition function Zdescribes partition function Z(ii) Different equivalent models(ii) Different equivalent models correspond to different coordinate correspond to different coordinate choices (gauge fixing)choices (gauge fixing)(iii) Gauge transformations are(iii) Gauge transformations are changing the basis sets changing the basis sets

(i) Introduce a set of gauge transformations(i) Introduce a set of gauge transformations that do not change Zthat do not change Z(ii) Gauge transformations build new(ii) Gauge transformations build new equivalent modelsequivalent models

General strategy (based on linear algebra)General strategy (based on linear algebra)

(i)(i) Replace q-ary alphabet with a q-dimensional vector space Replace q-ary alphabet with a q-dimensional vector space (ii)(ii) (letters are basis vectors)(letters are basis vectors)(ii) Represent Z by an invariant object (ii) Represent Z by an invariant object graphical tracegraphical trace(iii) Gauge fixing is a basis set choice(iii) Gauge fixing is a basis set choice(iv) Gauge transformations are linear transformation of basis sets(iv) Gauge transformations are linear transformation of basis sets

Gauge invariance: matrix formulation

Gauge transformations of factor-functionsGauge transformations of factor-functions

with orthogonality conditionswith orthogonality conditions

do not change the partition functiondo not change the partition function

BP equations: matrix (coordinate) formulation

No-loose-ends conditionNo-loose-ends condition

results in BP equationsresults in BP equations

withwith

ba1nc

1c0

1ac

01

nac

ab

BP equations: standard form

A standard form of BP equationsA standard form of BP equations

is reproduced using the followingis reproduced using the followingrepresentation for the ground staterepresentation for the ground state

Side remark: relation to iterative BPSide remark: relation to iterative BP

Graphical representation of trace and cyclic trace

jiij

jj gfffTr )(

132

22

21

11

13

2

2

1......)( ij

jiij

jiij

jijj

jj

jj

n n

nnngfgfgfffffTr

ijf jig

21ijg

nn jif

11 jif

22 jif

1ijng32ijg

TraceTrace

Cyclic traceCyclic trace

Summation over repeating subscripts/superscriptsSummation over repeating subscripts/superscripts

scalar productscalar product

Graphic trace and partition function

bbbf 2113ba

abgaaaaf 321

abW baW

NaaaaNa

a

anfff ,...1...,...,1 }{}{1

baiiabbaabbaggg }{}{

...)...(...)()( ............ 11

bbbb

aaaa

a

ag

bnjankfffTrfTr

jkbaabg

)( wugwu ab

abWu baWwijj

baiab

jba

iabab

ijab eeeegg )(

n

n

nababnaa eeff

...

1

1

1

1...),...,(

Collection of tensors (poly-vectors)Collection of tensors (poly-vectors)

Scalar productsScalar products

Graphic traceGraphic trace

Orthogonality conditionOrthogonality condition

Tensors and factor-functionsTensors and factor-functions )( fTrZ

Partition function and graphic trace: gauge invariance

ji

jabiab e )(,

*abab W ijjbaiab ,,

jjbajababg ,,

Dual basis set of co-vectorsDual basis set of co-vectors(elements of the dual space)(elements of the dual space) Orthogonality condition (two equivalent forms)Orthogonality condition (two equivalent forms)

)(...),...,( ,,1 11

aababna ff

nn

abab baab ,,

ababe ba

bae

adade

acace

sbsbe

bsbse

)...,(),,()( bsbabadacaba fffTr

Graphic trace: Evaluate scalar products (reside on edges) on tensors (reside vertices)Graphic trace: Evaluate scalar products (reside on edges) on tensors (reside vertices)

Gauge invariance: graphic trace is an invariant object,Gauge invariance: graphic trace is an invariant object,factor-functions are basis-set dependentfactor-functions are basis-set dependent

““Gauge fixing” is a choice of an orthogonal basis setGauge fixing” is a choice of an orthogonal basis set

Belief propagation gauge and BP equations

0))(...()(...1111

aababab

aababab fufu

nn

aba

abab Wfn

)(...11

baaba

ababab fgn

))(...(11

*))(...(11 ba

aababab Wfgn

ba1nb

1b1nab

1ab abu ba

*abab W *

abab Wu

0 baabu Introduce local groundIntroduce local ground and excited (painted) statesand excited (painted) states

BP gauge: painted structures withBP gauge: painted structures withloose ends should be forbiddenloose ends should be forbidden(in particular, no allowed painted(in particular, no allowed paintedstructures in a tree case)structures in a tree case)

or, stated differently, results in BP equations in invariant form:or, stated differently, results in BP equations in invariant form:

1 baab

Loop decomposition: binary case

A generalized loop visualizesA generalized loop visualizesa single-configuration contributiona single-configuration contributionto the partition function in BP gaugeto the partition function in BP gauge

Beliefs (marginal probabilities)Beliefs (marginal probabilities)

Homogeneous valence-three graphsFrom arbitrary- to valence-three vertices From arbitrary- to valence-three vertices

Transformation for Tanner graphs Transformation for Tanner graphs

Planar valence-three graphsPlanar valence-three graphsdual to a triangulation dual to a triangulation

Single-connected partition: dimer model

Transformation to the extended graphTransformation to the extended graph(Fisher’s trick)(Fisher’s trick)

Single partition equivalent to dimer-matchingSingle partition equivalent to dimer-matchingproblem on the extended graph problem on the extended graph

““Single” partition function (regular loops) Single” partition function (regular loops)

Pfaffian expression for dimer model

Kasteleyn representation as a PfaffianKasteleyn representation as a Pfaffian(from symmetric to skew-symmetric matrix) (from symmetric to skew-symmetric matrix)

Tractable problems: reduction to single-connected

BP equations: loose ends (valence one vertices) forbidden BP equations: loose ends (valence one vertices) forbidden

Additional equations: valence three vertices forbidden Additional equations: valence three vertices forbidden

The model is equivalent to the dimer-matching problemThe model is equivalent to the dimer-matching problem

Pfaffian series for monomer-dimer model

Expansion in dimers (triple colored vertices) Expansion in dimers (triple colored vertices)

Each term is computationally tractable Each term is computationally tractable

Fermion representation and models

Grassman variables Grassman variables Berezin integral Berezin integral

Berezin-integral representation of a PfaffianBerezin-integral representation of a Pfaffian

Continuous and supersymmetric case: graphical sigma-models

abM baMM

M

MM

Scalar product: the space of states and its dual are equivalentScalar product: the space of states and its dual are equivalent

No-loose-end requirementNo-loose-end requirement

Continuous version of BP equationsContinuous version of BP equations

Supersymmetric sigma-models: supermanifolds

dimensiondimension

substrate (usual) manifoldsubstrate (usual) manifold

additional Grassman (anticommuting variables)additional Grassman (anticommuting variables)

Functions on a supermanifoldFunctions on a supermanifold

Berezin integral (measure in a supermanifold)Berezin integral (measure in a supermanifold)

Any function on a supermanifold can be representedAny function on a supermanifold can be representedas a sum of its even and odd componentsas a sum of its even and odd components

Supersymmetric sigma models: graphic supertrace I

1,0abpabp bap

)()(01 VcardEcardBB

Natural assumption: factor-functionsNatural assumption: factor-functionsare even functions onare even functions on

Introduce parities of the beliefsIntroduce parities of the beliefs

BP equations for paritiesBP equations for parities Follows from the first twoFollows from the first two

Edge parity is well-definedEdge parity is well-definedelementselements

(number of connected components)(number of connected components)

Euler characteristicEuler characteristic

Supersymmetric sigma models: graphic supertrace II

Decompose the vector spacesDecompose the vector spaces

Graphic supertrace decomposition (generalizes the supertrace)Graphic supertrace decomposition (generalizes the supertrace)

results in a multi-reference loop expansionresults in a multi-reference loop expansion

into reduced vector spacesinto reduced vector spaces

is the graphic trace (partition function) of a reduced modelis the graphic trace (partition function) of a reduced model

• We have formulated the statistical inference problem in terms of a graphical trace, which leads to the invariance of the partition function under a set of gauge transformations.

• BP equations have been interpreted as a special choice of gauge

• The generalized loop (net) expansion appears in a natural way in the BP gauge

• For a planar graph we have performed the summation over single-connected loops

• A class of tractable models have been identified

• A Pfaffian series for a general binary vertex model case on a planar graph have been formulated

Summary

Bibliography

• M. Chertkov, V.Y. Chernyak, R. TeodorescuM. Chertkov, V.Y. Chernyak, R. TeodorescuBelief Propagation and Loop Series on Planar GraphsBelief Propagation and Loop Series on Planar Graphs, , arXiv:0802.3950v1 [cond-mat.stat-mech] (2008)arXiv:0802.3950v1 [cond-mat.stat-mech] (2008)

• V.Y. Chernyak, M. Chertkov,V.Y. Chernyak, M. Chertkov,Loop Calculus and Belief Propagation for q-ary Alphabet: Loop TowerLoop Calculus and Belief Propagation for q-ary Alphabet: Loop Tower, , proceeding of ISIT 2007, June2007, Nice, cs.IT/0701086 proceeding of ISIT 2007, June2007, Nice, cs.IT/0701086

• M. Chertkov, V.Y. Chernyak, M. Chertkov, V.Y. Chernyak, Loop Calculus Helps to Improve Belief Propagation and Loop Calculus Helps to Improve Belief Propagation and Linear Programming Decodings of Low-Density-Parity-Check CodesLinear Programming Decodings of Low-Density-Parity-Check Codes,,44th Allerton Conference (September 27-29, 2006, Allerton, IL); arXiv:cs.IT/060915444th Allerton Conference (September 27-29, 2006, Allerton, IL); arXiv:cs.IT/0609154

• M. Chertkov, V.Y. Chernyak, M. Chertkov, V.Y. Chernyak, Loop Calculus in Statistical Physics and Information ScienceLoop Calculus in Statistical Physics and Information Science, , Phys. Rev. E 73, 065102(R)(2006); cond-mat/0601487Phys. Rev. E 73, 065102(R)(2006); cond-mat/0601487

• M. Chertkov, V.Y. Chernyak, M. Chertkov, V.Y. Chernyak, Loop series for discrete statistical models on graphsLoop series for discrete statistical models on graphs, , J. Stat. Mech. (2006) P06009,cond-mat/0603189J. Stat. Mech. (2006) P06009,cond-mat/0603189

Path forward: interplay of topological and geometrical equivalence

Topological structure: the graphTopological structure: the graph Geometrical structure: factor-functionsGeometrical structure: factor-functions

Use topologicallyUse topologicallyequivalent modelsequivalent models

Use geometricallyUse geometricallyequivalent modelsequivalent models

CombineCombine

e.g. Weitz ’06e.g. Weitz ’06

++• improving BP improving BP • quantum versionquantum version• etcetc

Homotopy approach to loop decomposition

Graph (arbitrary)Graph (arbitrary) Bouquet of circlesBouquet of circles

Equivalent (same homotopy type)Equivalent (same homotopy type)by contracting the tree to a pointby contracting the tree to a point

31 B ““circles”circles”

Both models are equivalentBoth models are equivalent

Loop calculus for the bouquet model (independent loops) constitutesLoop calculus for the bouquet model (independent loops) constitutesa resummation for the original model (generalized loops)a resummation for the original model (generalized loops)

Loop towers for q-ary alphabet: first step

A generalized loop defines a vertex model on the corresponding subgraphA generalized loop defines a vertex model on the corresponding subgraphwith (q-1)-ary alphabet (first store above the ground store) with (q-1)-ary alphabet (first store above the ground store)

Partition function for the subgraph modelPartition function for the subgraph model

q>2 (non-binary case): more than one local excited stateq>2 (non-binary case): more than one local excited state

Loop-tower expansion for q-ary alphabet

Building the next level (store)Building the next level (store)

Loop towerLoop tower

“Reduced Bethe free energy” (variational approach)

Reduced Bethe free energyReduced Bethe free energy

is an attempt to approximate the partition function Z in terms of the ground-stateis an attempt to approximate the partition function Z in terms of the ground-statecontribution in a proper gaugecontribution in a proper gauge

withwith

))(ln()( 00 ZF

BP equations are recovered by the stationary point conditionsBP equations are recovered by the stationary point conditions 0)(0

ab

F

1 baab

Not a standard variational scheme: corrections can be of either signNot a standard variational scheme: corrections can be of either sign

What is the relation of the introduced functional to the Bethe free energyWhat is the relation of the introduced functional to the Bethe free energy((Yedidia, Freeman, Weiss ‘01Yedidia, Freeman, Weiss ‘01)?)?

Bethe free energy for q-ary alphabet

BP equations can be obtained as stationary points of the Bethe free energyBP equations can be obtained as stationary points of the Bethe free energyfunctional of beliefsfunctional of beliefs

with natural constraintswith natural constraints

Bethe effective Lagrangian

Variation of beliefsVariation of beliefs

Values of beliefsValues of beliefs

Relation to Bethe free energy

Variation of theVariation of theground stateground state

Variation of beliefsVariation of beliefs

Gauge fixingGauge fixing

Reduced Bethe free energyReduced Bethe free energy