Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf ·...

36
Phase transitions in NP-complete problems: a challenge for probability, combinatorics, and computer science Cristopher Moore University of New Mexico and the Santa Fe Institute Tuesday, October 12, 2010

Transcript of Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf ·...

Page 1: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Phase transitions in NP-complete problems:a challenge for probability, combinatorics, and computer science

Cristopher MooreUniversity of New Mexicoand the Santa Fe Institute

Tuesday, October 12, 2010

Page 2: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Satisfiability

• n Boolean variables, x1, x2, ... xn

• m constraints, or “clauses”: e.g.

• Can we satisfy the entire formula, i.e., all the clauses at once?

• There are 2n possible truth assignments—exhaustive search takes exponential time.

• Can we do better? Is there an algorithm that works in poly(n) time, i.e., O(nc) time for some constant c?

(x1 ! x4 ! x5) " (x4 ! x7 ! x19) " · · ·

Tuesday, October 12, 2010

Page 3: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

P vs. NP

• P: we can find the solution (or tell if there is one) in poly(n) time

• NP: we can check a solution in poly(n) time, e.g. 3-SAT

• If P=NP, we can find tours, tilings, proofs...

• Surely finding is harder than checking! Can we prove P≠NP?

Tuesday, October 12, 2010

Page 4: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

• Gödel to von Neumann: let φ(n) be the time it takes the best possible algorithm to tell whether there exists a proof of length n or less for a given mathematical statement.

• In the modern age, proving that P≠NP is one of the outstanding open questions in mathematics.

P vs. NP

DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --

WHAT IF P=NP? 179

6.1.3 The Demise of Creativity

We turn now to the most profound consequences ofP=NP: that reasoning and the search for knowledge,with all its frustration and elation, would be much easier than we think.

Finding proofs of mathematical conjectures, or telling whether a proof exists, seems to be a very hardproblem. On the other hand, checking a proof is not, as long as it is written in a careful formal language.Indeed, we invent formal systems like Euclidean geometry, or Zermelo-Fraenkel set theory, precisely toreduce proofs to a series of simple steps. Each step consists of applying a simple axiom, such as ModusPonens, “If A and (A ! B ), then B .” This makes it easy to check that each line of the proof follows fromthe previous lines, and we can check the entire proof in polynomial time as a function of its length. Thusthe following problem is in P:

PROOF CHECKING

Input: A statement S and a proof P

Question: Is P a valid proof of S?

This implies that the following problem is inNP:

SHORT PROOF

Input: A statement S and an integer n given in unary

Question: Does S have a proof of length n or less?

Note that we give the length of the proof in unary, since the running time of PROOF CHECKING is polynomialas a function of n .

Now suppose that P=NP. You can take your favorite unsolved mathematical problem—the RiemannHypothesis, the Goldbach Conjecture, you name it—and use your polynomial-time algorithm for SHORT

PROOF to search for proofs of less than, say, a billion lines. For that matter, ifP=NP you can quickly searchfor solutions to most of the other Millennium Problems as well. The point is that no proof constructed by ahuman will be longer than a billion lines anyway, even when we go through the tedious process of writingit out axiomatically. So, if no such proof exists, we have no hope of finding one.

This point was raised long before the classes P and NP were defined in their modern forms. In 1956,the logician Kurt Gödel wrote a letter to John von Neumann. Turing had shown that the Entscheidungsprob-lem, the problem of whether a proof exists at all for a given mathematical statement, is undecidable—thatis, as we will discuss in Chapter 7, it cannot be solved in finite time by any program. In his letter Gödelconsidered the bounded version of this problem, where we ask whether there is a proof of length n orless. He defined!(n ) as the time it takes the best possible algorithm to decide this, and wrote: 6.3

The question is, how fast does!(n ) grow for an optimal machine. One can show that!(n )"K n . If there actually were a machine with!(n )# K n (or even only !(n )# K n 2), this wouldhave consequences of the greatest magnitude. That is to say, it would clearly indicate that,despite the unsolvability of the Entscheidungsproblem, the mental effort of the mathemati-cian in the case of yes-or-no questions could be completely replaced by machines (footnote:apart from the postulation of axioms). One would simply have to select an n large enough

DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --

180 THE DEEP QUESTION: P VS. NP

that, if the machine yields no result, there would then be no reason to think further about theproblem.

Gödel goes on to point out that we can find proofs in polynomial time—in our terms, that P = NP—precisely if we can do much better than exhaustive search (dem blossen Probieren, meaning “simply try-ing”) among N alternatives, where N is exponentially large in n :

! ! K n (or ! K n 2) means, simply, that the number of steps vis-à-vis exhaustive search canbe reduced from N to log N (or (logN )2).

Since there are an exponential number of possible proofs of length n , even excluding those which areobviously nonsense, this problem seems very hard. As mathematicians, we like to believe that we needto use all the tools at our disposal—drawing analogies with previous problems, visualizing and doodling,designing examples and counterexamples, and making wild guesses—to find our way through this searchspace to the right proof. But if P = NP, finding proofs is not much harder than checking them, and thereis a polynomial-time algorithm that makes all this hard work unnecessary. As Gödel says, in that case wecan be replaced by a simple machine.

Nor would the consequences of P = NP be limited to mathematics. Scientists in myriad fields spendtheir lives struggling to solve the following problem:

ELEGANT THEORY

Input: A set E of experimental data and an integer n given in unary

Question: Is there a theory T of length n or less that explains E ?

For instance, E could consist of a set of astronomical observations, T could consist of a mathematicalmodel for planetary motion, and T could explain E to a given accuracy. An elegant theory is one whoselength n , defined as the number of symbols it takes to express it in some mathematical language, is fairlysmall—such as Kepler’s laws or Newton’s law of gravity, along with the planets’ masses and initial posi-tions.

Let’s assume that we can compute, in polynomial time, what predictions a theory T makes about thedata. Of course, this disqualifies theories such as “because God felt like it,” and even for string theory thiscomputation seems very difficult. Then again, if we can’t tell what predictions a theory makes, we can’tcarry out the scientific method anyway.

With this assumption, and with a suitable formalization of this kind, ELEGANT THEORY is in NP. There-fore, if P = NP, the process of searching for patterns, postulating underlying mechanisms, and forminghypotheses can simply be automated. No longer do we need a Kepler to perceive elliptical orbits, or aNewton to guess the inverse-square law. Finding these theories becomes just as easy as checking thatthey fit the data.

We could go on, and point out that virtually any problem in design, engineering, pattern recognition,learning, or artificial intelligence could be solved in polynomial time if P = NP. But we hope our pointis clear. At its root, our belief that P "= NP is about the nature of intelligence itself. We believe that theproblems we face demand all our faculties as human beings—all the painstaking work, flashes of insight,and leaps of intuition that we can bring to bear. We believe that understanding matters—that each prob-lem brings with it new challenges, and that meeting these challenges brings us to a deeper understanding

Tuesday, October 12, 2010

Page 5: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

NP-completeness

• A problem A is NP-complete if any problem in NP can be translated (or “reduced”) to A in poly(n) time

• Problems are NP-complete because we can build computers out of them.

• Imagine a circuit that checks a proposed solution:

• We can express the claim that this circuit works, and that a solution exists, as a 3-SAT formula:

• If any NP-complete problem is in P, then P=NP.

(x1 ! y1) " (x2 ! y1) " (x1 ! x2 ! y1)" · · · " z .

AND

OR

NOT

AND

x1 x2

z

y1

y2

y3

Tuesday, October 12, 2010

Page 6: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

NP-completeness

• If A is NP-complete, and we can translate A to B, then B is NP-complete too. Since 1972, the tree has grown...

Cosine Integral

Witness Existence

Planar SAT 3-SAT

TilingCA Predecessor

Subset Sum

Integer Partitioning

Exact Spanning Tree

Quadratic Diophantine Equation

Circuit SAT

NAE-3-SAT

Graph 3-Coloring

Vertex Cover

Independent Set

Max Clique

Max Cut

Max 2-SAT

Hamiltonian Path

3-Matching

Tuesday, October 12, 2010

Page 7: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Random formulas

• Instead of having an adversary choose the formula, make them randomly.

• choose m clauses uniformly and independently: choose xi, xj, xk, and negate each one with probability 1/2

• Denote these formulas F3(n,m), analogous to random graphs G(n,m).

• Sparse case: m=αn for some constant α

• Intuitively, the larger α is, the harder it is to satisfy all the clauses...

Tuesday, October 12, 2010

Page 8: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

A phase transition

3 4 5 6 7!

0.0

0.2

0.4

0.6

0.8

1.0

P

n = 10

n = 15

n = 20

n = 30

n = 50

n = 100

Tuesday, October 12, 2010

Page 9: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Search times

1 2 3 4 5 6 7 8!

102

103

104

105

106

DP

LL

call

s

Tuesday, October 12, 2010

Page 10: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

A threshold conjecture

• k-SAT formulas have k variables per clause

• Define random formulas Fk(n,m) similarly.

• Conjecture: for each k ≥ 2, there is a constant αc such that

• Proving this seems quite hard, but we know something almost as good...

limn!"

Pr[Fk(n,m = !n) is satisfiable] =

!1 if ! < !c

0 if ! > !c

Tuesday, October 12, 2010

Page 11: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

A threshold theorem, almost

• [Friedgut] For each k ≥ 2, there is a function αc(n) such that for any ε>0,

• Except for k=2, we don’t know that αc(n) converges to a constant.

• But we do have many theorems placing upper and lower bounds on αc

assuming that it exists, i.e., showing that Fk(n,m=αn) is w.h.p. satisfiable for α<α*, or w.h.p. unsatisfiable for α>α*.

limn!"

Pr[Fk(n,m = !n) is satisfiable] =

!1 if ! < (1! ")!c(n)0 if ! > (1 + ")!c(n)

Tuesday, October 12, 2010

Page 12: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

An easy upper bound

• Let Z be the number of satisfying assignments of F3(n,m)

• We can prove that random formulas are w.h.p. unsatisfiable using

• For each assignment σ, let be 1 if σ satisfies the formula and 0 otherwise

• Since the clauses are independent and σ satisfies each one with probability 7/8,

Pr[Z > 0] ! E[Z]

1!

E[Z] =!

!!{0,1}n

E[1!] =!

!

(7/8)m = 2n (7/8)m

Tuesday, October 12, 2010

Page 13: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

An easy upper bound

• Thus the expected number of satisfying assignments is

• Exponentially small when 2(7/8)α < 1, so

• For general k, the same argument gives

• Hint: σ satisfies a random clause with probability

E[Z] = 2n(7/8)m = [2 (7/8)!]n

!c ! log8/7 2 " 5.191

!c ! 2k ln 2

1! 2!k

Tuesday, October 12, 2010

Page 14: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Why isn’t this the right answer?

• The real threshold is αc ≈ 4.267

• In the range αc < α < 5.191, the expected number of solutions is exponentially large—even though most of the time there aren’t any.

• Z is not concentrated around its expectation—

• E[Z] is dominated by exponentially rare events where the formula has exponentially many solutions.

• For a long time, the best lower bound was O(2k/k), far below 2k ln 2.

• Maybe we can control the variance?

Tuesday, October 12, 2010

Page 15: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

The second moment method

• If Z is a nonnegative random variable,

• Now we have to think about pairs of solutions, and their correlations:

Pr[Z > 0] ! E[Z]2

E[Z2]

E[Z2] = E!"

!

1!

"

"

1"

#

="

!,"

E[1!1" ]

="

!,"

Pr[!, " both satisfying]

Tuesday, October 12, 2010

Page 16: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

• In many cases the events associated with σ≠τ are independent. Then

so , like Poisson...

if is large then is close to 1

When correlations are weak

E[Z2] =!

!,"

Pr[! ! " ]

=!

!

Pr[!] +!

" !=!

Pr[!] Pr[" ]

" E[Z] + E[Z]2

VarZ = E[Z]

Pr[Z > 0]E[Z]

Tuesday, October 12, 2010

Page 17: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Entropy and correlations

• If σ and τ have overlap z=ζ n, they both satisfy a random clause with probability

• Note that , just as if they were independent.

• Using where ,

• where

q(1/2) = (1! 2!k)2

!n

!n

"! enh(!) h(!) = !! ln ! ! (1! !) ln(1! !)

g(!) = h(!) + " ln q(!)

q(!) = 1! 2 · 2!k + !k 2!k

E[Z2] ! 2n

! 1

0eng(!) d!

E[Z2] = 2nn!

z=0

"n

z

#q(z/n)m

Tuesday, October 12, 2010

Page 18: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Asymptotic integrals

• When n is large, integrals of the form are dominated by the ζ that maximizes g(ζ)

• Up to polynomial terms,

• In our case, if g(ζ) is maximized at ζ=1/2, then E[Z2] ≈ CE[Z]2

• The second moment method then gives Pr[Z > 0] ≥ E[Z]2/E[Z2] ≈ 1/C, and Friedgut raises this probability to 1

• But if g(ζ) > g(1/2), then E[Z2] is exponentially larger than E[Z]2 and the second moment method fails

• So, where is g(ζ) maximized?

I =! 1

0eng(!) d!

I = engmax

Tuesday, October 12, 2010

Page 19: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Fatal attraction

• Correlations between σ and τ increase monotonically with the overlap ζ

• Attraction ⇒ correlation ⇒ large variance ⇒ second moment method fails.

0.40 0.45 0.50 0.55 0.60

!0.020

!0.015

!0.010

!0.005

0.000

0.005

0.010

Tuesday, October 12, 2010

Page 20: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Forcing symmetry

• Require that both σ and its complement are satisfying: “Not-All-Equal k-SAT”

0.0 0.2 0.4 0.6 0.8 1.0

!0.015

!0.010

!0.005

0.000

0.005

0.010

Tuesday, October 12, 2010

Page 21: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

The unbearable lightness of being satisfying

• Let Z be the sum over all satisfying assignments of η(# of true literals) for some η < 1

0.4 0.5 0.6 0.7 0.8 0.9 1.0!0.05

!0.04

!0.03

!0.02

!0.01

0.00

0.01

0.02

Tuesday, October 12, 2010

Page 22: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Tight lower bounds

• Recall the first-moment upper bound:

• Second moment with symmetry gives

• Second moment with weighting gives

!c ! 2k ln 2

[Achlioptas and Moore]

[Achlioptas and Peres]

!c ! 2k!1 ln 2"O(1)

!c ! 2k ln 2"O(k)

Tuesday, October 12, 2010

Page 23: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

The geometry of solutions: first hints

• Many pairs of solutions with large overlap, others almost independent

0.4 0.5 0.6 0.7 0.8 0.9 1.0

!0.06

!0.04

!0.02

0.00

different clusters

same clusters

Tuesday, October 12, 2010

Page 24: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Clustering, condensation, and criticality

• The set of solutions breaks up into clusters

• These condense into a few large clusters that dominate the set

• Finally, all the clusters shrink and disappear at αc

[Krzakała, Montanari, Ricci-Tersenghi, Semerjian, Zdeborová]

DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --

784 WHEN FORMULAS FREEZE

!clust !cond !c

Figure 14.26: Phase transitions in random k -SAT. Below the clustering transition at!clust, the set of sat-isfying assignments is essentially a single connected set. At !clust it breaks apart into exponentially manyclusters, each of which has exponential size, and which are widely separated from each other. At thecondensation transition !cond, a subexponential number of clusters dominates the set. Finally, at !c allclusters disappear and there are no solutions at all.

Belief Propagation fails at a certain density because long-range correlations appear between the vari-ables of a random formula. These correlations no longer decay with distance, so the messages that aclause or variable receives from its neighbors in the factorgraph can no longer be considered independent—even though the roundabout paths that connect these neighbors to each other have length!(log n ).

As we will see in this section, these correlations are caused by a surprising geometrical organization ofthe set of solutions. Well below the satisfiability transition, there is another transition at which the set ofsolutions breaks apart into exponentially many clusters. Each cluster contains solutions that are similarto each other, and the clusters have a large Hamming distance between them.

As the density increases further, there are additional transitions—condensation, where a few clusterscome to dominate the set of solutions, and freezing, where many variables in each cluster are forced totake particular values. We believe that condensation is precisely the point at which BP fails to give theright entropy density, and that freezing is the density at which random k -SAT problems become expo-nentially hard.

In order to locate these transitions, we need to fix Belief Propagation so that it can deal with long-rangecorrelations. The resulting method is known as Survey Propagation, and it is one of the main contribu-tions of statistical physics to the interdisciplinary field of random constraint satisfaction problems.

But first we need to address a few basic questions. What is a cluster? What exactly are those blobs inFigure 14.26? And what do they have to do with correlations? To answer these questions we return to aclassic physical system: the Ising model.

14.7.1 Macrostates, Clusters, and Correlations

The Milky Way is nothing else but a mass ofinnumerable stars planted together inclusters.

Galileo Galilei

We have already seen a kind of clustering transition in Section 12.1 where we discussed the Isingmodel of magnetism. Recall that each site i has a spin si =±1. Above the critical temperature Tc , typicalstates of the model are homogeneous, with roughly equal numbers of spins pointing up and down. In a

Tuesday, October 12, 2010

Page 25: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

What’s a cluster?

• The Ising model of magnetism: for T < Tc, the state space breaks up into two “macrostates,” mostly up and mostly down:

09/23/2006 02:17 PM640x640 IsingLab

Page 1 of 2http://www.ibiblio.org/e-notes/Perc/ising640.htm

640x640 Ising model

PrintClearRun +1 initfreeborder160L2.26T

Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console

The Ising applet with the thermostat algorithm.

09/23/2006 02:15 PM640x640 IsingLab

Page 1 of 2http://www.ibiblio.org/e-notes/Perc/ising640.htm

640x640 Ising model

PrintClearRun -1 initfreeborder160L2.26T

Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console

The Ising applet with the thermostat algorithm.

Tuesday, October 12, 2010

Page 26: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

What’s a cluster? The discrete math view

• Each cluster of satisfying assignments is connected under moves that flip one variable at a time

• Clusters have Hamming distance cn between them for some c > 0

• Clusters have diameter bn for some b < c

• “Energy barriers”: to get from one cluster to another, have to dissatisfy hn clauses for some h > 0

• N.B.: each of these might not be literally true in the physics picture

Tuesday, October 12, 2010

Page 27: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

What’s a cluster? Mixing times

• If we want to sample a uniformly random satisfying assignment, clustering seems to make this hard

• Energy barriers ⇒ bottlenecks in state space ⇒ exponential time to get from one cluster to another, as in the Ising model for T < Tc.

• But that doesn’t mean it’s hard to find a single satisfying assignment

• For 3-Coloring, there are algorithms that work beyond the clustering transition [Achlioptas and Moore]

Tuesday, October 12, 2010

Page 28: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

What’s a cluster? Fixed point of belief propagation

• Sparse random formulas, like sparse random graphs, are locally treelike

• For most vertices, the shortest cycle they lie on has length O(log n)

• If correlations decay with distance, then neighbors are roughly independent

DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --

MESSAGE PASSING 773

a k

i

j

Ti

Tj

Tk

Figure 14.21: Removing a vertex from a tree breaks it into independent subtrees. Here we remove a clausevertex a , creating subtrees rooted at each of its variables i , j , k ! ! a .

where x! a = {xi ,x j , . . .} denotes the truth values of the k variables in ! a and

Ca (x! a ) =

!1 if clause a is satisfied

0 otherwise .

We will interpretZi"a (xi ) as a message that is sent by the variable i to the clause a . In terms of dynamicprogramming, we can think of it as the solution to the subproblem corresponding to the subtree Ti beingpassed toward the root.

The message Zi"a (xi ) can be computed, in turn, from messages farther up the tree. Let ! i denotethe set of i ’s neighbors in the factor graph, i.e., the set of clauses in which i appears. For each b ! ! i ,let Zb"i (xi ) denote the number of satisfying assignments of the subtree rooted at b assuming that i hasvalue xi . Then

Zi"a (xi ) ="

b!! i#a

Zb"i (xi ) . (14.64)

The messages on the right-hand side of (14.64) are computed from messages even farther up the tree.Analogous to (14.63) we have

Za"i (xi ) =#

x! a#i

Ca (x! a )"

j!! a#i

Z j"a (x j ) . (14.65)

Thus variables send messages to clauses based on the messages they receive from the other clauses inwhich they appear, and clauses send messages to variables based on the messages they receive from theother variables they contain. (Whew!)

We evaluate (14.64) and (14.65) recursively until we reach a leaf, the base case of the recursion. Herethe sets ! i # a and ! a # i are empty, but empty products evaluate to 1 by convention. Therefore the

Tuesday, October 12, 2010

Page 29: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

What’s a cluster? Fixed point of belief propagation

• Each vertex (clause or variable) sends “messages” to its neighbors

• Marginal distribution based on its other neighbors

• “My other variables can’t satisfy me,” “I must be true to my other clauses”

DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --

774 WHEN FORMULAS FREEZE

i

a

b

Zi!a

Zb!

i

i

a

j

Za!i

Zj!

a

Figure 14.22: The flow of messages from the leaves towards theroot of the factor graph. Left, each variablesends a message to its parent clause based on the messages it receives from its other clauses. Right, eachclause sends a message to its parent variable based on the messages it receives from its other variables.Square vertices a ,b correspond to clauses and round vertices i , j correspond to variables.

messages from the leaves are

Zi!a (xi ) = 1 and Za!i (xi ) =Ca (xi ) .

Note that a leaf a is a unit clause, in which case Ca (xi ) is 1 or 0 depending on whether xi satisfies a .If we start at the leaves and work our way to the root, then we know Z as soon as the root has re-

ceived messages from all its neighbors. The total number of messages that we need to compute and sendequals the number of edges in the factor graph—that is, the total number of literals in the formula—soour algorithm runs in essentially linear time.

But we don’t actually need to choose a root at all. Imagine a parallel version of this algorithm, whereeach vertex sends a message to its neighbor as soon as it has received incoming messages from all itsother neighbors. As soon as any vertex has received messages from all its neighbors, we can compute Z .If this vertex is a clause we use (14.63). Similarly, if it is a variable i we can write

Z =!

xi

Zi (xi ) =!

xi

"

a"! i

Za!i (xi ) , (14.66)

where Zi (xi ) is the number of satisfying assignments where i has the value xi . For that matter, onceevery vertex has received messages from all its neighbors, then every vertex “knows” the total number ofassignments Z .

Figure 14.23 shows the resulting flow of messages. The leaves are the only vertices that can send amessage without receiving one, so the flow starts at the leaves and moves inward. The center vertex is thefirst to receive messages from all its neighbors. In the next iteration it sends messages to all its neighbors,reversing the flow of messages from the center towards the leaves. At the end of the process, every vertexhas received messages from all its neighbors.

Tuesday, October 12, 2010

Page 30: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

What’s a cluster? Fixed point of belief propagation

• In the Ising model, if most of your neighbors are up, you probably are too

• For T < Tc, distant spins are probably the same ⇒ long-range correlations

• But within each cluster, correlations decay09/23/2006 02:17 PM640x640 IsingLab

Page 1 of 2http://www.ibiblio.org/e-notes/Perc/ising640.htm

640x640 Ising model

PrintClearRun +1 initfreeborder160L2.26T

Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console

The Ising applet with the thermostat algorithm.

09/23/2006 02:15 PM640x640 IsingLab

Page 1 of 2http://www.ibiblio.org/e-notes/Perc/ising640.htm

640x640 Ising model

PrintClearRun -1 initfreeborder160L2.26T

Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console

The Ising applet with the thermostat algorithm. Tuesday, October 12, 2010

Page 31: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

What’s a cluster? Reconstruction from boundaries

• Choose a random state, hide all but its boundary, and reconstruct its interior

• How much information can we recover? Is the new state uniformly random deep inside, or is it correlated with the initial state?

• Note: typical boundary conditions, not worst-case

09/23/2006 02:17 PM640x640 IsingLab

Page 1 of 2http://www.ibiblio.org/e-notes/Perc/ising640.htm

640x640 Ising model

PrintClearRun +1 initfreeborder160L2.26T

Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console

The Ising applet with the thermostat algorithm.

09/23/2006 02:15 PM640x640 IsingLab

Page 1 of 2http://www.ibiblio.org/e-notes/Perc/ising640.htm

640x640 Ising model

PrintClearRun -1 initfreeborder160L2.26T

Controls Click mouse to get a new iteration. Press "Enter" to set a new T value. You see M (the red curve) and E (the blue one)in the right part of the applet and in the Status bar. "Clear" button cleans averaging. "Print" sends data to the Java console

The Ising applet with the thermostat algorithm.

See talks by Sly and Bhatnagar this afternoon!

Tuesday, October 12, 2010

Page 32: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Freezing and hardness

• A cluster is frozen if there are qn variables, for some q > 0, that take the same value in every assignment in that cluster

• Search algorithms are doomed if they set any of these variables wrong

• Almost all clusters are frozen when

• But there is an algorithm that works for

DRAFT -- DRAFT -- DRAFT -- DRAFT -- DRAFT --

FROZEN VARIABLES AND HARDNESS 795

!clust !cond !rigid !c

Figure 14.30: A refined phase diagram of random k -SAT. Gray blobs represent frozen clusters, i.e., thosewhere !(n ) variables take fixed values. Above !rigid almost all clusters are frozen, and we believe this isresponsible for the average-case hardness of random k -SAT.

clause renders this instance unsatisfiable with finite probability. Therefore we have!c <!+" for any " > 0,a contradiction.

But even if there are no variables that are frozen in all solutions, we could certainly have variables frozenwithin a cluster. Let’s say that a variable xi is frozen in a cluster C if xi takes the same value in everysolution in C , and call a cluster frozen if it has #n frozen variables for some constant #> 0.

Intuitively, these frozen clusters spell doom for local algorithms. Image aDPLL algorithm descendinginto its search tree. With every variable it sets, it contradicts any cluster in which this variable is frozenwith the opposite value. If every cluster is frozen, then it contradicts a constant fraction of them at eachstep, until it has excluded every cluster from the branch ahead. This forces it to backtrack, taking expo-nential time.

It’s also worth noting that if the clusters are a Hamming distance $n apart, then the DPLL algorithmis limited to a single cluster as soon as it reaches a certain depth in the search tree. Once it has set (1!$)nvariables, all the assignments on the resulting subtree are within a Hamming distance $n of each other,so they can overlap with at most one cluster. If any of the variables it has already set are frozen in thiscluster, and if it set any of them wrong, it is already doomed.

Recent rigorous results strongly suggest that this is exactly what’s going on. In addition to the otherproperties of clusters established by Theorem 14.5 at densities above (2k /k ) lnk , one can show that al-most all clusters have #n frozen variables for a constant # > n . Specifically, if we choose a cluster withprobability proportional to its size, then it has #n frozen variables with high probability. Equivalently, ifwe choose a uniformly random satisfying assignment, then with high probability there are #n variableson which it agrees with every other solution in its cluster.

Conversely, it can be shown that algorithms based on setting one variable at a time using BP messagesfail in this frozen region. But in a recent breakthrough, an algorithm was discovered which works atdensities up to (1! "k )(2k /k ) lnk where "k " 0 as k "#. Thus for large k , it seems that algorithms endprecisely where the frozen phase begins.

For large k , clustering and freezing take place at roughly the same density. In constrast, for small kthey are widely separated, which explains why some algorithms can probe deep into the clustered phase.Figure 14.30 shows a refined picture of random k -SAT that includes frozen clusters. The freezing transi-tion is defined by the point !rigid where the number of unfrozen clusters drops to zero.

14.22For finite k we can determine the freezing point!rigid numerically using Survey Propagation. A frozenvariable corresponds to BP messages µi"a (x ) = $(x ) or µi"a (x ) = $(1! x ), demanding that i take a par-

[Achlioptas and Coja-Oghlan]

[Coja-Oghlan] ! < (1! ")(2k/k) ln k

(1 + !)(2k/k) ln k < " < (1! !)2k ln 2

Tuesday, October 12, 2010

Page 33: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

On the other hand...

• Random XORSAT: each clause sets a parity, xi ⊕ xj ⊕ xk = 0 or 1

• Has clusters with frozen variables, just like k-SAT

• But we can solve it in polynomial time... it’s just a system of linear equations

• Gaussian elimination is not a local algorithm—it globally redefines the variables until the problem becomes trivial

• We believe that there is no analogous approach to k-SAT...

• ...but we don’t know that P ≠ NP.

[Cocco, Dubois, Mandler, Monasson]

Tuesday, October 12, 2010

Page 34: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Colorings with permutations

• A 3-coloring of a graph G=(V,E) is a function c:V→{red,blue,green} such that c(u)≠c(v) for any (u,v) ∈ E

• A threshold conjecture for 3-colorability:

• Conjecture: the threshold dc stays the same even if we put a random permutation π∈S3 on each edge, and demand that c(u)≠π(c(v))

• If this is true, second moment calculations are a lot easier, and we can bound dc within a constant

• Justification: correlation decay

limn!"

Pr[G(n, p = d/n) is 3-colorable] =

!1 if d < dc

0 if d > dc

[Moore and Olson]

Tuesday, October 12, 2010

Page 35: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Shameless Plug THE NATUREof COMPUTATION

Cristopher MooreStephan Mertens

Oxford University Press, 2011

This book rocks! You somehow manage to combine the fun of a popular book with the intellectual heft of a textbook.

— Scott Aaronson

A treasure trove of information on algorithms and complexity, presented in the most delightful way.

— Vijay Vazirani

A creative, insightful, and accessible introduction to the theory of computing, written with a keen eye toward the frontiers of the field and a vivid enthusiasm for the subject matter.

— Jon Kleinberg

Tuesday, October 12, 2010

Page 36: Phase transitions in NP-complete problems: a challenge for ...tuvalu.santafe.edu/~moore/AMS.pdf · Phase transitions in NP-complete problems: a challenge for probability, combinatorics,

Acknowledgements

Tuesday, October 12, 2010