Continuous-time model of structural balance...Continuous-time model of structural balance Seth A....

9
Continuous-time model of structural balance Seth A. Marvel a , Jon Kleinberg b,1 , Robert D. Kleinberg b , and Steven H. Strogatz a a Center for Applied Mathematics, and b Department of Computer Science, Cornell University, Ithaca, NY 14853 Edited by Ronald L. Graham, University of California, San Diego, La Jolla, CA, and approved November 30, 2010 (received for review September 3, 2010) It is not uncommon for certain social networks to divide into two opposing camps in response to stress. This happens, for example, in networks of political parties during winner-takes-all elections, in networks of companies competing to establish technical standards, and in networks of nations faced with mounting threats of war. A simple model for these two-sided separations is the dynamical system dXdt ¼ X 2 , where X is a matrix of the friendliness or unfriendliness between pairs of nodes in the network. Previous simulations suggested that only two types of behavior were pos- sible for this system: Either all relationships become friendly or two hostile factions emerge. Here we prove that for generic initial con- ditions, these are indeed the only possible outcomes. Our analysis yields a closed-form expression for faction membership as a func- tion of the initial conditions and implies that the initial amount of friendliness in large social networks (started from random initial conditions) determines whether they will end up in intractable conflict or global harmony. random matrix theory polarization T he mathematical model that we want to study is best under- stood as an outgrowth of a theory from social psychology known as structural balance (1). So let us begin with a brief explanation of what this theory says. Consider three individuals: Anna, Bill, and Carl, and suppose that Bill and Carl are friends with Anna, but are unfriendly with each other. If the sentiment in the relationships is strong enough, Bill may try to strengthen his friendship with Anna by encoura- ging her to turn against Carl, and Carl might likewise try to con- vince Anna to terminate her friendship with Bill. Anna, for her own part, may try to bring Bill and Carl together so they can reconcile and become friends. In abstract terms, relationship tri- angles containing exactly two friendships are prone to transition to triangles with either one or three friendships. Alternately, suppose that Anna, Bill, and Carl all view each other as rivals. In many such situations, there are incentives for the two people in the weakest rivalry to cooperate and form a working friendship or alliance against the third. In these cases, a single friendship may be prone to appear in a relationship triangle that initially has none. These two thought experiments suggest a notion of stability, or balance, that can be traced back to the work of Heider (2). Hei- ders theory was expanded into a graph-theoretic framework by Cartwright and Harary (3), who considered graphs on n nodes (representing people, countries, or corporations) with edges signed either positive (+) to denote friendship or negative () to denote rivalry. If a social network feels the proper social stres- ses (those felt by Anna, Bill, and Carl in the examples above), then Cartwright and Hararys theory predicts that in steady state the triangles in the graph should contain an odd number of positive edgesin other words, three positive edges or one positive edge and two negative edges. We refer to such triangles as balanced, and triangles with an even number of positive edges as unbalanced. Finally, we call a graph complete if it contains edges between all pairs of nodes, and we say that a complete graph with signs on its edges is balanced if all its triangles are balanced. (All graphs in our discussion will be complete.) As it turns out, these local notions of balance theory are closely related to the global structure of two opposing factions. In particular, suppose that the nodes of a complete graph are parti- tioned into two factions such that all edges inside each faction are positive and all edges between nodes in opposite factions are negative. (One of these factions may be empty, in which case the other faction includes all the nodes in the graph, and conse- quently all edges of the network are positive.) Note that this net- work must be balanced, because each triangle either has all three members in the same faction (yielding three positive edges) or has two members in one faction and the third member in the other faction (yielding one positive edge and two negative ones). In fact, a stronger and less obvious statement is true: Any balanced graph can be partitioned into two factions in this way, with one faction possibly empty (3). As a result, when we speak of balanced graphs, we can equivalently speak of networks with this type of two-faction structure. Model Structural balance is a static theoryit posits what a stablesigning of a social network should look like. However, its under- lying motivation is dynamic, based on how unbalanced triangles ought to resolve to balanced ones. This situation has led naturally to a search for a full dynamic theory of structural balance. Yet finding systems that reliably guide networks to balance has proved to be a challenge in itself. A first exploration of this issue was conducted by Antal et al. (4) who considered a family of discrete-time models. In one of the main models of this family, an edge of the graph is examined in each time step, and its sign is flipped if this produces more balanced triangles than unbalanced ones. Although a balanced graph is a stable point for these discrete dynamics, it turns out that many un- balanced graphs called jammed states are as well (4, 5). Thus, the natural problem became to identify and rigorously analyze a simple system that could progress to balanced graphs from generic initial configurations. An approach to this problem was taken by Kulakowski et al. (6), who proposed a continuous- time model for structural balance. They represented the state of a completely connected social network using a real symmetric n × n matrix X whose entry x ij represents the strength of the friendli- ness or unfriendliness between nodes i and j (a positive value denotes a friendly relationship and a negative value an unfriendly one). Note that for a given X , there is a signed complete graph with edge signs equal to the signs of the corresponding elements x ij in X . We will call X balanced if this associated signed complete graph is balanced. Kulakowski et al. considered variations on the following basic differential equation, which they proposed as a dynamical system governing the evolution of the relationships over time: Author contributions: S.A.M., J.K., R.D.K., and S.H.S. designed research; S.A.M., J.K., R.D.K., and S.H.S. performed research; S.A.M. analyzed data; and S.A.M., J.K., R.D.K., and S.H.S. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. See Commentary on page 1751. 1 To whom correspondence should be addressed. E-mail: [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1013213108/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1013213108 PNAS February 1, 2011 vol. 108 no. 5 17711776 APPLIED MATHEMATICS SEE COMMENTARY

Transcript of Continuous-time model of structural balance...Continuous-time model of structural balance Seth A....

Page 1: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

Continuous-time model of structural balanceSeth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza

aCenter for Applied Mathematics, and bDepartment of Computer Science, Cornell University, Ithaca, NY 14853

Edited by Ronald L. Graham, University of California, San Diego, La Jolla, CA, and approved November 30, 2010 (received for review September 3, 2010)

It is not uncommon for certain social networks to divide into twoopposing camps in response to stress. This happens, for example, innetworks of political parties during winner-takes-all elections, innetworks of companies competing to establish technical standards,and in networks of nations faced with mounting threats of war.A simple model for these two-sided separations is the dynamicalsystem dX∕dt ¼ X2, where X is a matrix of the friendliness orunfriendliness between pairs of nodes in the network. Previoussimulations suggested that only two types of behavior were pos-sible for this system: Either all relationships become friendly or twohostile factions emerge. Here we prove that for generic initial con-ditions, these are indeed the only possible outcomes. Our analysisyields a closed-form expression for faction membership as a func-tion of the initial conditions and implies that the initial amountof friendliness in large social networks (started from random initialconditions) determines whether they will end up in intractableconflict or global harmony.

random matrix theory ∣ polarization

The mathematical model that we want to study is best under-stood as an outgrowth of a theory from social psychology

known as structural balance (1). So let us begin with a briefexplanation of what this theory says.

Consider three individuals: Anna, Bill, and Carl, and supposethat Bill and Carl are friends with Anna, but are unfriendly witheach other. If the sentiment in the relationships is strong enough,Bill may try to strengthen his friendship with Anna by encoura-ging her to turn against Carl, and Carl might likewise try to con-vince Anna to terminate her friendship with Bill. Anna, for herown part, may try to bring Bill and Carl together so they canreconcile and become friends. In abstract terms, relationship tri-angles containing exactly two friendships are prone to transitionto triangles with either one or three friendships.

Alternately, suppose that Anna, Bill, and Carl all view eachother as rivals. In many such situations, there are incentivesfor the two people in the weakest rivalry to cooperate and forma working friendship or alliance against the third. In these cases, asingle friendship may be prone to appear in a relationship trianglethat initially has none.

These two thought experiments suggest a notion of stability, orbalance, that can be traced back to the work of Heider (2). Hei-der’s theory was expanded into a graph-theoretic framework byCartwright and Harary (3), who considered graphs on n nodes(representing people, countries, or corporations) with edgessigned either positive (+) to denote friendship or negative (−)to denote rivalry. If a social network feels the proper social stres-ses (those felt by Anna, Bill, and Carl in the examples above),then Cartwright and Harary’s theory predicts that in steady statethe triangles in the graph should contain an odd number ofpositive edges—in other words, three positive edges or onepositive edge and two negative edges. We refer to such trianglesas balanced, and triangles with an even number of positive edgesas unbalanced. Finally, we call a graph complete if it containsedges between all pairs of nodes, and we say that a completegraph with signs on its edges is balanced if all its triangles arebalanced. (All graphs in our discussion will be complete.)

As it turns out, these local notions of balance theory areclosely related to the global structure of two opposing factions. In

particular, suppose that the nodes of a complete graph are parti-tioned into two factions such that all edges inside each faction arepositive and all edges between nodes in opposite factions arenegative. (One of these factions may be empty, in which casethe other faction includes all the nodes in the graph, and conse-quently all edges of the network are positive.) Note that this net-work must be balanced, because each triangle either has all threemembers in the same faction (yielding three positive edges) orhas two members in one faction and the third member in theother faction (yielding one positive edge and two negative ones).In fact, a stronger and less obvious statement is true: Anybalanced graph can be partitioned into two factions in thisway, with one faction possibly empty (3). As a result, when wespeak of balanced graphs, we can equivalently speak of networkswith this type of two-faction structure.

ModelStructural balance is a static theory—it posits what a “stable”signing of a social network should look like. However, its under-lying motivation is dynamic, based on how unbalanced trianglesought to resolve to balanced ones. This situation has led naturallyto a search for a full dynamic theory of structural balance. Yetfinding systems that reliably guide networks to balance hasproved to be a challenge in itself.

A first exploration of this issue was conducted byAntal et al. (4)who considered a family of discrete-time models. In one of themain models of this family, an edge of the graph is examined ineach time step, and its sign is flipped if this producesmorebalancedtriangles than unbalanced ones. Although a balanced graph is astable point for these discrete dynamics, it turns out that many un-balanced graphs called jammed states are as well (4, 5).

Thus, the natural problem became to identify and rigorouslyanalyze a simple system that could progress to balanced graphsfrom generic initial configurations. An approach to this problemwas taken by Kułakowski et al. (6), who proposed a continuous-time model for structural balance. They represented the state of acompletely connected social network using a real symmetric n × nmatrix X whose entry xij represents the strength of the friendli-ness or unfriendliness between nodes i and j (a positive valuedenotes a friendly relationship and a negative value an unfriendlyone). Note that for a given X , there is a signed complete graphwith edge signs equal to the signs of the corresponding elementsxij in X . We will call X balanced if this associated signed completegraph is balanced.

Kułakowski et al. considered variations on the following basicdifferential equation, which they proposed as a dynamical systemgoverning the evolution of the relationships over time:

Author contributions: S.A.M., J.K., R.D.K., and S.H.S. designed research; S.A.M., J.K., R.D.K.,and S.H.S. performed research; S.A.M. analyzed data; and S.A.M., J.K., R.D.K., and S.H.S.wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Freely available online through the PNAS open access option.

See Commentary on page 1751.1To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1013213108/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1013213108 PNAS ∣ February 1, 2011 ∣ vol. 108 ∣ no. 5 ∣ 1771–1776

APP

LIED

MAT

HEM

ATICS

SEECO

MMEN

TARY

Page 2: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

dXdt

¼ X2: [1]

Remarkably, simulations showed that for essentially any initialXð0Þ, the system reached a balanced pattern of edge signs in finitetime.

Writing Eq. 1 directly in terms of the entries xij gives a sense ofwhy this differential equation should promote balance:

dxijdt

¼ ∑k

xikxkj: [2]

Notice that xij is being pushed in a positive or negative directionbased on the relationships that i and j have with k: If xik and xkjhave the same sign, their product guides the value of xij in thepositive direction, whereas if xik and xkj have opposite signs, theirproduct guides the value of xij in the negative direction. In eachcase, this is the direction required to balance the triangle fi;j;kg.Note also that Eq. 2 applies for the case that i ¼ j. Although thiscase is harder to interpret, the monotonic increase of xii impliedby Eq. 2 might be viewed in psychological terms as an increase ofself-approval or self-confidence as i becomes more resolute in itsopinions about others in the network.

For a network with just three nodes, it can be easily proved thata variant of these dynamics generically balances the single tri-angle in this network; such a three-node analysis has been givenby Kułakowski et al. (6), and we describe a short proof in SI Text.What is much less clear, however, is how the system shouldbehave with a larger number of nodes, when the effects governingany one edge fi;jg are summed over all nodes k to produce a sin-gle aggregate effect on xij.

It has therefore been an open problem to prove that Eq. 2 orany of the related systems studied by Kułakowski et al. will bring ageneric initial matrix Xð0Þ to a balanced state. It has also beenan open problem to characterize the structure of the balancedstate that arises as a function of the starting state Xð0Þ.ResultsIn this paper, we resolve these two open problems. We first showthat for a random initial matrix (with entries sampled indepen-dently from an absolutely continuous distribution with boundedsupport), the system reaches a balanced matrix in finite time witha probability converging to 1 in the number of nodes n. In addi-tion, we provide a closed-form expression for this balanced matrixin terms of the initial one; essentially, we discover that the systemof differential equations serves to “collapse” the starting matrixto a nearby rank-one matrix. We also characterize additionalaspects of the process, giving, for example, a description of an“exceptional” set of matrices of probability measure convergingto 0 in n for which the dynamics are not necessarily guaranteed toproduce a balanced state.

We then analyze the solutions of the system for classes of ran-dom matrices in the large-n limit—in particular, we consider thecase in which each unique matrix entry is drawn independentlyfrom a distribution with bounded support that is symmetric abouta number μ (the mean value of the initial friendliness among thenodes). In this case, we find a transition in the solution as μ varies:When μ > 0, the system evolves to an all-positive sign pattern,whereas when μ ≤ 0, the system evolves to a state in which thenetwork is divided evenly into two all-positive cliques connectedentirely by negative edges. We end by discussing some implica-tions of the model and the associated transition between harmonyand conflict, including an evaluation of the model on empiricaldata and some potential connections to research on reconcilia-tion in social psychology.

Behavior of the Model: Evolution to a Balanced StateSuppose we randomly select the xijð0Þ’s from a continuous distri-bution on the real line. Then the xijðtÞ’s found by numericalintegration generally sort themselves in finite time into the signpattern of two feuding factions. To reformulate this observationas a precise statement and explain why the behavior holds sopervasively, we now solve Eq. 1 explicitly.

Solution to theModel.The initial matrix Xð0Þ is real and symmetricby assumption, so we can write it as QDð0ÞQT , where Dð0Þ isthe diagonal matrix with the eigenvalues of Xð0Þ, denoted λ1 ≥λ2 ≥ ⋯ ≥ λn, as diagonal entries ordered from largest to smallest,and Q is the orthogonal matrix with the corresponding eigenvec-tors of Xð0Þ, denoted ω1;ω2;…;ωn, as columns. The superscriptT signifies transposition.

The differential equation Eq. 1 is a special case of a generalfamily of equations known as matrix Riccati equations (7). Theanalysis of the full family is complicated and not fully resolved,but we now show that the special case of concern to us, Eq. 1, hasan explicit solution with a form that exposes its connections tostructural balance. We proceed as follows. First, we observe thatby separation of variables, the solution of the single-variabledifferential equation _x ¼ x2 (overdot representing differentiationby time) with initial condition xð0Þ ¼ λk is

ℓkðtÞ ¼λk

1 − λkt: [3]

Therefore the diagonal matrix DðtÞ ¼ diag½ℓ1ðtÞ;ℓ2ðtÞ;…;ℓnðtÞ�is the solution of Eq. 1 for the initial condition Xð0Þ ¼ diagðλ1;λ2;…;λnÞ.

Moreover Y ðtÞ ¼ QDðtÞQT is also a solution of Eq. 1 because_Y ¼ Q _DQT ¼ QðD2ÞQT ¼ ðQDQTÞ2 ¼ Y 2. But Y ðtÞ has thesame initial condition as XðtÞ in our original problem: Y ð0Þ ¼QDð0ÞQT ¼ Xð0Þ. So by uniqueness, Y ðtÞ ¼ QDðtÞQT must bethe solution we seek.

Our solution XðtÞ can also be written in a different way to mi-mic the solution of the one-dimensional equation _x ¼ x2. BecausexijðtÞ ¼ ∑n

k¼1 qikℓkðtÞqjk, where qij is the ði;jÞth entry of Q, we canexpand the denominators of the ℓkðtÞ functions in powers of t torewrite XðtÞ as Xð0Þ þ Xð0Þ2tþ Xð0Þ3t2 þ…, or more concisely,

XðtÞ ¼ Xð0Þ½I − Xð0Þt�−1: [4]

(Note that the matrices Xð0Þ and ½I − Xð0Þt�−1 commute.) Thisequation is valid when t is less than the radius of convergenceof every λk, that is, when t < 1∕λ1 (assuming λ1 > 0).

Finally we note that the above method of solving Eq. 1 containsa reduction of the number of dynamical variables of the systemfrom ðnþ1

2Þ to n. The ðn

2Þ constants of motion generated by this

reduction are just the off-diagonal elements of QTXðtÞQ ¼ DðtÞ,or ∑n

k¼1 ∑nℓ¼1 qkixkℓðtÞqℓj ¼ 0 for all 1 ≤ i < j ≤ n. Furthermore,

the procedure for reducing XðtÞ can be easily generalized to anysystem of the form _X ¼ f ðXÞ, where f is a polynomial of X .

Behavior of the Solution. Let us now examine the behavior of oursolution XðtÞ to see why in the typical case it splits into two fac-tions in finite time. It turns out that this is the guaranteed out-come if the following three conditions hold (and as we see below,they hold with probability converging to 1 as n goes to infinity):

1. λ1 > 0,2. λ1 ≠ λ2 (and hence λ1 > λ2), and3. all components of ω1 are nonzero.

To see why these conditions imply a split into two factions, ob-serve from Eq. 3 that each ℓkðtÞ diverges to infinity at t ¼ 1∕λk.Because xijðtÞ ¼ ∑n

k¼1 qikℓkðtÞqjk, all xij’s diverge to infinity when

1772 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1013213108 Marvel et al.

Page 3: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

the ℓk with the smallest positive 1∕λk does. Under the first andsecond conditions, this ℓk is ℓ1, so the blow-up time t� of Eq. 1must be 1∕λ1. To show that the nodes are partitioned into twofactions as XðtÞ approaches t�, let XðtÞ ¼ XðtÞ∕jjXðtÞjj on thehalf-open interval ½0;t�Þ, where jjXðtÞjj denotes the Frobeniusnorm of X . The matrix XðtÞ has the sign pattern of XðtÞ, andas t approaches t� it converges to the rank-one matrix

X� ¼ Qdiagð1;0;0;…;0ÞQT ¼ ω1ωT1 : [5]

Now let ω1k denote the value of the kth coordinate of ω1, and letS ¼ fk: ω1k > 0g and T ¼ fk: ω1k < 0g: Then S and T partitionthe node indices 1;2;…;n by our condition that ω1 has no zerocomponents. From Eq. 5, this partition must correspond totwo cliques of friends joined by a complete bipartite graph ofunfriendly ties.

The Three Conditions.We now return to the three conditions above.We first show that the second and third hold with probability 1.We then show that the first condition holds with probabilityconverging to 1 as n goes to infinity. Finally, we analyze thebehavior of the system in the unlikely event that the first condi-tion does not hold. The fact that the conjunction of all three con-ditions holds with probability converging to 1 as n grows largejustifies our earlier claim that the behavior described above holdsfor almost all choices of initial conditions.

First we show why the second and third conditions hold withprobability 1 so long as the (joint) distribution from which Xð0Þ isdrawn is absolutely continuous with respect to Lebesgue measure—in other words, assigns probability zero to any set of matriceswhose Lebesgue measure is zero. Our arguments below make useof the following two basic facts:

i. The set of zeros of a nontrivial multivariate polynomial hasLebesgue measure zero, and

ii. the existence of a common root of two univariate polynomialsP and Q is equivalent to the vanishing of a multivariate poly-nomial in the coefficients of P and Q (specifically, it is equiva-lent to the vanishing of the determinant of the Sylvester matrixof P and Q, also called the resultant of P and Q).

To show that λ1 ≠ λ2 with probability 1, let P denote the char-acteristic polynomial of Xð0Þ, and let Q denote the derivative ofP. Then Xð0Þ has a repeated eigenvalue if and only if P has arepeated root, which it does if and only if P andQ have a commonroot. This condition is equivalent to the vanishing of the resultantof P and Q, which is a multivariate polynomial in the entries ofXð0Þ. The polynomial cannot be zero everywhere, because thereis at least one symmetric matrix that does not have a repeatedeigenvalue. So the set of matrices having a repeated eigenvaluehas Lebesgue measure zero.

Similarly, to show that all components of ω1 are nonzero, let Pdenote the characteristic polynomial of Xð0Þ and Pi the charac-teristic polynomial of the ðn − 1Þ × ðn − 1Þ submatrix Xið0Þobtained by deleting the ith row and ith column of Xð0Þ. It is easyto check that if any eigenvector of Xð0Þ has a zero in its ith com-ponent, then the vector obtained by deleting that component is aneigenvector of Xið0Þ with the same eigenvalue. Consequently,P and Pi must have a common root, implying that the resultantof P and Pi vanishes. This resultant is once again a multivariatepolynomial in the entries of Xð0Þ, and once again it must be non-zero somewhere because there is at least one symmetric matrixwhose eigenvectors all have nonzero entries. Hence, the set ofmatrices having an eigenvector with zero in its ith componenthas Lebesgue measure zero.

Finally, to determine the likelihood of the first condition, wefirst must say a bit more about the way that Xð0Þ is selected. Sup-pose that the off-diagonal xijð0Þ’s are drawn randomly from acommon distribution F and the on-diagonal xiið0Þ’s are drawn

randomly from a common distribution G. All selections areindependent for i ≤ j. [For i > j, we let xijð0Þ ¼ xjið0Þ, so that Xð0Þis symmetric.] For this construction of Xð0Þ, Arnold (8) has shownthat with the remarkably weak additional assumption that F hasa finite second moment, Wigner’s semicircle law holds in prob-ability as n grows to infinity. This in turn implies that λ1 > 0 inprobability in the same limit.

Moreover, suppose we are in the low-probability case thatλ1 ≤ 0. In this case, the analysis above shows that all the functionsℓiðtÞ converge to 0 as t → ∞. Thus, limt→∞DðtÞ ¼ 0, and becauseXðtÞ ¼ QDðtÞQT , we also have limt→∞XðtÞ ¼ 0.

Although the entries of XðtÞ converge to zero when λ1 ≤ 0, onemight still want to know if the sign pattern of XðtÞ is eventuallyconstant (i.e., remains unchanged for all t above some thresholdvalue) and, if so, what determines this sign pattern. It is possibleto answer this question, again assuming the second and third con-ditions. By expanding the function ℓiðtÞ ¼ λi∕ð1 − λitÞ in powersof u ¼ 1∕t, we obtain the asymptotic series

ℓiðtÞ ¼ −u − u2λ−1i −Oðu3Þ; [6]

which implies

XðtÞ ¼ QDðtÞQT ¼ −uI − u2Xð0Þ−1 −Oðu3Þ: [7]

In the limit of small u, the leading-order term of the diagonalentries of XðtÞ is the linear term, which has negative sign. Forthe off-diagonal entries of XðtÞ, the leading-order term as u tendsto zero is the quadratic term, whose sign matches the sign of thecorresponding off-diagonal entry of the matrix −Xð0Þ−1.Behavior of the Model: From Factions to UnificationThe analysis in the previous section tells us how to find both theblow-up time t� and final sign configuration of a network if weknow its initial state Xð0Þ. However, we might also want to knowwhether we can characterize the behavior of XðtÞ in the large-nlimit in terms of statistical parameters of Xð0Þ. This could, forexample, help us forecast the behavior of groups of individualswhen collecting complete relationship-level data is not feasible.Clearly if the underlying network is a complete graph, it is notrealistic to consider n that are too large, but we find fortunatelyin simulations that the asymptotic behavior we will derive inthis section becomes apparent even for moderate values of n (lessthan 100). As a result, these large-n results are perhaps most use-fully viewed as an approximate guide to what happens in medium-sized groups that are large enough to show predictable collectivebehavior but for which a completely connected set of relation-ships is still feasible to maintain.

In this section, we show that there is a transition from finalstates consisting of two factions to final states consisting of all-positive relations as the “mean friendliness” of Xð0Þ (the meanof the distributions used to generate the off-diagonal entries ofXð0Þ) is increased from negative to positive values. This is con-sistent with the numerical simulations shown in Fig. 1; as notedabove, the asymptotic behavior we are studying is already clear inthese simulations, which are performed with n ¼ 90.

Before discussing the details, we describe how Xð0Þ is selectedin this section. We start by adopting the procedure of Füredi andKomlós (9): The elements xijð0Þ are drawn independently fromdistributions Fij with zero mass outside of ½−K;K �. The off-diag-onal Fij’s have a common expectation μ and finite variance σ2,whereas the on-diagonal Fii’s have a common expectation νand variance τ2. In addition, we require that each off-diagonaldistribution Fij be symmetric about μ. Random matrix modelsof this type have attracted considerable recent interest (see,e.g., refs. 10 and 11), but we need only the basic results of Fürediand Komlós (9) for our purposes, and so we use these in the

Marvel et al. PNAS ∣ February 1, 2011 ∣ vol. 108 ∣ no. 5 ∣ 1773

APP

LIED

MAT

HEM

ATICS

SEECO

MMEN

TARY

Page 4: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

development below. We consider the three cases of positive, zero,and negative μ.

Case 1: μ > 0.The results of Füredi and Komlós (9) show that whenμ > 0, the deviation of ω1 from ð1;1;…;1Þ∕ ffiffiffi

np

vanishes in prob-ability in the large-n limit. Hence the final state of the systemconsists of one large clique of friends containing all but at mosta vanishing fraction of the nodes. Moreover, by assuming a boundon σ we can strengthen this statement further: If σ < μ∕2, thenthe findings of Füredi and Komlós imply that the final state con-sists of a single clique of friends, with no negative edges. Theseobservations are consistent with the representative numericaltrial shown in Fig. 1A. Moreover, Füredi and Komlós show thatthe asymptotic behavior of λ1 grows like μnþOð1Þ, and hence theblow-up time scales like 1∕ðμnÞ.

We can gain insight into the behavior of the system for small tusing an informal Taylor series calculation: If we rescale time inEq. 1 by inserting a 1∕n before the summation, compute theTaylor expansion of xijðtÞ term by term, and then take the expecta-tion of each term, we obtain the geometric series xðtÞ ¼ μþμ2tþ μ3t2 þ⋯, or

xðtÞ ¼ μ

1 − μt: [8]

With significantly more work, it can be proved that every trajec-tory xijðtÞ has this time dependence on ½0;1∕KÞ in the large-n limitwith probability 1 (see SI Text), so we may write

limn→∞

xijðtÞ ¼ xijð0Þ − μþ μ

1 − μtwith prob: 1 [9]

for all t in ½0;1∕KÞ. Observe that this limit has a blow-up time t� of1∕μ. Because our rescaling of time represents a zooming in ormagnification of time by a factor of n, this t� corresponds to ablow-up time asymptotic to 1∕ðμnÞ for the unrescaled system,consistent with the results of Füredi and Komlós.

Case 2: μ ¼ 0. In the event that the network starts from a meanfriendliness of zero, numerical experiments indicate that the sys-tem ends up with two factions of equal size in the large-n limit(Fig. 1B). We now prove this to be the case. For the remainderof this discussion, we abbreviate Xð0Þ as A and xijð0Þ as aij.

Because the off-diagonal entries of A have symmetric distribu-tions by assumption, we have for any off-diagonal aij and anyinterval Sij on the real line that Pðaij ∈ SijÞ ¼ Pð−aij ∈ SijÞ. Nowlet D be a diagonal matrix with some sequence of þ1 and −1along its diagonal (where the ith diagonal entry is denoted bydi). Then the random matrices A and B ¼ DAD are identicallydistributed, as we now show.

To say that A and B are identically distributed means that forevery Borel set of matrices S, PðA ∈ SÞ ¼ PðB ∈ SÞ. To prove this,it suffices to consider the case in which S is a product of intervalsSij, because these product sets generate the Borel σ-algebra. Theupper triangular entries of A are independent, so PðA ∈ SÞ ¼Q

i≤jPðaij ∈ SijÞ. Similarly, PðB ∈ SÞ ¼ Qi≤jPðdiaijdj ∈ SijÞ. By the

symmetry of the off-diagonal distributions,Q

i≤jPðaij ∈ SijÞ ¼Qi≤jPðdiaijdj ∈ SijÞ, which gives us PðA ∈ SÞ ¼ PðB ∈ SÞ as de-

sired. (Note that when i ¼ j, the factor didj is 1 so the on-diagonaldistributions need not be symmetric.)

Now consider the set S of matrices with an ω1 consisting ofall-positive components. The above demonstration implies thatthe probability of choosing an A in this set is the same as choosingan A such that B is in this set. Regarding the later event,AðDωiÞ ¼ λiðDωiÞ implies Bωi ¼ λiωi, so the λ1 eigenvector ofthe A used to compute B is Dω1. This demonstrates that all signpatterns for the components of ω1 are equally likely. In otherwords, the distribution of the number of positive componentsin ω1 is the binomial distribution Bðn;1∕2Þ and the fraction ofpositive components in ω1 converges (in several senses) to 1∕2as n grows large.

Additionally, we can consider how λ1 varies with n in the casethat μ ¼ 0 to determine when the blowup will occur. Füredi andKomlós (9) found for this case that λ1 ∈ 2σ

ffiffiffin

p þOðn1∕3 log nÞwith probability tending to 1, so with probability tending to 1the blow-up time shrinks to zero like 1∕

ffiffiffin

p, an order of

ffiffiffin

pslower

than in the μ > 0 case.

Case 3: μ < 0. For this final case, Füredi and Komlós (9) found thatλ1 < 2σ

ffiffiffin

p þOðn1∕3 log nÞ with probability tending to 1. Thesemicircle law gives a lower bound: λ1 > 2σ

ffiffiffin

p þ oð ffiffiffin

p Þ in prob-ability. So the blow-up time goes to zero like 1∕

ffiffiffin

pin the unres-

caled system.Note also that if we define a matrix C ¼ −A, where A is now

the initial matrix Xð0Þ of Case 3, then C satisfies the condition of

B CA

Fig. 1. Representative large-n plots of the model for (A) μ > 0 (μ ¼ 3∕10 in the plot shown), (B) μ ¼ 0, and (C) μ < 0 (μ ¼ −3 in the plot shown). For all threeplots, σ ¼ 1 and n ¼ 90. To reduce image complexity, only one randomly sampled fifth of the trajectories is included. In the second plot, t� denotes the time atwhich the system diverges, and ε denotes a sufficiently small displacement. The white curves superimposed on the three plots are the large-n trajectoriesxijðtÞ ¼ xijð0Þ − μþ μ∕ð1 − μnctÞ for xijð0Þ ¼ μ;μ� 3σ∕2, where c represents a rescaling of time. Because we want to fix the blow-up time t� near 1 and becausect� ¼ 1∕λ1 as found in the text, we choose c ¼ 1∕ðμnþ ν − μþ σ2∕μÞ for A and c ¼ 1∕ð2σ ffiffiffi

np Þ for B and C using estimates of λ1 taken from ref. 9. The black dotted

lines mark the blow-up times t� ¼ 1∕ðcλ1Þ.

1774 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1013213108 Marvel et al.

Page 5: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

Case 1, μ > 0. Thus the distance between the top eigenvector of Cand ð1;1;…;1Þ∕ ffiffiffi

np

declines to zero in probability just as in Case 1.Furthermore, every other eigenvector ofC is orthogonal to the lar-gest one. Hence if σ < jμj∕2, then with probability tending to 1,every other eigenvector acquires a mixture of positive andnegative components in the large-n limit, including the bottomeigenvector of C, which is the top eigenvector of A. Thisestablishes that in the case that μ < 0 and σ < jμj∕2, the sys-tem ends up in a state with two factions with probability con-verging to 1 for all finite n.

Numerical simulations of the case that μ < 0 suggest the con-jecture that the two factions are approximately equal in size forlarge n. Furthermore, the derivation of Eq. 9 is in fact valid for allμ, so each trajectory rapidly decays from xijð0Þ toward xijð0Þ − μ on½0;1∕KÞ (Fig. 1C). This transient decay appears to extend beyondt ¼ 1∕K in numerical simulations. So, for example, if time isrescaled by 1∕

ffiffiffin

pinstead of 1∕n, we would hypothesize that (i)

each trajectory makes a complete jump from xijð0Þ to xijð0Þ − μ inthe large-n limit and that (ii) from this point onward, the systembehaves like an initial configuration of the μ ¼ 0 case and so se-parates into two equal factions en route to its blowup at 1∕ð2σÞ.DiscussionIn this final section, we review our results and their significancerelative to previous work in structural balance theory. We thencompare the predictions of the model with data, discuss potentialcriticisms of the model, and finish with some intriguing connec-tions between the behavior of the model and recent social-psy-chological work on neutralizing two-sided conflicts.

Our first result is a demonstration that the model forms twofactions in finite time across a broad set of initial conditions.As noted at the outset, similar demonstrations have not been pos-sible for dynamic models of structural balance in earlier literaturebecause these models contained so-called jammed states thatcould trap a social network before it reached a two-faction con-figuration (4, 5). The model of Kułakowski et al. by contrast hasno such jammed states for generic initial conditions and henceprovides a robust means for a social network to balance itself.

The second result of the paper is the discovery and character-ization of a transition from global polarization to global harmonyas the initial mean friendliness of the network crosses from non-positive to positive values. Similar transitions have been observedin other models of structural balance but so far none has been

characterized at a quantitative level. For example, Antal et al.(4) found a nonlinear transition from two cliques of equal sizeto a single unified clique as the fraction of positively signed edgesat t ¼ 0 was increased from 0 to 1 (see figure 5 of ref. 4). Theauthors provided a qualitative argument for this transition, butleft open the problem of its quantitative detail. Our results bothconfirm the generality of their observations and provide a quan-titative account of a transition analogous to theirs.

To complement the theoretical nature of our work and get abetter sense of how the model behaves in practice, we can numeri-cally integrate it for several cases of empirical social network datawhere the real-life outcomes of the time evolution are known. Ourfirst example is based on a study by Zachary (12) who witnessedthe breakup of a karate club into two smaller clubs. Prior to theseparation, Zachary collected counts of the number of socialcontexts in which each pair of individuals interacted outside ofthe karate club, with the idea being that the more social contextsthey shared, the greater the likelihood for information exchange.These counts, or capacities as Zachary called them, can be con-verted to estimates of friendliness and rivalry in many differentways. For a large class of such conversions, Eq. 1 predicts the samedivision that Zachary’s method found, which misclassified only 1of the 34 club members (Fig. 2 A and B).

A second example can be constructed from the data of a studyby Axelrod and Bennett (13) regarding the aggregation of Alliedand Axis powers during World War II. If we simply take theentries of their propensityði;jÞ · sizeðiÞ · sizeðjÞ matrix to be pro-portional to the friendliness felt between the various pairs ofcountries in the war, then running the model gives the correctAllied-Axis split for all countries except Denmark and Portugal(Fig. 2C).

Nevertheless, the model clearly contains several strong simpli-fications of the underlying social processes. The first of these isinherent to structural balance theory itself; it is a frameworkrestricted to capture a particular kind of social situation, in whichthe need for consistency among one’s friendships and rivalriesbrings about the emergence of two factions. Extensions of thetheory have considered models in which it is possible to have mul-tiple mutual enemies and hence more than two factions (14), andalso networks that are not complete graphs (3). However, ourfocus here has been on the basic theory, because as we have seen,obtaining a satisfactory dynamics even for this simplest form ofstructural balance has been an elusive challenge. Moreover, the

BA

Fig. 2. Tests of the model of Kułakowski et al. (Eq. 1) against two existing datasets. (A) The evolution of the model starting from Zachary’s capacity matrix withthe capacity of each relationship reduced by 0.58. This is the minimal downward displacement necessary (to two significant figures) for the resulting separationto be correct for all but 1 of the 34 club members. For reasons described by Zachary (12), this is basically the best separation we can expect. (B) The evolution ofthemodel from Zachary’s capacity matrix with the capacity of zero between the two club leaders replaced by−11; the resulting factions are identical to those inA. Substituted values less than −11 yield the same two factions, whereas greater values produce less accurate divisions. (C) The evolution of the model startingfrom Axelrod and Bennett’s 1939 propensityði;jÞ · sizeðiÞ · sizeðjÞmatrix for the 17 countries involved in World War II (by Axelrod and Bennett’s definition). Themodel finds the correct split into Allied and Axis powers with the exceptions of Denmark and Portugal. Axelrod and Bennett’s own landscape theory ofaggregation does slightly better—its only misclassification is Portugal.

Marvel et al. PNAS ∣ February 1, 2011 ∣ vol. 108 ∣ no. 5 ∣ 1775

APP

LIED

MAT

HEM

ATICS

SEECO

MMEN

TARY

Page 6: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

basic version of structural balance that we have considered here,with a complete graph of relationships and constraints leading totwo factions, is relevant to a range of different situations. Thesespan the kinds of settings discussed earlier in this section, includ-ing clubs, classrooms, and small organizations (15), as well asinternational relations during crisis (where a large set of nationscan all mutually maintain friendly or unfriendly diplomatic rela-tions) (13, 16).

Another consequence of the particular model studied herethat has no direct analogue in real social situations is the diver-gence to infinity of the relationship strengths xij. However, becausethe purpose of the model is to study the pattern of signs thatemerges, our main conclusions are based not on the actualmagnitudes of these numbers but on the fact that the sign patterneventually stabilizes at a point before the divergence. This stabi-lization of the sign pattern is our primary focus, and one couldinterpret the subsequent singularity as simply the straightforwardand unimpeded “ramping up” of values caused by the system onceall inconsistencies have been worked out of the social relations—the divergence itself can be viewed as taking place beyond the win-dow of time over which the system corresponds to anything real.Alternately, one can imagine that as the community completes its

separation into two groups, other social processes take over.For example, individuals with differing ideological views or socialpreferences may self-segregate, breaking the all-to-all assumptionof the model. In other cases, mounting tensions may erupt intoviolence, reflecting a sort of bound on the relationship intensityachievable for pairs of nodes in the network.

Finally, we note that there is a large body of work in socialpsychology that studies issues such as the formation and recon-ciliation of factions from a much more empirical basis; see, forexample, refs. 17 and 18. It is an interesting open problem todetermine the extent to which the strictly mathematical develop-ment of the models here can be combined with the perspectives inthis empirical body of literature, ultimately leading to a richertheory of these types of social processes.

ACKNOWLEDGMENTS.We thank Nick Trefethen and David Bindel for pointersto the literature on matrix Riccati equations. Research was supported in partby the John D. and Catherine T. MacArthur Foundation, a Google ResearchGrant, a Yahoo! Research Alliance Grant, an Alfred P. Sloan FoundationFellowship, a Microsoft Research New Faculty Fellowship, a grant from theAir Force Office of Scientific Research, and National Science FoundationGrants CCF-0325453, BCS-0537606, CCF-0643934, IIS-0705774, and CISE-0835706.

1. Wasserman S, Faust K (1994) Social Network Analysis: Methods and Applications,Structural Analysis in the Social Sciences (Cambridge Univ Press, New York), pp220–248.

2. Heider F (1946) Attitudes and cognitive organization. J Psychol 21:107–112.3. Cartwright D, Harary F (1956) Structural balance: A generalization of Heider’s theory.

Psychol Rev 63:277–293.4. Antal T, Krapivsky PL, Redner S (2005) Dynamics of social balance on networks. Phys

Rev E 72:036121.5. Marvel SA, Strogatz SH, Kleinberg JM (2009) Energy landscape of social balance. Phys

Rev Lett 103:198701.6. Kułakowski K, Gawroński P, Gronek P (2005) The Heider balance—a continuous

approach. Int J Mod Phys C 16:707–716.7. Abou-Kandil H, Freiling G, Ionescu V, Jank G (2003)Matrix Riccati Equations in Control

and Systems Theory (Birkhäuser, Basel, Switzerland) p 21.8. Arnold L (1971) On Wigner’s semicircle law for the eigenvalues of random matrices.

Probab Theory Rel 19:191.

9. Füredi Z, Komlós J (1981) The eigenvalues of random symmetric matrices. Combina-torica 1:233–241.

10. Vu V (2007) Spectral norm of random matrices. Combinatorica 27:721–736.11. Tao T, Vu V (2010) Smooth analysis of the condition number and the least singular

value. Math Comput 79:2333–2352.12. ZacharyWW (1977) An information flowmodel for conflict and fission in small groups.

J Anthropol Res 33:452–473.13. Axelrod R, Bennett DS (1993) A landscape theory of aggregation. Brit J Polit Sci

23:211–233.14. Davis JA (1967) Clustering and structural balance in graphs. Hum Relat 20(2):181–187.15. Doreian P, Mrvar A (1996) A partitioning approach to structural balance. Soc Networks

18:149–168.16. Moore M (1978) An international application of Heider’s balance theory. Eur J Soc

Psychol 8:401–405.17. Pettigrew TF (1998) Intergroup contact theory. Annu Rev Psychol 49:65–85.18. Pettigrew TF, Tropp LR (2006) A meta-analytic test of intergroup contact theory. J Pers

Soc Psychol 90:751–783.

1776 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1013213108 Marvel et al.

Page 7: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

Supporting InformationMarvel et al. 10.1073/pnas.1013213108SI TextReferenced Results. In the introduction of our paper, we assert thata variant of the dynamics proposed by Kułakowski et al. (1)generically balances an isolated triangle. We explain what wemean here.

Theorem 1. The system _x12 ¼ x13x23, _x13 ¼ x12x23, _x23 ¼ x12x13achieves balance when the initial values x12ð0Þ, x13ð0Þ, and x23ð0Þare all unequal.

Proof:Multiplying each _xij by xij yields x12_x12 ¼ x13_x13 ¼ x23_x23. In-tegrating these equalities gives the constraints x212 − x213 ¼ C1 andx212 − x223 ¼ C2, which partition the three-dimensional space ofðx12;x13;x23Þ into trajectories (with the direction of flow givenby the original dynamical system). Examination of this flowreveals that each initial condition ðx12ð0Þ;x13ð0Þ;x23ð0ÞÞ with dis-tinct coordinates flows into one of the four octants on which Hei-der balance holds, that is, where x12x13x23 > 0. Furthermore, theseoctants each act as separate trapping regions: Once a trajectoryenters, it cannot leave. Hence, the theorem follows.

The next theorem regards the main system of the paper with arescaling of time: d

dt X ¼ n−1X2, where X is a real symmetric n × nmatrix. Recall that xijðtÞ denotes the ði;jÞth element of the solutionmatrix XðtÞ subject to the initial condition Xð0Þ. In the following,we abbreviate Xð0Þ asA and xijð0Þ as aij. Suppose that the aij, i ≤ j,are drawn independently from distributions Fij with zero massoutside ½−K;K �, and the off-diagonal distributions Fij have com-mon expectation μ and variance σ2.

Theorem 2. limn→∞xijðtÞ ¼ aij − μþ μ∕ð1 − μtÞ with probability 1 fort ∈ ½0;1∕KÞ.

Proof:Regard each step of the limit n → ∞ as a selection and con-catenation of elements faing1≤i≤n−1;fanjg1≤j≤n−1;ann to the ele-ments faijg1≤i;j≤n−1 selected in preceding steps. Now considerthe partial sum of the Taylor series expansion of xijðtÞ:

xijnNðtÞ ¼ ∑N

k¼0

αkntk where αkn ¼1

k!

dkxijdtk

����t¼0

: [S1]

The first step of the proof of Theorem 2 consists of proving thatlimN→∞ limn→∞ xijnNðtÞ converges to aij − μþ μ∕ð1 − μtÞ withprobability 1 on ½0;1∕jμjÞ (see Lemma 1). The second stepof the proof consists of proving that limN→∞ limn→∞ xijnNðtÞ ¼limn→∞xijðtÞ with probability 1 on ½0;1∕KÞ (see Lemma 2).Because we can write limn→∞xijðtÞ as limn→∞ limN→∞ xijnNðtÞ, thisamounts to showing that the two limits can be exchanged on½0;1∕KÞ. The above theorem then follows trivially by a unionbound.

Lemma 1. Under the assumptions of Theorem 2, limN→∞limn→∞xijnN ¼ aij − μþ μ∕ð1 − μtÞ with probability 1 for t ∈½0;1∕jμjÞ.

Proof: For the sake of generality, we present a proof with moremild assumptions than those of the rest of the paper: We requireonly that the moments of the Fij distributions be finite (and

off-diagonal distributions to have mean μ), not that the aij valuesbe bounded by K with probability 1.

Define αk∞ ¼ limn→∞αkn (merely shorthand—we do not as-sume the limit exists). By a union bound, we have

Pr�

⋂∞

k¼1

½αk∞ ¼ μkþ1��

≥ 1 −∑∞

k¼1

½1 − Prðαk∞ ¼ μkþ1Þ� [S2]

So if we can show that Prðαk∞ ¼ μkþ1Þ ¼ 1 for all k ≥ 1, thenPrð⋂∞

k¼1½αk∞ ¼ μkþ1�Þ ¼ 1. In this case, limN→∞ limn→∞ xijnNhas the convergent Taylor series aij þ∑∞

k¼1 μkþ1tk on ½0;1∕jμjÞ

with probability 1, which proves the lemma.So our task reduces to showing that Prðαk∞ ¼ μkþ1Þ ¼ 1 for

each k ≥ 1. In order to do this, we need to compute the leadingbehavior of αkn in n. To calculate the k time derivatives of xij in theformula for αkn (see Eq. S1), we alternate between applying thechain rule of differential calculus and substituting in the right-hand side of _xij ¼ n−1∑kxikxkj (our system _X ¼ n−1X2 writtenin elementwise fashion). This gives

αkn ¼ n−k ∑n

m1¼1∑n

m2¼1

⋯ ∑n

mk¼1

aim1am1m2

⋯amkj; [S3]

where the factor n−k comes from the k factors of n−1 introducedby the k derivatives, and the factor 1∕k! in the formula for αkncancels with a factor k! that arises from repeated applicationsof the chain rule. In Eq. S3, the dominant term is a sum ofthe edge value products of all simple length-ðkþ 1Þ paths be-tween i and j. This sum contains ðn − 2Þ!∕ðn − 2 − kÞ! terms.All other paths include fewer immediate nodes and thus haveat least a factor of n fewer terms in their sums.

Our goal then for the remainder of the proof is to show that thefirst term of Eq. S3 is the only term that remains after taking n toinfinity, and that it converges to μkþ1 with probability 1. To sim-plify notation, let ℓ denote the product of the edge values aijalong a particular path of length kþ 1 (not necessarily simple)from i to j, and let L denote the set of all such products on pathswith the same configuration, or pattern of connectivity. Denotethe set of all L by fLg, and let S denote the one L in fLgconsisting of simple paths of length kþ 1 from i to j.

Now observe that ∩L ½limn→∞n−k∑ℓ∈Lℓ ¼ 0� ∩ ½n−k∑ℓ∈Sℓ ¼μkþ1� ⊂ ½αk∞ ¼ μkþ1�. So by another union bound,

Prðαk∞ ¼ μkþ1Þ ≥ 1 − Pr�

limn→∞

n−k∑ℓ∈S

ℓ ≠ μkþ1

− ∑fLg\S

Pr�

limn→∞

n−k∑ℓ∈L

ℓ ≠ 0

: [S4]

Hence, we are done if we can show that (i) Prðlimn→∞n−k

∑ℓ∈Sℓ ¼ μkþ1Þ ¼ 1 for S and (ii) Prðlimn→∞n−k∑ℓ∈Lℓ ¼ 0Þ ¼ 1for all other L. Although ∑ℓ∈Lℓ is in general a sum of correlatedrandom variables, it is possible to adapt a standard proof of thestrong law of large numbers for uncorrelated random variables toprove both items. We do this next.

Let us prove (ii) first. For brevity, let Sn ¼ ∑ℓ∈Lℓ and choosev to denote the number of nodes in the path configuration of L.For any positive ϵ and r ¼ 1;2;…, Markov’s inequality gives

Marvel et al. www.pnas.org/cgi/doi/10.1073/pnas.1013213108 1 of 3

Page 8: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

PrðjSnj ≥ ðnϵÞkÞ ≤ EðjSnjrÞ∕ðnϵÞkr . So if we can find an r such thatEðjSnjrÞ∕ðnϵÞkr ≤ C∕n2 for some constant C (dependent on ϵ),then ∑n PrðjSnj ≥ ðnϵÞkÞ converges, and by the first Borel–Cantelli lemma, Prðjn−k∑ℓ∈Lℓj ≥ ϵ i:o:Þ ¼ 0 for all ϵ > 0 (wherei.o. stands for infinitely often). Careful reflection reveals that∪ϵ½jn−k∑ℓ∈Lℓj ≥ ϵ i:o:� (for, say, all rational ϵ) is the complemen-tary event of ½limn→∞n−k∑ℓ∈Lℓ ¼ 0�, and so we have arrived atthe desired result (ii).

Hence, in order to actually show (ii), we need to find an r suchthat EðjSnjrÞ∕ðnϵÞkr ≤ C∕n2. Consider r ¼ 2: EðS2nÞ ¼ ∑ EðℓxℓyÞ,where each index of the sum ranges independently over L. Thereare ðn − 2Þ!∕ðn − 2 − vÞ! paths ℓ in L, so there are fewer than n2v

terms in ∑ EðℓxℓyÞ, and EðS2nÞ ≤ Dn2v for some constant D.Because v < k for all L other than S, we have EðjSnj2Þ∕ðnϵÞ2k ≤ Dn2v∕ðnϵÞ2k ≤ C∕n2, where C ¼ Dϵ−2k, and the proofof (ii) is complete.

Finally, to prove (i), start by replacing each factor axy in ℓ withbxy þ μ, where bxy ¼ axy − μ. Now expand the result and cancelμkþ1 from both sides of n−kSn ¼ μkþ1 to obtain n−kS0n ¼ 0, whereS0n is a sum over S of a polynomial Q with 2kþ1 − 1 terms, each ofthe form μμ⋯μbuvbwx⋯byz, where at least one of the factors is a bxyand the total number of factors in the term is kþ 1. Note thateach place of Q corresponds to a particular set of bxy ’s fromthe original simple path, e.g., the 14th place of Q might havebxy’s corresponding to the 1st, 4th, 5th, and 7th edges of the path,and μ’s for the other edges. Now let mq denote the number ofvertices (excluding i and j) among the subscripts of the bxy’s ina given term. The remaining k −mq nodes of the path not foundin the term (supplanted by the μ’s) can take any of ðn − 2 −mqÞ!∕ðn − 2 − kÞ! permutations. Hence, there are no more than nk−mq

identical copies of any one term in S0n from the same place in Q.Now consider one of the ð2kþ1 − 1Þ4 ways that terms in the

2kþ1 − 1 places of Q can be multiplied together in S04n . Note thatthis can produce no more than n4k−∑

4q¼1

mq identical copies of thesame term. Second, because the bxy’s each have expectation zero,every bxy in the final termmust appear to at least a power of two orthe whole term has expectation zero. This implies that for eachnonvanishing term, there must be some pattern of matchingbetween the bxy’s. The number of possible matchings is clearlya function of k and not n [it certainly is not more than the numberof partitions of 4ðkþ 1Þ edges], so consider one of these possiblematchings. Now observe that if, as we stated above, we consideronly one of the ð2kþ1 − 1Þ4 ways that terms in the 2kþ1 − 1 places ofQ can be multiplied together in S04n , then no more than n∑

4q¼1

mq∕2

distinct nonvanishing terms can be constructed per matching forany such way of combining terms. This holds because each bxyneeds at least one match, and so the number of free nodes cannotexceed half the total number of bxy ’s in the final term. Thus wehave shown the highest order of n possible for EðS04n Þ is givenby the maximum value of n∑

4q¼1

mq∕2n4k−∑4q¼1

mq . Because mq ≥ 1

for each q ¼ 1;…;4, this can at most be n4k−2, which by the abovereasoning completes the proof of (i) and hence the full theorem.

Lemma 2. Under the assumptions of Theorem 2, limn→∞ limN→∞xijnN ¼ limN→∞ limn→∞ xijnN with probability 1 for t ∈ ½0;1∕KÞ.

Proof: We need three ingredients for this proof. We first describethe three ingredients and then show how they together proveLemma 2. Throughout the following, all statements hold withprobability 1 unless stated otherwise.

As we found in the course of the proof of Lemma 1, the limitslimn→∞αkn exist for all k and are μkþ1 on ½0;1∕jμjÞ, so limn→∞

∑Nk¼0 αknt

k exists under the same conditions, and we call itxij∞NðtÞ. This gives us the first ingredient: (i) limn→∞xijnNðtÞ ¼xij∞NðtÞ for t ∈ ½0;1∕jμjÞ and any N. Additionally, from Lemma 1we know that limN→∞xij∞NðtÞ exists and is aij − μþ μ∕ð1 − μtÞ on½0;1∕jμjÞ. We call this limit xij∞∞ðtÞ and write the second ingredi-ent as (ii) limN→∞xij∞NðtÞ ¼ xij∞∞ðtÞ for t ∈ ½0;1∕jμjÞ.

Finally, as we saw in the proof of Lemma 1, αkn ¼ n−k

∑ aim1am1m2

⋯amkj (by definition, not just with probability 1),where the k indices mx each range independently from 1 to n.Because each jaijj < K , we must have that jαknj ≤ Kkþ1, whichimplies jxijn∞ðtÞ − xijnNðtÞj ≤ KðKtÞNþ1∕ð1 − KtÞ. So if jKtj < 1,then for any ϵ > 0, there is a sufficiently large N1 independentof n such that jxijn∞ðtÞ − xijnNðtÞj ≤ ϵ for all N ≥ N1. This consti-tutes our third ingredient, that xijnNðtÞ converges uniformly toxijn∞ðtÞ: (iii) For every ϵ > 0, there exists an N1 such that forall N ≥ N1 and all n, jxijn∞ðtÞ − xijnNðtÞj < ϵ.

To complete the proof of Lemma 2, we need to show thatlimn→∞xijn∞ðtÞ exists and is just xij∞∞ðtÞ on ½0;1∕KÞ. Start by pick-ing an ϵ > 0. Then by (iii), there exists an N1 such that if N > N1

then jxijn∞ðtÞ − xijnNðtÞj < ϵ for all n. Similarly, (ii) implies thatthere exists an N2 such that if N ≥ N2, then jxij∞∞ðtÞ − xij∞NðtÞj< ϵ. Finally, let N3 ¼ maxfN1;N2g. Then by (i), we may choosean n1 such that if n ≥ n1, then jxij∞N3

ðtÞ − xijnN3ðtÞj < ϵ. Now

define the following events:

E1 ¼ ½jxij∞∞ðtÞ − xij∞N3ðtÞj < ϵ�

E2 ¼ ½jxij∞N3ðtÞ − xijnN3

ðtÞj < ϵ�E3 ¼ ½jxijnN3

ðtÞ − xijn∞ðtÞj < ϵ�E4 ¼ ½jxij∞∞ðtÞ − xijn∞ðtÞj < 3ϵ�: [S5]

Observe that, in similar form to Eq. S4, ðE1 ∩ E2 ∩ E3Þ ⊂ E4, soPrðE4Þ ≥ PrðE1 ∩ E2 ∩ E3Þ ¼ 1 − PrðE0

1∪E02∪E0

3Þ ≥ 1 − PrðE01Þ−

PrðE02Þ − PrðE0

3Þ ¼ 1 for all n ≥ n1. Thus, jxij∞∞ðtÞ − xijn∞ðtÞj < 3ϵfor all n ≥ n1 and t ∈ ½0;1∕KÞ.

Unreferenced Results. In the main paper, we establish that λ1 > 0 inprobability by way of Wigner’s semicircle law. However, we canalso show that λ1 > 0 with high probability under a different set ofassumptions about how A is selected. Loosely speaking, theselection of the diagonal entries is more constrained in this alter-native approach, whereas the selection of the off-diagonal entriesis less so.

Suppose that all aij, ji − jj ¼ 1, are chosen from the same dis-tribution F and all aii are chosen from the same distribution G.All selections are independent for i ≤ j. In addition, assume thatthe density of a11 − a12 is not entirely confined to negative values.Then we have the following theorem:

Theorem 3. Prðλ1 ≤ 0Þ is exponentially small in n.

Proof: If λ1 ≤ 0, then the corresponding matrix A is negative semi-definite. Therefore vTAv ≤ 0 for every vector v (T denotes trans-position). Let vk denote the vector with þ1 and −1 in itsð2k − 1Þth and 2kth components, respectively, and 0 in all othercomponents. Then the event that vTk Avk ≤ 0 is equivalent to theevent that að2k−1Þð2k−1Þ − að2k−1Þ2k − a2kð2k−1Þ þ a2k2k ≤ 0. Notethat the left-hand side of this final inequality has at least a con-stant probability of being positive. Now as k ranges from 1 to n∕2,we encounter n∕2 independent events, each having at least a con-stant probability of failure. Hence the probability that A is nega-tive semidefinite is exponentially small in n.

Marvel et al. www.pnas.org/cgi/doi/10.1073/pnas.1013213108 2 of 3

Page 9: Continuous-time model of structural balance...Continuous-time model of structural balance Seth A. Marvela, Jon Kleinbergb,1, Robert D. Kleinbergb, and Steven H. Strogatza aCenter for

Very roughly, the final result of this supporting text says that inthe case of negative μ, the distribution of the sum of the compo-nents of ω1 ¼ ðv1;…;vnÞ retains no more than constant width inthe large-n limit. So, for example, the mean of the components ofω1 must shrink to zero at least as fast as 1∕n. (Note that this con-vergence is faster than the 1∕

ffiffiffin

pconvergence of means for inde-

pendent and identically distributed random variables with finitemean and variance.) This result is consistent with the picture thatwhen μ is negative, the system is destined for two-sided conflict inthe large-n limit.

To make the proof less cumbersome, suppose that the diagonalFii have common expectation μ and variance σ2 just like theoff-diagonal Fij.

Theorem 4. If μ < 0, then limn→∞ Prðjn−δ ∑ni¼1 vij < ϵÞ ¼ 1 for all

δ;ϵ > 0.

Proof: Consider the eigenvalue equation ∑nj¼1 aijvj ¼ λ1vi. Sum

both sides over i and then rearrange and switch index labels toobtain

∑n

i¼1

vi ∑n

j¼1

aij ¼ λ1 ∑n

i¼1

vi: [S6]

We anticipate from standard results of random matrix theory thatthe left side of Eq. S6 is asymptotic to μn∑n

i¼1 vi and the right sideto 2σ

ffiffiffin

p∑n

i¼1 vi. So let us define the two events

E1 ¼�����n

−1−δ∑n

i¼1

vi ∑n

j¼1

aij −μ

nδ ∑n

i¼1

vi

����≤ ϵ

E2 ¼�����λ1n ∑

n

i¼1

vi −2σffiffiffin

p ∑n

i¼1

vi

����< ϵ

;

[S7]

where ½·� denotes an event. Now, PrðE1 ∩ E2Þ ¼ 1 − PrðE01∪E0

2Þ ≥1 − PrðE0

1Þ − PrðE02Þ, where we have written E0

x for the comple-mentary event of Ex. So if we can show that limn→∞ PrðE1Þ ¼ 1

and limn→∞ PrðE2Þ ¼ 1 for all δ;ϵ > 0, it follows that limn→∞Prðjμn−δ ∑n

i¼1 vi − 2σn−1∕2−δ ∑ni¼1 vij < ϵþ ϵn−δÞ ¼ 1, which im-

plies Theorem 4.Hence our task reduces to showing that limn→∞ PrðE1Þ ¼

limn→∞ PrðE2Þ ¼ 1. First consider PrðE1Þ. By Jensen’s inequality(2) and normalization of ω1, ðn−1 ∑n

i¼1 jvijÞ2 ≤ n−1 ∑ni¼1 v

2i ¼ n−1,

so

����∑

n

i¼1

vi

����≤ ∑

n

i¼1

jvij ≤ffiffiffin

p: [S8]

Now define the additional event:

E3 ¼�

max1≤i≤n

����n

−1∕2−δ∑n

j¼1

ðaij − μÞ����≤ ϵ

: [S9]

By Eq. S8, we have E3 ⊂ ½max1≤i≤njn−1∕2−δ ∑nj¼1ðaij − μÞjn−1∕2

∑ni¼1 jvij ≤ ϵ� ⊂ ½n−1−δ ∑n

i¼1 jvijj∑nj¼1ðaij − μÞj ≤ ϵ� ⊂ E1. The

Bernstein inequality and a union bound together imply thatlimn→∞ PrðE3Þ ¼ 1, because they give that

PrðE03Þ ≤ 2n exp

−n2δϵ2∕2

σ2 þ ðK þ μÞn−1∕2þδϵ∕3

: [S10]

So, limn→∞ PrðE1Þ ¼ 1 as desired.Regarding PrðE2Þ, we know that λ1 ∈ 2σ

ffiffiffin

p þ oð ffiffiffin

p Þ in prob-ability (3), which implies that limn→∞ Prðjλ1∕

ffiffiffin

p− 2σj < ϵÞ ¼ 1

and therefore limn→∞ PrðE2Þ ¼ 1 for all ϵ > 0 by Eq. S8. Thiscompletes the proof.

Additional References. Examples of two-sided social conflictsappear in a wide range of settings, including political party coali-tions (4–8), interstate wars (9, 10), corporate standard setting(11), intertribal feuding (12), and even experiments in whichgroup membership is based on arbitrary criteria (13–17).

1. Kułakowski K, Gawronski P, Gronek P (2005) The Heider balance—A continuousapproach. Int J Mod Phys C 16:707–716.

2. Doob JL (1994) Measure Theory (Springer, New York), see p 87.3. Füredi Z, Komlós, J (1981) The eigenvalues of random symmetric matrices. Combina-

torica 1:233–241.4. Duverger M (1954) Political Parties: Their Organization and Activity in the Modern

State (Wiley, New York), see p 217.5. Riker WH (1962) The Theory of Political Coalitions (Yale Univ Press, New Haven, CT),

see pp 174–189.6. Riker WH (1982) The two-party system and Duverger’s law: An essay on the history of

political science. Am Polit Sci Rev 76:753.7. Benoit K (2007) Electoral laws as political consequences: Explaining the origins and

change of electoral institutions. Annu Rev Polit Sci 10:363.8. Sagar DJ, ed. (2009) Political Parties of the World (John Harper, London).9. MooreM (1979) Structural balance and international relations. Eur J Soc Psychol 9:323.

10. Altfeld MF, de Mesquita BB (1979) Choosing sides in war. Int Stud Quart 23:87.

11. Axelrod R, Mitchell W, Thomas RE, Bennett DS, Bruderer, E (1995) Coalition formationin standard-setting alliances. Manage Sci 41:1493.

12. Chagnon NA (1988) Life histories, blood revenge, and warfare in a tribal population.Science 239:985.

13. Sherif M (1966) In Common Predicament: Social Psychology of Intergroup Conflict andCooperation (Houghton Mifflin, Boston).

14. Tajfel H, Turner JC (1979) The Social Psychology of Intergroup Relations, eds. AustinWG, Worchel S (Brooks/Cole, Monterey, CA), p 33. See pp 38–40.

15. Brewer MB (1979) In-group bias in the minimal intergroup situation: A cognitive-motivational analysis. Psychol Bull 86:307.

16. Fiske ST (2002) What we know now about bias and intergroup conflict, the problem ofthe century. Curr Directions Psychol Sci 11:123.

17. Wildschut T, Pinter B, Vevea JL, Insko CA, Schopler J (2003) Beyond the group mind: Aquantitative review of the interindividual-intergroup discontinuity effect. Psychol Bull129:698.

Marvel et al. www.pnas.org/cgi/doi/10.1073/pnas.1013213108 3 of 3