# A critical point for random graphs with a given degree...

### Transcript of A critical point for random graphs with a given degree...

A Critical Point for Random Graphs - with a Given Degree Sequence

Michael Molloy Department of Mathematics, Carnegie Mellon University, Pittsburgh, PA 152 13

Bruce Reed Equipe Combinatoire, CNRS, Universite Pierre et Marie Curie, Paris, France

ABSTRACT

Given a sequence of nonnegative real numbers A,, A , , . . . which sum to 1, we consider random graphs having approximately Ain vertices of degree i . Essentially, we show that if C i(i - 2)A, > 0, then such graphs almost surely have a giant component, while if C i(i - 2)A, < 0, then almost surely all components in such graphs are small. We can apply these results to Gn,p, Gn,M, and other well-known models of random graphs. There are also applications related to the chromatic number of sparse random graphs. @ 1995 John Wiley & Sons, Inc.

1. INTRODUCTION AND OVERVIEW

In this paper we consider two parameters of certain random graphs: the number of vertices and the number of cycles in the largest component. Of course, the behavior of these parameters depends on the probability distribution from which the graphs are picked. In one standard model we pick a random graph G",,,, with n vertices and M edges where each graph is equally likely. We are interested in what happens when we choose M as a function of n and let n go to infinity. The point M = i n is referred to as the critical point or the double-jump threshold because of classical results due to Erdos and RCnyi [8] concerning the dramatic changes which occur to these parameters at this point. If M = cn + o(n) for c < +, then almost surely (i.e., with probability tending to 1 as n tends to infinity) G".,,,

Random Structures and Algorithms, Vol. 6, Nos. 2 and 3 (1995) @ 1995 John Wiley & Sons, Inc. CCC 1042-9832/95/030161-19

161

MOLLOY AND REED 162

has no component of size greater than O(log n), and no component has more than one cycle. If M = l n + o(n) , then almost surely (a.s.) the largest component of Gn.M has size O(n ). If M = c n for c > + , then there are constants ~ , 6 > O dependent on c such that a.s. Gn,M has a component on at least En vertices with at least 6n cycles, and no other component has more than O(1og n) vertices or more than one cycle. This component is referred to as the giant component of G , , M . For more specifics on these two parameters at and around M = i n , see [3], [ll], or

In this paper, we are interested in random graphs with a fixed degree sequence where each graph with that degree sequence is chosen with equal probability. Of course, we have to say what we mean by a degree sequence. If the number of vertices in our graph, n , is fixed, then a degree sequence is simply a sequence of n numbers. However, we are concerned here with what happens asymptotically as n tends to infinity, so we have to look at a “sequence of sequences.” Thus, we generalize the definition of degree sequence:

1 / 3

~ 4 1 .

Definition. A n asymptotic degree sequence is a sequence of integer-valued functions 9 = d,(n) , d , ( n ) , . . . such that

1. d,(n) = 0 for i I n ; 2. ClrO d,(n) = n.

Given an asymptotic degree sequence 9, we set 9,, to be the degree sequence {cl, c 2 , . . . , c , , } , where cj I C ~ + ~ and I { j : cj = i } l = d,(n) for each i 2 0 . Define Ran to be the set of all graphs with vertex set [n] with degree sequence 9,,. A random graph on n vertices with degree sequence 9 is a uniformly random member of Ran.

Definition. A n asymptotic degree sequence 9 is feasible if Ran # 0 for all n 2 1.

In this paper, we will only discuss feasible degree sequences. Because we wish to discuss asymptotic properties of random graphs with

degree sequence 9, we want the sequences 9,, to be in some sense similar. We do this by insisting that for any fixed i , the proportion of vertices of degree i is roughly the same in each sequence.

Definition. such that limn+- d, (n) /n = A i .

A n asymptotic degree sequence 9 is smooth if there exist constants Ai

Throughout this paper, all asymptotics will be taken as n tends to 00, and we only claim things to be true for sufficiently large n.

In the past, the most commonly studied random graphs of this type have been random regular graphs. Perhaps the most important recent result is by Robinson and Wormald [19,20], who proved that if G is a random k-regular graph for any constant k 2 3, then G is a.s. Hamiltonian.

Another motivation for studying random graphs on a fixed degree sequence comes from the analysis of the chromatic number of sparse random graphs. This is because a minimally (r + 1)-chromatic graph must have minimum degree at least

A CRITICAL POINT FOR RANDOM GRAPHS 163

r. In an attempt to determine how many edges were necessary to force a random graph to a.s. be not 3-colorable, Chvfital [7] studied the expected number of subgraphs of minimum degree 3 in random graphs with a linear number of edges. He showed that for c < c* = 1.442. . . , the expected number of such subgraphs in Gn,M=cn is exponentially small, while for c > c * the expected number of such subgraphs in Gn,M=cn is exponentially large. In the work that motivated the results of this paper, the authors used a special case of the main theorem of this paper to show that the probability that a random graph on n vertices with minimum degree three and at most 1.793n edges is minimally 4-chromatic is exponentially small [18]. We used this to show that, for c a little bit bigger than c*, the expected number of minimally 4-chromatic subgraphs of Gn,M=cn is exponentially small. This suggests that determining the minimum value of c for which a random graph with cn edges is a s . 4-chromatic may require more than a study of the subgraphs with minimum degree 3.

Recently tuczak [14] showed (among other things) that if G is a random graph on a fixed degree sequence*, with no vertices of degree less than 2, and at least @(n) vertices of degree greater than 2, then G a.s. has a unique giant component. Our main theorem also generalizes this result.

We set Q(9) = C i 2 , i(i - 2)Ai. Essentially, if Q(9) > 0, then a random graph with degree sequence 9 a s . has a giant component, while if Q(9) < 0, then all the components of such a random graph are a s . quite small. Note how closely this parallels the phenomenon in the more standard model Gn,M.

Note further that these results allow us to determine a similar threshold for any model of random graphs as long as: (i) We can determine the degree sequence of graphs in the model with reasonable accuracy, and (ii) once the degree sequence is determined, every graph on that degree sequence is equally likely. Gn,p is such a model, and thus (as we see later), our results can be used to verify the previously known threshold for Gn,p.

Before defining the parameter precisely, we give an intuitive explanation of why it determines whether or not a giant component exists. Suppose that 9,, has ( Ai + o( 1))n vertices of degree i for each i 2 0. Pick a random vertex in our graph and expose the component in which it lies using a branching process. In other words, expose its neighbors, and then the neighbors of its neighbors, repeating until the entire component is exposed. Now when a vertex of degree i is exposed, then the number of “unknown” neighbors increases by i - 2. The probability that a certain vertex is selected as a neighbor is proportional to its degree. Therefore, the expected increase in the number of unknown neighbors is (roughly) Cir, i(i - 2)Ai. This is, of course, Q(9).

Thus, if Q(9) is negative, then the component will a s . be exposed very quickly. However, if it is positive then the number of unknown neighbors, and thus the size of the component, might grow quite large. This gives the main thrust of our arguments. We will now begin to state all of this more formally.

There are a few caveats, so in order for our results to hold, we must insist that the asymptotic degree sequences we consider are well behaved. In particular, when the maximum degree in our degree sequence grows with n , we can run into some problems if things do not converge uniformly. For example, if d , ( n ) = n -

* He did not use the asymptotic degree sequence introduced here, but the results translate.

164 MOLLOY AND REED

[n”] , di(n) = [n”] if i = [GI, and d,(n) = 0 otherwise, then A, = 1, and A; = 0 for i > 1, and we get Q(9) = -1. However, this is deceiving as there are enough vertices of degree G to ensure that a giant component containing n - o(n) vertices a s . exists.

Definition. An asymptotic degree sequence 9 is well-behaved i f :

1. 9 is feasible and smooth. 2. i(i - 2)di(n)/n tends uniformly to i(i - 2)Ai; i.e., for all E > 0 there exists N

such that for all n > N and for all i L 0.

i(i - 2)d,(n) - i(i - 2)Ai I < E .

3.

L ( 9 ) =lim i(i - 2)di(n)/n n-P .

1 2 1

exists, and the sum approaches the limit uniformly; i.e.: ( a ) If L ( 9 ) is finite then for all E > 0 there exists i* , N such that for all

n > N : i*

i(r - 2)di(n)/n - L ( 9 ) < E .

( b ) I f L ( 9 ) is finite, then, for all T > 0, there exists i*, N such that for all

I? n > N

i’

i(i - 2)di(n)/n > T . i = l

We note that it is an easy exercise to show that if 9 is well behaved, then

L(9) = Q @ a > . It is not surprising that the threshold occurs when there are a linear number of edges in our degree sequence. We define such a degree sequence as sparse:

Definition. o( 1 ) for some constant K .

A n asymptotic degree sequence 9 is sparse if C i z 0 id,(n)/n = K +

Note that for a well-behaved asymptotic degree sequence 9, if Q(9) is finite, then 9 is sparse.

The main result in this paper is the following:

Theorem 1. Let 9 = d,(n), d,(n), . . . be a well-behaved sparse asymptotic degree sequence for which there exists E > 0 such that for all n and i > n1’4-r, di(n) = 0. Let G be a graph with n vertices, di(n) of which have degree i , chosen uniformly at random from among all such graphs. Then:

a. If Q(9) > 0 then there exist constants l,, l2 > 0 dependent on 9 such that G a s . has a component with at least l l n vertices and C2n cycles. Furthermore, i f

A CRITICAL POINT FOR RANDOM GRAPHS 165

Q(9) is finite, then G a.s. has exactly one component of size greater than y logn for some constant y dependent on 9.

b. I f Q(9) < O and for some function 0 5 o ( n ) 5 n”’-‘, d i (n) = 0 for all i 2 w(n), then, for some constant R dependent on Q(9), G a.s. has no component with at least Rw(n)* logn vertices, and a.s. has fewer than 2 Rw(n)’ log n cycles. Also, a.s. no component of G has more than one cycle.

Consistent with the model Gn,M, we call the component referred to in Theorem l a a giant component.

Note that if Q(9) < 0, then Q(9) is finite. Note also that Theorem 1 fails to cover the case where Q(9 ) = 0. This is analogous to the case M = f n + o(n) in the model Gn,M, and would be interesting to analyze.

One immediate application of Theorem 1 is that if G is a random graph on a fixed well-behaved degree sequence with cn + o(n) edges for any c > 1 then G a.s. has a giant component, as there is no solution to

x i A , > 2 , x i ( i - 2 ) A i < 0 , CA,=l, O S A , 5 1 . 1 2 1 i z l 1 2 1

A major difficulty in the study of random graphs on fixed degree sequences is that it is difficult to generate such graphs directly. Instead it has become standard to study random configurations on a fixed degree sequence, and use some lemmas which allow us to translate results from one model to the other. The configuration model was introduced by Bender and Canfield [2] and refined by Bollobis [3] and also Wormald [21].

In order to generate a random configuration with n vertices and a fixed degree sequence, we do the following:

1. Form a set L containing deg(u) distinct copies of each vertex u . 2. Choose a random matching of the elements of L.

Each configuration represents an underlying multigraph whose edges are defined by the pairs in the matching. We say that a configuration has a graphical property P if its underlying multigraph does.

Using the main result in [17], it follows that the underlying multigraph of a random configuration on a degree sequence meeting the conditions of Theorem 1 is simple with probability tending to e - ” ( ” ) , for some A ( 9 ) < O ( n ” ’ - ‘ ) . The condition di (n) = 0 for all i > n”4-e is needed to apply this result. If Q(9) is finite, then A ( 9 ) tends to a constant.

Also, any simple graph G can be represented by IIUEvcG,deg(u)! configura- tions, which is clearly equal for all graphs on the same degree sequence and the same number of vertices.

This gives us the following very useful lemmas:

Lemma 1. If a random configuration with a given degree sequence 9 meeting the conditions of Theorem 1 [with Q(9) possibly unbounded] has a property P with probability at least 1 - zn for some constant z < 1, then a random graph with the same degree sequence a.s. has P.

166 MOLLOY AND REED

Lemma 2. If a random configuration with a given degree sequence 9 meeting the conditions of Theorem 1 a.s. has a property P, and if Q(9) < CQ, then a random graph with the same degree sequence a s . has P.

Using these lemmas, it will be enough to prove Theorem 1 for a random configuration.

The configuration model is very similar to the pseudograph model developed independently by Bollobis and Frieze [ 5 ] , Flajolet, Knuth, and Pittel [lo], and Chvital [7]. Both models are very useful when working with random graphs on a given degree sequence.

Having defined the precise objects that we are interested in, and the model in which we are studying them, we can now give a more formal overview of the proof. The remainder of this section is devoted to this overview. In the following two sections we give all the details of the proof. In Section 4, we see some applications of Theorem 1 : the aforementioned work concerning the chromatic number of sparse random graphs, and a new proof of a classical double-jump theorem, showing that this work generalizes that result. A reader who is not interested in the details of the proof might want to just finish this section and then skip ahead to the last one.

In order to examine the components of our random configuration, we will be more specific regarding the order in which we expose the pairs of the random matching. Given 9, we will expose a random configuration F on n vertices, d,(n) of which have degree i as follows:

At each step, a vertex all of whose copies are in exposed pairs is entirely exposed. A vertex some but not all of whose copies are in exposed pairs is partially exposed. All other vertices are unexposed. The copies of partially exposed vertices which are not in exposed pairs are open.

1. Form a set of L consisting of i distinct copies of each of the d,(n) vertices which have degree i .

2. Repeat until L is empty: a. Expose a pair of F by first choosing any member of L, and then choosing

b. Repeat until there are no partially exposed vertices: its partner at random. Remove them from L.

Choose an open copy of a partially exposed vertex, and pair it with another randomly chosen member of L. Remove them both from L.

All random choices are made uniformly. Essentially we are exposing the random configuration one component at a

time. When any component is completely exposed, we move on to a new one; i.e., we repeat step 2a.

It is clear that every possible matching among the vertex-copies occurs with the same probability under this procedure, and hence this is a valid way to choose a random configuration.

Note that we have complete freedom as to which vertex we pick in Step 2a. In a few places in this paper, it will be important that we take advantage of this freedom, but in most cases we will pick it randomly in the same manner in which

A CRITICAL POINT FOR RANDOM GRAPHS 167

we pick all the other vertex-copies, i.e., unless we state otherwise, we will always just pick a uniformly random member of L.

Now, let Xi represent the number of open vertex-copies after the ith pair is exposed. If the neighbor of u chosen in step 2b is of degree d, then Xi goes up by d - 2. Each time a component is completely exposed and we repeat step 2a, if the pair exposed in step 2a involves vertices of degree d , and d,, then Xi is set to a value of d , + d , - 2.

Note that if the number of vertex-copies in L which are copies of vertices of degree d is r d , then the probability that we pick a copy of a vertex of degree d in step 2b is r d / C i 2 , r i . Therefore, initially the expected change in Xi is approximate- lY

jzl

Therefore, at least initially, if this value is positive then Xi follows a Markov process very close in distribution to the well-studied “drunkard’s walk,” with an expected change of Q(9))lK. Since Xi+l r X, - 1 always, a standard result of random walk theory (see, for example, [9]) implies that if Q(9 ) > 0, then after O(n) steps, Xi is a.s. of order O(n).

It follows that our random configuration a s . has at least one component on O(n) vertices. We will see that such a component a s . has at least O(n) cycles in it, and this will give us the first part of Theorem 1. We will also see that if Q(9 ) is bounded, then this giant component is a s . unique.

On the other hand, if Q(9) < 0, then Xi a s . returns to zero fairly quickly, and this will give us the other part of Theorem 1, as the sizes of the components of F are bounded above by the distances between values of i such that Xi = 0.

Of course, the random walk followed by Xi is not really as simple as this. There are three major complications:

1. A pure random walk can drop below 0. Whenever Xi reaches 0, it resets itself to a positive number.

2. We neglected to consider that the second vertex-copy chosen in Step 2b might be an open vertex-copy in which case Xi decreases by 2. We will call such a pair of vertex-copies a backedge.

3. As more and more vertices are exposed, the ratio of the members of L which are copies of vertices of degree d shifts, and the expected increase of X, changes.

These complications are handled as follows:

1. This will increase the probability of Xi growing large, and so this only poses a potential problem in proving part (b). In this case, we will show that the probability of a component growing too big is of order o(n-’ ) , and hence even if we “try again” n times, this will a.s. never happen.

2. We will see that this a s . doesn’t happen often enough to pose a serious problem, unless the partially exposed component is already of size @(n).

168 MOLLOY AND REED

3. In proving part a, we look at our component at a time when the expected increase in Xi is still at least + its original value. We will see that the component being exposed at this point is a s . a giant component. In proving part b, it is enough to consider the configuration after o(n) steps. At this point, the expected increase hasn’t changed significantly.

This is a rough outline of the proof. We will fill in the details in the next two sections.

2. GRAPHS WITH NO LARGE COMPONENTS

In this section we will prove that the analogue of Theorem l b holds for random configurations. Lemma 1 will then imply that it holds for random graphs. We will first prove that if F is a random configuration meeting the conditions given in Theorem lb , then F a s . does not have any large components.

Given Q(9) < 0, set v = -Q(9)//K and set R = 150/vz.

Lemma 3. Let F be a random configuration with n vertices and degree sequence 9, meeting the conditions of Theorem 1. If Q(9) < 0 and if, for some function 0 5 w(n) 5 nl”-‘, F has no vertices of de ree greater than w(n), then F a.s. has no components with more than a = [Rw(n) logn] vertices. B

The following theorem of Azuma will play an important role:

Azuma’s inequality [l]. Let 0 = X,,, . . . , X, be a martingale with

IX;,, - Xil 5 1

for all 0 5 i < n. Let A > 0 be arbitrary. Then

Pr[)X,I > A f i ] < e-*”’ . This yields the following very useful standard corollary.

Corollary. f(Z,, C,, . . . , 2,) be a random variable defined by these X i . I f for each i

Let C = C , , C,, . . . , Z, be a sequence of random events. Let f ( C ) =

max)E(f(Z)ICl ,Z, , . . . , C i , , ) - E ( f ( Z ) I Z l , ~ , , . . , Z i ) I 5 c i 2

where E( f ) denotes the expected value o f f , then the probability that I f - E( f ) l > t is at most

For more details on this corollary and an excellent discussion of martingale arguments see either [16] or [5 ] .

In order to prove Lemma 3, we will analyze the Markov process described in Section 1. Recall that Xi is the number of open vertex-copies after i pairs of our configuration have been exposed. Similarly, we let Y, be the number of backedges

A CRITICAL POINT FOR RANDOM GRAPHS 169

formed, and C, be the number of components that have been at least partially exposed during the first i steps. We also define W; to be the sum of deg(u) - 2 over all vertices u completely or partially exposed during the first i steps. We note that

Now W; “stalls” whenever a backedge is formed, and only changes whenever a new vertex is completely or partially exposed. For this reason, it is easier to analyze Wi when it is indexed not by the number of pairs exposed, but by the number of new vertices exposed. Thus we introduce another variable which does exactly this. We let Z j be the sum of deg(u) - 2 over the first j new vertices (partially or completely) exposed.

The reason that we are introducing Z, is that it has the same initial expected increase as X i , but behaves much more nicely. In particular, it is not affected by the first and second complications discussed at the end of Section 1. Specifically, if, after the first j vertices have been completely or partially exposed, there are exactly r i ( j ) unexposed vertices of degree i , then Z j + ] = Z j + (i - 2 ) with probability i r ; ( j ) / C i r i ( j ) .

Now in order to discuss Xi and Z j at the same time, we will introduce the random variable lj which is the number of pairs exposed by the time that the jth vertex is partially exposed; i.e., W, = Z j .

w; =xi + 2 y - 2 c i .

Recall that a = [Rw(n)’ log nl .

Lemma 4. probability that u lies on a component of size at least a is less than n-’.

Suppose that F is as described in Lemma 3 . Given any vertex u in F,

Proof. Here we will insist that u is the first vertex chosen in Step 2a. Therefore, the probability that u lies on a component that large is at most the probability that X I > 0 for all 1 I i I a. Thus, we will consider the probability of the latter.

Note that for any i , if C, = 1, then W, = X I + 2Y, - 2 I XI - 2. In fact, we can also get Z, 2 X I - 2. This is because at each iteration we either have a backedge or expose a new vertex. Thus in iteration i , we have exposed i - Y, new vertices, therefore, W, = ZIP,, , . Now Z, decreases by at most one at each step; therefore,

Now, if X I > 0 for all 1 I i I a, then C, = 1. Therefore, the probability that X I > 0 for all 1 I i I a is at most the probability that Z, > -2 . We will concen- trate on this probability, as Z, behaves much more predictably than X I .

Initially the expected increase in Z, is X I ? , i(i - 2)d,(n)/C, , , id,(n) = - V + o(1). We claim that, for j I a, the expected increase in Z, is less than - v / 2 .

This is true because the expected increase of Z, would be highest if the first j vertex-copies chosen were all copies of vertices of degree 1. If this were the case then the expected increase in Z, would be

z, 2zl-y, - y 2 w, - y rx , + r, - 2 r X , - 2.

- (d , (n) - j ) + C i(i - 2)d,(n) + o(1) = -v + o(1) I 2 2

( d , ( n ) -i) + c id ,@) 1 2 2

V I --

2 for sufficiently large n , as j = o(n) and id,(n)+ A, uniformly.

MOLLOY AND REED 170

Therefore, the expected value of Z,, is less than -;a + deg(v) < -;a. We will use the corollary of Azuma’s Inequality to show that 2, is a.s. very close to its expected value.

Z j will indicate the choice of the ith new vertex exposed, i = 1, . . . , a, and f ( Z ) = Z,. We need to bound

Suppose that we are choosing the ( i + 1)st vertex to be partially exposed. Let R be the set of unexposed vertices at this point. The size of R is n - i. For each x E R, define Ei+,(x) to be E(Z, 1 El , Z,, . . . , X i + , ) , where Z i + l is the event that x is the ( i + 1)st new vertex exposed. Consider any two vertices u, u E R. We will bound 1Ei+,(u) - Ej+,(u)l. Consider the order that the vertices in R - {u , u } are exposed. Note that the distribution of this order is unaffected by the positions of u , u .

Let S be the set of the first a - 2 vertices under this order, and let w be the next vertex. Now, Z, = Z j - , + (C,,, deg(x) - 2) + deg(y,) - 2 + deg(y,) - 2, where y , is the jth vertex exposed (either u or u ) and y, is either u , u , or w . Therefore, the most that choosing between u, u can affect the conditional expected value of Z, is twice the maximum degree, i.e., l E l + , ( u ) - E j + , ( v ) 1 ~ 2w(n).

Since

E ( f ( Z ) l Z , , X 2 , . . . , C j ) = Pr{xischosen} x Ej+,(x), X E n

we have that

IE(f (Z) 1x1, z,, . . . 3 Z;+ , ) - E ( f ( Z ) IC,, z,, * . . 3 C;)lS 2 4 n ) .

Therefore, by the corollary of Azuma’s Inequality, the probability that Z,, > 0 is at most

< n-’

And now Lemma 3 follows quite easily:

0

Proof of Lemma 3 . By Lemma 4, the expected number of vertices which lie on 0 components of size at least a is o(1). Therefore a.s. none exist.

We also get the following corollary:

Corollary 1. exposure of our configuration.

Under the same conditions as Lemma 3 , a s . Xi < 2a throughout the

Proof. Because Xi drops by at most 2 at each step, if it ever got that high, it 0 would not be able to reach 0 within Rw (n)’ log n steps.

A CRITICAL POINT FOR RANDOM GRAPHS 171

We will now show that F a s . does not have many cycles. First, we will see that it a s . has no multicyclic components.

Lemma 5. Lemma 3. F a.s. has no component with at least 2 cycles.

Let F be a random configuration meeting the same conditions as in

Proof. Choose any vertex u. Let E, be the event that u lies on a component of size at most a with more than one cycle, and that throughout the exposure of this component, X i < 2a.

We will insist that u is the first vertex examined under Step 2a. If the size of the first component is at most a, then the second backedge must be chosen within at most a + 2 steps. Therefore, the probability that E holds is less than

as w(n) < nl’+‘ . Therefore, the expected number of vertices for which E, holds is o( 1 ) and SO

the probability that E, holds for any u is o(1). Therefore, by Lemma 3 and Corollary 1, a.s. no components of F have more than one cycle. 0

We can now show that F a s . does not have many cycles, by showing that it a s . does not have many cyclic components.

Lemma 6. Lemma 3. F a.s. has less than 2a logn cycles.

Let F be a random configuration meeting the same conditions as in

Proof. We will show that a.s. throughout the exposure of F, at most 2a logn backedges are formed. The rest will then follow, since by Lemma 5 , a.s. no component contains more than one cycle, and so a s . the number of cycles in F is exactly the number of backedges.

First we must define a set Bj of unmatched vertex-copies: For each i , if there are more than 2a open vertex-copies at the ith iteration, then let B; consist of any 2a of them. Otherwise, let B; consist of the open vertex-copies and enough arbitrarily chosen members of L to bring the size of B; up to a. Of course, if L is too small to do this, then we will just add all of L to B;. Let T, be the event that a member of B; is chosen in step i .

Clearly the number of backedges formed is at most the number of successful T,’s, plus the number of backedges formed at times when X j > 2 a . Now, by Corollary 1 , we know that there are a.s. none of the latter type of backedges, so we will concentrate on the number of the former type.

Now the number of vertex-copies to choose from is C j , , jd,(n) - 2i + 1 = M - 2i + 1 . Therefore, the probability of Ti holding is &, for M - 2i + 1 2 2a and 1 otherwise.

Therefore, the expected value of T, the number of successful Tj’s is (M - 2a) / 2 2a

E ( T ) = 2 a = C M - 2 i + 1 = a log(M)( 1 + o(1)) . i = 1

172 MOLLOY AND REED

Now we will use a second moment argument to show that T is a s . not much bigger than E(T):

= (E(T)2 + E(T))(l + o(1)).

Therefore, by Chebyshev's inequality, the probability that T > 1.5a log(M) is at most 1/(4E(T))(1+ o(1)) = o(1).

Therefore, a s . the number of backedges formed is less than 1.5a log(M) < 2a log n , proving the result.

And now we can prove Theorem lb.

Proof of Theorem lb. This clearly follows from Lemmas 2,3,5, and 6. 0

3. GRAPHS WITH GIANT COMPONENTS

In this section we will prove the analogue of Theorem l a for random configura- tions. Lemmas 1 and 2 will then imply that Theorem l a holds.

First we will show that a giant component exists with high probability:

Lemma 7. Let F be a random configuration with n vertices and degree sequence 9" meeting the conditions of Theorem 1. I f Q(9) > 0, then there exist constants 5, , 5, > 0 dependent on 9 such that F a s . has a component with at least 5,n vertices and 12n cycles. Moreover, the probability of the converse is at most z" , for some fixed O < z < 1.

Throughout this section we will assume that the conditions of Lemma 7 hold. As in Section 2, we will prove Lemma 7 by analyzing the Markov process discussed in the previous section. Again, the key will be to concentrate on the random variable Z j .

Lemma 8. There exists 0 < E < 1, 0 < A < min( a, $) such that for all 0 < 6 < A a.s. Zlsn, >can. Moreover, the probability of the converse is at most (z,)", for somefixed O < z , < l .

Proof. For simplicity, we will assume that 6n is an integer. Initially, the probability that a vertex-copy of degree i is chosen as a partner is p,(n) = id,(n)/ C,,, jdj(n) = ih , /K + o(1).

Unlike in Section 2, we have to consider the behavior of our walk after O(n) steps. Thus we have to worry about the third complication described at the end of Section 2, i.e., the fact that the ratios of unexposed vertices of different degrees are shifting.

It turns out that this problem is much less serious if we can ignore vertices of high degree. So what we will do is show that we can find a value i*, such that if

we change Z j slightly by saying that every time a vertex of degree i > i* is chosen,

A CRITICAL POINT FOR RANDOM GRAPHS 173

we subtract 1 from Z, instead of adding i - 2 to it, then we will still have positive expected increase.

We will then show that we can find a sequence c$l, . . . , 4i. summing to one, such that for each 2 5 i 5 i * , 4; is a little less than the initial probability of a vertex of degree i being chosen. However, if we were to adjust Zj a little further by selecting a vertex of degree i with probability 4; at each step, then we would still have a positive expected increase.

We will call this “adjusted Zj” ZT. Clearly, if we find some J such that after J steps, the probability of choosing a vertex of degree i is still at least 4, for 2 5 i 5 i* , then the probability that Z, > R for any R is at least as big as the probability that ZT > R. We will concentrate on the second probability as ZT is much simpler to analyze.

More formally, what we wish to do is choose a sequence 41,. . . , A . such that:

1. c 4; = 1; 2. 0 < 4, < ih , /K , for 2 5 i 5 i*, unless 0 = 4, = ih , /K; 3. C l a l i(i - 2)4; > 0.

Note that

Set p i = ih , /K. Since 9 is well behaved and Q(9) > 0, there exists i* such that

Therefore, we can choose a sequence +1, . . . , c#+. such that for all 2 5 i 5 i*, and Cj:, ( i - l)~#+ = 1 +

Ei12 (i - l)p, > 1 + E ’ , for some E ’ > 0 and sufficiently large n.

p i > r$i > 0 unless p i = 4; = 0, 41 = 1 - 42 - 43 - * * - - ~ ‘ 1 2 . It follows that C i z l ( i - 2 ) 4 = ~ ‘ 1 2 .

Consider the random variable ZT which follows the following random walk: . z;=o 0 Z,?+l = ZT + (i - 2 ) with probability c#+, 1 5 i 5 i* .

For i = 2 , . . . , i*, choose any A i > O such that <ri, and set A = min{A2, . . . , Ai,, $}. Clearly, after at most A iterations, the probability of choosing a copy of a vertex of degree i L 2 is at least 4;. Therefore, for 0 5 j 5. An, the random variable Z j majorizes Z y ; i.e., for any R,

Pr[Zj > R] 2 Pr[ZT > R.]

Now the expected increase in ZT at any step is ~ ’ / 2 . Thus the lemma follows by letting E = ~ ‘ / 4 , as is well known (see, for example [9]) that Z,*, is a s . concentrated around its expected value which is 2&n, and that the probability of

0 deviating from the expected value by more than O(n) is as low as claimed.

We have just shown that Zj a.s. grows large. However, we really want to analyze Xi. In order to do this, recall that the random variable 4 is defined to be

174 MOLLOY AND REED

the number of pairs exposed by the time that the jth vertex is partially exposed; i.e., W, = Zi.

Lemma 9. There exists 0 < 6 ' < A such that for any 0 < 6 I 6 ' there a.s. exists some 1 I I 5 I, , , , , such that XI > yn, where y = min(~N2, +). Moreover, the probability of the converse is at most (z2),, for some 0 < z2 < 1, dependent on 6 .

Proof. For simplicity, we will assume that 6n is an integer. We will count W, the number of backedges formed before either X , > yn or I,, pairs have been exposed. We claim that we can choose 6 ' such that a.s. W < + n for 6 56'.

At any step i , 1 I i s I,,, the probability that an open vertex-copy is chosen is Kn 1 2, + o(l), regardless of the choices made previous to that step. Now I, I j + Y , , s ~ + $ I ( ~ + e a ) n .

Therefore, at each step, the probability that such a backedge is formed is 0 if X, > yn and at most

X

+ 6 E ' = K - 26 - 266

if Xi 5 yn. Thus the number of such copies chosen is majorized by the binomial variable

Therefore the lemma follows so long as PI,, s p ( 6 + ~ 6 ) n < $ n , which is

Now if Xi 5 yn for all 1 5 i < Z,, then W is equal to YIsn. Therefore XIS, = Z,, - 2YISn which with probability at least 1 - (2,)" is at least

0

B W P , 1 6 , ) .

equivalent to 46 + 466 < K, yielding 6 '.

S ~ / 2 n , which yields our result.

Now that we know that Xi a s . gets to be as large as @(n) , we can show that there is a s . a giant component:

Lemma 10. There exists ll, c2 > 0 such that the component being exposed at step I = I,,,,, will a.s. have at least l l n vertices and 12n cycles. Moreover, the probability of the converse is at most (z,)", for some fixed 0 < z3 < 1.

Proof. Note that at this point there are a.s. at least n - 26'n - yn > n / 5 unexposed vertices. Form a set p consisting of exactly one copy of each of them.

There is a set x of XI open vertex-copies whose partners must be exposed before this component is entirely exposed. We will show that a.s. at least l l n of these will be matched with members of p, and at least C2n of these will be matched with other open vertex-copies from x. Clearly this will prove the lemma.

Now there are M - 21 vertex-copies available to be matched. Our procedure for exposing F simply generates a random matching among them where each matching is equally probable. The expected number of pairs containing one vertex from each of x, p is at least +(A), and the expected number of pairs of open

The previous lemmas give us a lower bound of 211n,2&n on these numbers, and it follows from the Chernoff bounds that these numbers are a.s. at least half

vertex-copies which form an edge of F is (F - I)(&)'. X

A CRITICAL POINT FOR RANDOM GRAPHS 175

of their expected values with the probability of the converse as low as claimed. Therefore, the component a.s. has at least l l n vertices and at least 12n cycles. 0

And now Lemma 7 follows quite easily:

Proof of Lemma 7 . This is clearly a corollary of Lemma 10. 0

We will now see that F a s . has only one large component.

Lemma 11. If F is a random configuration as described in Lemma 7 , then F a s . has exactly one component on more than T log n vertices, for some constant T dependent on the degree sequence.

Proof. We have already shown that F a s . has at least one giant component of size at least 5,n. We will see here that no other components of F are large.

Consider any ordered pair of vertices (u , u ) . We say that (u , u ) has property A if u and u lie on components of size at least 5,n and T log n , respectively. We will show that for an appropriate choice of T, the probability that (u , u ) has property A is o(n-,), which is enough to prove the lemma.

Recall that we may choose any vertex-copy we wish to start the exposure with. We will choose u.

By Lemma 9, there a s . exists some Z S Z , ~ , ~ , , such that X, > y n , where y = min(+, y , i), and so we can assume this is to be the case. Note that if after I steps, we are not still exposing the first component, C,, or if we have exposed a copy of u , then (u , u ) does not have property A, so we will assume the contrary. Define x to be the set of open vertex-copies after I steps.

Here we will break from the standard method of exposure. We will start exposing u’s component, C,, immediately, and put off the exposure of the rest of C , until later. We will see that if C, gets too big, then it will a.s. include a member

We expose C, in the following way. We start by picking any copy of u , and exposing its partner. We continue exposing pairs, always choosing a copy of partially-exposed vertex is known to be in C, (if one is available), and exposing its partner. We check to see if this partner lies in x. This would imply that u lies in C,. Once C, is entirely exposed, if it is disjoint from C,, then we return to exposing the rest of C , and continue to expose F in the normal manner. Note that this is a valid way to expose F.

At each step, the probability that a member of x is chosen is at least y l K . Also, if u lies on a component of size greater than T log n, then it must take at least T log n steps to expose this component. Therefore, the probability that u lies on a component of size greater than T log n which is not C , is at most

of x.

T log n

(I - 5) = o(n-2)

for a suitable value of T.

zero as n + m , so a.s. none exist. Therefore, the expected number of pairs of vertices with property A tends to

0

1 76 MOLLOY AND REED

It only remains to be shown that F a.s. has no small components with more than one cycle.

Lemma 12. has no multicyclic component on at most T log n vertices, for any constant T.

If F is a random configuration as described in Lemma 7, then F a.s.

Proof. Consider the probability of some vertex u lying on such a component. We will insist that we expose an edge containing u in the first execution of Step 2a. Now, if this component contains at most T l o g n vertices, then it is entirely exposed after at most o(n1I4) steps as the maximum degree is n114-‘.

At each point during the exposure we can assume Xi < n114, as otherwise Xi would not be able to return to zero quickly enough. Therefore, at each step, the probability that a backedge is formed is at most o(l)n114/(M - 2n114) = o(L3I4) . Therefore, the probability that at least 2 cycles are formed is at most

Therefore the expected number of vertices lying on such components is o ( l ) , and hence a s . none exist. 0

And now we have Theorem la:

Proof of Theorem la . This clearly follows from Lemmas 1 ,2 ,7 ,11 , and 12. 0

It is worth noting that by analyzing the number of open vertices of each degree more carefully throughout the exposure of F, it is possible to compute the size of the giant component more precisely. In fact, we can find a ~ ( 9 ) such that the size of the giant component is a s . (1 + o(l)K(9)n. Details will appear in a future paper.

4. APPLICATIONS

Here are a few applications of Theorem 1. The first is a new proof of a classical result concerning the double-jump threshold:

Theorem 2. For Gn.M=cn a.s. does

Proof. It is well

c > 4, Gn.M=cn a.s. has a giant component, while for c < 3, not have one.

known (see, for example, [7]) that such a graph a s . has

0’ e -2cn + o(n.”) i!

vertices of degree i for each i 5 O(log nllog log n), and no vertices of higher degree.

Now expose G by first exposing its degree sequence, and then choosing a random graph on that degree sequence. We will a s . get a sequence 9 which satisfies all the conditions of Theorem 1 and for which

A CRITICAL POINT FOR RANDOM GRAPHS 177

(2c)' e-2c Q(9)=C i ( i - 2 ) - i! 7

i z l

which is positive for c > + and negative for c < 3. 0

Note that this only gives an upper bound of O( ~,~go~o;)~)2 ) for the size of the largest component of G for c < $, rather than the proper upper bound of O( log n).

As mentioned earlier, this work was motivated by the study of minimally 4-chromatic subgraphs of a random graph G. Recall that such a subgraph must have minimum degree at least 3. This is of interest mainly in the study of the chromatic number of sparse random graphs, as if x(G) 2 4 , then G must have a minimally 4-chromatic subgraph, H .

Chvatal [7] showed that if G is a random graph on n vertices and cn edges, then for c < c* = 1.442. . ., the expected number of subgraphs of G with minimum degree at least 3 tends to 0 with n, while, for c > c * , the expected number of such subgraphs is exponentially large in n.

The authors wished to find which such subgraphs could actually be minimally 4-chromatic graphs. We looked at the following condition of Gallai [12]:

Definition. ( L ( H ) ) is the subgraph induced by the vertices of degree r.

If H is a graph with minimum degree r , then the low graph of H

Theorem 3. then L ( H ) has no even cycles whose vertices d o not induce a clique.

If H is a minimally k-chromatic graph with minimum degree k - 1 ,

We used this to prove the following:

Theorem 4. Let H be a random graph on n vertices and at most 1.793n edges with minimal degree 3. H is a.s. not a minimally 4-chromatic graph. Moreover, the probability of failure is at most z", for some fixed 0 < z < 1.

Outline of Proof. L ( H ) is a graph whose vertices are all of degree 0 , 1 , 2 , 3 . We showed that the degree sequence of L ( H ) could be approximately determined by the edge-density of H , and that all graphs on that degree sequence were equally likely to appear as L ( H ) . It then followed from Theorem 1 that if H has edge-density at most 1.793, then L ( H ) a.s. has O(n) cycles. We then showed that a s . at least one of these cycles was even and of length at least 6, and the result followed from the fact that L ( H ) has no cliques on more than four vertices.

We used Theorem 4 to show:

Corollary 2. There exists S > 0 such that if G is a random graph o n n vertices and [Snl edges, then the expected number of minimally 4-chromatic subgraphs of G is exponentially small, while the expected number of subgraphs of G with minimum degree at least 3 is exponentially large in n.

Outline of Proof. It follows from the results of [7] that for c slightly larger than

MOLLOY AND REED 178

c*, the expected number of such subgraphs of G with edge-density at least 1.793 is exponentially small. 0

The details to Theorem 4 and Corollary 2 will appear in a future paper, and can also be found in [18].

It is worth noting that Frieze, Pittel, and the authors [18], used a different type of argument to show that for c < 1.756 a random graph with edge-density c a s . has no subgraph with minimum degree at least 3, and hence is a s . 3-colorable.

ACKNOWLEDGMENTS

The authors would like to thank Avrim Blum, Alan Frieze, Colin McCiarmid, Chris Small, and Mete Soner for their helpful comments and advice, and two anonymous referees for several suggestions of improvement.

REFERENCES

[l] K. Azuma, Weighted sums of certain dependent random variables, Tokuku Math. J.,

[2] E. A. Bender and E. R. Canfield, The asymptotic number of labelled graphs with

[3] B. Bollobas, Random Graphs, Academic, New York, 1985. [4] B. Bollobas, Martingales, Isoperimetric inequalities and random graphs, Cofloq.

[5] B. Bollobas and A. Frieze, On matchings and Hamiltonian cycles in random graphs,

[6] B. Bollobh and A. Thompson, Random graphs of small order, Random Graphs '83,

[7] V. Chvatal, Almost all graphs with 1.44n edges are 3-colorable, Random Struct. Alg.,

[8] P. Erdos and A. RCnyi, On the evolution of random graphs, Magayr Tud. Akad. Mat.

[Y] W. Feller, An Introduction to Probability Theory and its Applications, Vol. 1. Wiley,

[lo] P. Flajolet, D. Knuth, and B. Pittel, The first cycles in an evolving graph, Discrete

(111 P. Flajolet, D. Knuth, T. tuczak, and B. Pittel, The birth of the giant component,

[12] T. Gallai, Kritische graphen I, Magayr Tud. Akad. Mat. Kutato Int. Kozl. , 8 165-192

[13] T. tuczak, Component behaviour near the critical point of the random graph process,

[14] T. tuczak, Sparse random graphs with a given degree sequence, Random Graphs, 2,

[15] T. Luczak and J . Wierman, The chromatic number of random graphs at the double-

19, 357-367 (1967).

given degree sequences, J . Combinat. Theory (A), 24, 296-307 (1978).

Math. SOC. Jands Bofyai, 52, 113-139 (1987).

Ann. Discrete Math., 28, 23-46 (1985).

Ann. Discrete Math., 28, 47-97 (1985).

2, 11-28 (1991).

Kutato Int. Kozl., 5, 17-61 (1960).

New York, 1966.

Math. 75, 167-215 (1989).

Random Struct. Alg., 4, (1993).

(1963).

Random Struct. Alg., I, 287-310 (1990).

165-182 (1992).

jump threshold, Combinatorica, 9, 39-50 (1989).

A CRITICAL POINT FOR RANDOM GRAPHS 179

[16] C. McDiarmid, On the method of bounded differences, Surveys in Combinatorics,

[17] B. D. McKay, Asymptotics for symmetric 0-1 matrices with prescribed row sums, Ars

[18] M. Molloy, The chromatic number of sparse random graphs, Masters thesis,

[19] R. W. Robinson and N. C. Wormald, Almost all cubic graphs are Hamiltonian,

[20] R. W. Robinson and N. C. Wormald, Almost all regular graphs are Hamiltonian,

[21] N. C. Wormald, Some problems in the enumeration of labelled graphs, Doctoral

Proc. Twelfth British Combinatorial Conference, 1989, pp. 148-188.

Combinat., 19A, 15-25 (1985).

University of Waterloo, 1992.

Random Struct. Alg. 3, 117-126 (1992).

manuscript.

thesis, Newcastle University, 1978.

Received August 6, 1993 Accepted November 24, 1994