· 2019-02-07 · Fourth Colloquium on Mathematics and Computer Science DMTCS proc. AG, 2006,...

Fourth Colloquium on Mathematics and Computer Science DMTCS proc. AG, 2006, 59–118

Some exactly solvable modelsof urn process theory

Philippe Flajolet, Philippe Dumas, and Vincent Puyhaubert

Algorithms Project, INRIA, F-78153 Le Chesnay (France)

We establish a fundamental isomorphism between discrete-time balanced urn processes and certain ordinary differen-tial systems, which are nonlinear, autonomous, and of a simple monomial form. As a consequence, all balanced urnprocesses with balls of two colours are proved to be analytically solvable in finite terms. The corresponding generatingfunctions are expressed in terms of certain Abelian integrals over curves of the Fermat type (which are also hyper-geometric functions), together with their inverses. A consequence is the unification of the analyses of many classicalmodels, including those related to the coupon collector’s problem, particle transfer (the Ehrenfest model), Friedman’s“adverse campaign” and Polya’s contagion model, as well as the OK Corral model (a basic case of Lanchester’s theoryof conflicts). In each case, it is possible to quantify very precisely the probable composition of the urn at any discreteinstant. We study here in detail “semi-sacrificial” urns, for which the following are obtained: a Gaussian limitingdistribution with speed of convergence estimates as well as a characterization of the large and extreme large deviationregimes. We also work out explicitly the case of 2-dimensional triangular models, where local limit laws of the stabletype are obtained. A few models of dimension three or greater, e.g., “autistic” (generalized Polya), cyclic chambers(generalized Ehrenfest), generalized coupon-collector, and triangular urns, are also shown to be exactly solvable.

Contents1 Urns and differential systems 61

1.1 Urn models [α, β, γ, δ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611.2 The associated differential system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

2 Basic examples of 2× 2 urns with entries in 0,±1 662.1 The contagion model (Polya): [1, 0, 0, 1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672.2 The adverse-campaign model (Friedman): [0, 1, 1, 0] . . . . . . . . . . . . . . . . . . . . . . . . . . . 682.3 The two-chambers urn (Ehrenfest): [−1, 1, 1,−1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692.4 The coupon collector’s urn: [−1, 1, 0, 0] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712.5 The sampling urns: [0, 0, 0, 0] and [−1, 0, 0,−1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722.6 The records urn [1, 0, 1, 0] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732.7 The all-ones urn: [1, 1, 1, 1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3 The first integral of balanced 2× 2 urns 74

4 General solutions for 2× 2 sacrificial urns (α ≤ −1) 75

5 The class of 2× 2 semi-sacrificial urns (α ≤ −1, δ > 0) 815.1 The analytic structure of the base functions S, C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815.2 Probabilistic properties of semi-sacrificial urns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.2.1 Extreme large deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855.2.2 A Gaussian limit law with speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855.2.3 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875.2.4 Large deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.3 The algebraic family [−1, σ + 1, σ − 1, 1] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 895.4 The algebraic family [−1, σ + 1, 1, σ − 1] (σ even) . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6 The class of 2× 2 fully sacrificial urns (α, δ < −1) 906.1 Analytic solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906.2 Elliptic urns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

1365–8050 c© 2006 Discrete Mathematics and Theoretical Computer Science (DMTCS), Nancy, France

http://www.dmtcs.org/proceedings/

http://www.dmtcs.org/proceedings/dmAGind.html

60 Philippe Flajolet, Philippe Dumas, and Vincent Puyhaubert

7 Triangular 2× 2 urns (γ = 0) 937.1 Generating functions and the exact form of probabilities . . . . . . . . . . . . . . . . . . . . . . . . . 947.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957.3 Local limit laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967.4 Mellin transforms and moments of fractional orders . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

8 Intermezzo: the OK Corral urn [0,−1,−1, 0] 99

9 Higher dimensional models 1029.1 The purely autistic (hyper-Polya) urn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029.2 The cyclic chambers (hyper-Ehrenfest) urn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1039.3 The generalized coupon collector’s urn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1069.4 The pelican’s urn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

10 Triangular 3× 3 urns 10910.1 Generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10910.2 Expectation and second moment of Bn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11010.3 The distribution of Bn when α ≥ δ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11110.4 The distribution of Bn when α < δ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

11 Relations between discrete and continuous time models 112

12 Conclusions 114

IntroductionThe urn models which we consider are the ones where a single urn contains balls of different colours and afixed set of rules specifies the way the urn composition evolves: at each discrete instant, a ball is picked upat random, its colour is inspected, and, in compliance with the rules, a collection of balls of various coloursis added to the urn. In this study, we restrict attention to balanced urn models, where the total numberof balls added at any instant is a deterministic quantity. Such urn models have been intensely studied forabout three centuries: early contributions are due to Jacob Bernoulli and Laplace [75], the subject beingthen largely renewed by works of Polya in the first half of the twentieth century. The book by Johnson andKotz [64] (see also [73] for a more recent update) offers an interesting perspective on the simplest models,their history, and some of their applications in statistics, engineering, and genetics. For dimension 2 (i.e.,balls can be of either of two colours), the model is precisely described in §1.1 below.

Our main theme is the following. To any balanced urn model, one can associate directly an ordinarydifferential system of the same dimension. This differential system has particular features, in that it isautonomous, monomial, and homogeneous in degree. The algebra of such a simple differential systemturns out to be isomorphic to the combinatorics of a balanced urn model. In particular an explicit solutionof the differential systems provides automatically an analytic solution of the urn model. We develop thisfundamental property in §1.2 in the case of dimension 2 and later extend it in §9 to the case of urns withmore than two colours.

The fundamental isomorphism between balanced urn models and differential systems makes it possibleto derive in a direct and unified way most of the explicit solutions known for special urns with two colours—for instance those of Polya, Ehrenfest, Friedman, as well as the coupon collector’s urn: this is the subjectof §2. Perhaps a more important consequence is that balanced urns with two colours belong to the classof integrable systems: we establish in §3 that the differential systems, for dimension 2 at least, alwaysadmit of an explicit first integral, so that the associated generating functions can be constructed from afinite sequence of elementary algebraic operations combined with quadratures and functional inversion(this corresponds to “closed-form” functions in the terminology of Taylor [98]). These integrability resultsextend those obtained earlier by Flajolet, Gabarro, and Pekari [30], where the method of characteristics wasused to solve an associated partial differential equation. The new feature of treating urn models directly viaan ordinary differential system takes some of its roots in the recent combinatorial study [16].

As an illustration of the usefulness of explicit solutions, we examine in detail in §4 and §5 the class ofsemi-sacrificial urns of dimension 2, which are defined by the fact that one diagonal entry is positive, theother negative. For such urns, we obtain the following: a Gaussian limit law for the urn’s compositionat large instants; a characterization of the speed of convergence to the Gaussian limit; a quantification ofthe extreme large deviation regime; the large deviation rate function; the structure of moments. Theseproperties extend what was known earlier from [30] regarding fully sacrificial urns, where both diagonal

Some exactly solvable models of urn process theory 61

entries are negative. The results of [30] are briefly reviewed in §6, which also contains a brief discussion of2× 2 urns whose solution involves elliptic functions.

Next we treat triangular urns of dimension 2 in §7, where we report on unpublished results that paralleland supplement a recent independent study of Janson [63]. Section 8 is an intermezzo dedicated to theOK Corral problem: there, a natural time-reversal transformation applied to Friedman’s urn (the adverse-campaign model) combines in an interesting fashion with analytic methods and provides exact evaluationsas well as asymptotic estimates that appear to be new.

In dimension greater than two, the isomorphism with differential systems still holds; see §9. However,the systems are not in general (explicitly) integrable. We can nonetheless single out a few special classesof exactly solvable urn models with dimension three or more, including generalized Polya and generalizedcoupon collector’s urns, as well as a multi-chamber Ehrenfest model: see §9; triangular urns of dimension 3are also (somewhat) exactly solvable: we report preliminary results on these in §10.

Our methodology strongly relies on special functions, singularity analysis, quasi-powers approximations,and, more generally methods exposed in the forthcoming book Analytic Combinatorics [38]. The principleof looking for explicit solutions could hardly be better expressed than by the following quotation fromEilbeck, Mikhailov, Santini, and Zakharov [in an introduction to a semester on Integrable Systems, at theNewton Institute in Cambridge, 2001]:

Rather surprisingly, relatively sizable classes of nonlinear systems are found to have an extra property, in-tegrability, which changes the picture completely. Integrable systems have a rich mathematical structure,which means that many interesting exact solutions to the PDEs can be found. Although important in theirown right, these systems form an archipelago of solvable models in a sea of unknown, and can be used asstepping stones to investigate properties of “nearby” non-integrable systems.

(Our title is inspired by that of Baxter’s magnificent opus, Exactly solved models in statistical physics [8].)As we propose to demonstrate in subsequent works, the methods of analytic combinatorics can be furtheradapted to several urn models that no longer admit of explicit solutions.

Apart from works that treat isolated problems, we mention here a few researches that deal with urnmodels at a level comparable to the one we are aiming at here. Friedman [43] launched the study ofmodels determined by a general symmetric 2 × 2 (balanced) matrix. Subsequently, Bagchi and Pal [6]elucidated the structure of moments for balanced 2 × 2 urns and deduced a Gaussian limit law for a largeclass of models. Smythe [96] found a theorem of considerable generality based on the spectral propertiesof the urn’s matrix, which is applicable to arbitrary dimensions. Gouet [46, 47, 48] developed powerfulmartingale methods that lead to functional limit theorems, but are often of a nonconstructive (i.e., non-explicit) nature. Pouyanne [91, 90] has recently developed a widely applicable operator approach to thedetermination of moments and it would be of great interest to relate his work to the present study. Higueraset al. [52] elaborate on the asymptotic theory of urns via algebraic differential equations, in a way that leadsto results of appreciable generality whenever some almost sure convergence property is a priori knownto hold. Finally, a special mention is due to the works of Athreya, Karlin, and Ney [3, 4] based on anembedding into continuous time, which has the capacity of dealing with nonbalanced urn models (seealso [2]); a central contribution in this area is Janson’s recent study [62].

Our interest in this range of questions started from problems in discrete probability theory and the analysisof algorithms. For these aspects, we refer to Mahmoud’s survey [84].

This text is an expanded version of invited lectures given by P. Flajolet at the Workshop on Analy-sis of Algorithms (AOFA2006, Alden Biesen, Belgium, 2-8 July 2006) and at the Fourth Colloquium onMathematics and Computer Science—Algorithms, Trees, Combinatorics and Probabilities (Nancy, France,September 18–22, 2006). It represents to a large extent a report on work in progress mingled with an at-tempted survey of the field of urn processes examined under the perspective of exactly solvable models andanalytic combinatorics methods.

1 Urns and differential systems1.1 Urn models [α, β, γ, δ]

We consider here urns that may contain balls of two possible colours or types, which we call rather unpoet-ically x and y. An urn model is specified by a matrix with integer entries,

M =(α βγ δ

), α, δ ∈ Z, β, γ ∈ Z≥0, (1)


also abbreviated as [α, β, γ, δ]. At time 0, the urn contains initially

a0 balls of type x, b0 balls of type y. (2)

At regularly spaced discrete instants, the urn evolves according to the following rule:

Urn evolution rule. A ball is chosen from the urn, with all balls present in the urn at that time havingequal chances of being chosen, and its colour is inspected(i). If the ball chosen is of colour x, then αnew balls of type x and β new balls of type y are placed into the urn; if the ball chosen is of colour y,then γ new balls of type x and δ new balls of type y are placed into the urn.

By convention, a negative (diagonal) entry in the matrix M means that balls of the corresponding colourshould be taken out of the urn.

For instance, with the matrix M =(−1 32 0

), a possible evolution starting with the initial configura-

tion (a0, b0) = (3, 1), at time 0, is described by the following sequence

xxxy → xxyyyy → xxyyyyxx → xxyyyyxyyy → xyyyyxyyyyy.

There, the ball chosen at each step is underlined. If it is x, then this ball is removed from the urn (sinceα = −1) and 3 balls of type y are appended (since β = 3); if it is y, then then 2 balls of type x are appendedand the balls of type y are kept untouched (since γ = 2 and δ = 0). We call history any such unambiguousdescription of the evolution of an urn—a history is thus nothing but a complete description of a samplepath (or trajectory) of the urn process. The length of a history is the number of transitions (→ arrows)that it comprises, so that a history of length n specifies the possible evolution(ii) of an urn from time 0 totime n. This paper is entirely based on developing an enumerative approach to urn histories, in contrast toprobabilistic treatments on which most publications in the field rely.

Throughout this article, we limit ourselves to considering urns that are balanced in the sense that thematrix M has constant row sums, this sum being consistently denoted by σ,

σ = α+ β = γ + δ, (3)

and called the balance. For an urn which is balanced, the number sn of balls that it contains at discretetime n is a deterministic quantity satisfying

sn = a0 + b0 + nσ = s0 + nσ. (4)

For urns with negative diagonal entries, there is also an issue of tenability. Say we have α = −g withg ≥ 2. This means that, whenever an x-ball is picked up, a group of g balls should be taken out of the urn.In order to avoid configurations that would block the urn, one must postulate that the initial composition ofthe urn satisfies g|a0 and that the number of balls of the first kind added when a ball of the second kind ischosen satisfies g|γ. Similarly for y-balls.

Definition 1 An urn given by a matrix M is tenable when the following two conditions are met:

(i) if the diagonal entry α is such that α < −1, then |α| divides both a0 and γ;

(ii) if the diagonal entry δ is such that δ < −1, then |δ| divides both b0 and β.

Henceforth, all urn models involving negative diagonal entries are assumed to be tenable.

Our global purpose is to determine the composition of the urn at any given time n. Let Hn

(a0

b0

)be the

set of histories of length n when the urn has the initial configuration (a0, b0), and Hn

(a0 ab0 b

)the subset of

those which, at time n, correspond to an urn with a balls of type x and b balls of type y. The correspondingcardinalities are

Hn

(a0

b0

)= cardHn

(a0

b0

), Hn

(a0 ab0 b

)= cardHn

(a0 ab0 b

).

(i) Thus, the chosen ball is not removed from the urn: this is the convention adopted by virtually all authors in the field, one that entailsno loss of generality.

(ii) We may regard the transitions as occurring at half integer times, or, equivalently, one may consider the nth transition to occurimmediately before time n.


We shall approach the determination of these quantities via their generating functions (GFs):

H

(z

∣∣∣∣ a0

b0

)=∑

n

Hn

(a0

b0

)zn

n!, H

(x, y, z

∣∣∣∣ a0

b0

)=∑n,a,b

Hn

(a0 ab0 b

)xayb z

n

n!,

here taken to be of the exponential type. The trivariate GF is also called the complete generating functionof urn histories. For future reference, we state:

Proposition 1 For an urn of balance σ with σ > 0, the number of histories of length n satisfies

Hn

(a0

b0

)= s0(s0+σ) · · · (s0+(n−1)σ) = σn Γ(n+ s0/σ)

Γ(s0/σ)= n!σn

(n+ s0/σ − 1

n

), s0 := a0+b0,

and the corresponding exponential generating function is

H

(z

∣∣∣∣ a0

b0

)=

1(1− σz)(a0+b0)/σ

(σ > 0). (5)

The complete generating function of urn histories H(x, y, z

∣∣∣∣ a0

b0

)is, for any R ≥ 1, analytic in the

domain|x| ≤ R, |y| ≤ R, |z| < 1

σRσ.

An immediate consequence of Stirling’s formula is

Hn ≡ Hn

(a0

b0

)= n!σn ns0/σ−1

Γ(s0/σ)

(1 +

s0(s0 − σ)2σ2n

+O

(1n2

)), s0 = a0 + b0. (6)

Proof: The number of histories is determined by the observation that there are s0 possible ways to pick upa ball at step 1, s1 = s0 + σ ways at time 2, and so on. The univariate GF then plainly results from thebinomial theorem. For |x| ≤ R and |y| ≤ R, the trivariate GF H(x, y, z) has the coefficient of its zn termdominated by

(s0 + σn)Rs0+σn · σn Γ(s0/σ + n)Γ(s0/σ)Γ(n+ 1)

= O(Rσnσnns0/σ

),

from which the analyticity property results. 2

In what follows, we shall mostly use the short forms

Hn ≡ Hn

(a0

b0

), H(z) ≡ H

(z

∣∣∣∣ a0

b0

), H(x, y, z) ≡ H

(x, y, z

∣∣∣∣ a0

b0

),

whenever the initial conditions (the values of a0, b0) are understood from context.The univariate GF H(z) of all histories is by (5) a simple algebraic function. The trivariate (complete)

GF H(x, y, z) is the one that determines the urn composition; it is a (yet unknown) deformation of H(z),and Proposition 1 ensures that it is always a function of z analytic in a neighbourhood of the origin, oncethe values of x and y have been fixed to arbitrary finite values. This simple observation makes it possibleto conduct calculations in the following manner: pick up suitably convenient domains for x and y, thenrestrict z to be such that |z| is small enough; extend the result of the calculation to larger domains by virtueof unicity of analytic continuation.

Since the total number of balls at time n is the deterministic quantity s0 + nσ, all monomials enteringthe trivariate GF are of the form

xaybzn = xays0+nσ−azn,

so that one has the homogeneity relation:

H

(x, y, z

∣∣∣∣ a0

b0

)= ys0H

(x/y, 1, zyσ

∣∣∣∣ a0

b0

). (7)

From a logical standpoint, the variable y is unnecessary, and we can freely set it to the value y = 1,whenever appropriate.


We have left aside so far the cases where σ ≤ 0. When σ = 0, the number of all histories satisfies

Hn

(a0

b0

)= sn

0 , H

(z

∣∣∣∣ a0

b0

)= es0z.

When σ = −τ < 0 all histories are of finite length at most s0/τ . One has

Hn

(a0

b0

)= s0(s0 − τ) · · · (s0 − (n− 1)τ), H

(z

∣∣∣∣ a0

b0

)= (1 + τz)(a0+b0)/τ ,

with the trivariate GF then being a polynomial.Before embarking in the determination of the counting generating function H , we note that there is an

exact equivalence between the combinatorial model where all histories of a given length are consideredequally likely and the original probabilistic rules governing the urn’s behaviour.

Proposition 2 Let An (respectively Bn) be the random variable representing the number of balls of colourx (respectively y) of a balanced urn process at time n. Then, we have:

P An = a, Bn = b =[xaybzn]H(x, y, z)

[zn]H(1, 1, z).

In addition, at any given time, An and Bn determine one another: for an urn of balance σ, they satisfy therelation:

An +Bn = a0 + b0 + nσ. (8)

Proof: This equivalence between combinatorics and probability owes to the fact that we are dealing exclu-sively with balanced urns (it would badly fail otherwise): the size of the urn at time n is the deterministicquantity sn = s0 + nσ, and, since histories are unambiguous descriptions of sample paths, each history oflength n has equal probability (1/Hn) of being chosen. 2

1.2 The associated differential systemOur approach relies on observing dramatically simple relations between the action of differentiation oncertain expressions and the stochastic evolution of an urn. The starting point is the effect of differentiation,noted ∂, on monomials: the obvious relation ∂x[xn] = nxn−1, can be interpreted as

∂x

[xx · · ·x

]= (6 xx · · ·x) + (x6 x · · ·x) + · · ·+ (xx · · · 6 x),

meaning: “pick up in all possible ways a single occurrence of the formal variable x and delete it”. Simi-larly, the operator x∂x picks up all occurrences (underlined below for readability) of the variable x withoutdeleting them:

x∂x

[xx · · ·x

]= (xx · · ·x) + (xx · · ·x) + · · ·+ (xx · · ·x).

This interpretation of differentiation in terms of “marking” is classical in combinatorial analysis [9, 38, 49,76].

Guided by the previous principles, we then associate to an urn of matrix M the linear partial differentialoperator

D = xα+1yβ∂x + xγyδ+1∂y, (9)

where ∂x ≡ ∂∂x . The composition of an urn that contains at any given time a balls of type x and b balls of

type y can be encoded by the monomial m = xayb. It is easily seen that the quantity D[m] generates all thepossible evolution of the urn in one step:

D[xayb] = axa+αyb+β + bxa+γyb+δ.

Similarly, Dn[xa0yb0 ] is the generating polynomial of the configurations of the urn after n steps, when itsinitial configuration is (a0, b0),

Dn[xa0yb0 ] =∑a,b

Hn

(a0 ab0 b

)xayb,

so that, using operator notation,

H(x, y, z) ≡ H

(x, y, z

∣∣∣∣ a0

b0

)=∑n≥0

Dn[xa0yb0 ]zn

n!= ez D [xa0ya0 ] . (10)

Now, comes the central mathematical object of the present paper.


Definition 2 Given a 2 × 2 urn model M, the associated differential system is the ordinary differentialsystem

Σ :x = xα+1yβ

y = xγyδ+1 (11)

There, the notation x represents differentiation with respect to the independent variable, which is by con-vention taken to be t.

The system Σ is of a quite specific form: it is nonlinear (apart from some of the simplest cases), autonomous(meaning that independent variable t does not figure explicitly), monomial (in the sense that only monomialsappear in the right hand sides), and homogeneous with respect to total degree (which reflects the fact thatthe original urn model is balanced).

The observation that leads to our next statement is extremely simple. Let X = X(t) and Y = Y (t) beany particular solutions of the system (11). Consider the effect of a derivative ∂t on a monomial of the formXaY b. We have:

∂t

(XaY b

)= aXa−1XY b + bXaY b−1Y (by standard differentiation rules)= aXa+αY b+β + bXa+γY b+δ (by system Σ).

In other words: The effect of usual differentiation on solutions of the associated differential system Σmimicks the evolution of the urn M. Accordingly, we have

∂t

(XaY b

)= D

[xayb

]x→ Xy → Y

,

and upon repeating the process:

∂nt

(XaY b

)= Dn

[xayb

]x→ Xy → Y

. (12)

Theorem 1 (Basic isomorphism, 2× 2 urns) Consider a balanced urn M of the form (1), (3), which isinitialized with a0 balls of type x and b0 balls of type y. Let x0, y0 be complex numbers that are each

nonzero: x0y0 6= 0. Let X(t

∣∣∣∣x0

y0

)and Y

(t

∣∣∣∣x0

y0

)be the solution to the associated differential system Σ

with initial conditions x0, y0. The generating function of urn histories satisfies (for z small enough)

H(x0, y0, z) = X

(z

∣∣∣∣x0

y0

)a0

Y

(z

∣∣∣∣ x0

y0

)b0

.

Proof: The trivariate GF is given by (10) as

H(x, y, z) =∑n≥0

Dn[xa0yb0 ]zn

n!,

which by (12) and Taylor’s formula yields formally

H(X(t), Y (t), z) =∑n≥0

∂nt [X(t)a0Y (t)b0 ]

zn

n!= X(t+ z)a0Y (t+ z)b0 . (13)

Analytically, these manipulations are justified by the Cauchy-Kovalevskaya existence theorem, which guar-antees that X(t), Y (t) are well defined and analytic in a neighbourhood of the origin. As a consequence,Taylor’s formula can be applied and (13) is valid at least for some small enough values of z and t. Itthen suffices to set t = 0 to get the statement. (Observe that the condition x0y0 6= 0 is needed to ensureanalyticity, in the case of negative exponents in (11), that is, α+ 1 < 0 and/or δ + 1 < 0.) 2

Finding the trivariate GF H(x, y, z) thus reduces to solving the system Σ with “floating” initial condi-tions, (x0, y0). This task is doable directly for many of the most classical urn models, for which Theorem 1leads directly to explicit solutions, as we show in the next section (§2). A notable feature of the argu-ment is that it adapts to higher dimensional urn models, the ones that have balls of m possible colours, forany m ≥ 3, provided again that the models are balanced (§9 and Theorem 6, p. 102).


Note 1. The Partial Differential Equation (PDE) approach. From the exponential operator form of (10), it immediatelyresults that H satisfies the partial differential equation:

∂zH = D H, i.e.,∂

∂zH(x, y, z) = xα+1yβ ∂

∂xH(x, y, z) + xγyδ+1 ∂

∂yH(x, y, z). (14)

In addition, the monomials composing H satisfy a homogeneity relation: any such monomial xaybzn with a nonzerocoefficient is such that a + b = s0 + nσ (see (8)), which implies

(x∂x + y∂y − σz∂z) H(x, y, z) = s0H(x, y, z). (15)

Given (14) and (15), it becomes possible to eliminate the differential dependency on y (that is, ∂y), then set y = 1. Inthis way, it is found that the bivariate generating H(x, z) = H(x, 1, z) satisfies the fundamental PDE:“

1− σzxβ+σ”

∂zH +“xβ+σ+1 − x1−α

”∂xH − s0x

β+σH = 0. (16)

The solution of this equation by the method of characteristics lies at the basis of the earlier study by Flajolet, Gabarro,and Pekari [30] (see also [92] for other instances), but we do not pursue this line of investigation here. 2

Note 2. Combinatorics of urn histories as weighted paths. An urn history can be rendered as a weighted path in thediscrete Cartesian plane Z× Z. Indeed the sequence of values of the urn’s composition

(A0, B0), (A1, B1), . . . , (Ak, Bk), . . . , (An, Bn),

defines a polygonal line in Z2. A history is then completely specified by a sequence of integers

$0, $1, . . . , $k, . . . , $n−1,

where $k indicates which ball is picked up at time k (assuming some canonical numbering that identifies balls).If (Ak, Bk) = (x, y), then one should have ωk ∈ [1 . . x + y], as there are x + y possibilities to choose a ball. Theedges connecting any two consecutive points, the “steps”, are of either one of the two types,

v =

„αγ

«, w =

„βδ

«,

in accordance with the rules determined by the urn’s matrix M. A v-step is chosen with a weight of x (the number ofpossibilities for choosing an x ball), and similarly a w-step is chosen with a weight of y, weights being multiplicativequantities over the sequence of steps. This is summarized by the following diagram

px

y s@@I

v, weight: x

w, weight: y

for M = [−1, 3, 1, 1].This interpretation in terms of weighted paths is well known: for instance, Kotz, Mahmoud, and Robert [74] in-

troduce it to develop complicated multiple sum expressions for the probability distribution of the urn ’s composition.Though, in all generality, the weighted path approach is ineffective for solving urn models, it is of occasional interestto combinatorialists in situations where a simple transformations can yield lattice paths of a recognizable form (see forinstance [16]). This is in particular the case of the Ehrenfest urn (Note 5, p. 71), and of the OK Corral urn treated in§8, p. 99, by means of the time-reversal transformation of Kingman and Volkov [70]. Weighted lattice paths of a formsimilar to ours arise in Leroux and Viennot’s studies of the combinatorics of nonlinear differential systems (see, e.g,[77]). 2

2 Basic examples of 2× 2 urns with entries in 0,±1Theorem 1 makes it possible to investigate directly a number of urn models, whose analyses have beenpreviously approached by a variety of techniques, including probabilistic methods, combinatorial analysis,and the calculus of finite differences. We examine here in the light of Theorem 1 the collection of allbalanced urn models whose matrices have entries in the set −1, 0,+1.


20

20

0400

100

8060 100

40

80

60

20

20 40

40

60

8060 10000

80

Figure 1: Ten simulations of a Polya contagion urn over 100 steps, initialized with (1, 1) [left] and (1, 2) [right],indicate that the distribution of the urn’s composition is spread. The diagrams plot the evolution of An as a functionof n.

2.1 The contagion model (Polya): [1, 0, 0, 1]

The urn with matrix

M =(

1 00 1

), (17)

that is, the identity matrix, has a simple interpretation: one can imagine a population with two kinds of genes(or diseases) that randomly comes into contact with external elements to which their gene (or disease) istransmitted. The question is to investigate the way the initial proportion of genes of both kinds evolves withtime.

The differential system is x = x2

y = y2 ,

x(0) = x0

y(0) = y0.

Thanks to separation of variables, one finds xx−2 = 1, yy−2 = 1, which is integrated, resulting in thesolution pair (X(t), Y (t)): X(t) =

x0

1− tx0

Y (t) =y0

1− zy0

.

By Theorem 1, the complete GF of histories for this urn is

H(x, y, z) =xa0yb0

(1− zx)a0(1− zy)b0.

The urn has balance σ = 1 and the number of histories of length n is Hn = n!(n+s0−1

s0−1

). The probability

that, at time n, there are a balls of the first type and b balls of the second type (necessarily, a + b =a0 + b0 + n), is obtained from the trivariate GF by coefficient extraction.

Proposition 3 For the Polya contagion model (17), the composition of the urn at time n is

P(An = a,Bn = b) =

(a−1a0−1

)(b−1b0−1

)(a0+b0+n−1

a0+b0−1

) . (18)

It is not hard to establish this classical result by a direct reasoning, but the way it comes out mechanicallyfrom Theorem 1 is pleasant.

Figure 1 shows that the urn’s composition at time n varies widely. In fact, with (a0, b0) = (1, 1), onegets, in accordance with (18),

P(An = k) =1

n+ 1, k = 1, . . . , n+ 1


that is, the distribution of An is uniform over the set of all possible values [1 . . n + 1], so that An/n is inthe limit uniform over the real interval [0, 1]. Similarly, with (a0, b0) = (1, 2), the random variable An/nhas a limit distribution supported by [0, 1] with density 2(1− x), and so on. These results easily generalizeto higher dimensions: see §9.1, p. 102.

2.2 The adverse-campaign model (Friedman): [0, 1, 1, 0]

This corresponds to the antidiagonal matrix

M =(

0 11 0

), (19)

and it is a kind of dual to Polya’s model. This urn was studied by Friedman [43], and it can be viewed asmodelling a propaganda campaign in which candidates are so bad that the persons who listen to them areconvinced to vote for the opposite candidate. (This is perhaps not such an unrealistic situation!) The urnhas balance σ = 1. As is obvious from simulations, see Figure 2, this urn has a behaviour very differentfrom Polya’s urn. Indeed, any large enough deviation of the urn’s composition from proportions ( 1

2 ,12 )

tends to correct itself and bring the system back to equilibrium, and accordingly, the composition of the urnis concentrated around its mean value.

20 400

50

20

60

30

40

0

10

10080 20 400

60

40

800

50

60

10

100

20

30

Figure 2: Ten simulations of a Friedman’s urn (adverse campaign model) over 100 steps, initialized with (1, 1) [left]and (1, 2) [right], indicate that the distribution of the urn’s composition is concentrated.

The associated differential system readsx = xyy = xy

,

x(0) = x0

y(0) = y0.

We have x = y, so that x and y only differ by a constant: y = x− x0 + y0. Thus, x satisfies

x

x(x− x0 + y0)= 1,

which is easily integrated. The solution pair is then found to beX(t) =

x0(x0 − y0)x0 − y0et(x0−y0)

Y (t) =y0(y0 − x0)

y0 − x0et(y0−x0)

,

and the complete GF of histories is

H(x, y, z) =(

x(x− y)x− yez(x−y)

)a0(

y(y − x)y − xez(y−x)

)b0

. (20)

We single out here a simple consequence of our calculation:


0.6

0.4

0.3

0.2

10.80.40 0.2

0.5

0.1

0

Figure 3: The histograms of the probability distribution P(An = k) as a function of k/(n + 1), for the adverse-campaign urn initialized with (1, 0) and for n = 3 . . 50, illustrate convergence to a Gaussian regime.

Proposition 4 For Friedman’s adverse-campaign urn defined by (19) and for the initial condition (1, 0),the generating function of urn histories satisfies

H(x, 1, z) =x(x− 1)x− ez(x−1)

=x(x− 1)x− ez(x−1)

= x+ xz +12(x+ 1)xz2 +

16(x2 + 4x+ 1)xz3 + · · · . (21)

In particular n![znxk]H(x, 1, z) is an Eulerian number.

The bivariate generating function of Eulerian numbers is given in [15, 51]: it enumerates permutationsaccording to their number of rises (or equivalently, descents, falls, ascents, runs).

Note 3. Acquaintance groups and the combinatorics of Friedman’s urn. This urn has interesting ramifications relatedto the combinatorics of permutations and increasing Cayley trees, as well as to a growth model of acquaintance groups.Indeed, consider an evolving population, which at time n consists of individuals numbered 1, 2, . . . , n+1. At time nany of the individuals present, say individual numbered j, can recruit a new member (then numbered n + 2). In thatcase, an edge is drawn that connects n+2 to j. The collection of all edges at time n constitutes a tree, which has label 1at its root, and is such that labels along any branch stemming from the root are in increasing order—this is an increasingCayley tree [10]. The leaves of the tree, correspond to individuals who have not recruited, call them “neophytes”;the nonleaf nodes correspond to individuals that have recruited, call them “gurus”. The problem is to determine theproportion of neophytes in the group, equivalently, the number of leaves in the tree at time n.

Represent a neophyte by a ball of type x = •, a guru by a ball of type y = . Then, the two rules are

• =⇒ ——•, =⇒ ——•,

and it is easily recognized that it is exactly modelled by Friedman’s urn, with initial conditions (1, 0). The solutionis then provided by Equation (21). (The relation to rises in permutations can also be seen directly from existingcombinatorial bijections between increasing Cayley trees, binary increasing trees, and permutations, and it is useful inthe design and analysis of several classical data structures of computer science [10, 12, 38, 41, 102].) The connectionwith Eulerian numbers implies furthermore that the limiting distribution of balls of either kind is asymptotically normal(Figure 3), a property known for a long time in the context of run statistics in random permutations: see for instancethe book by David and Barton [18, Ch. 10]. 2

2.3 The two-chambers urn (Ehrenfest): [−1, 1, 1,−1]

Consider two chambers initially containing a finite number (s0 = N ) of particles. At any given time, aparticle migrates from one chamber to the other (Figure 4). This famous model was first introduced byPaul and Tatiana Ehrenfest [26] in 1907 in order to investigate the way a system can reach thermodynamicequilibrium.

Colour balls according to the chamber in which they are at any given time (say x for the left and y for theright chamber). The system is modelled by the urn process with matrix

M =(−1 11 −1

), (22)


Figure 4: The Ehrenfest model (two-chambers urn).

and its balance is σ = 0. The associated system is simplyx = yy = x

,

x(0) = x0

y(0) = y0

According to Theorem 1, it suffices to solve the system in order to determine the GF of urn histories. Thederived equalities x = x, y = y show that the solution is such that

X(t) = x0 cosh t+ y0 sinh tY (t) = x0 sinh t+ y0 cosh t .

Thus, the trivariate GF that solves the model is

H(x, y, z) =(x+ y

2ez +

x− y

2e−z

)a0(x+ y

2ez − x− y

2e−z

)b0

. (23)

In the standard version of the model, one starts from a0 = N and b0 = 0. In that case, the complete GFsimplifies to

H(x, y, z) = (x cosh z + y sinh z)N. (24)

The total number of histories of length n is Nn. The event that all particles have returned to the firstchamber is then obtained by setting x = 1 and y = 0 in H(x, y, z). Consequently:

Proposition 5 For Ehrenfest’s urn defined by (22), with N particles and initial configuration (N, 0), theprobability that, at time n, all particles have returned to the first chamber is

P(AN = N,Bn = 0) =n!Nn

[zn] coshN z =1

Nn2N

N∑k=0

(N

k

)(N − 2k)n.

The formula is obtained via a binomial expansion; it is valid irrespective of the parity of n. The calculationis easily extended to the determination of the probability that k particles have migrated (expand the resultof Equation (24)).

The way the solution of this famous problem comes out automatically is noteworthy. In the past it hasbeen approached in particular via matrix algebra, probability theory, combinatorial analysis, and continuedfraction theory. A discussion of higher dimensional generalizations appears in §9.2, p. 103.

Note 4. Matrix analysis of the Ehrenfest urn. An Ehrenfest urn can be viewed as a finite state Markov chain with 2N

states, when all balls are distinguished—it is then equivalent to a random walk on the N -dimensional hypercube; forN = 3, see Figure 5 (left). States of this complete chain can be aggregated, with only the number of balls in the leftchamber being retained, and this leads to a reduced Markov chain with transition probabilities represented in Figure 5(right).

Since the Markov chain is stationary, every state is recurrent and the system will (with probability 1) return infinitelyoften to its initial configuration. This seems to contradict the principles of thermodynamics according to which thesystem should be in equilibrium, with approximately N/2 balls in each chamber. The apparent contradiction is resolvedby observing that statistical mechanics only represents the ideal situation of a large N limit. The Ehrenfest model thenserves to quantify precisely the discrepancy between a finite system and its thermodynamic limit approximation.


1/3

000 001

011010

100

110 111

101

0 1 2 32/3

2/3

3/3 1/3

3/3

Figure 5: The complete and reduced Markov chains describing the Ehrenfest urn for N = 3.

The analytic solution given by Marc Kac in [66] is well known. It relies on the explicit diagonalization of the(N + 1)× (N + 1) tridiagonal matrix of the Markov chain, which is 1

NK, where

K =

0BBBBB@0 N · · ·1 0 N − 1 · · ·

2 0 N − 2 · · ·3 0 N − 3 · · ·

......

......

.... . .

1CCCCCA . (25)

This is a “simple matrix with many hidden treasures” in the terms of Edleman and Kostlan [25] who offer a vividdiscussion, including surprising connections with random polynomials. 2

Note 5. Combinatorial analysis of the Ehrenfest urn. Basic combinatorics provides an attractive alternative approachto the Ehrenfest model. Consider histories of the urn that start from state (N, 0) and return to state (N, 0) in n steps.This can be viewed as an ordered partition of the integer interval [1 . . n] into N (possibly empty) classes, where eachclass records the times at which a particle moved. For a particle that returns to its original chamber, its class must be ofeven cardinality. Then, the general theory of exponential generating functions (EGF) provides coshN z, as the EGF ofall such partitions. Similarly, if k out of N particles migrate in the course of a history, then the EGF is directly foundto be

N

k

!(cosh z)N−k (sinh z)k , (26)

which is in agreement with (24). (Proof: there are k classes of odd cardinality and N − k classes of even cardinality, tobe chosen in

`Nk

´ways; the odd [resp., even] classes are enumerated by sinh z [resp., cosh z].)

An alternative derivation of the formulae relative to the Ehrenfest urn can be based on Flajolet’s combinatorial theoryof continued fraction [28]: histories can be transcribed as altitude-weighted Dyck paths (simple excursions) to whichthe theory applies. As a consequence, the ordinary generating function (OGF) of histories corresponding to a cycleinvolves the continued fraction

1

1−1 ·N z2

1−2 · (N − 1) z2

1−3 · (N − 2) z2

. . .

. (27)

See Goulden-Jackson [50] for details of this approach in the case of Ehrenfest’s urn and Flajolet-Guillemin [33] for anextension of the continued fraction framework to continuous-time models. (The comparison of (26) and (27), whichare related by the Laplace transformation, gives then combinatorially a continued fraction first discovered analyticallyby Stieltjes.) 2

2.4 The coupon collector’s urn: [−1, 1, 0, 0]

This is the first of our models where ball colours play nonsymmetrical roles. The urn may contain balls ofeither white (x) or black (y) colour. If a white ball is drawn, it is taken out, painted in black, then placedback into the urn; if a black ball is drawn, it is placed back unchanged into the urn. The urn’s matrix is thus(

−1 10 0

). (28)


It is instantly realized that the particular initial conditions (a0, b0) = (N, 0) model the classical couponcollector problem: a black ball represents an item that has already been drawn, a white ball an item not yetdrawn. The associated differential system is

x = yy = y

,

x(0) = x0

y(0) = y0,

and it admits the solution X(t) = x0 − y0 + y0e

t

Y (t) = y0et .

Thus, the complete GF of urn histories is

H(x, y, z) = (x− y + yez)a0yb0eb0z.

The last expression specializes to give back the classical solution of the coupon collector problem. Uponsetting y = 1 and choosing the particular initial condition (N, 0), one finds:

H(x, 1, z) = (x+ ez − 1)N.

This can be expanded:

Proposition 6 For the coupon collector’s urn defined by the matrix (28) and initialized with (N, 0), theprobability distribution of the number of y balls at time n is

P(Bn = b) =b!Nn

(N

b

)n

b

,

where

nb

is a Stirling number of the partition type or of “second kind”.

See [15, 51] for the usual definition of Stirling numbers. Quite remarkably, this urn model already appearsexplicitly in Laplace’s Theorie analytique des probabilites [75], first published around 1811—Laplace in-deed assigns the original question to [Jacob] Bernoulli. Gani [44] has recently made use of this urn to modelcertain infectious phenomena (the propagation of diseases by means of infected syringes).

2.5 The sampling urns: [0, 0, 0, 0] and [−1, 0, 0,−1]

The urn with the null matrix (0 00 0

)models sampling with replacement. We are initially given a0 white (x) balls and b0 black (y) balls. A balldrawn is placed back into the urn, irrespective of its colour, so that the composition of the urn never changes.Thus the evolution is nothing but a succession of independent Bernoulli trials (with respective probabilitiesa0/s0 and b0/s0). The system is x = x, y = y, so that X(t) = x0e

t and Y (t) = y0et. The complete GF is

accordinglyH(x, y, z) = xa0yb0e(a0+b0)z.

(The binomial law arises when one counts the number of times a white balls has been drawn.)The urn with matrix (

−1 00 −1

)corresponds to sampling without replacement and is more interesting. The system is

x = 1y = 1 ,

x(0) = x0

y(0) = y0.

The complete GF of urn histories is then

H(x, y, z) = (x+ z)a0(y + z)b0 ,

from which the probability distribution of An, Bn reads off:

P(An = a,Bn = b) =

(a0a

)(b0b

)(a0+b0a+b

) ,with a+ b = n. This is again extremely classical.


2.6 The records urn [1, 0, 1, 0]

This urn is formed with identical rows, so that the composition of the urn at time n is entirely deterministic:

An = a0 + n, Bn = b0.

From the associated differential system, we find mechanically:

H(x, y, z) =xa0yb0

(1− zxy)a0+b0.

Though the composition of the urn at various instants is predictable, the number of times a ball of eithertype is chosen in the course of an evolution of length n is of interest.

Let the colours x and y be named white and black, respectively, and start with a single black ball, so thata0 = 0, b0 = 1. At time n, there are n white balls and exactly one black ball. The probability that the blackball is picked up is 1/(n + 1). Thus, the number Sn of times that the black ball is chosen in the course ofan n step evolution is a sum of independent Bernoulli random variables:

Sn = X1 +X2 + · · ·+Xn, Xj ∈ Bern(

1j, 1− 1

j

).

This random variable also represents the number of records in a random permutation of size n+1. (A directproof results from the classical construction of inversion tables [15].) For this special case, the differentialequation approach to urn models can be adapted as follows. It suffices to introduce a variable w that keepstrack of the choices of a black ball and define the (modified) differential system:

x = x2

y = wxy,

x(0) = x0

y(0) = y0.

The solution isX(t) =

x0

1− x0t, Y (t) = y0(1− x0t)−w.

The variables x0, y0 are redundant and we may freely set them to 1. For (a0, b0) = (0, 1), the outcome isthe bivariate generating function of the number of records (and cycles) in a permutation, which comes outas

1(1− z)w

,

a classical result for the generating function of Stirling cycle numbers [15, 51]. (This urn is also closelyrelated to what Pitman names the “Chinese Restaurant Process” [89] and to the Ewens sampling distributionof population genetics [54].)

2.7 The all-ones urn: [1, 1, 1, 1]

Like the previous one, this urn has identical rows, so that the composition of the urn at time n is determin-istic. The associated differential system is

x = x2yy = xy2 ,

x(0) = x0

y(0) = y0.

The solution is easily determined, once we notice that xx−1 = yy−1, which implies that x and y areproportional and y/y0 = x/x0. We find

X(t) =x0√

1− 2x0y0t, Y (t) =

y0√1− 2x0y0t

,

so that the complete GF of urn histories is

H(x, y, z) =xa0yb0

(1− 2zxy)(a0+b0)/2.

Like for the records urn, our methods extend to a characterization of the number of times a ball of each typeis chosen. The end result is similar to the Stirling distribution of the Records Urn.

This concludes the discussion of all the ten cases of balanced urns whose matrices have entries restrictedto −1, 0,+1, once we take into account the equivalence between urns(

α βγ δ

)∼=(δ γβ α

), (29)

corresponding to an interchange of colours x and y.


3 The first integral of balanced 2× 2 urnsIt turns out that urn models of dimension two (i.e., with two types of balls) that are balanced can alwaysbe solved in finite terms. This results from the existence of a first integral for the associated differentialsystem. An essential parameter of the urn is the dissimilarity index of the urn model or matrix, which wedefine by

p := γ − α = β − δ.

(The equivalence of the two forms requires the balance condition.) The dissimilarity parameter measuresthe differential benefit, seen from either ball, resulting from the colour of the ball chosen. It is then naturalto consider the following three categories.

— Altruistic urns correspond to p > 0. In that case, a ball chosen “gives” to the population of balls ofthe other colour.

— Neutral urns correspond to p = 0. In that case, the composition of the urn at any instant n is entirelydetermined (and is a linear function of time). Such models are of moderate interest since they canbe studied directly like in the case of the records urn and the all-ones urn of the previous section.Accordingly, we need not discuss them any further.

— Selfish urns correspond to p < 0. In that case a ball chosen serves its own kind more than the otherkind. In a sense, it reinforces its own population.

The family of all 2× 2 balanced urn matrices is a three parameter family. The quantities σ (the sum) andp (the dissimilarity) are two parameters, which, together with any single entry of the matrix, determine themodel. We also observe that the matrix M admits σ as an eigenvalue, with right eigenvector the columnvector (1, 1)T. The other eigenvalue is −p = α− γ with left eigenvector (1,−1).

Note 6. The spectrum of the urn’s matrix. In the classical theory of urn models, eigenvalues play an important role,since they determine to a large extent the urn’s probabilistic behaviour. For a balanced 2 × 2 urn with σ > 0, it hasbeen recognized for a long time that the ratio of eigenvalues,

ρ := − p

σ≡ α− γ

α + β,

is the basic discriminating quantity. One has a priori ρ ≤ 1, since ρ > 0 implies α > 0 and, in turn, (α−γ)/(α+β) ≤α/(α + β) ≤ 1. Gouet [47] uses martingale arguments to prove functional limit theorems to the effect that under thecondition,

ρ ≤ 1

2(together with βγ 6= 0, i.e., for nontriangular cases), the evolution of the urn is Gaussian—this includes all sacrificialurns discussed in §5 and §6 for which one has ρ < 0. (Several such results do extend to urns of higher dimensions asdescribed in §9; see for instance [62, 96].) 2

Finally, an urn model is said to be arithmetically irreducible if the four matrix entries have no commondivisor:

gcd(α, β, γ, δ) = 1. (30)

With little loss of generality, we shall normally confine ourselves to the study of such arithmetically irre-ducible urns. Roughly, if d > 1 is a common divisor of all matrix entries, it is possible to group balls bypacks of d and operate with the reduced matrix that has entries [α/d, β/d, γ/d, δ/d]. (The only price tobe paid is a minor restriction on initial conditions, namely d|a0 and d|b0, but our methods can easily beextended to cover such situations.)

Proposition 7 The differential system,x = xα+1yβ

y = xγyδ+1 , α+ β = γ + δ

admits xp − yp as a first integral, where p = β − α = γ − δ is the dissimilarity parameter. Precisely, let

X(t) = X

(t

∣∣∣∣x0

y0

), Y (t) = Y

(t

∣∣∣∣x0

y0

),

be the solution of the system corresponding to the initial conditions X(0) = x0 and Y (0) = y0, withx0y0 6= 0. Then, for t in some complex neighbourhood of 0, one has

X(t)p − Y (t)p = xp0 − yp

0 .


Proof: It suffices to compute

d

dt(xp − yp) = pxp−1x− pyp−1y = pxα+pyβ − pxγyδ+p ≡ 0,

which ensures that xp − yp is a constant. 2

In other words, the solutions X(t), Y (t) to the differential system provide a local parameterization of acurve of Fermat type,

Xp − Y p = Constant .

In a sense, this explains that the two-chambers urn (Ehrenfest urn), which has p = 2, gives rise to a solutioninvolving hyperbolic functions, which parametrize the hyperbola Y 2 −X2 = 1. The contagion urn (Polyaurn) has p = −1, so that X−1 − Y −1 = C (with C a constant), and it is accordingly described by linearfractional transformations.

4 General solutions for 2× 2 sacrificial urns (α ≤ −1)Equipped with the first integral of Proposition 7, we proceed to solve the associated differential system.Several cases are to be distinguished based, inter alia, on the sign of p. In this section, we operate with theassumption

α ≤ −1, (31)

implying p > 0, thus corresponding to an upper diagonal entry that is negative and implying an urn withpositive dissimilarity. Balls of colour x are said to be sacrificial since, whenever drawn, a certain numberof them is withdrawn. An urn with at least one negative diagonal entry is said to be sacrificial. It is said tobe semi-sacrificial, respectively fully sacrificial, if the number of its negative diagonal entries is exactly 1,respectively 2.

We start by restricting y0 in a neighbourhood of 1 and x0 in a neighbourhood of 0. Eventually, for theprobabilistic analysis, we shall set y0 = 1 (since the information provided by the variable y in the completeGF is redundant and tolerates setting y = 1) and take x0 to vary in domains containing 0. We make use ofthe following notations,

∆(x0, y0) := (yp0 − xp

0)1/p, ∆(x0) := (1− xp

0)1/p, (32)

with principal determinations of pth roots intended when x0 ≈ 0, y0 ≈ 1. So far, we have only made useof the property p > 0, a consequence of (31).

Our goal is to eliminate the “floating” initial conditions and introduce a standard set of functions by whichone can completely express the complete GF of urn histories. Given the first integral of Proposition 7, onecan eliminate y in the equation defining x:

x = xα+1yβ = xα+1 (xp − xp0 + yp

0)β/p = xα+1(xp + ∆p)β/p,

and similarly for y. This can be reduced to a form that no longer involves “floating” initial conditions. Set

X

(t

∣∣∣∣ x0

y0

)= ∆ξ(t∆σ), Y

(t

∣∣∣∣ x0

y0

)= ∆η(t∆σ). (33)

Then, the transformed system satisfied by the pair (ξ, η) isξ = ξα+1(ξp + 1)β/p

η = ηδ+1(ηp − 1)γ/p ,

ξ(0) = x0∆−1

η(0) = y0∆−1 . (34)

In particular, there is separation of variables,

ξ

ξα+1(1 + ξp)β/p= 1

so that on can solve for ξ:

t =∫ ξ(t)

x0/∆

w−α−1

(1 + wp)β/pdw. (35)


Urn matrix: M =

„α βγ δ

«;

Balance: σ := α + β = γ + δ;Dissimilarity index: p := γ − α = β − δ;Initial composition: (a0, b0);Size: s0 := a0 + b0; sn := s0 + nσ;Auxiliary quantities: r := − p

α; s := −p

δ;

Integral parameters: λ :=β

p; µ :=

γ

p.

Figure 6: A summary of notations for 2-colour urn models.

This defines the ξ function by inversion of an Abelian integral(iii). Hence the solution of the associatedsystem and the complete GF of histories are expressible by means of an Abelian integral over the Fermatcurve,

Fp : Y p −Xp = 1,

and its inverse functions (with respect to composition).At this stage, we make use of the assumption α + 1 ≤ 0 in (31). This means that balls of the x colour

“sacrifice” themselves in favour of balls of colour y. A change of variable w−α = ζ in the integral of (35)first reduces the definition of ξ to (remember that −α is a positive number)

t = − 1α

∫ ξ(t)−α

(x0/∆)−α

dζ

(1 + ζr)β/p, r := − p

α. (36)

Note that r is a positive integer, given the tenability conditions which require −α|γ. The desired standard-ization is provided by the following definition.

Definition 3 Let the parameters r and λ satisfy r ∈ Z>0 and λ ∈ Q≥0. The fundamental integral J = Jr,λ

is defined to be

J(u) :=∫ u

0

dζ

(1 + ζr)λ. (37)

The pseudo-sine function S = Sr,λ is defined in a complex neighbourhood of 0 as the inverse of J:

S(J(u)) = J(S(u)) = u. (38)

The pseudo-cosine function C = Cr,s,λ is defined in a complex neighbourhood of 0 as

C(z) = (1 + S(z)r)1/s, (39)

where the principal determination of the sth root is adopted.

We note that, near 0, J(u) is analytic and satisfies J(u) = u+O(ur+1), so that it is analytically invertiblenear 0. Thus, the function S is analytic at 0 and satisfies S(z) = z + O(zr+1). The function C(z) is alsoanalytic and satisfies C(z) = 1 +O(zr). Precisely:

J(u) = u− λ

r + 1ur+1 +

λ(λ+ 1)2(2r + 1)

u2r+1 − λ(λ+ 1)(λ+ 2)6(3r + 1)

u3r+1 + · · ·

S(z) = z +λ

r + 1zr+1 +O(z2r+1)

C(z) = 1 +1szr +O(z2r).

(40)

(By its very definition, J(u) only includes powers of the form ukr+1k≥0. As a consequence, S(z) andC(z) only include terms of the form zkr+1k≥0 and zkrk≥0, respectively.)

The reason for the chosen denomination is that the functions S,C can be thought of as higher degreeanalogues of trigonometric or hyperbolic functions. (In fact, minor variants of these functions have beenstudied in the nineteenth century by Lundberg [81] under the name of hypergoniometric functions, their

(iii) Given an algebraic curve defined by the polynomial equation P (x, y) = 0, an Abelian integral is any integral of the formRQ(x, y) dx, with Q a rational function.


study having been recently revitalized by Lindqvist and Peetre [78, 79].) In particular, when p = r = 2 andλ = 1/2, the functions J, S, C reduce to

J(u) =∫ u

0

dζ√1 + ζ2

= arcsinh(u), S(z) = sinh(z), C(z) = cosh(z),

which is how they appear in the two-chambers (Ehrenfest) urn model.We can now state:

Theorem 2 Consider a balanced urn model whose matrix M =(α βγ δ

)has α ≤ −1 and(iv) δ 6= 0.

Then the bivariate generating function of urn histories, for (x, z) in a neighbourhood of (0, 0) is given bythe analytic expression:

H(x, 1, z) = ∆s0S(−αz∆σ + J

(x−α∆α

))− a0α C

(−αz∆σ + J

(x−α∆α

))− b0δ ,

with ∆ ≡ ∆(x) = (1− xp)1/p.(41)

There, the usual notations are used,

σ = α+ β, p = β − α, s0 = a0 + b0,

and we have J = Jr,λ (the fundamental integral of (37)), S = J (−1) defined by S(J(u)) = J(S(u)) = u(the pseudo-sine of (38)), C defined by C(z) = (1+S(z)r)1/s (the pseudo-cosine of (39)), the parametersbeing

p := γ − α, λ =p

β, r := − p

α, s := −p

δ,

so that λ ∈ Q≥0, r ∈ Z≥1, and s ∈ Q.

Proof: We have from (36):−αt = J(ξ(t)−α)− J(x−α

0 ∆α).

This relation can be inverted to express ξ−α:

ξ−α = S(−αt+ J(x−α

0 ∆α)),

and henceX(t)−α, by Equation (33). Observe that, though ξ andX are not in general analytic at 0, ξ−α andX−α are. The function η−δ is similarly expressed in terms of C, since ηp − ξp = 1. The statement finallyresults from plugging the explicit forms of X,Y into the formulæ provided by Theorem 1, upon effectingthe substitution x0 → x, y0 → 1. The expression obtained is bivariate analytic in a neighbourhood of (0, 0),as it should, owing to the tenability conditions, which ensure that the exponent of the pseudo-sine, namely−a0/α, is necessarily an integer. 2

We present now in Examples 1–4 several examples that illustrate the variety of special functions that mayappear in 2× 2 urn models.

Example 1. The two-chambers urn (Ehrenfest), [−1, 1, 1,−1], revisited. This urn justifies our basic terminologyregarding pseudo-trigonometric functions. The matrix is„

−1 11 −1

«,

so that we haveσ = 0, α = −1, p = r = 2, β = 1.

The fundamental integral is

J(u) = J2,1/2(u) =

Z u

0

dζp1 + ζ2

= arcsinh(u).

Consequently, the pseudo-sine and pseudo-cosine are

S(z) = S2,1/2(z) = sinh(z), C(z) = C2,1/2(z) = cosh(z).

A mechanical application of Theorem 2 then provides, when s0 = a0 = N (and b0 = 0):

H(x, 1, z) = (1− x2)N/2 sinhN

„z + arcsinh

„x√

1− x2

««,

which is easily seen to be equivalent to the result of (24), given basic trigonometric identities. 2

(iv) The case δ = 0 corresponds to “antitriangular” urns and needs minor adjustments in definitions: see Example 2 below.


Example 2. The peaks-in-perms urn (Mahmoud): [−1, 2, 1, 0]. This urn has matrix„−1 21 0

«,

and parameters

σ = 1, p = r = 2, λ =1

2.

The fundamental integral is accordingly

J(u) = J2,1(u) =

Z u

0

dζ

1 + ζ2= arctan(u),

so that the pseudo-trigonometric functions are usual ones:

S(z) = tan(z), C(z) =q

1 + tan2(z) = sec(z) =1

cos(z).

In this case, we have δ = 0, so that s = ∞, which forces us to adopt, instead of C, the function C satisfyingCr − Sr = 1. Then the bivariate GF of urn histories is for a0 = 1 and b0 = 0:

H(x, 1, z) = ∆(x) tan

„z∆(x) + arctan

„x

∆(x)

««, ∆(x) :=

p1− x2. (42)

an expression that can be expanded to yield

H(x, 1, z) = x + z + 2xz2

2!+ (2 + 4x2)

z3

3!+ (16x + 8x3)

z4

4!+ (16 + 88x2 + 16x4)

z5

5!+ · · · .

Note that, if we set x = 0, we get the GF of alternating (or up-and-down or zigzag) permutations [15, pp. 258–259].We recognize in the last expansion a variant of the sequence EIS:A008303 in Sloane’s Encyclopedia of Integer

Sequences [95], where it is described as “Permutations of [n] with k peaks”. Indeed, values in a permutation σ =(σ1, . . . , σn) can be classified into four types known as peaks (σi−1 < σi > σi+1), valleys aka troughs (σi−1 > σi <σi+1), double rises (σi−1 < σi < σi+1), and double falls (σi−1 > σi > σi+1). See for instance the books of Davidand Barton [18, Ch. 10] or Stanley [97, p. 24], as well as the article by Francon and Viennot [42]. A well known GFdue to Carlitz enumerates permutations according to the number of elements of each type, and H in (42) is a specialcase. In fact, in an article dedicated to patterns in permutations and increasing binary trees(v) the authors of [32, p. 238]derived an expression essentially equivalent to that of (42) and deduced a local limit law of the Gaussian type for thecoefficients.

The correspondence between this urn model and increasing binary trees results from observations made by Mahmoudin the related context of binary search trees: Mahmoud explicitly introduced this urn in his review paper [84] and ourdescription is essentially an adaptation of his. First, it is well known that increasing binary trees under the uniformmodel can be obtained by a simple growth model: view external nodes (that are unlabelled) as “placeholders”; startfrom a single external node; at each discrete instant n + 1

2, pick up uniformly at random such an external node,

and attach to it an internal node labelled n + 1 with two dangling external nodes. In the standard bijection betweenpermutations and increasing binary trees [10, 38, 97], an element that is neither a peak nor a valley corresponds to aone-way branching internal node, hence to an external node whose brother is an internal node; a leaf (an internal nodeattached to two external nodes) corresponds to a peak. It is then possible to model the growth of an increasing binarytree by an urn model, while at the same time keeping track of peaks: Encode by a white ball (x) an external nodewhose brother is an internal node; encode by a black ball (y) an external node whose brother is an external node. It isthen easily seen that the inductive construction of the tree corresponds to the urn process with matrix [−1, 2, 1, 0]. Thecorrespondence between this urn model and peaks in permutation is thus established. 2

Example 3. The first algebraic urn: [−1, 3, 1, 1]. This urn has matrix and parameters given by

M =

„−1 31 1

«, σ = 2, p = r = 2, λ =

3

2.

Its interest is to admit a complete GF that is an algebraic function of low degree.The fundamental integral is explicit:

J(u) ≡ J2,3/2(u) =

Z u

0

dζ

(1 + ζ2)3/2=

u√1 + u2

.

(v) An increasing binary tree is a plane binary tree, which has internal nodes labelled by distinct integers and is such that labelsalong any branch stemming from the root go in increasing order. An increasing binary tree is in bijective correspondence with apermutation of the same size—simply read off the labels in infix order. See [97, p. 24] and [10] for instance. The model is equivalentto that of random binary search trees [94].


Then inversion provides for the pseudo-sine and pseudo-cosine:

S(z) =z√

1− z2, C(z) =

p1− z2.

Say we consider initial configurations with (a0, b0) = (0, N). The complete GF of urn histories is found to be

H(x, 1, z) = (1− x2)N/2 1

(1− (z(1− x2) + x)2)N/2.

Such an expression can be used to determine explicitly moments of the number of balls of the first colour. Take forinstance N = 1. It suffices to expand locally near x = 1:

H(x, 1, z) =1√

1− 2z+ (x− 1)

z(1− z)

(1− 2z)3/2+

(x− 1)2

2

z2(2− 4z + 3z2)

(1− 2z)5/2+ O

`(x− 1)2

Each partial derivative ∂rxH(x, 1, z)

˛x=1

is a simple radical function, and in particular, the total number of histories is(as anticipated)

Hn = n! · [zn]1√

1− 2z= 2nn!

n− 1

2

n

!= 1 · 3 · 5 · · · (2n− 1).

As a consequence: Each moment of the number of balls of a given colour is a rational function of n. Asymptotically, wefind easily (with the Help of Dr Salvy’s programme equivalent, under the symbolic manipulation system MAPLE):

E(An) =1

2n +

1

4+

1

8n+ O

„1

n2

«, V(An) =

1

4n +

1

8+

1

8n+ O

„1

n2

«,

where E and V represent expectation and variance, respectively, and An is as usual the number of balls of colour xpresent at time n. Thus the distribution is concentrated around its mean and there is convergence in probability:

An

n/2

P−→ 1.

We shall see in §6 that such closed form formulæ and simple asymptotic forms for moments are the rule. Theirderivation in the present context is almost immediate, this in sharp contrast with the fact they are often obtained in theliterature at the expense of costly recurrence manipulations(vi).

The probability of all balls being of the second colour at time n = 2ν also results from the expression of H . It isobtained by setting x = 0 in H(x, 1, z),

H(0, 1, z) =1√

1− z2,

so that, upon extracting coefficients:

P(A2ν = 0) =(2ν)!3

ν!2 (4ν)!∼ 2−2ν+1/2.

Like in the case of the generation-parity urn, the probability of such an extreme deviation is well quantified and turnsout to be exponentially small. (See §5 for a general theory of such urns and §5.3 for a class generalizing the presentexample.) 2

Example 4. The generation-parity model: [−1, 2, 2,−1]. Consider the urn with matrix„−1 22 −1

«,

and parameters

σ = 1, p = r = 3, λ =2

3.

This urn models a binary splitting process as follows. There are two types of particles, x and y. At a discrete instant,any of the particles present in the system has equal chances of disintegrating, with an x particle giving rise to two yparticles and a y particle giving rise to two x particles. Assume for simplicity that one starts at time 0 with one particleof type x, so that a0 = 1, b0 = 0. If particles are numbered according to the value of n determining the time (n + 1

2) at

which they disintegrate, the process is once more represented by an increasing binary tree. Balls of colour x correspond

(vi) Our point of view is that traditional recurrence manipulations in the case of balanced urns at least, are normally subsumed bymore synthetic generating function methods. This observation is perhaps to be somewhat nuanced, and for instance the fundamentalpaper of Bagchi and Pal [6] has to be singled out: by a shrewd use of recurrence techniques, the authors of [6] were first able todeduce the asymptotic form of centred moments of the urn’s composition and deduce a Gaussian limiting distribution for a largeclass of balanced urns. However, as we shall see later, the global structure of all moments can be unwound for large classes of urns.An elegant point of view on moment calculations is developed by Pouyanne in [91, 90].


to balls that belong to an even generation (as measured by the distance to the root node); balls of colour y belong to ageneration which is odd.


J(u) = J3,2/3(u) =

Z u

0

dζ

(1 + ζ3)2/3,

which is an Abelian over the Fermat curve F3 : Y 3 − X3 = 1. The inverse function has been studied by AlfredCardew Dixon (1865–1936) in a long article [20] and further examined by Conrad and Flajolet(vii) in [16] as regardscombinatorial properties. What is of special interest is the fact that smh := S and its companion cmh := C are ellipticfunctions (i.e., meromorphic doubly periodic functions [13, 103]). With

smh(z) := S3,2/3(z) = Inv

Z z

0

dζ

(1 + ζ3)2/3

= z + 4z4

4!+ 160

z7

7!+ 20800

z10

10!+ 6476800

z13

13!+ · · · ,

(43)

the inverse function(viii) of J(u) (our notation follows [16, 20]), Theorem 2 provides the complete GF of histories(Proposition 3 in [16]):

H(x, 1, z) = (1− x3)1/3 smh

(1− x3)1/3z +

Z x(1−x3)−1/3

0

ds

(1 + s3)2/3

!,

= x +z

1!+ 2x2 z2

2!+ (4x3 + 2x4)

z3

3!+ · · · .

(44)

The function smh, which has positive coefficients, has a dominant singularity on the positive real axis, by virtueof Pringsheim’s Theorem. Since J(u) is analytic at any finite positive point u > 0, where it satisfies J ′(u) 6= 0,it is invertible. Thus, the dominant singularity ρ > 0 of smh along the positive real axis can only be obtained withsmh(ρ) = ∞, that is, we must have

ρ = J(∞) =1

2π√

3Γ

„1

3

«3

, (45)

where the latter evaluation results from the Eulerian Beta integral [103].Here is a nice application of (43), (44), and (45). Say we want to estimate the probability that at time n, all balls be of

the second colour, that is, all particles alive in the system belong to some odd numbered generation. This probability isobtained by considering H(0, 1, z), so that P(An = 0) = [zn] smh(z). (This can only happen with nonzero probabilityat instants of the form n = 3ν + 1.) Then, by an easy analysis of singularities, we find

P(A3ν+1 = 0) ∼ cρ−3ν−1, ρ =1

2π√

3Γ

„1

3

«3

. (46)

For instance, we find to 10 decimal places (10D):„[z28] smh(z)

[z31] smh(z)

«1/3.= 1.76663 87490 · · · . ρ

.= 1.76663 87502 · · · .

Equation (46) thus quantifies accurately an extreme large deviation regime that has an exponentially small probabilityof occurrence. (See §6 for a general theory of such urns and, in particular, §6.2 for further elliptic connections.) 2

Note 7. The pseudo-cosine as an inverse Abelian integral. Coming back to the transformed system (34), we have

η

ηδ+1(ηp − 1)γ/p= 1.

This implies (compare with Equation (35))

t =

Z η(t)

y0/∆

w−δ−1

(wp − 1)γ/p,

and via a change of variables similar to (36)

t = −δ

Z ξ−δ

(y0/∆)−δ

dζ

(ζs − 1)µ, s = −p

δ, µ =

γ

p.

Thus: the pseudo-cosine is alternatively definable as the inverse of an Abelian integral. 2

(vii) Regarding the genesis of the present paper, an important starting point for the differential system approach to urn models is thedetailed study of the combinatorics of Dixonian elliptic functions in [16].

(viii) The coefficients constitute the sequence EIS:A104133.


Note 8. Hypergeometric form of the fundamental integrals. The hypergeometric function F = 2F1 is classicallydefined [103] as

2F1

»a, bc

˛z

–= 1 +

a · bc

z

1!+

a(a + 1) · b(b + 1)

c(c + 1)

z2

2!+ · · · .

The fundamental integral (37) is then a special hypergeometric function,

Jr,λ(u) = u · F»

λ, 1/r1 + 1/r

˛− ur

–,

as is seen from expanding and integrating term-by-term. Thus, with Inv representing inversion of functions with respectto composition, one has for an urn with matrix [α, β; γ, δ]:

S(z) = Inv

„z · F

»β/p,−α/p1− α/p

˛− zr

–«.

Thus: the pseudo-sine function is the compositional inverse of a special hypergeometric function. A similar propertyholds for the pseudo-cosine. We shall not however make much use of hypergeometric function theory in the presentpaper. 2

5 The class of 2× 2 semi-sacrificial urns (α ≤ −1, δ > 0)Recall that a colour is sacrificial if the corresponding diagonal entry in its matrix M is negative. The lastsection has started a general treatment of the case α < 0, that is, colour x is sacrificial. Starting from there,two subclasses of urns may be considered: (i) the fully sacrificial urn models for which α < 0 and δ < 0;(ii) the semi-sacrificial urns for which α < 0 and δ > 0. The former case has been treated fairly thoroughlyin the earlier article of Flajolet, Gabarro and Pekari [30] and is reviewed in the next section. The latter caseis the one of interest for us now. In summary, our present object of study is the class of matrices(ix)

M =(α βγ δ

), α < 0, δ > 0, βγ 6= 0, σ > 0. (47)

What we want to do is exploit the analytic consequences of Theorem 2 in a way dictated by the need ofasymptotic probability. As we shall see, the complex geometry of the problem is quite different from thatof [30]. A sufficient fragment of this geometry is worked out in the next subsection. After this, routinecalculations make it possible to derive precise estimates relative to extreme deviations, a Gaussian limit law(with speed of convergence estimates), the large deviation rate function, and the structure of moments.

5.1 The analytic structure of the base functions S, C

By the term of base functions, we mean here the two functions

S(z), C(z),

which serve to express H(x, 1, z), the bivariate generating function (BGF) of urn histories. As is usual inanalytic combinatorics, singularities are essential. The first fundamental result is the following.

Lemma 1 Let M be a semi-sacrificial urn model in the sense of the conditions (47).(i) The base functions S(z) and C(z) are analytic near 0. Their common radius of convergence ρ is

given by (p = γ − α, r = −p/α, s = −p/δ)

ρ = J(+∞) =1rB(

1r, 1− 1

r− 1s) =

1r

Γ( 1r )Γ(1− 1

r −1s )

Γ( 1s )

. (48)

(ii) The only dominant singularities (i.e., the singularities of smallest modulus) of S(z) and C(z) are atthe conjugate points

ρωj , for j = 0, 1, . . . , r − 1, where ω = e2iπ/r. (49)

Proof: The proof consists in establishing four properties: (A) the radius of convergence is a singularityand its form is explicit; (B) S(z)r avoids the value (−1) inside its disc of convergence; (C) |S(z)| remainsbounded as z tends radially to a boundary point that is not a singularity; (D) the function S(z) is analyticat end points ρeiθ except for a collection of r special directions, as described in the statement.(ix) Triangular urns corresponding to βγ = 0 will be dealt with in separate sections.


(A) Consider the BGF of histories H(x, 1, z) for an initial composition of the urn given by (a0, b0) =(−α, 0). Then, by Theorem 2, we have

H(0, 1, z) = S(−αz). (50)

This is the exponential GF of urn histories conditioned upon the fact that no ball of colour x is present inthe end. The equation shows that S(−αz), hence S(z), has nonnegative Taylor coefficients.

By Pringsheim’s theorem [38, 99], since S(z) has nonnegative coefficients, the set of all its dominantsingularities includes the value of the radius of convergence ρ of the series at 0. In particular, in orderto determine that radius, it suffices to examine the function S along the ray R≥0, starting from 0, till asingularity is encountered.

The main point is that S is defined as the inverse of J and J(u) is analytic at all points of the positivereal axis. At the same time, for any u ∈ R≥0, one has J ′(u) ≡ (1 + ur)−λ 6= 0. There results that J is ananalytically invertible mapping of the interval [0,+∞[ onto [0, ρ[, where ρ = J(+∞). The latter quantityis nothing but an Eulerian Beta integral whose evaluation in terms of Γ factors is classical [103]. Theseconsiderations justify the first assertion (i) of the lemma for S(z).

(B) We next prove that for all z such that |z| < ρ, we have S(z)r 6= −1. Observe that S(z) is a solutionof the differential equation

Y ′(z) = (1 + Y (z)r)λ,

with λ > 1, given the conditions (47). Now, it is easy to verify that a solution Y (z) cannot be analytic inthe neighbourhood of a point z0 such that Y (z0)r = −1. Indeed, if Y (z) were analytic, we would have

Y (z) ∼ η + cη(z − z0)a, Y ′(z) ∼ caη(z − z0)a−1, (1 + Y (z)r)λ ∼ (−rc)λ(z − z0)λa,

for some a ≥ 1, ηr = −1, and some c 6= 0. The equality Y ′ = (1 + Y r)λ is clearly impossible sinceλ = β/(β − δ) > 1, so that the orders of growth of Y ′ and (1 + Y r)λ are incompatible.

This avoidance property implies that C(z) is analytic in the disc |z| < ρ. A simple computation alsoshows that it is singular as z → ρ−. Thus, part (i) of the statement has been established for C(z).

Next, we observe, from the nature of the expansions of S and C at 0 that these functions have a rotationalsymmetry, namely

S(ωz) = ωS(z), C(z) = C(ωz).

Thus, it is sufficient to study S(z), C(z) in any sector rooted at 0 with opening angle 2π/r. Also, thissymmetry immediately implies that the conjugate quantities ρωj are amongst the dominant singularitiesof S and C. It takes however a bit of work to establish that there are no dominant singularities besidesthese.

(C) We now proceed to investigate the behaviour of S(z) along rays of angle θ, with 0 < θ < π/r,which is sufficient by rotational symmetry and conjugacy. Define the semi-open interval

I[θ] :=xeiθ

∣∣ 0 ≤ x < ρ.

The proof of boundedness properties of S makes use of the differential relations satisfied by S. Choosefirst an arbitrary R ∈ (0, ρ). The “daffodil lemma” of [38] expresses the fact that the directions of largestincrease of the modulus of an analytic function with nonnegative coefficients at 0 are to be found amongsta set of regularly spaced rays emanating from 0; furthermore these directions are determined by the period(or span) of the support of the coefficients of f (i.e., the set of n such that fn 6= 0). Here this lemma impliesthat, for R ∈ (0, ρ) and for any θ with 0 < θ < π/r, we have

∣∣S(Reiθ)∣∣ < S(R). In particular, there exists

an x0 < R such that∣∣S(Reiθ)

∣∣ = S(x0). We can then find an x1 in (x0, R), such that∣∣S(Reiθ)∣∣ < S(x1).

We then propose to compare for x ∈ [R, ρ[, the two functions

Y (x) := S(xeiθ), Z(x) := S(x1 + x−R),

which satisfy the autonomous differential equationsY ′(x) = (1 + Y (x)r)λeiθ

Y (R) = S(Reiθ) ,

Z ′(x) = (1 + Z(x)r)λ

Z(R) = S(x1).


Roughly, we can visualize the values of Y, Z as the trajectory of two mobiles whose movement is governedby similar differential equations. In terms of modulus, i.e., distance to the origin, Y is the tortoise and Zthe hare, but contrary to what happens in the fable, the tortoise never catches up since it starts late and itsspeed remains dominated by that of the hare (see [17] for another context). Consequently, till time x = ρ,the function Y (z) remains of modulus smaller than Z(ρ) = S(x1 −R+ ρ) < +∞. The argument is madeprecise by writing the differential systems in integral form:

Y (x) = Y (R) +∫ x

R

(1 + Y (ξ)r)λeiθ dξ

Z(x) = Z(R) +∫ x

R

(1 + Z(ξ)r)λ dξ.

(Note that the determination of the λth root is well defined by continuity since S(z)r avoids −1.) Assumea contrario that there exists an x with x < R+ ρ− x1 such that |Y (x)| = Z(x), and let x2 be the smallestsuch x. We would then have, by trivial inequalities,

|Y (x2)| ≤ |Y (R)|+∫ x2

R

(1 + |Y (ξ)|r)λdξ

< Z(R) +∫ x2

R

(1 + Y (ξ)r)λ dξ = Z(x2),

where the second inequality is strict since |Y (x)| < Z(x) at least for a small interval to the right of R. Acontradiction with the supposition that |Y (x2)| = Z(x2) has been reached. In other words, we have provedthat S(z) remains of bounded modulus along I[θ].

(D) We now establish that the end point ρeiθ of I[θ] is such that S(z) is analytic there. Say that adirection θ different from the singular directions of the statement is proper if the set of limit values ofS(xeiθ)r as x→ ρ includes at least one element that is different from (−1).

Consider first a proper direction, with 0 < θ < π/r, and let υ0 be any limit value of S(z) along I[θ]satisfying υr

0 6= −1. By classical existence theorems for ordinary differential equations, the equation

Υ′(z) = (1 + Υ(z)r)λ, Υ′(0) = υ,

admits a solution Υ(z) = Υ(z, υ) that is bivariate analytic in a neighbourhood Ω ⊂ C2 of (0, υ0), whichwe can fix to be the polydisc

|z| < δ, |υ − υ0| < ε.

Choose then z0 = ρ0eiθ such that

|ρeiθ − ρ0eiθ| < δ

2, |υ − υ0| < ε.

Then, we must haveS(z) = Υ(z − z0, S(z0)).

As a consequence of this equality, S(z) is itself analytic at ρeiθ. Thus, along any proper direction, S(z) isanalytically continuable beyond the boundary of its disc of convergence.

There only remains to exclude improper directions. Assume a contrario that there exists an improperdirection θ. Then, by definition of properness, S(z)r admits −1 as unique limit value. In addition, weclaim that there exists a particular rth root η of −1 such that S(z) tends to the limit η as z → ρeiθ. (Proof:otherwise, S, which is bounded, would oscillate between several roots of−1, but, then, its set of limit valueswould include quantities other than the rth roots of−1, a contradiction with the improperness assumption.)Supposing still θ to be an improper direction, let γ be the image curve by S of the open ray I[θ]. We have

Jγ(S(z)) = z for all z = xeiθ, with 0 ≤ x < ρ, (51)

where Jγ(u) means that the integral defining J is to be taken along the path γ. (Proof: the identity is truefor z near 0 and it extends to the whole of γ by analytic continuation.) But then, we reach a contradictionsince, all determinations of J are such that J(η) is infinite (the integral defining J diverges), while (51)implies that J must remain finite. Thus, no improper direction exists.

We have thus shown that all points of the disc of radius ρ other than ρ and its conjugate are regular pointsfor S. We have also shown that the value −1 is not approached by S(z)r, as z tends to the boundary of


the disc of radius ρ, so that the stated analyticity properties of S in the closed disc of radius ρ also holdfor C(z). The proof of the lemma is now complete. 2

The next step consists in characterizing the singular expansions of S(z) and C(z).

Lemma 2 Let σ > 0. In the complex plane slit along the ray R≥ρ, one has

S(z) =z→ρ

(r0(ρ− z))−1/r0

[1− λ

r1(r0(ρ− z))r/r0 +O

((ρ− z)2r/r0

)], (52)

where r0 = rλ− 1, r1 = r(λ+ 1)− 1. The conjugate expansions at the points ρωj of (49) are determinedby the fact that S(z)/z is an analytic function of zr at 0. Similarly for C(z):

C(z) =z→ρ

(r0(ρ− z))−r/(sr0)

[1 +

1s

(1− λr

r1

)(r0(ρ− z))r/r0 +O

((ρ− z)2r/r0

)], (53)

with the conjugate expansions being determined by the fact that C(z) is an analytic function of zr at 0.

Proof: This is a simple exercise in computational asymptotics. The expansion of J(u) as u→ +∞ resultsfrom the expansion of the integrand (1 + ζr)−λ:

J(u) =u→+∞

ρ− 1rλ− 1

1urλ−1

+1

r(λ+ 1)− 11

ur(λ+1)−1+O

(1

ur(λ+2)−1

). (54)

Thus, ρ − J(u) is of the order of 1/ur0 , which, given that J and S are inverse pair, means that ρ − z isof the order of 1/S(z)r0 as z → ρ. The main term of S(z) follows; further terms are obtained by seriesreversion.

This formal process provides the singular expansion of S(z) in the form

(r0(ρ− z))−1/r0 A(r0 (ρ− z))r/r0

),

with A(w) a formal power series in w with A(0) = 1. Adopting τ = (r0(ρ− z))r/r0 as uniformizingparameter and making use of the fact that the compositional inverse of an analytic function is analytic(provided, as is the case here, that the first derivative does not vanish), we see that A(w) is furthermoreanalytic at 0. The validity of the expansion in the slit plane results. 2

5.2 Probabilistic properties of semi-sacrificial urnsEquipped with a good understanding of the analytic structure of the basic functions S and C, which involvethe single complex variable z, we can now proceed with the analysis of the bivariate generating functionH(x, 1, z). The treatment follows the customary lines exposed in the book Analytic Combinatorics [38]and they parallel what was done earlier in [30] in the case of fully sacrificial urn models. We state here inplain words a collection of probabilistic properties whose detailed formulations constitute Propositions 8–11 below.

Theorem 3 (Semi-sacrificial urns) Consider a semi-sacrificial urn determined by a matrix satisfying theconditions of (47).

(i) The probability that, at time n, all balls are of the second type (An = 0) is exponentially small; theexponential rate is a quotient of Gamma function values taken at rational arguments.

(ii) The number of balls of the first type (An) converges in distribution, in the large n limit, to a normal.The speed of convergence to the Gaussian limit is a negative fractional power of n.

(iii) The numberAn of balls of the first type satisfies the mean and variance estimates: E(An) ∼ µn andV(An) ∼ ξ2n, for computable constants µ, ξ; in particular, An/n converges in probability to µ. Generally,any rth moment admits a closed-form expression, which is of hypergeometric type.

(iv) The number An of balls of the first type satisfies a large deviation principle, with the large deviationrate function being effectively computable from the fundamental integral J of the urn model.

The extreme large deviation result result (i) and its companion (iv) are, to the best of our knowledge,new; technically, they closely parallel the case of fully sacrificial urns discussed in [30]. The Gaussian limitlaw of (ii) is known from works of Bachi, Pal, and Smythe [6, 96], but the characterization of the speed ofconvergence seems to be new.


5.2.1 Extreme large deviationsLike before, we let (An, Bn) be the vector describing the urn’s composition at time n: there are An balls oftype x and Bn balls of type y. For a semi-sacrificial urn, balls of type y never disappear, so that Bn > 0 forall n ≥ 1. A boundary configuration is the one in which An = 0, corresponding to an extreme in the largedeviation regime (this terminology follows [30]). We state:

Proposition 8 The probability of an extreme large deviation, corresponding to the event An = 0, is expo-nentially small. One has for large n satisfying n ≡ −a0

α mod r:

P(An = 0) = CRnnκ(1 +O(n−e)

), e = min

(1,β − α

β + α

),

where C > 0 and

κ = s0

(1p− 1σ

), R =

(−αα+ β

)ρ−1, ρ ≡ 1

r

Γ( 1r )Γ(1− 1

r −1s )

Γ( 1s )

.

Proof: The number of configurations is obtained by setting x = 0 in the expression (41) of the BGF(bivariate generating function) H(x, 1, z). We have, by Theorem 2:

H(0, 1, z) = S(−αz)−a0/αC(−αz)−b0/δ, (55)

if the urn is initialized with (a0, b0). Consideration of the supports of S(z) and C(z) shows that An = 0can only happen at instants equal to −a0/α plus a multiple of r, a fact which may otherwise be confirmedby a combinatorial reasoning.

The statement then results from singularity analysis [34, 38] applied to the BGF H(0, 1, z), as givenby (55), and a trite calculation. 2

5.2.2 A Gaussian limit law with speedThe knowledge of the bivariate generating function of urn histories combined with the characterization ofsingularities of the fundamental functions S,C gives access to the asymptotic law of the urn’s composition(An, Bn) at every instant. We start with a purely analytic lemma describing the radius of convergence ofurn histories.

Lemma 3 Let ρ(x) be the radius of convergence of the bivariate generating function of urn historiesH(x, 1, z) considered as a function of z. There exists a complex neighbourhood of x = 1 in which thisradius is an analytic function of x and is given by the formula

ρ(x) = x−σ∑k≥0

(−λk

)1

σ + kp

(1− xp

xp

)k

. (56)

Proof: The radius of convergence ρ(x) is determined by

−αρ(x)∆(x)σ + J(x−α∆(x)α) = ρ, (57)

where ρ = J(+∞) is the common radius of convergence of S(z) and C(z). Define the quantity

e(x) =∆(x)x

=(1− xp)1/p

x.

The quantity e(x)α becomes infinite as e(x) → 0, that is as x → 1, since α ≤ −1 by assumption. Bycomparing (57) for general x and for x = 1, we find

e−σ (ρ− J(eα)) = e−σ

∫ ∞

eα

dζ

(1 + ζr)λ.

The change of variables ζ = υα followed by term-by-term integration of the series expansion of the inte-grand then yields the stated form of ρ(x). 2


Next, from the general expression of H(x, 1, z) and from the singular expansions of S(z) and C(z), wesee directly that

H(x, 1, z) ∼z→ρ(x)

σ−s0/σ (ρ(x)− z)−s0/σ. (58)

In other words, the function H(x, 1, z), regarded as a function of z with x a parameter, has a movablesingularity and a constant exponent. In particular, for an x suitably close to 1, one has(x)

[zn]H(x, 1, z) ∼ σ−s0/σ

Γ(s0/σ)ρ(x)−n−s0/σns0/σ−1, (59)

by virtue of singularity analysis theory [34, 38]. These estimates are seen to be consistent with the case x =1, which corresponds to the plain enumeration of urn histories.

A trite calculation suffices to enhance (58): one finds

H(x, 1, z) =z→ρ(x)

1

(σ(ρ(x)− z))s0/σ

[1 +

γb0 − βa0

p(p+ σ)(1− xp) (σ(ρ(x)− z))p/σ

+O((ρ(x)− z)2p/σ

)],

(60)

where the error term is uniform with respect to x in a sufficiently small neighbourhood of 1. It is knownthat singularity analysis preserves the uniform character of expansions with respect to a parameter (seecomments in [34] and [38, Ch. IX]). As a consequence, one has

[zn]H(x, 1, z) =n→∞

σ−s0/σ

Γ(s0/σ)ρ(x)−n−s0/σns0/σ−1

[1 +O

(n−p/σ

)], (61)

uniformly for x sufficiently close to 1. The expression simplifies further, when one considers the probabilitygenerating function of An. One has

χn(x) := E(xAn

)≡ [zn]H(x, 1, z)

[zn]H(1, 1, z)=

n→∞

(ρ(1)ρ(x)

)n+s0/σ [1 +O

(n−p/σ

)]. (62)

This say that χn(x) closely resembles the nth power of a fixed function, or, in probabilistic terms, that An

is formally like the sum of n independent identically distributed random variables. A Gaussian law isthen expected in the asymptotic limit. Precisely the Quasi-Powers Theorem of Hwang [55, 58] guaranteesthe existence of this limit law (the proof is an analytic analogue of the classical proof of the central limittheorem, based on the continuity theorem for characteristic functions); in addition, the speed of convergenceestimate expressed by (62) translates automatically into an upper bound on the speed of convergence to theGaussian limit (this fact results from a suitable use of the classical Berry-Esseen inequalities [27, 80]).

Note that the coefficients in the asymptotic forms of the mean and standard deviation ofAn are accessiblefrom ρ′(1) and ρ′′(1), themselves easily computable from the expansion of ρ(x) at 1 in (56). (For instance,ρ′(1) = −γ/(β + γ).) We then have:

Proposition 9 Consider a semi-sacrificial urn. The number An of x–balls at time n satisfies for x in anycompact set of the real line

P(An − µn

ξ√n

≤ x

)=

1√2π

∫ x

−∞e−t2/2 dt+O

(n−e

), e = min

(p

σ,12

),

where the centering and scaling constants are

µ := limn→∞

E(An)n

=σγ

β + γ, ξ2 := lim

n→∞

V(An)n

=βγ

−α+ β + 2γ

(α− γ

β + γ

)2

σ.

(x) The functions S, C do have conjugate singularities at ρj = ρωj . However, none of these, except ρ itself, plays a role. Indeed, thesingularity ρj of S, C induces for H(x, 1, z) a singularity at

ρj(x) =ρj − J(x−α∆(x)α)

−α∆(x)σ;

Apart from ρ0(x) ≡ ρ(x), all the other ρj(x) escape at infinity as x → 1, and, in particular, do not contribute in the scale of theproblem, once x is restricted to a sufficiently small neighbourhood of 1.


5.2.3 MomentsOur purpose here is to establish that moments of any order r ∈ Z≥1 of the urn’s composition An areexpressible in hypergeometric form in the following sense.

Definition 4 A function t(n) defined for integer values of its argument is said to be a hypergeometric termif t(n+1)/t(n) is a rational function of n. It is said to be of hypergeometric form or type if it is a C-linearcombination of a finite number of hypergeometric terms.

Bivariate generating functions are a convenient way to get access to moments. Set

χr(z) :=∂r

∂xrH(x, 1, z)

∣∣∣∣x=1

= r![(x− 1)r]H(x, 1, z),

where the last form indicates that H(x, 1, z) should be expanded at x = 1. The rth factorial moment of An

is then

E (An(An − 1) · · · (An − r + 1)) =[zn]χr(z)[zn]χ0(z)

.

The starting point of the analysis(xi) is an extended version of the expansion (60), namely

H(x, 1, z) = (σ(ρ(x)− z))−s0/σ A(σ(ρ(x)− z)p/σ

), (63)

where A(w) is an analytic function of w at 0. (Analyticity derives from the fact that A is obtained bycomposition and reversion of convergent Puiseux expansions; cf Lemma 2.) Write A(w) =

∑k≥0 akw

k

(one has a0 = 1 and a1 given by(60)). We have

H(x, 1, z) =∑k≥0

ak(1− xp)k (σ(ρ(x)− z))(−s0+kp)/σ. (64)

The rth moment is obtained by applying ∂rx followed by the substitution x → 1. Clearly, only the terms

corresponding to k = 0, . . . , r need to be considered.Now, we have the reduction

(σ(ρ(x)− z))−θ∣∣∣x=1

= (1− σz)−θ,

as is seen from setting x = 1 in (64), given the fact that H(1, 1, z) is the univariate GF of all urn historiesgiven by (5). It then results that, for any m ≥ 0, the quantity

∂xm (σ(ρ(x)− z)−θ

∣∣∣x=1

is a linear combination of terms of the form (1 − σz)−θ+`, for ` = 0 . .m. Here, the exponents θ to beconsidered are

θ ∈−s0 + kp

σ

∣∣ k = 0..r.

Thus, the quantity

∂xrH(x, 1, z)|x=1 =

r∑j=0

cj,r(1− σz)−θ+r,

is the sum of a finite number of terms of the form (1− σz)−ξ. Coefficient extraction based on the rule

[zn](1− σz)−ξ = σn

(n+ ξ − 1ξ − 1

)then shows the following:

Proposition 10 All moments of a semi-sacrificial urn are of hypergeometric form in the sense of Defini-tion 4.(xi) This derivation follows closely the treatment of fully sacrificial urns in [30, p. 1216–1217].


Note 9. On moments. Another way to gain access to moments is via the partial differential equation satisfied byH(x, 1, z). This has the advantage of applying to all balanced 2 × 2 urn models. Start from the partial differentialequation (the PDE of (14), with D the operator of (9)) ∂zH = DH , where, momentarily, H represents the trivariateGF of urn histories. Using homogeneity relations, we know that it is possible to eliminate the ∂yH term, and thenset y = 1. There results the equation (which repeats (16))

(xα+1 − xγ+1)∂xH = −s0xγH + (1− σxγz)∂zH, (65)

where H = H(x, 1, z). First, it is possible to set x = 1 in (65). We then retrieve the GF H(1, 1, z) via the solution of ahomogeneous differential equation and find, as expected H(1, 1, z) = (1−σz)−s0/σ . Next, if we apply differentiation∂x

r , we obtain a linear PDE for ∂xrH(x, 1, z), which, upon setting x = 1 provides χr(z) as the solution of an

inhomogeneous differential equation of order 1. The homogeneous equation is of the same type as that satisfied byH(1, 1, z). Then, the variation-of-constant method is applicable. Thus the GF of any moment for any 2 × 2 balancedurn model (not only sacrificial ones!) is expressible by quadratures. This is an illustration of the “moment pumping”method (see [35]), where moments are pumped out of a functional equation for a BGF. We do not pursue this threadhere since the method appears to be more involved than the plain perturbation techniques employed in the paper. (Itwould also be of interest to relate this approach to Pouyanne’s methods [91, 90].) 2

5.2.4 Large deviationsThe discussion of large deviations (see [19] for an accessible introduction) follows here the same lines asin the treatment of fully sacrificial urns by [30]. It is based on the general fact that knowledge of singular-ities of a bivariate generating functions as the secondary parameter x varies does provide exponential tailbounds [38, 39, 56, 57] on the associated distributions.

The exact representation

P(An ≤ k) = [xk](

11− x

[zn]H(x, 1, z)[zn]H(1, 1, z)

),

and the estimate[zn]H(x, 1, z)[zn]H(1, 1, z)

=C(x)C(1)

(ρ(1)ρ(x)

)n

(1 + o(1))

imply the family of bounds valid for x ∈ (0, 1):

P(An ≤ k) ≤ 1xk(1− x)

C(x)C(1)

(ρ(1)ρ(x)

)n

(1 + o(1)) . (66)

(Use the fact that, for φ an analytic function with nonnegative coefficients, one has [xk]φ(x) ≤ φ(x)x−k

when x > 0 lies within the disc of convergence of φ.) The large deviation case (for left tails, when k = κnis less than the mean E(An) ∼ µn, that is, κ < n), is quantified by choosing the particular value x0 of xthat optimizes the family of bounds (66). This motivates the following statement.

Proposition 11 For a semi-sacrificial urn and for a number κ < µ one has the large deviation estimate

limn→∞

1n

log P(An ≤ κn) = log(

ρ(1)xκ

0ρ(x0)

),

where x0 is the root in (0, 1) of the equation

−x0ρ′(x0)

ρ(x0)= κ.

A similar estimate holds for the right tails, P(An ≥ κn) with κ > µ.

Proof (sketch): The upper bound is provided by the preceding discussion. That the upper bound is tightfollows from an argument of Cramer-type, which has been placed within the general framework of Quasi-Powers approximations by Hwang [56, 57]. (The essential idea is to develop a central limit law, again basedon Quasi-Powers, for an exponentially shifted family of distributions that includes the distribution of An.)2


5.3 The algebraic family [−1, σ + 1, σ − 1, 1]

We now conduct a search for semi-sacrificial urns whose generating function is elementary. In particular,our goal is to find classes of urns such that the S and C functions reduce to classical functions of analysis.A sufficient condition is that the integral in (35), on which the J function of Definition 3 is based, be analgebraic function.

In accordance with (35), we consider the integral

I(u) :=∫ u

c

w−α−1

(1 + wp)β/pdw.

(The choice of c is immaterial here.) For a semi-sacrificial urn, we have α < 0. An immediate sufficientcondition for explicit integrability is that w−α−1dw be (up to a constant factor) the differential elementcorresponding to the quantity wp in the denominator. Such is the case if p = −α, that is

γ − α = −α,

implying γ = 0. This means that the matrix M is triangular—this case, which is incompatible with theconditions (47), requires a special treatment that is given in §7.

An alternative condition for explicit integrability is obtained by reduction to the previous case, after thechange of variables v = 1/w. Proceeding in this way, we find

I(u) =∫ 1/u

1/c

vα−1

(1 + v−p)β/pdv =

∫ 1/u

1/c

vα+β−1

(1 + vp)β/pdv.

Thus, explicit integrability is granted if σ = α+β and p = γ−α are equal. Algebraically, this correspondsto the two parameter family of matrices

(α γ−2αγ −α

). Now, since we have α < 0, the tenability conditions

require that |α| should divide γ, say γ = −dα. We only consider arithmetically irreducible matrices, sothat we only need to retain the case α = −1. The matrices of interest are thus of the form(

−1 σ + 1σ − 1 1

), (67)

and we know a priori that I , hence J , hence S and C, are each a simple algebraic function. It is pleasantto see that the general class of matrices (67) encapsulate the urn of Example 3 (p. 78), which is obtained bysetting σ = 2.

Example 5. The second algebraic urn [−1, 4, 2, 1]. We have corresponding to σ = 3

M =

„−1 42 1

«, p = 3, λ ≡ β

p=

4

3.


J(u) ≡Z u

0

dζ

(1 + ζ3)4/3=

u

(1 + u3)1/3.

The function S(z) is the solution of J(S(z)) = z, that is, S/(1 + S3)1/3 = z, which is readily solved to provide Sand then C:

S(z) =z

(1− z3)1/3, C(z) = (1− z3)1/3.

(Notice the striking parallel with the corresponding function in Example 3.) The GF of urn histories is then found to be

H(x, 1, z) =(1− x3)s0/3

`x + z(1− x3)

´a0`1− (x + z(1− x3))3

´s0/3.

The radius of convergence ρ(x), which determines the main characteristics of the distribution of An is then found to be

ρ(x) =1

1 + x + x2,

which, as predicted by Theorem 3 is an analytic function of x for x near 1. In this case, the large deviation rate functioncan be made fully explicit in terms of simple radicals. 2

The computations in the last example are typical. We state:

Proposition 12 Any urn having a matrix of the form (67) with σ ≥ 2 is such that the bivariate generatingfunction of histories is an algebraic function expressible by radicals:

H(x, 1, z) =(1− xσ)s0/σ (x+ z(1− xσ))a0

(1− (x+ z(1− xσ))σ)s0/σ.


5.4 The algebraic family [−1, σ + 1, 1, σ − 1] (σ even)When aiming at integrability, a good thing to do is look for small values of the parameter r since theintegrand of the fundamental integral J(u) is (1 + ζr)−λ. For semi-sacrificial urns, the case r = 1 isirrelevant, so that we try next r = 2.

Example 6. The third algebraic urn [−1, 5, 1, 3]. The parameters are r = 2 and λ = 5/2, the balance being σ = 4.The fundamental integral

J(u) =

Z u

0

dζ

(1 + ζ2)5/2=

u(3 + 2u2)

3(1 + u2)3/2

is then an algebraic function. Accordingly, the function S(z), which solves J(S(z)) = z, satisfies

S2(3 + 2S2)2

3(1 + S2)3= z2,

which means that its square S2 is a cubic algebraic function in z2. Its expansion could then be made explicit by meansof Lagrange inversion. 2

The previous example suggests considering the class(−1 σ + 11 σ − 1

), (68)

and, specifically, restrict attention to the case when σ is even. Note that the case σ = 2 also includes thefirst algebraic urn. (The case σ = 2 resorts to both of the schemes (67) and (68).) For σ = 2τ , it is seen(using integration by parts) that J(u) is of the form

J(u) =uPτ (u2)

(1 + u2)τ−1/2,

where Pτ is a polynomial of degree τ − 1. As a consequence:

Proposition 13 Any urn having a matrix of the form (68) with σ ≥ 2 is such that the bivariate generatingfunction of histories is an algebraic function. In particular, the pseudo-sine function is such that its squareis the inverse of a rational function.

6 The class of 2× 2 fully sacrificial urns (α, δ < −1)Fully sacrificial urns are such that both diagonal entries in the matrix are negative:

M =(α βγ δ

), α < 0, δ < 0, βγ 6= 0, σ > 0. (69)

These models are the main object of study of the article by Flajolet, Gabarro and Pekari [30], of which thepresent works constitutes in many ways a follow-up. We give below a brief account of this paper under ourcurrent perspective.

6.1 Analytic solutionsThe first integral of Proposition 7 (see §4, p. 74) applies to the fully sacrificial case, so that the formaltreatment of semi-sacrificial urns (§5) can be adapted easily to the fully sacrificial case. In fact, the formalanalysis of [30] was based on the partial differential equation of the complete GF of urn histories as dis-cussed in Note 1 (see p. 66 and Equation (14)). That partial differential equation was then solved by themethod of characteristics, an hitherto neglected tool in the analysis of urn models. In our view, the presenttreatment, via the Isomorphism Theorem (Theorem 1) allows for a more transparent connection betweenthe urn’s structure and its analytic solution, since it bypasses a somewhat opaque computational procedure.

The next phase of [30] consists in working out properties of the S,C functions as regards their singulari-ties. (The corresponding functions are called there ψI and ψII .) The fact that both α and δ are now negativeentails that the conformal mapping properties of the base functions are comparatively simple to elucidate:contrary to the situation encountered with semi-sacrificial urns in §5, there are no occurrences of multi-ple coverings, so that suitable sectors of the complex z-plane are found to be mapped to well-determinedquadrilaterals in the image complex plane (these are the “elementary kites” of [30]). In such a case, domi-nant singularities can be found directly, without resorting to the indirect argument employed in the proof ofLemma 1, p. 81.

The analytic machinery needed to grind probabilistic properties from singularities is then identical to theone exposed in the previous section, §5. One obtains in this way a result that closely parallels Theorem 3.


(M) (Σ) (D) Type

A`−2 34 −3

´x = x−1y3, y = x4y−2 x−1y3∂x + x4y−2∂y ℘ [30, 88]

B`−1 23 −2

´x = y2, y = x3y−1 y2∂x + x3y−1∂y Jacobian [30]

C`−1 22 −1

´x = y2, y = x2 y2∂x + x2∂y Dixonian [16]

D`−1 33 −1

´x = y3, y = x3 y3∂x + x3∂y lemniscatic [30]

E`−1 35 −3

´x = y3, y = x5y−2 y3∂x + x5y−2∂y

√℘

F`−1 45 −2

´x = y4, y = x5y−1 y4∂x + x5y−1∂y

Figure 7: Top: The six elliptic 2× 2 urns: matrix (M), system (Σ), partial differential operator (D), and elliptic type.Bottom: The corresponding six lattices

`A B CD E F

´.

Theorem 4 (Fully sacrificial urns) Consider a fully sacrificial urn determined by a matrix satisfying theconditions of (69).

(i) The probability that, at time n, all balls are of the second type (An = 0) is exponentially small; thisexponential rate is a quotient of Gamma function values taken at rational arguments.

(ii) The number of balls of the first type (An) converges in distribution, in the large n limit, to a Gaussianlaw. The speed of convergence to the Gaussian limit is a negative fractional power of n.

(iii) The number An of balls of the first type satisfies the mean and variance estimates: E(An) ∼ µnand V(An) ∼ ξ2n; in particular, An/n converges in probability to µ. Generally, any rth moment admits aclosed-form expression, which is of hypergeometric type.

(iv) The number An of balls of the first type satisfies a large deviation principle, with the large deviationrate function being effectively computable from the fundamental integral J of the urn model.

Detailed statements that are analogous to Propositions 8–11 above are found in [30].

6.2 Elliptic urnsAn elliptic function is a function that is meromorphic in the whole complex plane and is doubly periodic.There are several ways to develop the theory of such functions—accessible introductions appear in thebooks of Whittaker and Watson [103] and Chandrasekharan [13]. Elliptic functions also enjoy specialconformal mapping properties.

One of the results of [30] amounts to characterizing all 2 × 2 urn models whose solutions can be ex-pressed in terms of elliptic functions. It is found that there are exactly six such models (after arithmeticalirreducibility, in the sense of (30), p. 74, is imposed), which are listed in Figure 7 [top]. Such elliptic urnsare associated to the regular tilings of the Euclidean plane displayed in Figure 7 [bottom].

For our purposes, the most fruitful relations that determine elliptic functions are the differential ones. Forthe Weierstraß ℘-function, this is

℘′(z)2 = 4℘(z)3 − g2℘(z)− g3, implying ℘′′(z) = 6℘(z)2 − 12g2. (70)


0.05

0.1

0.15

0.2

0.25

0.3

0 0.2 0.4 0.6 0.8 1 0

0.05

0.1

0.15

0.2

0.25

0.3

0.1 0.2 0.3 0.4 0.5

Figure 8: The 2–3 tree urn model. Left: a plot of − 1n

log P(An = k)n+1k=0 against k/(n + 1), for n = 24 . . 96;

right: a comparison against the limit large deviation rate function.

see [103, Ch.XX]. The Jacobian elliptic function sn(z, k) satisfies the nonlinear differential equation(dy

dz

)2

= (1− y2)(1− k2y2); (71)

see [103, Ch.XXII]. Finally, an interesting differential system considered by Dixon [20], namelyx = y2

y = x2 ,

x(0) = 1y(0) = 0 , (72)

has solutions, written X(z) = cmh(z) and Y (z) = smh(z) in the notations of [16], that parameterize theFermat hyperbola X3 − Y 3 = 1 (this curve is of genus 1, hence elliptic). We now briefly review some ofthe elliptic models of Figure 7, under the angle of differential relations and the isomorphism theorem.

A. The urn

A =(−2 34 −3

),

(matrix A of Figure 7) is of great historical interest: it appears in the context of balanced 2–3 trees, adata structure of computer science used for sorting and searching [71]. The fringe analysis leads to thisurn, the “2–3 tree urn”; see [2, 5, 106] as well as [1] for a “process viewpoint” and [83] for relations torotation balanced data structures. Panholzer and Prodinger first derived the elliptic solution of this particularmodel [88], which then served as a prime motivation for the development of [30]. Explicit reductionsto elliptic functions are developed in [30, 88], one of which goes as follows: strat from the associateddifferential system,

x = x−1y3

y = x4y−2 ,

then notice that xx = y3, so that x2 = 2∫y3; similarly, we have y3 = 3

∫x4 which yields x2 =

∫ ∫x4.

Setting x2 = Ξ then shows that Ξ′′ = 6Ξ2, which is an avatar of the differential equation (70) satisfied bythe Weierstraß ℘ function.

Figure 8 displays numeric aspects of the large-deviation regime of such urns.B. The urn

B =(−1 23 −2

)has an associated differential system

x = y2

y = x3y−1 ,

which can be directly reduced to elliptic form like in the previous example. The second equation putunder the form yy = x3 implies that y2 = 2

∫x3. Then, the first equation becomes x =

∫x3, which by

differentiation gives x = 2x3. This is then integrated upon multiplication by x: we have xx = 2x3x and12 x

2 = 12x

4 + 12K for some constant K. The latter equation is of the Jacobian form (71).

C. The system (72) defining Dixonian functions is directly associated to the urn with matrix

C =(−1 22 −1

)


(matrix C of Figure 7) discussed as Example 4, p. 79 under the name of “generation-parity model”, withthe functions smh, cmh providing the pseudo-sine (S) and pseudo-cosine (C) of the general theory. Thecombinatorics of these functions, including the reduction to the Weierstraß ℘ form and further properties ofthe urn model, are examined in [16].

D. The case of matrix D is reduced to elliptic form in [30]. What appears in this case is the lemniscaticintegral

I(u) =∫ u

0

dζ√1− ζ4

.

E. For the urn and associated system

E =(−1 35 −3

),

x = y3

y = x5y−2 ,

one can start the reduction like in Case C above. One finds y3 = 3∫x5, so that x = 3

∫x5, implying

x = 3x5 and x2 = x6 + K. This is not immediately recognizable as being of elliptic type. However, setΞ = x2. We find Ξ2 = 4Ξ3 +4K, which is now seen to be of elliptic type. Thus, Ξ(t) is an elliptic functionand x is the square-root of an elliptic function. (It is still justified to name the urn “elliptic”, since the initialconditions a0 = 2, b0 = 0, for instance, lead to bona fide elliptic function solutions.)

F. The reduction of the matrix F to elliptic form is discussed briefly in [30].Let us finally mention that a 3 × 3 model, which is exactly solvable in terms of Jacobian functions, is

discussed in §9.4, p. 107 (the pelican’s urn).

7 Triangular 2× 2 urns (γ = 0)In this section, we examine balanced urn models whose replacement rules are determined by a triangularmatrix(xii):

M =(α σ − α0 σ

), 0 < α < σ. (73)

(Thus, with our general notations: γ = 0, and δ = σ is the balance of the urn.)Triangular urns may be viewed as a simple model for evolving species. Consider two populations, for

instance apes and human. At time n, a randomly chosen individual gives birth to σ children. Apes mayevolve and as a consequence give birth to children of both species. But since offsprings inherit the char-acteristics of their parents, a human can only give birth to a human (this is perhaps an optimistic model ofreality). The evolution of both populations can then be represented by such a triangular urn model.

We establish here the following properties, whose detailed statements constitute Propositions 14–17.

Theorem 5 (Triangular 2× 2 urns) Consider a triangular urn model determined by a matrix of the form (73).(i) The complete generating function of urn histories is an algebraic function of a special radical form.

A single combinatorial sum expresses exactly the composition of the urn at any instant.(ii) Moments of any order are expressible as finite linear combinations of products and quotients of

binomial coefficients (i.e., they are of hypergeometric type).(iii) Both a central and a local limit hold for the urn’s composition, the distribution being expressible in

terms of the stable laws of probability theory.(iv) Moments of fractional orders and the Mellin transform of the limit law are expressible as quotients

of Gamma function values.

Bagchi and Pal [6, p. 395] briefly mention the triangular urn model, which, in their terms, “requiresfurther investigation” and “presents some curious technical problems and appears to need a separate treat-ment”. The model was later studied by Gouet [47] who established convergence in distribution but did notcharacterize the limit, due to limitations arising from the non-constructive character of martingale theory.Kotz, Mahmoud and Robert [74] found the exact (hypergeometric) form of the expectation of the numberof balls of the first color at time n for the particular urn ( 1 3

0 4 ). An original analysis of 2× 2 triangular urnsalong the lines of the present treatment was included in the first version (dated March 2003) of [30], but itwas restricted to urns of the form

(σ−1 1

0 σ

), and for reasons of size, had to be kept out of the eventually pub-

lished version [30]. Puyhaubert and Flajolet then worked out the case of a general matrix of the form (73),(xii) The case α = σ leads back to the original urn studied by Polya; the case α = 0 is such that the urn’s composition is deterministic:

see the “records’ urn” in §2.6; the case α < 0 is such that balls of type x disappear almost surely, so that the evolution is eventuallytrivial. Higher dimensional urns described by a triangular matrix are also of interest since they form an integrable family.


with the corresponding results being reported in the thesis [92]. In the mean time, Janson [63] developeda wide-encompassing theory of triangular urn models, including non-balanced ones. To this class of trian-gular models, we contribute explicit formulae, speed of convergence estimates, as well as a local limit law.Note finally, that Gnedin and Pitman have recently encountered similar generating functions in [45].

7.1 Generating functions and the exact form of probabilitiesOur starting point is the triangular matrix (73) and the main tool is the isomorphism theorem (Theorem 1,p. 65). We assume a0 > 0, since otherwise the urn would never contain a ball of the first type. Regardingexact expressions of probabilities, we have:

Proposition 14 Consider a triangular urn of the form (73), whose composition at time 0 is 〈a0, b0〉. Thebivariate generating function of urn histories is an algebraic function expressible by radicals:

H(x, 1, z) = xa0 (1− σz)−b0/σ(1− xα

(1− (1− σz)α/σ

))−a0/α

. (74)

The probability that the urn contains a0 + kα black balls at time n is

P [An = a0 + kα] =Γ (n+ 1) Γ

(s0σ

)Γ(

s0σ + n

) (k + a0

α − 1k

) k∑i=0

(−1)i

(k

i

)(n+ b0−αi

σ − 1n

). (75)

Proof: Triangular urns are most conveniently treated by means of the associated differential system thatreads:

x = xα+1yσ−α

y = yσ+1 ,

x(0) = x0

y(0) = y0.

The system is triangular, so that one can first solve for y, resulting in

Y (t) = y0(1− σy−σ0 t)−1/σ.

Then X(t) must satisfy:xx−α−1 = yσ−α,

which is explicitly integrable. Hence, the solution to the system is explicit and it provides the algebraicfunction of (74).

By definition, only blocks of α balls can be added to the urn, so that An is necessarily of the formAn = a0+kα for some k ∈ Z≥0. Coefficient extraction from (74) based on the expansion of (1−X)−α0/a

then yields

[xa0+kα]H(x, 1, z) =(k + a0

α − 1k

)(1− σz)−b0/σ

(1− (1− σz)α/σ

)k

.

The formula (75) then follows after expanding the kth power, upon performing coefficient extraction withrespect to z. 2

Example 7. The simplest triangular urn: [1, 1; 0, 2]. The triangular matrix

M =

„1 10 2

«,

with σ = 2, gives rise to the simplest nontrivial triangular model. The BGF of urn histories is

H(x, 1, z) = xa0(1− 2z)−b0/2 `1− x`1−

√1− 2z

´´−a0,

which leads to

[xk]H(x, 1, z) =

k − 1

a0 − 1

!(1− 2z)−b0/2 `1−√1− 2z

´k−a0.

Consider the case b0 = 0 and (for convenience) a0 = 1. Now, the function φ(z) = 1 −√

1− 2z is such thatits powers can be explicitly expanded by means of the Lagrange Inversion Theorem, since it satisfies the implicitdefinition φ = z/(2− φ). We thus find (n ≥ 1 and 2 ≤ k ≤ n + 1)

P[An = k] =k − 1

n2k−1

`2n−kn−1

´`2nn

´ ,


under the initial condition (a0, b0) = (1, 0). Asymptotically, the distribution is of the Rayley type,

P`an = bξ

√nc´∼ ξ

2√

ne−ξ2/4,

uniformly for ξ in any compact set of R>0, this fact also constitutes a special case of Proposition 16 below. This urnwas studied independently by Janson who obtains a central limit result [63, Ex. 3.1]. 2

7.2 MomentsLike probabilities, moments of the number of x-balls admit closed-form expressions: these are even simpler,as each moment turns out to be of hypergeometric type, that is, a finite linear combination of products andquotients of Gamma function values (see Definition 4, p. 87). For an alternative approach, see [90, §3.2].

Proposition 15 The random variable An has expectation and standard deviation that satisfy

E [An] = a0

Γ(

t0s

)Γ(

t0+αs

) nα/s +O(nα/σ−1

),

(σ [An])2 = a0

(a0 + α)Γ( t0

s )Γ( t0+2α

s )− a0

(Γ(

t0s

)Γ(

t0+αs

))2n2α/s +O

(nα/σ

).

(76)

The general moment of order ` is a finite linear combination of terms of the form(t0+kα

s

)· · ·(

t0+kαs + n− 1

)t0s · · ·

(t0s + n− 1

) =

( t0+kαs +n−1

n

)( t0s +n−1

n

) , 0 ≤ k ≤ `,

and asymptotically,

E[An

`]

= α` Γ(

a0+` αα

)Γ(

t0s

)Γ(

a0α

)Γ(

t0+` αs

) n` α/s +O(n(`−1)α/σ). (77)

Proof: Given the probability generating function ofAn (Proposition 14), moments are obtained by repeateddifferentiation with respect to variable u followed by the assignment u 7→ 1. Precisely, the `th factorialmoment is equal to

E [An(An − 1) · · · (An − `+ 1)] =Γ (n+ 1) Γ

(t0σ

)σnΓ

(t0σ + n

) [zn][∂`

∂u`H(x, 1, z)

]u=1

.

For the first moment, this gives us

E [An] =Γ (n+ 1) Γ

(t0σ

)σnΓ

(t0σ + n

) [zn](∂H(x, 1, z)

∂u

)x=1

=Γ (n+ 1) Γ

(t0σ

)σnΓ

(t0σ + n

) [zn]a0∆t0+a

where ∆ denotes here the function ∆ = (1− σz)−1/σ . For the second moment, one finds

E[An

2]

=Γ (n+ 1) Γ

(t0σ

)σnΓ

(t0σ + n

) [zn](a0(a0 + a)∆t0+2a − a0 a∆t0+a

).

Coefficient extraction leads to binomial coefficient forms involving rational parameters. The asymptoticequivalents are obtained by Stirling’s formula, which proves (76).

Similarly, for higher moments, we start by expressing the function H(x, 1, z) as

H(x, 1, z) = ∆t0(1−∆a

(1− x−a

))−a0/a.

For x close enough to 1, this expression admits a series expansion given by

H(x, 1, z) = ∆t0

∞∑k=0

a0

a· · ·(a0

a+ k − 1

)∆ka (1− x−a)k

k!.


Introduce the collection of numbers (sk,`)k,`∈Z≥0 such that for x near 1,

(1− x−α)k =∞∑

`=k

sk,`(x− 1)`.

Then, sk,k = αk and upon exchanging summations, one finds

H(x, 1, z) = ∆t0

∞∑`=0

(x− 1)`

(∑k=0

sk,`

k!a0

α· · ·(a0

α+ k − 1

)∆kα

). (78)

From this expression, it is seen that the `th derivative of H(x, 1, z) at x = 1 is a polynomial in ∆, of degreet0 + `α, whose leading coefficient is a0 · · · (a0 + (k − 1)α) = α`Γ

(a0α + `

)/Γ(

a0α

). By linearity, this

property also extends to the `th power moment. Finally, since

[zn](1− z)−λ1 = o([zn](1− z)−λ2

)if λ1 < λ2,

the asymptotic equivalent of the general moment follows. 2

Example 8. The KMR triangular urn [1, 3; 0, 4] . In [74], Kotz, Mahmoud, and Robert discuss recurrences satisfied byprobabilities and moments in the triangular case and make explicit some first moment computations. In particular, theystudy a variant of the model

M =

„1 30 4

«,

and we name the latter the KMR triangular urn. We consider here the initial conditions (a0, b0) = (1, 0). The meannumber of balls of the x-type at time n satisfies

E[An] = 4n + 1− 1`n−3/4

n

´∼ 4n− π

√2

Γ(3/4)n3/4 + 1 + O(n−1/4).

(Similar forms have been computed by the authors of [74] using somewhat heavier recurrence manipulations.) Heremoments result effortlessly from taking partial derivatives of the BGF H(x, 1, z) at x = 1,

H(x, 1, z) =1

δ1/4, H ′

x(x, 1, z) =1

δ5/4− 1

δ,

H ′′xx(x, 1, z) = 5δ−9/4 − 8δ−2 + 4δ−7/4 − 5δ−5/4 + 4δ−1,

(79)

where δ = 1 − 4z, upon extracting coefficients. As a consequence of the last equation, the variance also admits anexplicit form involving binomial coefficients of fractional index, and its asymptotic value is

2

3π

8√

2− 3π

Γ(3/4)2n3/2 − 3π

√2

Γ(3/4)n3/4 + O(

√n).

The number of balls of the second colour has thus been found to be of average order n3/4, which coincides with theorder of growth of the standard deviation. This is a strong indication of the presence of a non-Gaussian law(xiii), aproperty that we justify in the next subsection. 2

7.3 Local limit lawsThe non-Gaussian character of the limit in triangular cases is obvious from the moment calculations of theprevious subsection. It was observed by Smythe in [96]. However, apparently the martingale argumentsof [96] do not lead to a determination of the limit law. Gouet [47] shows the existence of a limit distributionand remarks that “information on the moments of Z [the limit] can be obtained from the difference equa-tions characterizing the moments of Wn [our An]”, but he does not appear to have available a completecharacterization of this limit.

Proposition 16 The random variable An follows a local limit law whose density is expressed in terms ofMittag-Leffler functions. Precisely, for any compact set I of R>0, and any ξ in I such that ξ nα/σ is aninteger, one has

P[An = a0 + αξnα/σ

]=

1nα/σ

g(ξ) +O

(1

n2α/σ

)(xiii) We are grateful to Dr Philippe Robert for asking us, in 2003, to determine the variance and the corresponding distribution, which

eventually led to the developments of the present section, first reported in the 2003 version of [30] and in [92].


0 1 1n

R n

rn

Figure 9: Hankel integration contour

where the error term holds uniformly with respect to ξ ∈ I . The function g is defined on R>0 by means ofthe series expansion

g(ξ) =Γ(

t0σ

)Γ(

a0α

)ξa0/α−1∑k≥0

(−1)k

Γ(

b0−kασ

) ξk

k!. (80)

Hence, Γ(

a0α

)/Γ(

t0α

)(−ξ)1−a0/αg(−ξ) is the inverse Laplace transform of a

Mittag-Leffler function with parameters (−α/σ, b0/σ). If b0 = 0, it is exactly the density of a stablelaw of index α/σ.

Proof: In connection with the analytic inclination of this paper, it is of interest to note that the stable-likedistributions arise here for a bivariate generating function whose singular structure experiences a discon-tinuity at 1 (see [38, Ch. IX] for a perspective). The type of phenomenon encountered here resorts to theso-called critical composition scheme of [7]: this is described by a BGF of the form F (uG(z)) in the casewhere F,G have algebraic singularities, while G at its singularity assumes a value that creates a singularityfor F . The result thus follows from [7], but for completeness, we sketch the proof here.

By expanding the outer radical (·)−a0/a in the expression (74) of H , one finds

P [An = a0 + kα] =Γ (n+ 1) Γ

(t0σ

) (k+a0α −1k

)σnΓ

(t0σ + n

) [zn] (1− σz)−b0σ(1− (1− σz)

ασ

)k.

The coefficient of zn is estimated by means of Cauchy’s coefficient formula,

[zn] (1− σz)−b0/σ(1− (1− σz)α/σ

)k

=σn

2iπ

∫γ

(1− z)−b0σ

(1− (1− z)

ασ

)k dz

zn+1,

using a contour of integration γ that winds around 1 and is scaled according to the general principles ofsingularity analysis [34, 38]. The contour is displayed in Figure 9: we choose rn = 1 + Cn−α/σ and then,Rn =

√rn2 + (1/n)2, where C is a fixed positive real. Set k = ξnα/σ , where ξ lies in a fixed compact

subset I of R>0. When C is chosen sufficiently large (C > max(I) is enough), the contribution along thelarge circle is exponentially small for any ξ in I . Setting z = (1− t/n) then leads to the approximation,

P [An = a0 + ka] ∼Γ(

t0s

)Γ(

a0a

)ξa0/a−1 12πnα/σ

∫Ht−b0/σe−ξtα/σ+tdt,

where H is a clockwise oriented loop around the negative real axis. Next, keep the kernel t−b0/σet andexpand the other exponential; this gives us

12iπ

∫Ht−b0/σe−ξtα/σ+tdt =

12iπ

∞∑k=0

∫H

(−ξ)k

k!t

ka−b0s etdt

=1

2iπ

∞∑k=0

(−ξ)k

Γ(

ka−b0s

)k!

where the last equality derives from Hankel’s classical representation of the Gamma function [103]. Theerror estimates are obtained by expanding the integral to one more term. The connection with stable laws


x

3.532.521.510.5

y

0

0.25

0.2

0.15

0.1

0.05

0

x

3.532.521.51

y

0.5

0.5

0

0.4

0.3

0.2

0.1

0

Figure 10: Limit laws of triangular urn models with 2 types of balls.

finally results from a trite comparison with the classical expression of their densities as found in the booksof Feller [27] or Zolotarev [107]. 2

Figure 10 displays some of the limit densities of triangular urns for various initial conditions. The ploton the left corresponds to limit laws of

[α 20 α+2

]urns, α ∈ [2 . . 6], with initial conditions (α + 1, 1), the

plot on the right displays densities of the same urns, with initial conditions (α − 1, 1). Graphs with higherpeaks correspond to higher values of α.

7.4 Mellin transforms and moments of fractional ordersWe finally determine the Mellin transform of the limit law of An. Recall that, given a function h(ξ) definedover R≥0, its Mellin transform is defined by

h?(s) =∫ ∞

0

h(ξ)ξs−1 dξ.

In particular, if h(ξ) is the density of a nonnegative random variable Y , then h?(s) = E[Y s−1]. In otherwords, a Mellin transform provides moments of fractional (or even complex) orders, and its expressionmust extrapolate the values of the classical power moments of integral order.

Proposition 17 The density g(x) of the limit law of An admits a Mellin transform given, for any s suchthat <(s) > 1− a0/α, by

g∗(s) =Γ(

a0+(s−1)αα

)Γ(

t0σ

)Γ(

a0α

)Γ(

t0+(s−1)ασ

) . (81)

The moment of order κ of the limit law exists for any complex κ with < (κ) > −a0/a and is equal tog∗(κ+ 1), with g? provided by (81).

Proof: The Mellin transform of the density is given by

g∗(s) =∫ +∞

0

ξs−1

(Γ(

t0σ

)Γ(

a0α

)ξa0/α−1

∫Ht−b0/σe−ξtα/σ+tdt

)dξ

where H is a clockwise oriented loop around the negative real axis. Define

F (ξ, t) =Γ(

t0σ

)Γ(

a0α

)ξa0/α+σ−2t−b0/σe−ξtα/σ+t.

The integral∫ +∞0

ξs−1e−ξγdξ converges for any λ such that <(λ) > 0 and is equal to λ−sΓ(s). Thecontour H can is chosen as displayed below,


0 1

θ

with π2 < θ < σ

απ2 . Given this integration contour, for any t of H and any s satisfying <(s) > 1 − a0/α,

the integral∫ +∞0

F (ξ, t)dξ converges and its value is∫ +∞

0

F (ξ, t)dξ = Γ(a0 + (s− 1)α

α

)t−(t0+(s−1)α)/σ et.

It suffices to integrate this expression with respect to t, and the statement follows. 2

8 Intermezzo: the OK Corral urn [0,−1,−1, 0]The analytic approach to 2 × 2 urn models leads to the solution of several classical problems in a unifiedway. Instances that have been encountered in earlier sections include the coupon collector’s problem, theEhrenfest model, records and rises in permutations, leaves in binary increasing trees and trees of acquain-tance networks, as well as the fringe analysis of 2–3 trees. We briefly discuss here results obtained in theperiod 2004–2005 by two of us [36], which will give rise to a separate publication. These results exemplifyfurther the interest of an analytic approach to urn models, as well as the diversity of stochastic behavioursthat such models may lead to.

The model at stake here is described by a matrix

M =(

0 −1−1 0

), (82)

which is not of a type hitherto considered (here, β = γ = −1, while all previous results are based on theassumption β, γ ≥ 0). If an x ball is picked up, then a y ball should be taken out of the urn; if a y ball ispicked up, then a x ball should be taken out of the urn. The process stops once balls of one or the othercolour are exhausted.

A vivid interpretation of the urn defined by (82) is as follows. Consider two groups of gunwomen facingeach other. At time 0, the first group comprises m fighters and the second group comprises n fighters. (Weassume throughout m,n ≥ 1.) At each discrete instant, a fighter chosen uniformly at random shoots andkills a member of the opposite group. The problem consists in quantifying the chances for a team to winas well as the number of survivors. In other words: To what extent does a numerical advantage in the fieldincrease the chances of a favorable outcome? As pointed out by Kingman [69], this question is a basic onein the mathematical theory of warfare and conflicts initiated by Frederick William Lanchester (1868–1946).

The problem of analysing the urn (82) has been named the “OK Corral Problem” by reference to thefamous gunfight at the OK Corral Ranch of Tombstone in 1881. It makes sense both in the “fair” case wherem,n are equal (or nearly equal) and in the “unfair” case where m n or n m. It was introduced byWilliams and McIlroy in [105]. These authors gave a heuristic argument based on a deterministic differentialsystem and a diffusion process approximation suggesting for the mean number µ(m,n) of survivors theasymptotic estimates

µ(m,n) ∼√m2 − n2, for m n,

µ(m,m) ∼ 2Km3/4, with K := 3−1/4π−1/2Γ(3/4).(83)

In particular the numerical evidence for the correctness of the second estimate was overwhelming.Kingman [68] proved the estimate of µ(m,m) in (83) and a good deal more. By recurrence manipula-

tions, he obtained, for the fair case m = n, the asymptotic form of higher moments of the number S ofsurvivors, which leads to a characterization of the limit distribution of S:

P(S ≤ λn3/4) →√

2π

∫ λ2/3

0

e−x2/2 dx. (84)


100

80

60

40

20

0100806040200

100

80

60

40

20

0100806040200

Figure 11: Left: Fifty random evolutions in the fair case m = n = 50. Right: Random evolutions in the general casem + n = 100, with m ∈ [1 . . 99]. The horizontal axis represents time, the vertical axis corresponds to the size of thefirst group.

Remarkably, the methods of [68] also make it possible to justify the first approximation of (83) for theunfair fight, in the case when m − n is large compared to

√m. Finally Theorem 3 of [68] describes the

transition region when m − n ∼ x√m in terms of a Gaussian error function (erf). Kingman’s arguments

in [68] are based on moment methods, and accordingly they say little regarding speed of convergence tothe asymptotic limit in estimates like (84). This problem is revisited by Kingman and Volkov in [70] whoproved a local limit law, (i.e., convergence to the Gaussian density function) accompanied with error boundsfor the fair case m = n, in a form consistent with (84). The results of [70] are based on ingenious couplingarguments. Interestingly enough, they also lead to an exact expression involving a double alternating sumsfor the probabilities of interest.

The analytic results reported in [36] (see also [92] for a preliminary presentation) are the following.First, it is possible to obtain expressions for the probability distribution of the number of survivors thatare valid in all (fair/unfair) cases and are simpler than those of Kingman and Volkov, as they involve onlya single alternating sum. Then, an explicit single alternating sum representation for the probability ofsurvival of the first group, can be derived. The approach reveals a connection with Eulerian numbers and itprovides an independent proof of Kingman’s phase transition (84) as well as a precise and optimal speed ofconvergence estimate. If the ratio α = m/n satisfies α < 1, then the probability for the first group to winbecomes exponentially small, and one can precisely quantify this exponential smallness. Precise statementsare given now.

Proposition 18 (OK Corral: Exact forms) (i) The probability P(m,n, s) of observing s survivors fromthe first group in the OK Corral problem admits the exact expressions:

P(m,n, s) =s!

(m+ n)!

m∑k=1

(−1)m−k

(k − 1s− 1

)(m+ n

n+ k

)km+n−s

=s!

(m+ n)!

n∑k=1

(−1)n−k

(k + s− 1

k

)(m+ n

m+ k

)km+n−s.

(85)

(ii) The probability P1(m,n) of survival of the first group satisfies

P1(m,n) =1

(m+ n)!

n∑k=1

(−1)m−k

(m+ n

n+ k

)km+n (86)

=1

(m+ n)!

m∑k=1

Am+n,k, (87)

where An,k =∑

0≤j≤k(−1)j(n+1

j

)(k − j)n. In particular, the survival probability is equal to the proba-

bility that a random permutation of size m+ n has less than m rises.

Proof (sketch): (i) The starting point is the time-reversal transformation of Kingman and Volkov [70] (seeNote 2, p. 66 for background): it is possible to view histories of the OK Corral process as mirror images


0/5

0 1 2 32/3

2/3

3/3 1/3

3/31/3

0123450/5 1/5 2/5 3/5 4/5

4/5 3/5 2/5 1/5

Figure 12: The transition graphs of the Ehrenfest urn with N = 3 particles (top) and of the Mabinogion urn withN + 2 = 5 sheep (bottom) illustrate the duality of the two models, under time reversal.

of histories of Friedman’s urn, up to a normalizing factor. It then becomes possible to make use of thecomplete analytic solution of Friedman’s urn as given by (20), p. 68. Standard expansions provide the firstform stated in (85), while the seemingly similar second form results from a suitable integral representation.(ii) Generating function calculations make it possible to carry explicitly the summation needed to derivethe evaluation (86) from that of (85). 2

Proposition 19 (OK Corral: asymptotic forms) (i) Consider nearly fair fights defined by initial condi-

tions m,n satisfyingm− n√m+ n

= θ, where θ lies in a compact interval of the real line. Then, the survival

probability of the first group satisfies

P1(m,n) =1√2π

∫ θ√

3

−∞e−t2/2 dt+O

(1n

), (88)

the uniform error term O(n−1) being optimal.(ii) Consider the “unfair” regime m = αn, with α < 1, a constant. There exist a function W (α)

satisfying W (0) = −∞, W (1) = 0, such that the probability of survival of the first group satisfies

1n

log P1 (αn, n) = W (α) +O

(log nn

). (89)

The function W is defined for α ∈ (0, 1) by W (α) = −(1+α) log(

log xα

xα − 1

)−α log xα, the real number

xα being defined implicitly by the equationxα

xα − 1− 1

log xα=

α

1 + α.

Proof (sketch): (i) The estimate (88) is a consequence of (86), after a central limit estimate with speedis developed, based on quasi-powers approximation. (ii) The estimate (89) is a large deviation estimateestablished by a process similar to the one detailed earlier in the case of semi-sacrificial urns. 2

Other statements of [36] include exact representations for all power moments of the number of survivorsin the fair case m = n: these make precise and complement calculations of Kingman (who obtainedmoments of orders 4, 8, 12, . . .) and lead to another proof of (84). In particular, one gains direct access tothe mean itself, whose expression is a sum that involves the birthday paradox function,

Q(k) :=k

k+k(k − 1)k2

+ · · ·+ k!kk.

An integral representation of the Norlund-Rice type [37] then leads to an asymptotic expansion of the meannumber of survivors. It is of interest to note that, in this perspective, the asymptotic analysis relies on ananalytic extrapolation of the sequence (Q(k)) to the complex plane.

Note 10. The Mabinogion urn. The time reversal transformation that relates the OK Corral urn to Friedman’s canbe of interest in other contexts. For instance, the Mabinogion urn(xiv), described by Williams in [104, p. 159] andcorresponding to the matrix

M =

„1 −1−1 1

«,

analytically reduces to the Ehrenfest urn when time is reversed: see Figure 12 (details are omitted in this short abstract).2

(xiv) According to a medieval Welsh folk tale , the Mabinogion, there is a magical flock of sheep, some black, some white. Whenevera black sheep bleats, a white sheep becomes instantly black; whenever a white sheep bleats, a black sheep becomes instantly white.


9 Higher dimensional modelsWe briefly discuss here a few additional urn models involving balls of more than two colours. Let m be thenumber of colours. We now let xj be the formal variable associated to colour j. When there are only threetypes of balls (m = 3), we tend to use x := x1, y := x2, and u := x3, for readability.

Definition 5 For an urn with matrixM = (Mi,j), (90)

the associated differential system is the following collection of m ordinary differential equation:

Σ :

xi = xi

m∏j=1

xMi,j

j

i=1 . . m

, (91)

with initial conditionsxi(0) = xi.0i=1 . . m . (92)

By a slight abuse of notation, the system (91) can be rewritten as

X = XI+M ,

with I the identity matrix of the proper dimension. An analogue of Theorem 1 still holds for the associatedsystem.

Theorem 6 (Basic isomorphism, m×m urns) Consider an m×m urn with matrix M that is balanced,that is,

∑j Mi,j = σ for all j = 1 . .m. Let X(t) = (Xi(t))m

i=1 be the solution of the system (91) underthe initial conditions (92). Assume that

∏mi=1 xi,0 6= 0. Then, there exists a small neighbourhood of 0 in

the z-plane such that

H(x1,0, . . . , xm,0, z) =m∏

i=1

Xj(t)ai,0 ,

when the urn is initialized with ai,0 balls of colour i.

In general, form ≥ 3, we do not know of first integrals for this system (see Note 11 below). Accordingly,our discussion in this article will be limited to special cases, ones that usually involve symmetry. Fortu-nately, urn models naturally occurring in real life often possess some interesting structure—often, precisely,a symmetry of sorts.

Note 11. Nonintegrability(xv). The system

x = y2, y = u2, u = x2, (93)

together with its d dimensional generalization, has been considered by Jouanolou in [65] in the context of Pfaffian sys-tems. (This model does have combinatorial content since it corresponds to analysing the depth of nodes in binary searchtrees and increasing trees taken modulo d; it is thus a natural generalization of the generation-parity urn, Example 4.) Atheorem of [65] asserts the non-existence of a polynomial first integral for (93). Elementary proofs of non-integrability(in the sense that they are based on basic differential algebra rather than algebraic geometry) are discussed in [82, 86].These results are of great interest: they imply that the global approach that we developed for 2 × 2 urns (based on thefirst integral of Section 3) admits of no universal generalization in any dimension d ≥ 3. 2

9.1 The purely autistic (hyper-Polya) urnNobody cares about her neighbour: the matrix is the unit matrix, which for m = 3 is

M =

1 0 00 1 00 0 1

.

This is a generalization of the original Polya urn. For m = 3, still, the associated differential system reads

x = x2, y = y2, u = u2.

(xv) We are greatly indebted to Jacques-Arthur Weil and Aleksandar Sedoglavic for insightful discussions and references relative tothese questions. In particular A. Sedoglavic has mostly kindly provided the references to the works of Jouanolou, Moulin-Ollagnier,Nowicki et al [65, 82, 86].


III

I

II

40

30

20

030 40200

10

10

Figure 13: The cyclic chambers urn with m = 3 chambers. Left: overall structure. Right: a random evolution of theurn initialized with (a0, b0, c0) = (90, 0, 0) represented by the successive values of (Bn, Cn) for n = 0 . . 900.

Not surprisingly, variables separate. Taking into account the initial condition vector (x0, y0, u0), we have

x(t) =x0

1− x0t, y(t) =

y01− y0t

, u(t) =u0

1− u0t.

The complete GF of urn histories when starting from a configuration of composition (a0, b0, c0) is then

H(x0, y0, u0, z) =(

x0

1− x0z

)a0(

y01− y0z

)b0 ( u0

1− u0z

)c0

.

Coefficient extraction is immediate, and we get by the binomial theorem:

Proposition 20 For the purely autistic urn in dimension m = 3 initialized with (a0, b0, c0), one has

P(An = a,Bn = b, Cn = c) =

(a−1a0−1

)(b−1b0−1

)(c−1c0−1

)(a0+b0+c0+n−1

a0+b0+c0−1

) ,

assuming a+ b+ c = n+ a0 + b0 + c0.

This formula generalizes the case m = 2; see (18). It is of course easily derived directly (in combinatorialterms, we are talking of a shuffle of histories corresponding to each one of the various colours, and shufflestranslate into products of exponential generating functions), extends to any dimension m, and is very wellknown.

9.2 The cyclic chambers (hyper-Ehrenfest) urnThis is a natural generalization of the Ehrenfest urn introduced for dimensionm = 2 in §2.3, p. 69. Form =3, imagine that you have three chambers arranged cyclically at third roots of unity. Particles are allowedto migrate freely from one chamber to the next in clockwise order (Figure 13, left). (Thus the model has aprivileged direction for the migration of particles.) The matrix that models the system is

M =

−1 1 00 −1 11 0 −1

(Each particle becomes a ball whose colour represents the chamber the particle is currently in.)

The associated differential system is then

Σ : x = y, y = u, u = x,

so that x(t) satisfiesd3

dt3x(t) = x(t),


which implies that x(t) is a linear combination of

et, eωt, eω2t, with ω := e2iπ/3.

And similarly for y(t), u(t).The analysis is best developed for the general case of dimension m. Define the sectioned exponentials:

Er,m(t) :=∑

n≡r mod m

tn

n!,

so that Er,m(t) = O(tr) as t → 0. For instance: E0,2(t) = cosh t and E1,2(t) = sinh t. Equivalently, wehave

Er,m(t) =1m

m∑j=1

ω−jreωjt, ω := exp(

2πim

).

These functions with j = 0, . . . ,m− 1 clearly form a basis of the linear space of solutions of the equation

dm

dtmf(t)− f(t) = 0.

The system Σ with initial conditions x0 = (x1,0, x2,0, . . .) then has solutions

X1

(t∣∣ x0

)=

m∑j=1

xj,mEj−1,m(t),

and similarly for the other Xj’s (with a cyclic shift of the initial condition vector (x1,0, x2,0, . . .). Thisprovides the complete analytic solution of the urn model:

Proposition 21 For the cyclic chambers urn in dimension m initialized with a vector (a1,0, a2,0, . . .), thecomplete GF of histories satisfies

H(x0; z) =m∏

j=1

Xj

(z∣∣ x0

)aj,0,

where

Xj

(z∣∣ x0

)=

m∑r=1

xj+r−1,mEj−1,m(t), (94)

in which we interpret xm+1 ≡ x1, xm+2 ≡ x2, and so on. In particular, the probability that an urninitialized with (N, 0, 0) returns to its original state in n steps is

P(An = N) =n!Nn

[zn]E0,m(z)N . (95)

Speed of convergence to equilibrium. Let us first return to m = 3 and specialize the discussion toa system initialized with N balls of the first colour and no ball of any of the other colours. The initialcomposition vector is then (N, 0, 0). Let hn be the number of histories corresponding to the event that attime n, balls are gain all of the first colour, i.e., all particles have returned home. The EGF of the sequencehn is by (95):

h(z) = E0,3(z)N .

The multinomial expansion then provides

h(z) =1

3N

∑j1+j2+j3=N

(N

j1, j2, j3

)exp

((j1 + ωj2 + ω2j3)z

),

so that

hn =1

3N

∑j1+j2+j3=N

(N

j1, j2, j3

)(j1 + ωj2 + ω2j3

)N. (96)

(This quantity is nonzero only if n ≡ 0 mod 3.) The expression above is seen to nicely extend the analysisof the standard Ehrenfest model.


-0.5

0.5

1

-0.5

0 0.50

1

1

-0.5

0.5

0-1

-0.5

0.5

-1

010.5

0

-0.5

0.5

0-0.5

Figure 14: The spectrum of the Markov chain describing the cyclic m-chambers (hyper-Ehrenfest) urn, when m =3, 4, 5.

The system is, like in dimension 2, represented probabilistically by finite Markov chains—the completeone (analogous to the hypercube) and the reduced one (see §2.3 and Figure 5, p. 71). The reduced chain,which keeps track only of the population of each chamber, has

(N+2

2

)states and a transition matrix of the

same dimension. The spectrum of that matrix is also visible on an expression (96): the eigenvalues areprecisely the quantities

λj1,j2 =j1N

+j2Nω +

(1− j1

N− j2N

)ω2. (97)

The reason why the enumerative result (96) “contains” all eigenvalues is the following. For any directedgraph, the number of closed paths with a root vertex distinguished having length n is TrAn with A theadjacency matrix of the graph and Tr the trace operator—this quantity is none other than the nth powersum symmetric function of the eigenvalues (with possible multiplicities): see for instance [11, p. 12] or [38,Ch. V]. For the graph of the complete Markov chain (with 3N states, where the identities of the balls arepreserved, see the discussion of §2.3), the number of closed paths is equal to 3Nhn, by the fundamentalsymmetry of the model. Since the reduced chain is obtained by aggregating states of the complete chain,the sets of eigenvalues of both matrices are the same, only multiplicities differ. Figure 14 (left) displays thecorresponding eigenvalues when m = 3 and N = 8.

The second largest eigenvalues dictate the speed of convergence to equilibrium. In Figure 13 (right),we see a random trajectory represented by the pair (Bn, Cn) of the urn initialized with (N, 0, 0): the urnfirst evolves by transferring balls from Chamber I to Chamber II [the nearly horizontal part of the pathin Figure 13], then enough balls migrate from Chamber II to Chamber III [the slanted zigzag part of thepath], after which the urn tends to spend most of the time near the “equilibrium” value (N/3, N/3) [the blobaround the point (30, 30)]. The common modulus of the second eigenvalues is made fully explicit by thecalculation (97). These correspond to j2 = 1, j3 = 0 or j2 = 0, j3 = 1 so that their common modulus µ[3]

is

µ[3] =∣∣∣∣(1− 1

N

)+

1Ne2iπ/3

∣∣∣∣ =√

1− 3N

+3N2

.

In dimension m (i.e., there are m cyclically arranged chambers), this modulus is made fully explicit by asimilar calculation.

Proposition 22 For the Markov chain associated to the cyclic chambers urn with m cyclic chambers, themodulus of the second largest eigenvalues is

µ[m] =∣∣∣∣(1− 1

N

)+

1Ne2iπ/m

∣∣∣∣ =√(

1− 2N

)2

+4N

(1− 1

N

)cos2

π

m. (98)

The formula (98) is also compatible with what we know of the standard Ehrenfest urn (m = 2), for whichthe spectrum is N/N, (N − 2)/N, . . . ,−(N − 2)/N,−N/N , so that the convergence to equilibrium isan exponential with rate µ[2] = (N − 2)/N . The spectrum for N = 8 and m = 3, 4, 5 is represented inFigure 14. (It is unclear to us how Kac’s treatment could be extended to cover such higher dimensionalcases—for relevant techniques, see Mohar’s survey [85].)


Note 12. Combinatorial analysis of the cyclic chambers urn. The formulae above are not all that surprising from acombinatorial point of view: let the urn be, as above, initialized with N balls of the first type and return to its initial stateafter n moves. A corresponding history can be regarded as a partition of the set 1, . . . , n into an ordered collectionof N (possibly empty) disjoint classes, each of which has size a multiple of 3: the jth class records the various timesat which ball j has been active. Given general principles of combinatorial analysis, this makes it obvious why thegenerating function is an N th power of a sectioned exponential. (This generalizes earlier observations of Flajolet,Francon, Goulden, and Jackson [14, 29, 50] relative to the case m = 2.) 2

The differential system approach can also be extended to urns involving random transitions, providedthat in all cases the model remains balanced. In that case the system becomes polynomial (instead of beingplainly monomial). To illustrate this point, we briefly discuss an undirected version of the cyclic chambersmodel.

Example 9. The symmetrical three-chambers model. The case where each of three chambers communicates with itstwo neighbours can be modelled by a matrix of the form

M =

0@ −1 Z 1− ZZ −1 1− ZZ 1− Z −1

1A ,

where the Bernoulli random variable Z assumes the values 0, 1 with equal probability, and each transition of the urninvolves an independent draw of a variable of type Z. A simple extension of Theorem 6 then shows that one shouldconsider the differential system 8<:

x = 12(y + u)

y = 12(u + x)

u = 12(x + y)

,

8<:x(0) = x0

y(0) = y0

u(0) = u0.(99)

The sum S := x + y + u satisfies S = S, so that

S(t) = (x0 + y0 + z0)et,

and x(t) satisfies the differential equation x + 12x = 1

2S. Thus, the first base function is

X(z) =1

3(x0 + y0 + u0)e

z +1

3(2x0 − y0 − u0)e

−z/2, (100)

and symmetrically for Y (z), U(z). In particular, starting with the initial configuration (N, 0, 0), the probability that allparticles have returned to the first chamber at time n, obtained by setting y0 = u0 = 0 in (100), is

P(An = N) =n!

Nn[zn]

„1

3ez +

2

3e−z/2

«N

.

For a fixed value of N , this quantity is, asymptotically in n,

P(An = N) =1

3N

»1 + O

„„1− 1

2N

«n«–.

This last estimate reflects the fact that the symmetrical urn reaches equilibrium faster than its nonsymmetrical counter-part.

More generally, models of multichamber systems where communication between chambers is described by a finitegraph equipped with probabilities [67] are amenable to such a treatment. Relations between urn models and interactingparticle systems as considered by Itoh and his coauthors [59, 60, 61] seem to deserve further study. 2

9.3 The generalized coupon collector’s urnStart with the double coupon collector problem: Alice and her little brother Bob collect images or couponsfound in chocolate tablets, and Alice who comes first passes all her duplicates to Bob. They want each tohave a complete collection, knowing that there are initially N different coupons. How much time do theyneed, on average, in probability?

The problem is modelled by the urn

M =

−1 1 00 −1 10 0 0

.


Initially there are N balls all of the first colour (x). As soon as an x-ball has been picked up, it becomes aball of the second colour (y). A y ball, when picked up, becomes of the third colour (u) and remains so forever. It is easily realized that the double coupon collector problem is adequately modelled in this way.

The associated differential system is

x = y, y = u, u = u.

It is linear with constant coefficients and of a triangular form: one can then solve for u, which gives asimple exponential, then solve backwards by successive integrations. Given the initial condition vector(x0, y0, u0), the solution is

U(z) = u0ez, Y (z) = y0 + u0(ez − 1), X(z) = x0 + y0z + u0(ez − 1− z).

The complete GF of histories when the urn is started at time 0 with the vector (N, 0, 0) is then

H(x0, y0, u0, z) = (x0 + y0z + u0(ez − 1− z))N.

What is of interest is the time at which a double collection is obtained, that is, all balls of x, y have disap-peared. Such configurations correspond to setting (x0, y0, u0) to (0, 0, 1). Thus the probability that at timen two collections have (already) been obtained is

n!Nn

[zn] (ez − 1− z)N.

These numbers are higher order generalizations of Stirling partition numbers.The results above are also extremely well known and in accordance with what either poissonization

arguments or combinatorial enumeration techniques otherwise provide [31, 38, 40, 87]. They generalizeeasily to higher dimensions, thereby providing a partial unification of balls-and-bins models (random allo-cations [72]) and urn processes of the type studied in this paper—this aspect is otherwise discussed in thecontext of continuous time models by Holst [53].

The foregoing considerations are also of interest to researchers working in analysis of algorithms. Indeed,it can be verified that the algorithm known as random probing hashing(xvi) in a paging environment can bemodelled by an urn of the multiple coupon collector’s type. This observation has been made by Durand andViola in 2003–2004 (unpublished) and a preliminary account of the analysis is to be found in Chapter 3 ofMarianne Durand’s PhD thesis [24]. (The treatment of Durand and Viola makes use of the earlier techniqueof characteristics, as adapted from [30].)

9.4 The pelican’s urnThere is a tradition to be found in several cultures (Egyptian and Christian, inter alia): The mother pelicanhad to feed her little baby pelicans in time of famine; she would tear her breast open to feed her young withher entrails. This is rendered as follows by Alfred de Musset in his famous poem, La nuit de mai.

Lorsque le pelican, lasse d’un long voyage,Dans les brouillards du soir retourne a ses roseaux,Ses petits affames courent sur le rivageEn le voyant au loin s’abattre sur les eaux.Deja, croyant saisir et partager leur proie,Ils courent a leur pere avec des cris de joieEn secouant leurs becs sur leurs goitres hideux.Lui, gagnant a pas lents une roche elevee,De son aile pendante abritant sa couvee,Pecheur melancolique, il regarde les cieux.

Le sang coule a longs flots de sa poitrine ouverte ;En vain il a des mers fouille la profondeur ;L’Ocean etait vide et la plage deserte ;Pour toute nourriture il apporte son cœur.Sombre et silencieux, etendu sur la pierre,Partageant a ses fils ses entrailles de pere,Dans son amour sublime il berce sa douleur,Et, regardant couler sa sanglante mamelle,Sur son festin de mort il s’affaisse et chancelle,Ivre de volupte, de tendresse et d’horreur.

The preceding considerations perhaps justify naming as pelican’s the urn whose matrix is, for m = 3,

M =

−1 1 11 −1 11 1 −1

.

(xvi) See Knuth’s description of the model in [71, Ex. 6.4.48]. The random probing model is a companion to the uniform probingmodel described in [71, pp. 534–535], and both serve to estimate the efficiency of double hashing [71, pp. 528–529]. See also thediscussion in [101, pp. 507–509].


The associated differential system is

x = yz, y = ux, u = xy. (101)

Observe that the system is strangely reminiscent of a system of differential equation satisfied by the Jaco-bian elliptic functions [103, pp. 492–493]. To wit:

d

dtsn t = cn t dn t,

d

dtsn t = −dn t sn t,

d

dtsn t = −k2 sn t cn t, (102)

where k is known as the modulus. The elliptic connection is easily elicited. In (101), multiply the firstequation by x, the second by y, the third by u, resulting in

xx = yy = uu = xyu. (103)

This gives us a system of two first integrals:

x2 − x20 = y2 − y2

0 = z2 − z20 ,

which is sufficient to guarantee integrability. In particular, one can express y, u in terms of x, resulting inthe following differential equation for x:

dx√(x2 − x2

0 + y20)(x2 − x2

0 + u20)

= dt.

This shows that indeed x(t) is of the Jacobian elliptic type. The reduction to the standard normal form ofJacobian elliptic functions, namely,

dw

dt=√

(1− w2)(1− k2w2),

is then easily obtained upon a double change of scale:

X(t) =1√

x20 − y2

0

sn

(t√

x20 − y2

0

+K,

√x2

0 − y20

x20 − u2

0

).

(The second argument is the explicitly written modulus; the constantK is determined by initial conditions.)One can then set x0 = 1 as well as y0 = y, u0 = u, to obtain the three fundamental functions of thepelican’s urn.

It is interesting to note that the pelican’s urn is related to investigations carried out by Schett in the 1970s:Schett introduced the trivariate operator

D = yu∂x + ux∂y + xy∂u,

which is precisely the partial differential operator of the pelican’s urn, and he was led to doing so forthe purpose of finding efficient algorithms for calculating the Taylor expansions of the Jacobian ellipticfunctions [93]. Starting from there, Dumont and Viennot were then able to deduce new combinatorialinterpretations of coefficients of the Jacobian elliptic functions [21, 22, 23, 100]. To this panoply, ourcurrent developments thus add histories of the pelican’s urn as a new combinatorial interpretation.

The hyperpelican urn and hyperelliptic functions. Integrability is also granted for higher dimensionalversions of the pelican’s urn. For instance, in dimension m = 4, the matrix

M =

−1 1 1 11 −1 1 11 1 −1 11 1 1 −1

will give rise to a hyperelliptic equation

dx√(x2 − a2)(x2 − b2)(x2 − c2)

= dt,

since (compare with (103))xx = yy = uu = vv = xyuv.

We are not currently aware of works dedicated to the combinatorial interpretation of Abelian integrals (e.g.,Abel’s addition theorem) over curves of genus g > 1, though some interesting combinatorics might arise.Urn models at least indicate that hyperrelliptic integrals and their inverses may well have combinatorialcontent.


10 Triangular 3× 3 urnsThe theme of this section is the study of urn models with a 3× 3 matrix of the triangular form

M =

α β σ − α− β0 δ σ − δ0 0 σ

. (104)

Such urns present a specific interest since they do not satisfy the irreducibility condition assumed in mostother works. (For a notable exception, see Janson’s recent study [63].)

A 3 × 3 triangular urn serves as yet another simple model of evolving species. There are now threespecies, corresponding to three types of balls, say x, y, u, with respective populations An, Bn, Cn at time n.Evolution is irreversible, following the direction x → y → u, in accordance with the following diagram:

The evolution of balls of the first colour, x, is described by a 2×2 triangular urn process, namely(α σ − α0 σ

)(simply aggregate colours y, u and view the corresponding balls as being of a single colour), so that we mayfreely restrict our attention to the random variables Bn and Cn representing at time n the number of ballsof colour y and u, respectively. In order to avoid degenerate cases, we further assume the conditions

αβδ 6= 0. (105)

We now present a classification of triangular urn models with three types of balls under the assump-tions (104) and (105). It turns out that three cases need to be considered: in each case, we establishconvergence in distribution of Bn and fully characterize its limit law. (Similar methods would apply toestimate Cn, while An is simply described by a balanced 2× 2 triangular model whose behaviour is knownfrom §7.) The results of this section have been already reported in the thesis [92], to which we refer fordetailed proofs.

10.1 Generating functionsA function I(z) serves to express all generating functions relative to 3 × 3 urns models—it is in fact aspecial hypergeometric function. Its main analytic property is described by the following lemma.

Lemma 4 Let λ = α+β−δδ . For any complex z satisfying |z| < 1, the function I(z), defined by

I(z) =∑

n≥0 , n 6=α/δ

(−1)n α

α− nδ

(n− λ− 1

n

)zn, (106)

admits a unique analytic continuation to the slit complex plane C \ R≤−1.

Proof (sketch): The function I(z) is clearly analytic in the unit disc. Define Iz0(z) by

Iz0(z) = −αzα/δ

∫ z1/δ

z0

(1 + tδ)λ

tα+1dt, (107)

where z0 is a fixed real from ]0; 1]. By adopting as path of integration a straight line from z0 to z, we seethat Iz0 exists as an analytic function in C \ R≤ 0.

Termwise integration of the expansion of the integrand of (107) provides a function which, up to anormalizing term of low degree, coincides with I(z) near the origin. (In the case when δ divides α, afurther logarithmic term is to be taken into account.) 2


Proposition 23 Consider a 3 × 3 urn of the form (104), satisfying the condition (105) and initializedwith (a0, b0, c0). The 4-variable generating function of urn histories is

H(x, y, u; z) = Xa0Y b0U c0 . (108)

The functions Y and U are given by

U = u(1− s uσz)−1/σ, Y = y

(1− yδ

(1uδ− 1Uδ

))−1/δ

.

The function U is determined by cases as follows:

— If α/δ 6∈ Z≥1:

X = x

(1− xα

(I

((yu

)δ

− 1)y−α − I

((Y

U

)δ

− 1

)Y −α

))−1/α

.

— If there exists m ∈ Z≥1 such that α = mδ, then

X = x

(1− xα

(I

((yu

)δ

− 1)y−α − I

((Y

U

)δ

− 1

)Y −α

−(m− λ− 1

m

)(1yδ− 1uδ

)m

log(Y/y)α

))−1/α

.

Proof (sketch): We make use of the fundamental isomorphism theorem in its multidimensional version. (Aproof based instead on partial differential equations and the method of characteristics appears in [92].) Theassociated differential system is triangular: x = xα+1yβuσ−α−β

y = yδ+1uσ−δ

u = uσ+1

x(0) = x0

y(0) = y0u(0) = u0.

The u-component can be solved directly,

u(t) = u0(1− σuσ0 t)

−1/σ,

and the y-component then results by integration (since yy−δ−1 = uσ−δ),

y(t) = y0(1− yδ

0

(u−δ

0 − u(t)−δ))−1/δ

,

just like in the two-dimensional case. The x-component is finally determined by another integration (sincexx−α−1 = yβuσ−α−β), which leads to the special function I(z) defined in (106). 2

10.2 Expectation and second moment of Bn

The probability generating function of the random variable Bn, giving the number of y-balls at time n isencoded into the GF H(1, y, 1; z). Then, moments of any order are obtained as usual by differentiatingwith respect to y, then setting y = 1, and finally extracting coefficients by means of the [zn] operation. Let∆ = (1− szσ)−1/σ . The derivatives of H with respect to y at y = 1 are seen to be polynomials in ∆ andlog ∆, whose coefficients are easily determined. Three cases emerge from the analysis.

Proposition 24 Let Bn be the random variable giving the number of y-balls present in the urn at time n.Then the first and second moments of Bn belong to one of three distinct regimes:

— If α < δ: E[Bn] =

(b0 + a0

δδ−α

)Γ( t0

σ )Γ( t0+δ

σ ) nδ/σ +O

(n(δ−1)/σ

)E[(Bn)2] =

[K

Γ(t0σ )

Γ(t0+2α

σ )

]n2δ/σ +O

(n(2δ−1)/σ

)where K is

K =(b0 + a0

δ

δ − α

)2

+ δb0 + a0

(αδ2

(δ − α)2+ δ − α− 2δ +

2δ(δ + δ)2δ − α

).


— If α = δ:E[Bn] = a0 δ

α

Γ( t0σ )

Γ( t0+ασ ) n

α/σ log n+O(nα/σ

)E[(Bn)2] ∼

(a0 δα

)2 [(1 + α

a0

)Γ(

t0σ )

Γ(t0+2α

σ )

]n2α/σ (log n)2 +O

(n2α/σ log n

).

— If α > δ: E[Bn] = a0 δ

α−δ

Γ( t0σ )

Γ( t0+ασ ) n

α/σ +O(n(α−1)/σ

)E[(Bn)2] =

(a0 δα−δ

)2[(

1 + αa0

)Γ(

t0σ )

Γ(t0+2α

σ )

]n2α/σ +O

(n2(α−1)/σ

).

This result, when accompanied by similar analyses of Cn (the number of u-balls) and An (the numberof x-balls), provides an overall picture of the urn’s probable evolutions. First, since An is governed bya 2 × 2 triangular urn process, we have with high probability [w.h.p.]: An = Θ(nα/σ), given what wealready know regarding 2× 2 triangular urns (§7, p. 93). Naturally, we have An +Bn +Cn = σn+O(1),which implies Bn + Cn = σn + o(n) [w.h.p]; next Bn = o(n) holds according to Proposition 24, so thatCn = σn + o(n) [w.h.p.]. We may then regard intuitively the diagonal quantities as reproduction ratesand interpret the situation as follows: (i) if α < δ, the reproduction rate of x-balls is smaller than thatof y-balls, so that y balls evolve essentially at their own pace, which corresponds toBn being of order nδ/σ.(ii) if α = δ, the reproduction rates of x- and y-balls are the same, and from this confluence of values, alogarithmic term is induced; (iii) if α > δ, then x-balls, which reproduce faster, drive the game for y-balls.

10.3 The distribution of Bn when α ≥ δ

This is the case where the evolution of y-balls is essentially driven by that of x-balls. We treat it first as it istechnically simpler. We have:

Proposition 25 For a triangular urn with α ≥ δ, the random variable Bn giving the number of y-balls attime n admits moments that satisfy the following asymptotic estimates:

— If α > δ:

E[Bn

`]

=(

αδ

α− δ

)` Γ(

a0+`αα

)Γ(

t0σ

)Γ(

a0α

)Γ(

t0+`ασ

)n`α/σ +O(n`(α−1)/σ

)(109)

— If α = δ, then

E[Bn

`]

= δ` Γ(

a0+`αα

)Γ(

t0σ

)Γ(

a0α

)Γ(

t0+`ασ

)n`a/σ(log n)` +O(n`α/σ(log n)`−1

). (110)

In particular, the normalized variable Bn defined by

Bn =α

δ

Bn

nα/δ log nif α = δ, Bn =

α− δ

δ

Bn

nα/δotherwise.

converges in distribution to a limit low whose density is given by (80),

Proof (sketch): The moment calculations are detailed in [92]. A comparison with the moments obtainedfor the 2× 2 case shows that they are the same to main asymptotic order, up to scaling. The convergence indistribution then follows from the classical moment convergence theorem. 2

10.4 The distribution of Bn when α < δ

In this case, the functionH(1, y, 1; z) admits a series expansion at y = 1 whose coefficients are polynomialsin ∆ and log ∆; the coefficient of (y−1)` has degree `δ in ∆ and its leading term involves a rational functionof a0, b0, α, β, δ. Hence, each moment of the normalized random variable Bn = Bn/n

δ/σ converges to afinite value. The first few terms are easily obtained by series expansions (and computer algebra!), but wehave been unable to find a simple expression for the general term. The following proposition establishes aconvergence in distribution for this random variable as well as an integral representation of the characteristicfunction. (This representation can then be used to calculate moments at will, via successive differentiations.)


Proposition 26 For a triangular urn with α < δ, the normalized random variable

Bn =Bn

nδ/σ

converges in distribution to the law whose characteristic function admits the integral representation

φ(ξ) =1

2iπΓ(t0σ

)∫Ht−c0/σ (tρ − iδξ)−(a0+b0)/δ

I

(iδξ

tρ − iδξ

)−a0/α

etdt. (111)

For any ξ in [−ξ0; ξ0], the contour H can be any clockwise oriented contour surrounding the negative realaxis at a distance at least

(1 + 1

R

)1/ρ (δξ0)1/ρ, where R is such that I−1/α is well defined and bounded for

|z| ≤ R.

Proof (sketch): We only treat the case of an urn initialized with one x-ball at time 0, the general casebeing similar. The idea consists in extracting the coefficient of zn in H(1, y, 1; z) by means of Cauchy’scoefficient formula: this gives the probability generating function of Cn, up to a factor of Hn. For thiscoefficient extraction, a Hankel contour in accordance with the principles of singularity analysis is adopted.At the same time, since convergence in law depends on a suitably scaled version of Bn, the substitution

y = eiξ/nδ/σ

,

is to be used. This renormalization, combined with standard asymptotic approximations, then gives rise tothe integral representation (111).

Precisely, set ρ = δ/σ. The characteristic function φn(t) of Bn is

φn(ξ) ≡ E[eiξBn

]=

n!1 · (1 + σ) · · · (1 + (n− 1)σ)

[zn]H(1, eiξ/nρ

, 1; z).

By the continuity theorem of characteristic functions, it suffices to show that the function φn converges toφ, as given by (111). Let ξ be a fixed real number taken from an interval [−ξ0, ξ0]. The coefficient of [zn]is obtained by means of Cauchy’s coefficient formula, which gives (after the change of variable σz → z)

φn(ξ) =n!

1σ · · ·

(1σ + n− 1

) 12iπ

∮h

(z,

ξ

nβ

)− 1α dz

zn+1,

where the function h is

h(z, θ) = 1−I(eiδθ − 1

)eaθ

+ I

((1− z)−ρ

(1− e−iδθ

)1− (1− z)−ρ (1− e−iδθ)

)(e−iδθ − 1 + (1− z)ρ

)α/δ.

As already announced, we choose the integration contour to be of the Hankel type, as in Figure 9. Then werenormalize z by setting z = 1− t/n, in accordance with the general principles of singularity analysis. Onethen finds the main approximation:

h

(1− t

n,ξ

nρ

)= I

(iδξ +O

(1

nρ

)tρ − iδξ +O

(1

nρ

))(tρ − iδξ +O

(1nρ

))α/δ

+O

(1

n(δ−α)/σ

).

Integration of this last approximation over a complete Hankel contour only introduces negligible error terms(see [92] for details), a fact which concludes the proof of the statement. 2

The results above have some striking formal similarities with the analysis given by Janson of unbalanced2×2 urn models. Perhaps this could be explained to some extent by considering that the x- and y-balls playbetween themselves a game that resembles the one described by an unbalanced 2× 2 model.

11 Relations between discrete and continuous time modelsSince a large portion of the literature dedicated to urns appeals to a continuous time model, it is of obviousinterest to investigate whether a dictionary exists between the continuous and discrete frameworks. Ele-ments for this discussion can be found in the book by Athreya and Ney [4] (see p. 219–224 for a specific


treatment of the relationships between urns and branching processes), the article of Athreya and Karlin [3],as well as in the dedicated calculation of [16, §3.4].

The continuous time model assumes that each ball is equipped with an exponentially distributed, Exp(1),clock. In other words, the random time T at which a ball fires a transition (after it has been born) satisfies

P(T > t) = e−t.

The exponential distribution satisfies the well-known memoryless property, namely

P(T > t+ τ | T > τ) = P(T > t),

which entails a fundamental equivalence principle (see [4, p. 221] and [3, Th. 1]): The discrete time urnprocess at any instant n = 0, 1, 2, . . . and the continuous time version stopped after the nth firing areequivalent.

Consider a balanced urn process given as usual by a matrix M with balance σ. Let Z(t) representthe continuous time urn process at time t, and let Yn be the discrete urn process at time n (our notationsfollow [4]). We can now state a quantitative version of the equivalence principle.

Theorem 7 (Discrete-to-continuous transfer) Consider a balanced urn model with balance σ. Let E bean event that is defined by a set of histories(xvii), also denoted by E . Define the continuous time probabilities:

Ω(t) := P(Z(t) ∈ E)

and the history countsΞn := card

h ∈ E

∣∣ |h| = n.

Let the urn be initialized with s0 balls at time 0. Then, the continuous time probabilities are given by

Ω(t) = e−s0tΞ(

1σ

(1− e−σt

)), (112)

where Ξ(z) is the exponential generating function of the sequence of history counts (Ξn):

Ξ(z) :=∑n≥0

Ξnzn

n!.

Proof: The proof boils down to the determination of the distribution of the urn’s size in continuous time,the principles having been folklore knowledge for a long time (see [4, p. 109] for the generalized Yuleprocess, [63, Lemma 5.1], and [16, §3.4] for a calculation analogous to the present one). We have, uponconditioning on size,

P(Z(t) ∈ E) =∑n≥0

P(S(t) = n)Ξn

Hn, (113)

where S(t) is the random variable representing the size of the continuous time urn at time t.Regarding the distribution of S(t), we have

P[S(t+ dt) = n+ σ

]= P

[S(t) = n+ σ

]· (1− (n+ σ)dt) + P

[S(t) = n

]· ndt. (114)

(Either size has not changed as none of the (n + σ) balls fired in the interval [t, t + dt], or size waspreviously n and one of the n balls fired. This is an instance of the Kolmogorov backward equations.) Set$n(t) = P(S(t) = n). Then, Equation (115) can be rephrased as the differential recurrence

$′n+σ(t) + (n+ σ)$n+σ(t) = n$n(t). (115)

The differential equation satisfied by $n+σ is then solved by the variation-of-constant method, which pro-vides an integral recurrence:

$n+σ(t) = ne−(n+σ)t

∫ t

0

e(n+σ)t$n(t) dt, (116)

(xvii) Acceptable events are for instance: (i) the urn contains k balls of the first type; (ii) the urn contains k balls of the first type attime n; (iii) the urn has more balls of the first type than of the second type at all even instants; (iv) a ball of the second type hasnever been drawn; (v) all balls of the second type have disappeared in the end; and so on.


whose solution is easily verified to be

$n(t) ≡ P(S(t) = n) = e−s0t(1− e−σt

)n(n+ s0/σ − 1n

). (117)

It then suffices to observe that Hn, as expressed by Proposition 1 (p. 63), equals n!σn(n+s0/σ−1

n

), and

the proof is completed by combining (113) and the evaluation (117). 2

We can first conduct a few verification. Let E = ε, i.e., only the “empty” history of size 0 is in E .We have Ξ(z) = 1 and Theorem 7 predicts Ω(t) = e−s0t, which is consistent with the fact that no firingamongst the s0 original balls should have taken place till time t. Next, let E be the set of all histories: thegenerating function is, by Proposition 1, Ξ(z) ≡ H(z) = [1− σz]−s0/σ . Substitution then shows that

Ω(t) = e−s0t[1− (1− e−σt)

]−s0/σ ≡ 1,

as it should be. Naturally, the statement of Theorem 7 can be extended by linearity to generating functions.For instance, we have:

Proposition 27 Let (A(t), B(t)) be the composition of a balanced 2 × 2 urn at time t and let, as usual,H(x, y, z) be the trivariate generating function of urn histories. The continuous and discrete models arerelated by ∑

a,b

P [(A(t), B(t)) = (a, b)]xayb = e−s0tH

(x, y,

1σ

(1− e−σt

)).

As a consequence of Theorem 7 and 27, essentially all our previous calculations relative to balancedurns in discrete time can be translated back into the framework of continuous time. Take for instancethe generation-parity model of Example 4, p. 79, initialized with one x-ball. Under continuous time, theprobability that, at some large time t, all balls be of an odd generation—this is an “extreme” large deviationevent—is

e−t smh(1− e−t)

(smh(z) is defined in (43)), and is thus exponentially small, being asymptotic to e−t smh(1), wheresmh(1) .= 1.20541.

12 ConclusionsA first conclusion of this preliminary study is that some order can be put amongst those classical urnmodels that admit of explicit solutions. More generally, the methods of analytic combinatorics, based ongenerating functions, complex asymptotics and singularities, as well as the Quasi-Powers Theorem (forGaussian laws) and specific contour integration (for non-Gaussian laws) can provide a wealth of valuableinformation. In this way, we have been able to obtain precise estimates of speed of convergence, some locallimit convergence results, large deviation rate functions, and, in a few cases even, limit laws that seem to benew.

Many more problems deserve further investigation. We list them here in order of increasing difficulty.

— The study of nonsacrificial urns (either altruistic or selfish in the sense of §3, p. 74) should be con-ducted in a way that parallels our study of semi-sacrificial urns. We foresee no particular obstacle inthe case where the urn’s composition is Gaussian, as the Quasi-Powers Theorem should be applica-ble. Non-Gaussian cases should be amenable to treatment by methods of contour integration perhapsanalogous to the ones employed for triangular models (see [38, Ch. IX] for a discussion of somerelevant methods).

— The study of higher dimensional models might be amenable to a perturbative analysis of singularities,at least in the case where the urn has a Gaussian behaviour, and despite the fact that there is no hopeto obtain general solutions because of nonintegrability considerations (Note 11, p. 102).

— The elucidation of the behaviour of at least some of the non-balanced models seems to be the mostchallenging problem posed to the analytic approach at this stage.

Acknowledgements. The authors are grateful to Philippe Chassaing, Yoshiaki Itoh, Nicolas Pouyanne,Philippe Robert, Aleksandar Sedoglavic, Brigitte Vallee, and Jacques-Arthur Weil for many stimulatingdiscussions and valuable bibliographical indications.


References[1] David Aldous, Asymptotic fringe distributions for general families of random trees, Annals of Applied Probabil-

ity 1 (1991), 228–266.

[2] David Aldous, Barry Flannery, and Jose Luis Palacios, Two applications of urn processes—the fringe analysis oftrees and the simulation of quasi-stationary distributions of Markov chains, Probability in the Engineering andInformational Sciences 2 (1988), 293–307.

[3] Krishna B. Athreya and Samuel Karlin, Embedding of urn schemes into continuous time Markov branchingprocesses and related limit theorems, Ann. Math. Statist. 39 (1968), 1801–1817. MR MR0232455 (38 #780)

[4] Krishna B. Athreya and Peter E. Ney, Branching processes, Springer-Verlag, New York, 1972, Die Grundlehrender mathematischen Wissenschaften, Band 196.

[5] Ricardo A. Baeza-Yates, Fringe analysis revisited, ACM Computing Surveys 27 (1995), no. 1, 109–119.

[6] A. Bagchi and A. K. Pal, Asymptotic normality in the generalized Polya-Eggenberger urn model, with an appli-cation to computer data structures, SIAM Journal on Algebraic and Discrete Methods 6 (1985), no. 3, 394–405.

[7] Cyril Banderier, Philippe Flajolet, Gilles Schaeffer, and Michele Soria, Random maps, coalescing saddles, sin-gularity analysis, and Airy phenomena, Random Structures & Algorithms 19 (2001), no. 3/4, 194–246.

[8] Rodney J. Baxter, Exactly solved models in statistical mechanics, Academic Press Inc. [Harcourt Brace Jo-vanovich Publishers], London, 1982.

[9] F. Bergeron, G. Labelle, and P. Leroux, Combinatorial species and tree-like structures, Cambridge UniversityPress, Cambridge, 1998.

[10] Francois Bergeron, Philippe Flajolet, and Bruno Salvy, Varieties of increasing trees, CAAP’92 (J.-C. Raoult,ed.), Lecture Notes in Computer Science, vol. 581, 1992, Proceedings of the 17th Colloquium on Trees inAlgebra and Programming, Rennes, France, February 1992., pp. 24–48.

[11] Norman L. Biggs, Algebraic graph theory, Cambridge University Press, 1974.

[12] William H. Burge, An analysis of a tree sorting method and some properties of a set of trees, First USA-JAPANComputer Conference Proceedings, AFIPS and IPSJ, October 1972, pp. 372–378.

[13] K. Chandrasekharan, Elliptic functions, Grundlehren der Mathematischen Wissenschaften [Fundamental Princi-ples of Mathematical Sciences], vol. 281, Springer-Verlag, Berlin, 1985.

[14] Laurent Cheno, Philippe Flajolet, Jean Francon, Claude Puech, and Jean Vuillemin, Dynamic data structures:Finite files, limiting profiles and variance analysis, Eighteenth Annual Conference on Communication, Control,and Computing, The University of Illinois at Urbana–Champaign, 1980, pp. 223–232.

[15] Louis Comtet, Advanced combinatorics, Reidel, Dordrecht, 1974.

[16] Eric van Fossen Conrad and Philippe Flajolet, The Fermat cubic, elliptic functions, continued fractions, and acombinatorial excursion, Seminaire Lotharingien de Combinatoire 54 (2006), no. B54g, 1–44.

[17] Benoıt Daireaux, Veronique Maume-Deschamps, and Brigitte Vallee, The Lyapunov tortoise and the dyadichare, 2005 International Conference on Analysis of Algorithms, Discrete Mathematics & Theoretical ComputerScience Proceedings, vol. AD, 2005, pp. 71–94 (electronic).

[18] F. N. David and D. E. Barton, Combinatorial chance, Charles Griffin, London, 1962.

[19] Frank den Hollander, Large deviations, American Mathematical Society, Providence, RI, 2000.

[20] Alfred Cardew Dixon, On the doubly periodic functions arising out of the curve x3 + y3 − 3αxy = 1, TheQuarterly Journal of Pure and Applied Mathematics 24 (1890), 167–233.

[21] Dominique Dumont, A combinatorial interpretation for the Schett recurrence on the Jacobian elliptic functions,Mathematics of Computation 33 (1979), 1293–1297.

[22] , Une approche combinatoire des fonctions elliptiques de Jacobi, Advances in Mathematics 1 (1981),1–39.

[23] , Grammaires de William Chen et derivations dans les arbres et arborescences, Seminaire Lotharingiende Combinatoire 37 (1996), Art. B37a, 21 pp. (electronic).

[24] Marianne Durand, Combinatoire analytique et algorithmique des ensembles de donnees, Ph.D. thesis, EcolePolytechnique, France, 2004.

[25] A. Edelman and E. Kostlan, The road from Kac’s matrix to Kac’s random polynomials, Proceedings of the 1994SIAM Applied Linear Algebra Conference (Philadelphia) (J.G. Lewis, ed.), SIAM, 1994, pp. 503–507.

[26] Paul Ehrenfest and Tatiana Ehrenfest, Uber zwei bekannte Einwande gegen das Boltzmannsche H-Theorem,Physikalische Zeitschrift 8 (1907), no. 9, 311–314.

[27] W. Feller, An introduction to probability theory and its applications, vol. 2, John Wiley, 1971.


[28] Philippe Flajolet, Combinatorial aspects of continued fractions, Discrete Mathematics 32 (1980), 125–161,Reprinted in the 35th Special Anniversary Issue of Discrete Mathematics, Volume 306, Issue 10–11, Pages992-1021 (2006).

[29] Philippe Flajolet and J. Francon, Structures de donnees dynamiques en reservoir borne, III Journees Algorith-miques (J. Morgenstern, ed.), Universite de Nice, 1980, 14 pages. (Proceedings of a meeting, June 1980).

[30] Philippe Flajolet, Joaquim Gabarro, and Helmut Pekari, Analytic urns, Annals of Probability 33 (2005), no. 3,1200–1233, Available from ArXiv:math.PR/0407098.

[31] Philippe Flajolet, Daniele Gardy, and Loys Thimonier, Birthday paradox, coupon collectors, caching algorithms,and self–organizing search, Discrete Applied Mathematics 39 (1992), 207–229.

[32] Philippe Flajolet, Xavier Gourdon, and Conrado Martınez, Patterns in random binary search trees, RandomStructures & Algorithms 11 (1997), no. 3, 223–244.

[33] Philippe Flajolet and Fabrice Guillemin, The formal theory of birth-and-death processes, lattice path combina-torics, and continued fractions, Advances in Applied Probability 32 (2000), 750–778.

[34] Philippe Flajolet and Andrew M. Odlyzko, Singularity analysis of generating functions, SIAM Journal on Alge-braic and Discrete Methods 3 (1990), no. 2, 216–240.

[35] Philippe Flajolet, Patricio Poblete, and Alfredo Viola, On the analysis of linear probing hashing, Algorithmica22 (1998), no. 4, 490–515.

[36] Philippe Flajolet and Vincent Puyhaubert, Analytic combinatorics at OK Corral, Technical memorandum, 2005,To be submitted.

[37] Philippe Flajolet and Robert Sedgewick, Mellin transforms and asymptotics: finite differences and Rice’s inte-grals, Theoretical Computer Science 144 (1995), no. 1–2, 101–124.

[38] , Analytic combinatorics, April 2006, Chapters I–IX of a book to be published by Cambridge UniversityPress, 717p.+x, available electronically from P. Flajolet’s home page.

[39] Philippe Flajolet and Michele Soria, General combinatorial schemas: Gaussian limit distributions and exponen-tial tails, Discrete Mathematics 114 (1993), 159–180.

[40] Dominique Foata, Bodo Lass, and Guo-Niu Han, Les nombres hyperharmoniques et la fratrie du collectionneurde vignettes, Seminaire Lotharingien de Combinatoire 47 (2001), no. B47a, 1–20.

[41] J. Francon, G. Viennot, and Jean Vuillemin, Description and analysis of an efficient priority queue representa-tion, Proceedings of the 19th Annual Symposium on Foundations of Computer Science, IEEE Computer Press,1978, pp. 1–7.

[42] Jean Francon and Gerard Viennot, Permutations selon leurs pics, creux, doubles montees et doubles descentes,nombres d’Euler et de Genocchi, Discrete Mathematics 28 (1979), 21–35.

[43] Bernard Friedman, A simple urn model, Communications in Pure and Applied Mathematics 2 (1949), 59–70.

[44] J. Gani, Random-allocation and urn models, Journal of Applied Probability 41A (2004), 313–320, Stochasticmethods and their applications.

[45] Alexander Gnedin and Jim Pitman, Exchangeable Gibbs partitions and Stirling triangles, Preprint available onArXiv, 2004, arXiv.org:math/0412494.

[46] Raul Gouet, A martingale approach to strong convergence in a generalized Polya-Eggenberger urn model, Statis-tics & Probability Letters 8 (1989), no. 3, 225–228. MR MR1024032 (91d:60075)

[47] , Martingale functional central limit theorems for a generalized Polya urn, The Annals of Probability 21(1993), no. 3, 1624–1639.

[48] , Strong convergence of proportions in a multicolor Polya urn, Journal of Applied Probability 34 (1997),no. 2, 426–435.

[49] Ian P. Goulden and David M. Jackson, Combinatorial enumeration, John Wiley, New York, 1983.

[50] , Distributions, continued fractions, and the Ehrenfest urn model, Journal of Combinatorial Theory.Series A 41 (1986), no. 1, 21–31.

[51] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, Concrete mathematics, Addison Wesley, 1989.

[52] I. Higueras, J. Moler, F. Plo, and M. San Miguel, Urn models and differential algebraic equations, Journal ofApplied Probability 40 (2003), no. 2, 401–412.

[53] Lars Holst, A unified approach to limit theorems for urn processes, Journal of Applied Probability 16 (1979),154–162.

[54] Fred M. Hoppe, Polya-like urns and the Ewens sampling formula, J. Math. Biology 20 (1984), 91–94.

[55] Hsien-Kuei Hwang, Theoremes limites pour les structures combinatoires et les fonctions arithmetiques, Ph.D.thesis, Ecole Polytechnique, December 1994.


[56] , Large deviations for combinatorial distributions. I. Central limit theorems, The Annals of AppliedProbability 6 (1996), no. 1, 297–319.

[57] , Large deviations of combinatorial distributions. II. Local limit theorems, The Annals of Applied Prob-ability 8 (1998), no. 1, 163–181.

[58] , On convergence rates in the central limit theorems for combinatorial structures, European Journal ofCombinatorics 19 (1998), no. 3, 329–343.

[59] Yoshiaki Itoh, On a ruin problem with interaction, Annals of the Institute of Statistical Mathematics 25 (1973),635–641. MR MR0346951 (49 #11671)

[60] , Random collision models in oriented graphs, Journal of Applied Probability 16 (1979), no. 1, 36–44.

[61] Yoshiaki Itoh, Colin Mallows, and Larry Shepp, Explicit sufficient invariants for an interacting particle system,Journal of Applied Probability 35 (1998), no. 3, 633–641.

[62] Svante Janson, Functional limit theorems for multitype branching processes and generalized Polya urns,Stochastic Processes and Applications 110 (2004), no. 2, 177–245.

[63] , Limit theorems for triangular urn schemes, Probability Theory and Related Fields 134 (2006), 417–452.

[64] Norman L. Johnson and Samuel Kotz, Urn models and their application, John Wiley, 1977.

[65] J. P. Jouanolou, Equations de Pfaff algebriques, Lecture Notes in Mathematics, vol. 708, Springer, Berlin, 1979.

[66] Mark Kac, Random walk and the theory of Brownian motion, American Mathematical Monthly 54 (1947), 369–391.

[67] Samuel Karlin and James McGregor, Ehrenfest urn models, Journal of Applied Probability 2 (1965), 352–376.

[68] J. F. C. Kingman, Martingales in the OK Corral, The Bulletin of the London Mathematical Society 31 (1999),no. 5, 601–606.

[69] , Stochastic aspects of Lanchester’s theory of warfare, Journal of Applied Probability 39 (2002), no. 3,455–465.

[70] J. F. C. Kingman and S. E. Volkov, Solution to the OK Corral model via decoupling of Friedman’s urn, Journalof Theoretical Probability 16 (2003), no. 1, 267–276.

[71] Donald E. Knuth, The art of computer programming, 2nd ed., vol. 3: Sorting and Searching, Addison-Wesley,1998.

[72] Valentin F. Kolchin, Boris A. Sevastyanov, and Vladimir P. Chistyakov, Random allocations, John Wiley andSons, New York, 1978, Translated from the Russian original Slucajnye Razmesceniya.

[73] Samuel Kotz and N. Balakrishnan, Advances in urn models during the past two decades, Advances in combi-natorial methods and applications to probability and statistics, Stat. Ind. Technol., Birkhauser Boston, Boston,MA, 1997, pp. 203–257. MR MR1456736 (98f:60012)

[74] Samuel Kotz, Hosam Mahmoud, and Philippe Robert, On generalized Polya urn models, Statistics & ProbabilityLetters 49 (2000), no. 2, 163–173.

[75] Pierre-Simon Laplace, Theorie analytique des probabilites. Vol. I, II, Editions Jacques Gabay, Paris, 1995,Reprint of the 1819 and 1820 editions.

[76] P. Leroux and G. X. Viennot, Combinatorial resolution of systems of differential equations I: Ordinary differen-tial equations, Combinatoire Enumerative (G. Labelle and P. Leroux, eds.), Lecture Notes in Mathematics, no.1234, Springer-Verlag, 1986, pp. 210–245.

[77] , A Combinatorial Approach to Non Linear Functional Expansions: An Introduction With an Example,Proceedings of the 27th IEEE Conference on Decision and Control (Austin, Texas), December 1988, pp. 1314–1319.

[78] Peter Lindqvist, Some remarkable sine and cosine functions, Ricerche di Matematica XLIV (1995), no. 2, 269–290.

[79] Peter Lindqvist and Jaak Peetre, Two remarkable identities, called twos, for inverses to some Abelian integrals,The American Mathematical Monthly 108 (2001), no. 5, 403–410.

[80] Eugene Lukacs, Characteristic functions, Griffin, London, 1970.

[81] Erik Lundberg, Om hypergoniometriskafunktioner af komplexa variabla, Manuscript, 1879, English translation,“On hypergoniometric functions of complex variables” available from Jaak Peetre’s web page.

[82] Andrzej J. Maciejewski, Jean Moulin Ollagnier, Andrzej Nowicki, and Jean-Marie Strelcyn, Around Jouanolounon-integrability theorem, Indagationes Mathematicae, New Series 11 (2000), no. 2, 239–254.

[83] Hosam Mahmoud, On rotations in fringe-balanced binary trees, Information Processing Letters 65 (1998), 41–46.


[84] , Urn models and connections to random trees: a review, Journal of the Iranian Mathematical Society 2(2003), 53–114.

[85] Bojan Mohar, Some applications of Laplace eigenvalues of graphs, Graph Symmetry: Algebraic Methods andApplications (G. Hahn and G. Sabidussi, eds.), NATO ASI Ser., vol. C 497, Kluwer, 1997, pp. 225–275.

[86] Jean Moulin Ollagnier, Andrzej Nowicki, and Jean-Marie Strelcyn, On the non-existence of constants of deriva-tions: the proof of a theorem of Jouanolou and its development, Bulletin des Sciences Mathematiques 119(1995), no. 3, 195–233.

[87] Donald J. Newman and Lawrence Shepp, The double dixie cup problem, American Mathematical Monthly 67(1960), 58–61.

[88] Alois Panholzer and Helmut Prodinger, An analytic approach for the analysis of rotations in fringe-balancedbinary search trees, Annals of Combinatorics 2 (1998), 173–184.

[89] Jim Pitman, Combinatorial stochastic processes, Technical Report 621, 2002, Lecture Notes for Saint-FlourCourse, 231 pages.

[90] Nicolas Pouyanne, An algebraic approach of large Polya processes, To appear in Annales de l’Institut HenriPoincare, 40 pages, in press.

[91] , Classification of large Polya-Eggenberger urns with regard to their asymptotics, 2005 InternationalConference on Analysis of Algorithms, Discrete Mathematics and Theoretical Computer Science (Proceedings),vol. AD, 2005, pp. 275–285 (electronic).

[92] Vincent Puyhaubert, Modeles d’urnes et phenomenes de seuil en combinatoire analytique, PhD thesis, EcolePolytechnique, Palaiseau, France, 2005.

[93] Alois Schett, Properties of the Taylor series expansion coefficients of the Jacobian elliptic functions, Mathemat-ics of Computation 30 (1976), no. 133, 143–147.

[94] Robert Sedgewick and Philippe Flajolet, An introduction to the analysis of algorithms, Addison-Wesley Publish-ing Company, 1996.

[95] N. J. A. Sloane, The On-Line Encyclopedia of Integer Sequences, 2006, Published electronically atwww.research.att.com/˜njas/sequences/.

[96] R. T. Smythe, Central limit theorems for urn models, Stochastic Process. Appl. 65 (1996), no. 1, 115–137.

[97] Richard P. Stanley, Enumerative combinatorics, vol. I, Wadsworth & Brooks/Cole, 1986.

[98] Michael E. Taylor, Partial differential equations. I, Applied Mathematical Sciences, vol. 115, Springer-Verlag,New York, 1996.

[99] E. C. Titchmarsh, The theory of functions, second ed., Oxford University Press, 1939.

[100] Gerard Viennot, Une interpretation combinatoire des coefficients des developpements en serie entiere des fonc-tions elliptiques de Jacobi, Journal of Combinatorial Theory, series A 29 (1980), 121–133.

[101] Jeffrey Scott Vitter and Philippe Flajolet, Analysis of algorithms and data structures, Handbook of TheoreticalComputer Science (J. van Leeuwen, ed.), vol. A: Algorithms and Complexity, North Holland, 1990, pp. 431–524.

[102] J. Vuillemin, A unifying look at data structures, Communications of the ACM 23 (1980), no. 4, 229–239.

[103] E. T. Whittaker and G. N. Watson, A course of modern analysis, fourth ed., Cambridge University Press, 1927,Reprinted 1973.

[104] David Williams, Probability with martingales, Cambridge Mathematical Textbooks, Cambridge UniversityPress, Cambridge, 1991.

[105] David Williams and Paul McIlroy, The OK Corral and the power of the law (a curious Poisson-kernel formulafor a parabolic equation), The Bulletin of the London Mathematical Society 30 (1998), no. 2, 166–170.

[106] Andrew Chi Chih Yao, On random 2–3 trees, Acta Informatica 9 (1978), no. 2, 159–170.

[107] V. M. Zolotarev, One-dimensional stable distributions, American Mathematical Society, Providence, RI, 1986,Translated from the Russian by H. H. McFaden, Translation edited by Ben Silver.

· 2019-02-07 · Fourth Colloquium on Mathematics and Computer Science DMTCS proc. AG, 2006,...

Documents

Transcript of · 2019-02-07 · Fourth Colloquium on Mathematics and Computer Science DMTCS proc. AG, 2006,...