Appendix AA Quick Primer on Discrete Probability
In this appendix, we develop some basic ideas in discrete probability theory. Wenote from the outset that some of the definitions given here are no longer correct inthe setting of continuous probability theory.
Let � be a finite or countably infinite set, and let 2� denote the set of subsetsof�. An element A 2 2� is simply a subset of�, but in the language of probabilityit is called an event. A probability measure on � is a function P W 2� ! Œ0; 1�
satisfying P.;/ D 0; P.�/ D 1 and which is � -additive; that is, for any 1 �N � 1, one has P.[N
nD1An/ D PNnD1 P.An/, whenever the events fAngNnD1
are disjoint. From this � -additivity, it follows that P is uniquely determined byfP.fxg/gx2�. Using the � -additivity on disjoint events, it is not hard to prove thatP is � -sub-additive on arbitrary events; that is, P.[N
nD1An/ � PNnD1 P.An/, for
arbitrary events fAngNnD1. See Exercise A.1. The pair .�;P / is called a probabilityspace.
If C and D are events and P.C/ > 0, then the conditional probability of Dgiven C is denoted by P.DjC/ and is defined by
P.DjC/ D P.C \D/P.C /
:
Note that P.� jC/ is itself a probability measure on �. Two events C and D arecalled independent if P.C \ D/ D P.C/P.D/. Clearly then, C and D areindependent if either P.C/ D 0 or P.D/ D 0. If P.C/; P.D/ > 0, it is easyto check that independence is equivalent to either of the following two equalities:P.DjC/ D P.D/ or P.C jD/ D P.C/. Consider a collection fCngNnD1 of events,with 1 � N � 1. This collection of events is said to be independent if for anyfinite subset fCnj gmjD1 of the events, one has P.\m
jD1Cnj / D QmjD1 P.Cnj /.
Let .�;P / be a probability space. A function X W � ! R is called a (discrete,real-valued) random variable. For B � R, we write fX 2 Bg to denote the eventX�1.B/ D f! 2 � W X.!/ 2 Bg, the inverse image of B . When considering theprobability of the event fX 2 Bg or the event fX D xg, we write P.X 2 B/ orP.X D x/, instead of P.fX 2 Bg/ or P.fX D xg/. The distribution of the random
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext,DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
133
134 A Primer on Discrete Probability
variable X is the probability measure �X on R defined by �X.B/ D P.X 2 B/,for B � R. The function pX.x/ WD P.X D x/ is called the probability function orthe discrete density function for X .
The expected value or expectation EX of a random variable X is defined by
EX DX
x2Rx P.X D x/ D
X
x2Rx pX.x/; if
X
x2RjxjP.X D x/ < 1:
Note that the set of x 2 R for which P.X D x/ > 0 is either finite or countablyinfinite; thus, these summations are well defined. We frequently denote EX by �. IfP.X � 0/ D 1 and the condition above in the definition of EX does not hold, thenwe write EX D 1. In the sequel, when we say that the expectation of X “exists,”we mean that
Px2R jxjP.X D x/ < 1.
Given a function W R ! R and a random variable X , we can define a newrandom variable Y D .X/. One can calculate EY according to the definition ofexpectation above or in the following equivalent way:
EY DX
x2R .x/P.X D x/; if
X
x2Rj .x/jP.X D x/ < 1:
For n 2 N, the nth moment of X is defined by
EXn DX
x2RxnP.X D x/; if
X
x2RjxjnP.X D x/ < 1:
If � D EX exists, then one defines the variance of X , denoted by �2 or �2.X/ orVar.X/, by
�2 D E.X � �/2 DX
x2R.x � �/2P.X D x/:
Of course, it is possible to have �2 D 1. It is easy to check that
�2.X/ D EX2 � �2: (A.1)
Chebyshev’s inequality is a fundamental inequality involving the expected valueand the variance.
Proposition A.1 (Chebyshev’s Inequality). Let X be a random variable withexpectation � and finite variance �2. Then for all � > 0,
P.jX � �j � �/ � �2
�2:
A Primer on Discrete Probability 135
Proof.
P.jX � �j � �/ DX
x2RWjx��j��P.X D x/ �
X
x2RWjx��j��
.x � �/2�2
P.X D x/ �
X
x2R
.x � �/2�2
P.X D x/ D �2
�2:
�
Let fXj gnjD1 be a finite collection of random variables on a probability space.�;P /. We call X D .X1; : : : ; Xn/ a random vector. The joint probability functionof these random variables, or equivalently, the probability function of the randomvector, is given by
pX.x/ D pX.x1; : : : ; xn/ WD P.X1 D x1; : : : ; Xn D xn/ D P.X D x/;
xi 2 R; i D 1; : : : ; n; where x D .x1; : : : ; xn/:
It follows thatP
j2Œn��figP
xj2R pX.x/ D P.Xi D xi /. For any function H WRn ! R, we define
EH.X/ DX
x2RnH.x/pX.x/; if
X
x2RnjH.x/jpX.x/ < 1:
In particular then, if EXj exists, it can be written as EXj D Px2Rn xj pX.x/.
Similarly, if EXk exists, for all k, then we have
E
nX
kD1ckXk D
X
x2Rn.
nX
kD1ckxk/pX.x/ D
nX
kD1ck� X
x2RnxkpX.x/
�:
It follows from this that the expectation is linear; that is, if EXk exists for k D1; : : : ; n, then
E
nX
kD1ckXk D
nX
kD1ckEXk;
for any real numbers fckgnkD1.Let fXj gNjD1 be a collection of random variables on a probability space .�;P /,
where 1 � N � 1. The random variables are called independent if for every finiten � N , one has
136 A Primer on Discrete Probability
P.X1 D x1;X2 D x2; : : : ; Xn D xn/ DnY
jD1P.Xj D xj /;
for all xj 2 R; j D 1; 2; : : : ; n:
Let ffi gniD1 be real-valued functions with fi defined at least on the set fx 2 R WP.Xi D x/ > 0g. Assume that Ejfi .Xi /j < 1, for i D 1; : : : ; n. From thedefinition of independence it is easy to show that if fXj gnjD1 are independent, then
E
nY
iD1fi .Xi / D
nY
iD1Efi .Xi /: (A.2)
The variance is of course not linear. However the variance of a sum of independentrandom variables is equal to the sum of the variances of the random variables:
If fXigniD1 are independent random variables, then
�2.
nX
iD1Xi / D
nX
iD1�2.Xi /: (A.3)
It suffices to prove (A.3) for n D 2 and then use induction. Let �i D EXi , i D 1; 2.We have
�2.X1 CX2/ D E�X1 CX2 �E.X1 CX2/
�2 D E�.X1 � �1/C .X2 � �2/
�2 DE.X1 � �1/2 CE.X2 � �2/2 C 2E.X1 � �1/.X2 � �2/ D �2.X1/C �2.X2/;
where the last equality follows because (A.2) shows that E.X1 � �1/.X2 � �2/ DE.X1 � �1/E.X2 � �2/ D 0.
Chebyshev’s inequality and (A.3) allow for an exceedingly short proof ofan important result—the weak law of large numbers for sums of independent,identically distributed (IID) random variables.
Theorem A.1. Let fXng1nD1 be a sequence of independent, identically distributed
random variables and assume that their common variance �2 is finite. Denote theircommon expectation by �. Let Sn D Pn
jD1 Xj . Then for any � > 0,
limn!1P.jSn
n� �j � �/ D 0:
Proof. We have ESn D n�, and since the random variables are independent andidentically distributed, it follows from (A.3) that �2.Sn/ D n�2. Now applyingChebyshev’s inequality to Sn with � D n� gives
A Primer on Discrete Probability 137
P.jSn � n�j � n�/ � n�2
.n�/2;
which proves the theorem. �
Remark. The weak law of large numbers is a first moment result. It holds evenwithout the finite variance assumption, but the proof is much more involved.
The above weak law of large numbers is actually a particular case of thefollowing weak law of large numbers.
Proposition A.2. Let fYng1nD1 be random variables. Assume that
�2.Yn/ D o�.EYn/
2�; as n ! 1:
Then for any � > 0,
limn!1P.j Yn
EYn� 1j � �/ D 0:
Proof. By Chebyshev’s inequality, we have
P.jYn �EYnj � �jEYnj/ � �2.Yn/��EYn
�2 :
�
If X and Y are random variables on a probability space .�;P /, and ifP.XDx/>0, then the conditional probability function of Y given X D x isdefined by
pY jX.yjx/ WD P.Y D yjX D x/ D P.X D x; Y D y/
P.X D x/:
The conditional expectation of Y given X D x is defined by
E.Y jX D x/ DX
y2Ry P.Y D yjX D x/ D
X
y2Ry pY jX.yjx/;
ifX
y2RjyjP.Y D yjX D x/ < 1:
It is easy to verify that
EY DX
x2RE.Y jX D x/P.X D x/;
where E.Y jX D x/P.X D x/ WD 0, if P.X D x/ D 0.
138 A Primer on Discrete Probability
A random variable X that takes on only two values—0 and 1, with P.X D1/ D p and P.X D 0/ D 1 � p, for some p 2 Œ0; 1�—is called a Bernoullirandom variable. One writes X � Ber.p/. It is trivial to check that EX D p and�2.X/ D p.1 � p/.
Let n 2 N and let p 2 Œ0; 1�. A random variable X satisfying
P.X D j / D n
j
!
pj .1 � p/n�j ; j D 0; 1; : : : ; n;
is called a binomial random variable, and one writes X � Bin.n; p/. The randomvariable X can be thought of as the number of “successes” in n independent trials,where on each trial there are two possible outcomes—“success” and “failure”—and the probability of “success” is p on each trial. Letting fZigniD1 be independent,identically distributed random variables distributed according to Ber.p/, it followsthat X can be realized as X D Pn
iD1 Zi . From the formula for the expectedvalue and variance of a Bernoulli random variable, and from the linearity of theexpectation and (A.3), the above representation immediately yields EX D np and�2.X/ D np.1 � p/.
A random variable X satisfying
P.X D n/ D e�� �n
nŠ; n D 0; 1; : : : ;
where � > 0, is called a Poisson random variable, and one writes X � Pois.�/.One can check easily that EX D � and �2.X/ D �.
Proposition A.3 (Poisson Approximation to the Binomial Distribution). Forn 2 N and p 2 Œ0; 1�, let Xn;p � Bin.n; p/. For � > 0, let X� � Pois.�/. Then
limn!1;p!0;np!�
P.Xn;p D j / D P.X� D j /; j D 0; 1; : : : : (A.4)
Proof. By assumption, we have p D �nn
, where limn!1 �n D �. We have
P.Xn;pDj /D n
j
!
pj .1�p/n�jDn.n � 1/ � � � .n � j C 1/
j Š.�n
n/j .1 � �n
n/n�jD
1
j Š�jnn.n � 1/ � � � .n � j C 1/
nj.1 � �n
n/n�j I
thus,
limn!1;p!0;np!�
P.Xn;p D j / D e�� �j
j ŠD P.X� D j /:
�
A Primer on Discrete Probability 139
Equation (A.4) is an example of weak convergence of random variables or dis-tributions. In general, if fXng1
nD1 are random variables with distributions f�Xng1nD1,
and X is a random variable with distribution �X , then we say that Xn convergesweakly to X , or �Xn converges weakly to �X , if limn!1 P.Xn � x/ DP.X � x/, for all x 2 R for which P.X D x/ D 0, or equivalently, iflimn!1 �Xn..�1; x�/ D �X..�1; x�/, for all x 2 R for which �X.fxg/ D 0.Thus, for example, if P.Xn D 1
n/ D P.Xn D 1 C 1
n/ D 1
2, for n D 1; 2; � � � ,
and P.X D 0/ D P.X D 1/ D 12, then Xn converges weakly to X since
limn!1 P.Xn � x/ D P.X � x/, for all x 2 R � f0; 1g. See also Exercise A.4.
Exercise A.1. Use the � -additivity property of probability measures on disjoint setsto prove � -sub-additivity on arbitrary sets: that is, P.[N
nD1An/ � PNnD1 P.An/, for
arbitrary events fAngNnD1, where 1 � N � 1. (Hint: Rewrite [NnD1An as a disjoint
union [NnD1Bn, by letting B1 D A1;B2 D A2 � A1;B3 D A3 � A2 � A1, etc.)
Exercise A.2. Prove thatP.A1[A2/ D P.A1/CP.A2/�P.A1\A2/, for arbitraryevents A1;A2. Then prove more generally that for any finite n and arbitrary eventsfAkgnkD1, one has
P.[nkD1Ak/ D
X
1�i�nP.Ai / �
X
1�i<j�nP.Ai \ Aj /C
X
1�i<j<k�nP.Ai \ Aj \ Ak/ � � � � C .�1/n�1P.A1 \ A2 � � � \ An/:
This result is known as the principle of inclusion–exclusion.
Exercise A.3. Let .�;P / be a probability space and let R � 2 be an integer. ForA � �, recall that the complement Ac of A is defined by Ac D � � A. Prove thatif the events fAkgRkD1 are independent, then the complementary events fAckgRkD1 arealso independent. (Hint: By the definition of independence, we have
P.\`jD1Bj / D
Y
jD1P.Bj /; for any ` � R and any
sub-collection fBj g`jD1 of fAkgRkD1:
(A.5)
Using this, we need to prove that P.\`jD1Bc
j / D Q`jD1 P.Bc
j /, for any sub-
collection fBcj g`jD1 of fAckgRkD1. Let pj D P.Bj / and p D P.\`
jD1Bcj /. Then
we need to prove that p D Q`jD1.1 � pj /. Write
140 A Primer on Discrete Probability
Y
jD1.1 � pj / D 1 �
X
1�i�`pi C
X
1�i<j�`pipj � � � � ;
and use (A.5) along with the principle of inclusion–exclusion, which appears inExercise A.2.
Exercise A.4. Using (A.4), show that
limn!1;p!0;np!�
P.Xn;p � x/ D P.X� � x/; for all x 2 R:
Appendix BPower Series and Generating Functions
We review without proof some basic results concerning power series. For moredetails, the reader should consult an advanced calculus or undergraduate analysistext. We also illustrate the utility of generating functions by analyzing the one thatarises from the Fibonacci sequence.
Let fang1nD0 be a sequence of real numbers. Define formally the generating
function F.t/ of fang1nD0 by
F.t/ D1X
nD0ant
n; (B.1)
where t 2 R. We say “formally” because we have made the definition beforedetermining for which values of t the power series on the right hand side aboveconverges. The power series converges trivially for t D 0, and it is possible that itconverges only for t D 0, for example, if an D nŠ.
The power seriesP1
nD0 antn converges absolutely ifP1
nD0 jantnj < 1. Thepower series is uniformly, absolutely convergent for jt j � � if
limN!1 sup
jt j��
1X
nDNjantnj D 0I
that is, if the tail of the seriesP1
nD0 jantnj converges to 0 uniformly over jt j � �.We state four fundamental results concerning the convergence of power series:
1. If the power series converges for some number t0 ¤ 0, then necessarily the powerseries converges absolutely and uniformly for jt j � �, for all � < t0.
2. There exists an extended real number r0 2 Œ0;1� such that the power seriesP1nD0 antn converges absolutely if t 2 Œ0; r0/ and diverges if t > r0.
The number r0 in (2) is called the radius of convergence of the power series.
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext,DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
141
142 B Power Series and Generating Functions
3. The radius of convergence is given by the formula
r0 D 1
lim supn!1 npan:
4. If the power series is uniformly, absolutely convergent for jt j � �, then thefunction F.t/ in (B.1) is infinitely differentiable for jt j < �, and its derivativesare obtained via term by term differentiation in the power series; in particular,F 0.t/ D P1
nD0 nantn�1.
The generating function often provides an efficient method for obtaining infor-mation about the sequence fang1
nD0. Typically, this will occur when the generatingfunction can be written in a nice closed form and analyzed. This analysis then allowsone to obtain information about the coefficients in the generating function’s powerseries expansion, and these coefficients are of course fang1
nD0. We illustrate this inthe case of the famous Fibonacci sequence.
Recall that the sequence of Fibonacci numbers is defined recursively by f0 D 0;
f1 D 1 and
fn D fn�1 C fn�2; for n � 2: (B.2)
The first few Fibonacci numbers are 0,1,1,2,3,5,8,13, 21,34,55,89,144.We will obtain a closed form for the generating function
F.t/ D1X
nD0fnt
n (B.3)
of the Fibonacci numbers. Multiply both sides of (B.2) by tn and then sum bothsides over n, with n running from 2 to 1. This gives us
1X
nD2fnt
n D1X
nD2fn�1tn C
1X
nD2fn�2tn:
Since f0 D 0 and f1 D 1, the left hand side above is equal to F.t/ � t . Factoringout t from the first term and t 2 from the second term on the right hand side above,and using the fact that f0 D 0, one sees that the right hand side above is equal totF .t/C t 2F .t/. Thus, we obtain the equation
F.t/ � t D tF .t/C t 2F .t/;
which gives a closed form expression for F ; namely, F.t/ D t1�t�t2 . Up until now
we have ignored the question of convergence. However, the above formula gives us
the answer. The roots of the polynomial t 2 C t � 1 are rC WD �1Cp5
2and r� WD
�1�p5
2. Since jrCj < jr�j, we conclude that the generating function F.t/ has radius
B Power Series and Generating Functions 143
of convergence jrCj Dp5�12
. Thus, the generating function of the Fibonacci seriesis given by
F.t/ D t
1 � t � t 2 ; jt j <p5 � 12
: (B.4)
We now use the method of partial fractions to represent the function t1�t�t2 in an
explicit power series. Using the fact that rCr� D �1, we write
t 2 C t � 1 D .t � rC/.t � r�/ D �.t r� C 1/.t rC C 1/I
thus,
t
1 � t � t 2 D t
.t r� C 1/.t rC C 1/: (B.5)
For unknown A and B , we write
t
.t r� C 1/.t rC C 1/D A
tr� C 1C B
trC C 1D t .ArC C Br�/C .AC B/
.t r� C 1/.t rC C 1/:
(B.6)
Comparing the left-most and right-most terms in (B.6) , we conclude thatACB D 0
and ArC C Br� D 1. Solving for A and B , we obtain A D 1
rC�r� D 1p5
and
B D 1
r��rC D � 1p5. Thus, from (B.5) and the first equality in (B.6), we arrive at
the partial fraction representation
t
1 � t � t 2 D 1p5
� 1
1C t r� � 1
1C t rC�: (B.7)
Since jr�j > jrCj, both 11Ct r� and 1
1Ct rC can be written as geometric series if
jt j < 1jr�j D 2
1Cp5
Dp5�12
. We have
1
1C t r� D1X
nD0.�1/n.r�/ntn D
1X
nD0.1C p
5
2/ntnI
1
1C t rC D1X
nD0.�1/n.rC/ntn D
1X
nD0.1 � p
5
2/ntn:
(B.8)
Thus, from (B.4), (B.7), and (B.8), we obtain
F.t/ D1X
nD0
1p5
�.1C p
5
2/n � .1 � p
5
2/n�tn: (B.9)
144 B Power Series and Generating Functions
Comparing (B.3) with (B.9), we conclude that the nth Fibonacci number fn isgiven explicitly by
fn D 1p5
�.1C p
5
2/n � .1 � p
5
2/n�: (B.10)
From the explicit formula in (B.10), the asymptotic behavior of fn is clear:
fn � 1p5.1C p
5
2/n as n ! 1:
Appendix CA Proof of Stirling’s Formula
Stirling’s formula states that
nŠ � nne�np2n; as n ! 1: (C.1)
In order to obtain an asymptotic formula for the discrete quantity nŠ, it is extremelyuseful to be able to embed this quantity in a function of a continuous variable.Integrating by parts and then applying induction shows that nŠ D .nC 1/, n 2 N,where the gamma function .t/ is defined by
.t/ DZ 1
0
xt�1e�x dx; t > 0:
Thus, one proves Stirling’s formula in the following form.
Theorem C.1 (Stirling’s Formula).
.t C 1/ � t t e�tp2t; as t ! 1: (C.2)
Proof. In the literature one can find literally dozens of proofs of Stirling’s formula.We present here an elementary proof that uses Laplace’s asymptotic method [14].We begin by giving the intuition for the method. We write
.t C 1/ DZ 1
0
e t .x/ dx; (C.3)
where
t.x/ D t log x � x:
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext,DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
145
146 C Proof of Stirling’s Formula
Now t takes on its maximum at x D t , and the Taylor expansion of t about x D t
starts out as
t log t � t � .x � t /22t
DW O t.x/:
Replacing t by O t , we calculate that
Z 1
0
eO t .x/ dx D
Z 1
0
et log t�t� .x�t /2
2t dx D t t e�tZ 1
0
e� .x�t /2
2t dx:
Making the substitution z D x�tpt
gives
Z 1
0
e� .x�t /2
2t dx D pt
Z 1
�pt
e� 12 z2 d z:
SinceR1
�1 e� 12 z2 d z D p
2 , we conclude that
Z 1
0
eO t .x/ dx � t t e�tp2t; as t ! 1:
We now turn to the rigorous proof. We can write t exactly as
t.t C y/ D t log t � t � tg.yt/;
where
g.v/ D v � log.1C v/:
Substituting this in (C.3) and making the change of variables x D y C t , we obtain
.t C 1/ D t t e�tZ 1
�te�tg. yt / dy:
Making the change of variables y D ptz, we have
.t C 1/ D t t e�tp2t N.t/; (C.4)
where
N.t/ D 1p2
Z 1
�pt
e�tg. zp
t/d z:
C Proof of Stirling’s Formula 147
We will show that
limt!1
N.t/ D 1: (C.5)
Now (C.2) follows from (C.4) and (C.5).Fix L > 0 and write
N.t/ D NL.t/C 1p2
TCL .t/C 1p
2T �L .t/; (C.6)
where
NL.t/ D 1p2
Z L
�Le
�tg. zpt/d z
and
TCL .t/ D
Z 1
L
e�tg. zp
t/d z; T �
L .t/ DZ �L
�pt
e�tg. zp
t/d z:
From Taylor’s remainder formula it follows that for any � > 0 and sufficiently smallv, one has
1
2.1 � �/v2 � g.v/ � 1
2.1C �/v2:
Thus, limt!1 tg. zpt/ D 1
2z2, uniformly over z 2 Œ�L;L�; consequently,
limt!1
NL.t/ D 1p2
Z L
�Le� 1
2 z2 d z: (C.7)
Since t�g. zp
t/�0 D p
t�1 � 1
1C zpt
� DptzptCz
is increasing in z, we have
TCL .t/ �
pt C LptL
Z 1
L
t�g.
zpt/�0e
�tg. zpt/d z D
pt C LptL
e�tg. Lp
t/ D
pt C LptL
e�t Œ Lp
t�log.1C Lp
t/�:
By Taylor’s formula, we have log.1C Lpt/ D Lp
t� L2
2tCO.t� 3
2 / as t ! 1; thus,
lim supt!1
TCL .t/ � 1
Le� 1
2 L2
: (C.8)
148 C Proof of Stirling’s Formula
A very similar argument gives
lim supt!1
T �L .t/ � 1
Le� 1
2 L2
: (C.9)
Now from (C.6)–(C.9), we obtain
1p2
Z L
�Le� 1
2 z2 d z � lim inft!1
N.t/ � lim supt!1
N.t/
� 1p2
Z L
�Le� 1
2 z2 d z C 2
Lp2e� 1
2 L2
:
Since N.t/ is independent of L, letting L ! 1 above gives (C.5). �
Appendix D
An Elementary Proof ofP1
nD11
n2 D �2
6
The standard way to prove the identity in the title of this appendix is via Fourierseries. We give a completely elementary proof, following [1]. Consider the doubleintegral
I DZ 1
0
Z 1
0
1
1 � xy dxdy: (D.1)
(Actually, the expression on the right hand side of (D.1) is an improper integral,because the integrand blows up at .x; y/ D .1; 1/. Thus,
R 10
R 10
11�xy dxdy WD
lim�!0C
R 1��0
R 1��0
11�xy dxdy. Since the integrand is nonnegative, there is no
problem applying the standard rules of calculus directly toR 10
R 10
11�xy dxdy.) On
the one hand, expanding the integrand in a geometric series and integrating term byterm gives
I DZ 1
0
Z 1
0
1X
nD0.xy/n dxdy D
1X
nD0
Z 1
0
Z 1
0
xnyn dxdy D
1X
nD0
� Z 1
0
xn dx�� Z 1
0
yn dy�
D1X
nD0
1
.nC 1/2D
1X
nD1
1
n2: (D.2)
(The interchanging of the order of the integration and the summation is justified bythe fact that all the summands are nonnegative.)
On the other hand, consider the change of variables u D yCx2
, v D y�x2
. Thistransformation rotates the square Œ0; 1��Œ0; 1� clockwise by 45ı and shrinks its sidesby the factor
p2. The new domain is f.u; v/ W 0 � u � 1
2;�u � v � ug [ f.u; v/ W
12
� u � 1; u �1 � v � 1� ug. The Jacobian @.x;y/
@.u;v/ of the transformation is equal to
2, so the area element dxdy gets replaced by 2dudv. The function 11�xy becomes
11�u2Cv2 . Since the function and the domain are symmetric with respect to the u-axis,we have
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext,DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
149
150 D Proof ofP1
nD11n2
D 2
6
I D 4
Z 12
0
�Z u
0
dv
1 � u2 C v2
�du C 4
Z 1
12
�Z 1�u
0
dv
1 � u2 C v2
�du:
Using the integration formulaR
dxx2Ca2 D 1
aarctan x
a, we obtain
I D 4
Z 12
0
1p1 � u2
arctan� up
1 � u2
�du C 4
Z 1
12
1p1 � u2
arctan� 1 � up
1 � u2
�du:
Now the derivative of g.u/ WD arctan�
up1�u2
�is 1p
1�u2, and the derivative of
h.u/ WD arctan�
1�up1�u2
� D arctan�q
1�u1Cu
�is � 1
21p1�u2
. Thus, we conclude that
I D 4
Z 12
0
g.u/g0.u/ du � 8Z 1
12
h.u/h0.u/ du D 2g2.u/j 120 � 4h2.u/j112
D
2�
arctan21p3
� arctan2 0� � 4� arctan2 0 � arctan2
1p3
� D 6 arctan21p3
D 6.
6/2 D 2
6: (D.3)
Comparing (D.2) and (D.3) gives
1X
nD1
1
n2D 2
6:
�
References
1. Aigner, M., Ziegler, G.: Proofs from the Book, 4th edn. Springer, Berlin (2010)2. Alon, N., Spencer, J.: The Probabilistic Method, 3rd edn. Wiley-Interscience Series in Discrete
Mathematics and Optimization. Wiley, Hoboken (2008)3. Alon, N., Krivelevich, M., Sudakov, B.: Finding a large hidden clique in a random graph.
In: Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (SanFrancisco, CA, 1998), pp. 594–598. ACM, New York (1998)
4. Andrews, G.: The Theory of Partitions, reprint of the 1976 original. Cambridge UniversityPress, Cambridge (1998)
5. Apostol, T.: Introduction to Analytic Number Theory. Undergraduate Texts in Mathematics.Springer, New York (1976)
6. Arratia, R., Barbour, A.D., Tavaré, S.: Logarithmic Combinatorial Structures: A ProbabilisticApproach. EMS Monographs in Mathematics. European Mathematical Society, Zürich (2003)
7. Athreya, K., Ney, P.: Branching Processes, reprint of the 1963 original [Springer, Berlin].Dover Publications, Inc., Mineola (2004)
8. Bollobás, B.: The evolution of random graphs. Trans. Am. Math. Soc. 286, 257–274 (1984)9. Bollobás, B.: Modern Graph Theory. Graduate Texts in Mathematics, vol. 184. Springer, New
York (1998)10. Bollobás, B.: Random Graphs, 2nd edn. Cambridge Studies in Advanced Mathematics, vol. 73.
Cambridge University Press, Cambridge (2001)11. Brauer, A.: On a problem of partitions. Am. J. Math. 64, 299–312 (1942)12. Conlon, D.: A new upper bound for diagonal Ramsey numbers. Ann. Math. 170, 941–960
(2009)13. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer,
New York (1998)14. Diaconis, P., Freedman, D.: An elementary proof of Stirling’s formula. Am. Math. Mon. 93,
123–125 (1986)15. Doyle, P., Snell, J.L.: Random Walks and Electric Networks. Carus Mathematical Monographs,
vol. 22. Mathematical Association of America, Washington (1984)16. Durrett, R.: Probability: Theory and Examples, 4th edn. Cambridge Series in Statistical and
Probabilistic Mathematics. Cambridge University Press, Cambridge (2010)17. Dwass, M.: The number of increases in a random permutation. J. Combin. Theor. Ser. A 15,
192–199 (1973)18. Erdos, P., Rényi, A.: On the evolution of random graphs. Magyar Tud. Akad. Mat. Kutató Int.
Közl 5, 17–61 (1960)19. Feller, W.: An Introduction to Probability Theory and Its Applications, 3rd edn, vol. I. Wiley,
New York (1968)
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext,DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
151
152 References
20. Flajolet, P., Sedgewick, R.: Analytic Combinatorics. Cambridge University Press, Cambridge(2009)
21. Flory, P.J.: Intramolecular reaction between neighboring substituents of vinyl polymers. J. Am.Chem. Soc. 61, 1518–1521 (1939)
22. Graham, R., Rothschild, B., Spencer, J.: Ramsey Theory, 2nd edn. Wiley-Interscience Seriesin Discrete Mathematics and Optimization. Wiley, New York (1990)
23. Hardy, G.H., Ramanujan, S.: Asymptotic formulae in combinatory analysis. Proc. LondonMath. Soc. 17, 75–115 (1918)
24. Harris, T.: The Theory of Branching Processes, corrected reprint of the 1963 original [Springer,Berlin]. Dover Publications, Inc., Mineola (2002)
25. Jameson, G.J.O.: The Prime Number Theorem. London Mathematical Society Student Texts,vol. 53. Cambridge University Press, Cambridge (2003)
26. Montgomery, H., Vaughan, R.: Multiplicative Number Theory. I. Classical Theory. CambridgeStudies in Advanced Mathematics, vol. 97. Cambridge University Press, Cambridge (2007)
27. Nathanson, M.: Elementary Methods in Number Theory. Graduate Texts in Mathematics, vol.195. Springer, New York (2000)
28. Page, E.S.: The distribution of vacancies on a line. J. Roy. Stat. Soc. Ser. B 21, 364–374 (1959)29. Pinsky, R.: Detecting tampering in a random hypercube. Electron. J. Probab. 18, 1–12 (2013)30. Pitman, J.: Combinatorial stochastic processes. Lectures from the 32nd Summer School on
Probability Theory held in Saint-Flour, 7–24 July 2002. Lecture Notes in Mathematics, 1875.Springer, Berlin (2006)
31. Rényi, A.: On a one-dimensional problem concerning random space filling (Hungarian;English summary). Magyar Tud. Akad. Mat. Kutató Int. Közl. 3, 109–127 (1958)
32. Spitzer, F.: Principles of Random Walk, 2nd edn. Graduate Texts in Mathematics, vol. 34.Springer, New York (1976)
33. Tenenbaum, G.: Introduction to Analytic and Probabilistic Number Theory. Cambridge Studiesin Advanced Mathematics, vol. 46. Cambridge University Press, Cambridge (1995)
34. Wilf, H.: Generating Functionology, 3rd edn. A K Peters, Ltd., Wellesley (2006)
Index
AAbel summation, 77arcsine distribution, 37average order, 13
BBernoulli random variable, 138binomial random variable, 138branching process – see Galton–Watson
branching process, 117
CChebyshev’s -function, 70Chebyshev’s � -function, 68Chebyshev’s inequality, 134Chebyshev’s theorem, 68Chinese remainder theorem, 19clique, 89coloring of a graph, 104composition of an integer, 5cycle index, 58cycle type, 51
Dderangement, 49Dyck path, 40
EErdos–Rényi graph, 89Euler �-function, 11Euler product formula, 19Ewens sampling formula, 52expected value, 134extinction, 117
FFibonacci sequence, 142finite graph, 89
GGalton–Watson branching process, 117generating function, 141giant component, 110
HHardy–Ramanujan theorem, 81
Iindependent events, 133independent random variables, 135
Llarge deviations, 113
MMertens’ theorems, 75Mobius function, 8Mobius inversion, 10multiplicative function, 9
Pp-adic, 71partition of an integer, 1Poisson approximation to the binomial
distribution, 138Poisson random variable, 138prime number theorem, 67
R.G. Pinsky, Problems from the Discrete to the Continuous, Universitext,DOI 10.1007/978-3-319-07965-3, © Springer International Publishing Switzerland 2014
153
154 Index
probabilistic method, 107probability generating function, 54probability space, 133
RRamsey number, 105random variable, 133relative entropy, 115, 131restricted partition of an integer, 1
Ssieve method, 19simple, symmetric random walk, 35square-free integer, 8Stirling numbers of the first kind, 54
Stirling’s formula, 145survival, 117
Ttampering detection, 99total variation distance, 99
Vvariance, 134
Wweak convergence, 139weak law of large numbers, 136, 137
Top Related