Analysis of Execution Costs for QuickSelectfill/papers/NakamaDissertation.pdfAnalysis of Execution...

Analysis of Execution Costs for QuickSelect

by

Takehiko Nakama

A dissertation submitted to The Johns Hopkins University in conformity with the

requirements for the degree of Doctor of Philosophy.

Baltimore, Maryland

August, 2009

c© Takehiko Nakama 2009

All rights reserved

Abstract

QuickSelect is a search algorithm widely used for finding a key of target rank in

a file of keys. When the algorithm compares keys during its execution, it must operate on

the keys’ representations or internal structures, which were ignored by the previous studies

that quantified the execution cost for the algorithm in terms of the number of required key

comparisons. In this dissertation, we conduct analyses of running costs for the algorithm

that take into account not only the number of key comparisons but also the cost of each

key comparison. First we suppose that keys are independent and uniformly distributed

in the unit interval (0, 1) and are represented as their binary expansions, and we derive

exact and asymptotic expressions for the expected number of bit comparisons required by

QuickSelect. We also establish a closed formula for the expectation that only involves

finite summation and use it to compute the expected cost for various values of the target

rank and the number of keys. Then we investigate execution costs for the algorithm applied

to keys that are represented by more general sequences of symbols, and we identify limiting

distributions associated with the costs. Further, we derive integral and series expressions for

the expectations of the limiting distributions and use them to recapture previously obtained

ii

ABSTRACT

results on the number of key comparisons required by QuickSelect.

Primary Reader: Professor James Allen Fill

Secondary Reader: Professor S. Rao Kosaraju

iii

Acknowledgements

It has truly been an honor and a great pleasure working with Professor James Allen Fill.

I thank him for constantly inspiring not only with his exceptional brilliance but also with

his character. We are grateful for Professor Brigitte Vallee’s collaboration on the limiting-

distribution analysis described in Chapters 5–8 and for Professor Rao Kosaraju’s helpful

comments on a draft of this dissertation. I am also wholeheartedly grateful to numerous

people who have provided me with incalculable support; I would not have been able to

conduct graduate studies without them. Finally, I thank Johns Hopkins. I have never taken

for granted the privilege of pursuing my various intellectual interests at this remarkable

institution, which has provided me with wonderful opportunities to learn.

iv

Contents

Abstract ii

Acknowledgements iv

List of Figures viii

1 Introduction and summary 1

2 Analysis of µ(1, n) 9

2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Exact computation of µ(1, n) . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3 Asymptotic analysis of µ(1, n) . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Analysis of µ(m, n) 27

3.1 Exact computation of µ(m, n) . . . . . . . . . . . . . . . . . . . . . . . . 27

3.1.1 Exact computation of µ1(m, n) . . . . . . . . . . . . . . . . . . . . 28

3.1.2 Exact computation of µ2(m, n) and µ(m, n) . . . . . . . . . . . . . 39

v

CONTENTS

3.2 Asymptotic analysis of µ(m, n) . . . . . . . . . . . . . . . . . . . . . . . 44

4 Derivation of a closed formula for µ(m,n) 49

5 Background and preliminaries for limiting-distribution analysis 55

5.1 Probabilistic source models for the keys . . . . . . . . . . . . . . . . . . . 55

5.2 Known results for the numbers of key and symbol comparisons . . . . . . . 62

5.3 QuickQuant and QuickVal . . . . . . . . . . . . . . . . . . . . . . . 64

6 Analysis of QuickVal 66

6.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.2 Convergence of S.9513.6V

n /n in Lp for 1 ≤ p <∞ . . . . . . . . . . . . . . 69

6.3 Almost sure convergence of S.9513.6V

n /n . . . . . . . . . . . . . . . . . . . 73

7 Analysis of QuickQuant 79

7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.2 Convergence of S.9513.6Q

n /n in Lp for 1 ≤ p <∞ . . . . . . . . . . . . . . 81

8 Analysis of ES 88

8.1 Computation of ES: an integral expression . . . . . . . . . . . . . . . . . 89

8.2 Computation of ES: a series expression . . . . . . . . . . . . . . . . . . . 92

A Proof of (2.28) 95

B Tractable expression for the measure ν 97

vi

CONTENTS

Bibliography 106

Vita 110

vii

List of Figures

4.1 Expected number of bit comparisons for Quickselect . . . . . . . . . . 53

viii

Chapter 1

Introduction and summary

QuickSelect, introduced by Hoare [12] in 1961 and also known as Find, is a search

algorithm widely used for finding a key of target rank in a file of keys. Suppose that there

are n distinct keys (the keys are typically assumed to be distinct, but the algorithm still

works—with a minor adjustment—even if they are not distinct) and that the target rank

is m, where 1 ≤ m ≤ n. The algorithm finds the target key in a recursive and random

fashion. First, it selects a pivot uniformly at random from n keys. Let k denote the rank

of the pivot. If k = m, then the algorithm returns the pivot. If k > m, then the algorithm

recursively operates on the set of keys smaller than the pivot and returns the rank-m key.

Similarly, if k < m, then the algorithm recursively operates on the set of keys larger than

the pivot and returns the (m− k)-th smallest key from the subset.

Many studies have examined QuickSelect to quantify its execution costs (a non-

exhaustive list of references is Knuth [15], Mahmoud et al. [19], Prodinger [22], Grubel

1

CHAPTER 1. INTRODUCTION AND SUMMARY

and U. Rosler [11], Lent and Mahmoud [18], Grubel [10], Mahmoud and Smythe [20],

Devroye [4], Hwang and Tsai [14], Fill and Nakama [6], and Vallee et al. [27]), and all of

them except for Fill and Nakama [6] and Vallee et al. [27] have conducted the quantification

with regard to the number of key comparisons required by the algorithm to achieve its task.

As a result, most of the theoretical results on the complexity of QuickSelect are about

expectations or distributions for the number of required key comparisons. (We will soon

return to [6] and [27].)

However, one can reasonably argue that analyses of QuickSelect in terms of the

number of key comparisons cannot fully quantify its complexity. For instance, if keys

are represented as binary strings, then individual bits of the strings must be compared

in order for QuickSelect to complete its task. Consider applying QuickSelect to

find the smallest key among three keys k1, k2, and k3 whose binary representations are

.01001100. . . , .00110101. . . , and .00101010. . . , respectively. If the algorithm selects k3 as

a pivot, then it compares each of k1 and k2 to k3 in order to determine the rank of k3. When

k1 and k3 are compared, the algorithm requires 2 bit comparisons to determine that k3 is

smaller than k1 because the two keys have the same first digit and differ at the second digit.

Similarly, when k2 and k3 are compared, the algorithm requires 4 bit comparisons to deter-

mine that k3 is smaller than k2. After these comparisons, k3 has been identified as smallest.

Hence the search for the smallest key requires a total of 6 bit comparisons (resulting from

the two key comparisons). This simple case illustrates that the number of bit comparisons

required by the algorithm more accurately reflects the actual execution cost. (We will con-

2


sider bit comparisons as an example of symbol comparisons.) When QuickSelect (or

any other algorithm) compares keys during its execution, it must operate on the keys’ rep-

resentations or internal structures, so these should not be ignored in fully characterizing

the performance of the algorithm. Also, symbol-complexity analysis allows us to compare

key-based algorithms such as QuickSelect and QuickSort with digital algorithms

such as those utilizing digital search trees.

Fill and Janson [5] pioneered symbol-complexity analysis by analyzing the expected

number of bit comparisons required by QuickSort. This well-known and popular sorting

algorithm was also invented by Hoare [13], and, like QuickSelect, it uses a divide-and-

conquer strategy. (In fact, QuickSelect is a “one-sided version” of QuickSort.) The

algorithm picks a pivot key uniformly at random from the given file of keys and compares

the other keys with the pivot to form two subfiles; one of them consists of keys larger than

the pivot, and the other subfile consists of keys smaller than the pivot. QuickSort then

recursively performs the same procedure on each of the resulting subfiles containing more

than one key. Fill and Janson assumed that QuickSort is applied to keys that are i.i.d.

(independent and identically distributed) from the uniform distribution over (0, 1) and rep-

resented (via their binary expansions) as binary strings, and that the algorithm operates on

individual bits in order to sort the keys. They found that the expected number of bit compar-

isons required by QuickSort to sort n keys is asymptotically equivalent to n(lnn)(lg n)

(where lg denotes binary logarithm), whereas the lead-order term of the expected number

of key comparisons is 2n lnn, smaller by a factor of order log n. In their Section 6 they also

3


considered i.i.d. keys drawn from other distributions with density on (0, 1).

Their symbol-complexity analysis was closely followed by Fill and Nakama [6], who

investigated the expected number of bit comparisons required by QuickSelect. Chap-

ters 2–4 of this dissertation describe the results of [6] in detail. In these chapters, we assume

that the algorithm is applied to n keys that are i.i.d. uniform(0,1) and that the keys are repre-

sented as their binary expansions. We let µ(m,n) denote the expected number of bit com-

parisons required to find the rank-m key in a file of n keys by QuickSelect. By sym-

metry, µ(m,n) = µ(n+1−m,n). In this dissertation, we will refer to QuickSelect for

finding the smallest key and the largest key as QuickMin and QuickMax, respectively.

First, we develop exact and asymptotic formulae for µ(1, n) = µ(n, n), as summarized in

the following theorem.

Theorem 1.1. The expected number µ(1, n) of bit comparisons required by QuickMin

has the following exact and asymptotic expressions:

µ(1, n) = 2n(Hn − 1) + 2n−1∑j=2

Bj

n− j + 1−(nj

)j(j − 1)(1− 2−j)

(1.1)

= cn− 1

ln 2(lnn)2 −

(2

ln 2+ 1

)lnn+O(1), (1.2)

where Hn and Bj denote harmonic and Bernoulli numbers, respectively, and

c := 2∞∑k=0

1 + 2−k2k∑j=1

lnj

2k

.= 5.27938. (1.3)

With χk := 2πikln 2

and γ := Euler’s constant .= 0.57722, the constant c can alternatively be

4


expressed as

c =28

9+

17− 6γ

9 ln 2− 4

ln 2

∑k∈Z\{0}

ζ(1− χk)Γ(1− χk)Γ(4− χk)(1− χk)

. (1.4)

The asymptotic formula shows that the expected number of bit comparisons is asymp-

totically linear in n with lead-order coefficient approximately equal to 5.27938. Hence the

expected number of bit comparisons is asymptotically different from that of key compar-

isons required to find the smallest key only by a constant factor (the expectation for key

comparisons is asymptotically 2n). Details of the derivations of the formulae are described

in Chapter 2.

Complex-analytical methods are utilized to obtain the asymptotic formula (1.2) with c

in the form (1.4) and seem to be indispensable for obtaining asymptotics beyond the lead

term. [We remark that, although it involves the imaginary numbers χk, the expression (1.4)

is real because the terms with indices k and −k are complex conjugates.] In Section 2.3

we again use complex-analytical methods to reexpress (1.4) in the form (1.3). Having

done all this we suspected that there must be a purely real-analytical way to obtain directly

the lead-order asymptotics µ(1, n) ∼ cn with c in the form (1.3). Indeed, there is: See

Remark 2.3.

In Chapter 3, we move on to derive exact and asymptotic expressions for the expected

number of bit comparisons for the average case, where the target rank is chosen uniformly

randomly from {1, 2, . . . , n}. We denote this expectation by µ(m, n); hence µ(m, n) =

1n

∑nm=1 µ(m,n). QuickSelect for finding a key of random rank will be referred to as

QuickRand. The derived asymptotic formula shows that µ(m, n) is also asymptotically

5


linear in n; see (3.45). More detailed results for µ(m, n) are described in Chapter 3.

In Chapter 4, we derive an exact expression of µ(m,n) for each fixedm that is suited for

computations. Our preliminary exact formula for µ(m,n) [shown in (2.8)] entails infinite

summation and integration. As a result, it is not a desirable form for numerically computing

the expected number of bit comparisons. Hence we establish another exact formula that

only requires finite summation and use it to compute µ(m,n) for m = 1, . . . , n, n =

2, . . . , 25. The computation leads to the following conjectures: (i) for fixed n, µ(m,n)

increases in m for m ≤ n+12

and is symmetric about n+12

; and (ii) for fixed m, µ(m,n)

increases in n (asymptotically linearly).

Vallee et al. [27] extended the results of Fill and Janson [5] and Fill and Nakama [6]

by investigating QuickSort and QuickSelect with respect to the number of required

comparisons of more general symbols. They considered keys represented by sequences of

symbols generated by any of a wide variety of sources that include memoryless, Markov,

and other dynamical sources. (In both Fill and Nakama [6] and all but Section 6 of Fill and

Janson [5], the symbols that compose each key representation are assumed to be random

uniform bits, generated by a memoryless source.) We will provide details of their proba-

bilistic source models in Section 5.1, since we will use them in formulating our limiting-

distribution results for QuickSelect. Roughly summarized, Vallee et al. showed that the

expected number of symbol comparisons in processing a file of n keys is of order n log2 n

for QuickSort and of order n for QuickSelect (applied to find a key of target rank

mn with mn/n → α for any given α ∈ [0, 1]) if symbols are generated by a suitably

6


nice source. (For example, all memoryless sources are suitably nice.) For a more detailed

discussion of sources and the results of Vallee et al. [27] for QuickSelect, see Sec-

tion 5.1. In this dissertation, we further deepen the quantification of the performance of

QuickSelect in terms of the number of symbol comparisons used by the algorithm.

In Chapters 5–8, we investigate limiting distributions for the number of symbol com-

parisons required by QuickSelect. We believe that our study is the first to establish

a limiting distribution for the number of symbol comparisons required by any key-based

algorithm. Our elementary approach allows us to handle rather general kinds of “cost” for

comparing two keys, and in particular to recover in a rather direct way known results about

the number of key comparisons. There is no disadvantage to allowing general costs, since

our results rely on at most broad limitations on the nature of the cost.

In the last four chapters, we shall be concerned primarily with QuickQuant ≡

QuickQuant(n, α), which is what we call the algorithm QuickSelect when ap-

plied to find the key of rank mn in a file of size n, where we are given 0 ≤ α ≤ 1

and a sequence (mn) such that mn/n → α. It turns out to be convenient mathemati-

cally to analyze a close cousin to QuickQuant introduced by Vallee et al. [27], namely,

QuickVal ≡ QuickVal(n, α) (this algorithm is fully described in Section 5.3), and then

treat QuickQuant by comparison. For the same value of α, the operation of QuickVal

is quite close to that of QuickQuant, and execution costs of the two algorithms are ex-

pected to be close; in fact, we will show that if SQn ≡ SQ

n (α) and SVn ≡ SV

n (α) denote the

total costs of executing QuickQuant and QuickVal, respectively, then SQn /n and SV

n /n

7


have the same limiting distribution under proper conditions on the cost function. In order to

prove the convergence in distribution, we will define all the random variables SQ1 , S

Q2 , . . .

and SV1 , S

V2 , . . . on a common probability space and show that SQ

n /n and SVn /n both con-

verge in Lp to a common limit, call it S, for 1 ≤ p < ∞. The limit S is defined at (6.7);

see the end of Section 6.1. (Under a technical assumption, we will also prove almost sure

convergence for QuickVal in Section 6.3.)

In order to establish limiting-distribution results for QuickVal and QuickQuant,

we first provide a careful description of the probabilistic models used to govern the gener-

ation of keys in Section 5.1. Then we review known results about key and symbol com-

parisons in Section 5.2 and describe QuickVal in Section 5.3. In Chapter 6, we present

convergence results for QuickVal; see Theorems 6.1 and 6.4. Finally we move on to

QuickQuant in Chapter 7. Theorem 7.1 is the main result of our limiting-distribution

analysis. We will also analyze ES in Chapter 8 and derive integral and series expressions

for it. These expressions will be used to recover the previously obtained results reviewed

in Section 5.2.

8

Chapter 2

Analysis of µ(1, n)

In this chapter, we analyze the expected number of bit comparisons required by

QuickMin (QuickSelect for finding the smallest key). By symmetry, the expecta-

tion equals that for QuickMax (QuickSelect for finding the largest key). As described

in Chapter 1, we assume that QuickMin is applied to n distinct keys that are represented

as bit strings and that the algorithm operates on individual bits in order to compare keys and

find the target key. It is further assumed that the n keys are uniformly and independently

distributed in (0, 1). In Section 2.1, we describe the framework and notation for our bit-

complexity analysis. In Section 2.2, we derive an exact expression for µ(1, n). We conduct

an asymptotic analysis of this expression in Section 2.3.

9

CHAPTER 2. ANALYSIS OF µ(1, n)

2.1 Preliminaries

To investigate the bit complexity of Quickselect, we follow the general ap-

proach developed by Fill and Janson [5]. Let U1, . . . , Un denote the n keys uniformly

and independently distributed on (0, 1), and let U(i) denote the rank-i key. Then, for

1 ≤ i < j ≤ n (assume n ≥ 2),

P{U(i) and U(j) are compared} =

2

j −m+ 1if m ≤ i

2

j − i+ 1if i < m < j

2

m− i+ 1if j ≤ m.

(2.1)

To determine the first probability in (2.1), note thatU(m), . . . , U(j) remain in the same subset

until the first time that one of them is chosen as a pivot. Therefore, U(i) and U(j) are

compared if and only if the first pivot chosen from U(m), . . . , U(j) is either U(i) or U(j).

Analogous arguments establish the other two cases.

For 0 < s < t < 1, it is well known that the joint density function of U(i) and U(j)

is given by

fU(i),U(j)(s, t) :=

(n

i− 1, 1, j − i− 1, 1, n− j

)si−1(t− s)j−i−1(1− t)n−j.

(2.2)

10


Clearly, the event that U(i) and U(j) are compared is independent of the random variables

U(i) and U(j). Hence, defining

P1(s, t,m, n) =∑

m≤i<j≤n

2

j −m+ 1fU(i),U(j)

(s, t), (2.3)

P2(s, t,m, n) =∑

1≤i<m<j≤n

2

j − i+ 1fU(i),U(j)

(s, t), (2.4)

P3(s, t,m, n) =∑

1≤i<j≤m

2

m− i+ 1fU(i),U(j)

(s, t), (2.5)

P (s, t,m, n) = P1(s, t,m, n) + P2(s, t,m, n) + P3(s, t,m, n) (2.6)

[the sums in (2.3)–(2.5) are double sums over i and j], and letting β(s, t) denote the index

of the first bit at which the keys s and t differ, we can write the expectation µ(m,n) of the

number of bit comparisons required to find the rank-m key in a file of n keys as

µ(m,n) =

∫ 1

0

∫ 1

s

β(s, t)P (s, t,m, n) dt ds (2.7)

=∞∑k=0

2k∑l=1

∫ (l− 12

)2−k

(l−1)2−k

∫ l2−k

(l− 12

)2−k

(k + 1)P (s, t,m, n) dt ds; (2.8)

in this expression, note that k represents the last bit at which s and t agree.

2.2 Exact computation of µ(1, n)

Since the contribution of P2(s, t,m, n) or P3(s, t,m, n) to P (s, t,m, n) is zero for

m = 1, we have P (s, t, 1, n) = P1(s, t, 1, n) [see (2.4) through (2.6)]. Let x := s, y :=

11


t− s, z := 1− t. Then

P1(s, t, 1, n) = zn∑

1≤i<j≤n

2

j

(n

i− 1, 1, j − i− 1, 1, n− j

)xi−1yj−i−1z−j

= 2zn∫ ∞z

η−n−1∑

1≤i<j≤n

(n

i− 1, 1, j − i− 1, 1, n− j

)xi−1yj−i−1ηn−j dη

= 2zn∫ ∞z

η−n−1n(n− 1)(x+ y + η)n−2 dη

= 2znn(n− 1)

∫ ∞z

η−3

(t

η+ 1

)n−2

dη. (2.9)

Making the change of variables v = tη

+ 1 and integrating, and recalling z = 1− t, we find,

after some calculation,

P1(s, t, 1, n) = 2n∑j=2

(−1)j(n

j

)tj−2. (2.10)

From (2.8) and (2.10),

µ(1, n) =∞∑k=0

(k + 1)2k∑l=1

∫ (l− 12

)2−k

(l−1)2−k

∫ l2−k

(l− 12

)2−k

P1(s, t, 1, n) dt ds

= 2∞∑k=0

(k + 1)2k∑l=1

∫ (l− 12

)2−k

(l−1)2−k

∫ l2−k

(l− 12

)2−k

n∑j=2

(−1)j(n

j

)tj−2 dt ds

= 2∞∑k=0

(k + 1)2k∑l=1

n∑j=2

(−1)j(n

j

)∫ l2−k

(l− 12

)2−k

tj−2[(l − 12)2−k − (l − 1)2−k] dt

=∞∑k=0

(k + 1)2k∑l=1

n∑j=2

(−1)j(nj

)j − 1

2−k{(l2−k)j−1 − [(l − 12)2−k]j−1}

=n∑j=2

(−1)j(nj

)j − 1

∞∑k=0

(k + 1)2−kj2k∑l=1

[lj−1 − (l − 12)j−1]. (2.11)

12


To further transform (2.11), define

aj,r =

Br

r

(j − 1

r − 1

)if r ≥ 2

12

if r = 1

1j

if r = 0,

(2.12)

where Br denotes the r-th Bernoulli number. Let Sn,j :=∑n

l=1 lj−1. Then Sn,j =∑j−1

r=0 aj,rnj−r (see Knuth [17]), and

2k∑l=1

[lj−1 − (l − 12)j−1] = S2k,j − 2−(j−1)

2k∑l=1

(2l − 1)j−1

= S2k,j − 2−(j−1)(S2k+1,j − 2j−1S2k,j) = 2S2k,j − 2−(j−1)S2k+1,j

= 2

j−1∑r=0

aj,r2k(j−r) − 2−(j−1)

j−1∑r=0

aj,r2(k+1)(j−r) = 2

j−1∑r=1

aj,r2k(j−r)(1− 2−r).

(2.13)

From (2.11) and (2.13),

µ(1, n) = 2n∑j=2

(−1)j(nj

)j − 1

∞∑k=0

(k + 1)2−kjj−1∑r=1

aj,r2k(j−r)(1− 2−r).

Here

∞∑k=0

(k + 1)2−kjj−1∑r=1

aj,r2k(j−r)(1− 2−r) =

∞∑k=0

(k + 1)

j−1∑r=1

aj,r2−kr(1− 2−r)

=

j−1∑r=1

aj,r(1− 2−r)∞∑k=0

(k + 1)2−kr =

j−1∑r=1

aj,r(1− 2−r)−1.

13


Hence

µ(1, n) = 2n∑j=2

(−1)j(nj

)j − 1

j−1∑r=1

aj,r(1− 2−r)−1 = 2n−1∑r=1

(1− 2−r)−1

n∑j=r+1

(−1)j(nj

)j − 1

aj,r

= 2n∑j=2

(−1)j(nj

)j − 1

+ 2n−1∑r=2

(1− 2−r)−1Br

r

n∑j=r+1

(−1)j(nj

)(j−1r−1

)j − 1

= 2n∑j=2

(−1)j(nj

)j − 1

+ 2n−1∑r=2

(1− 2−r)−1Br

r

[n∑j=r

(−1)j(nj

)(j−1r−1

)j − 1

−(−1)r

(nr

)r − 1

].

(2.14)

The sum∑n

j=r

(−1)j(nj)(

j−1r−1)

j−1can be simplified as follows:

n∑j=r

(−1)j

j − 1

(n

j

)(j − 1

r − 1

)=

1

r − 1

n∑j=r

(−1)j(n

j

)(j − 2

r − 2

)

=1

r − 1

n∑j=r

(−1)j(n

j

)(j − 2

j − r

)

=(−1)r

r − 1

n∑j=0

(n

n− j

)(1− rj − r

)=

(−1)r

r − 1

(n+ 1− rn− r

)= (−1)r

n+ 1− rr − 1

. (2.15)

Plugging (2.15) into (2.14) and recalling B2k+1 = 0 for k ≥ 1, we finally obtain

µ(1, n) = 2n∑j=2

(−1)j(nj

)j − 1

+ 2n−1∑r=2

(1− 2−r)−1Br

r

[(−1)r(n− r + 1)

r − 1−

(−1)r(nr

)r − 1

]

= 2n∑j=2

(−1)j(nj

)j − 1

+ 2n−1∑j=2

Bj

n− j + 1−(nj

)j(j − 1)(1− 2−j)

= 2n(Hn − 1) + 2tn, (2.16)

where Hn denotes the n-th harmonic number and

tn :=n−1∑j=2

Bj

j(1− 2−j)

[n−

(nj

)j − 1

− 1

]. (2.17)

14


The last equality in (2.16) follows from the easy identity

n∑k=1

(−1)k−1(nk

)k

= Hn.

15


2.3 Asymptotic analysis of µ(1, n)

In order to obtain an asymptotic expression for µ(1, n), we analyze tn in (2.16)–

(2.17). The following lemma provides an exact expression for tn that easily leads to an

asymptotic expression for µ(1, n):

Lemma 2.1. For n ≥ 2, let un := tn+1 − tn (with t2 = 0) and vn := un+1 − un. Let γ

denote Euler’s constant (.= 0.57722), and define χk := 2πik

ln 2. Then

(i)

vn =1

n+ 1+

Hn+2

ln 2− ( γ

ln 2− 1

2)

(n+ 1)(n+ 2)− Σn,

where

Σn :=∑

k∈Z\{0}

ζ(1− χk)Γ(n+ 1)Γ(1− χk)(ln 2)Γ(n+ 3− χk)

;

(ii)

un = −Hn + a− Hn+1

(ln 2)(n+ 1)+

(γ − 1

ln 2− 1

2

)1

n+ 1+ Σn,

where

a :=14

9+

17− 6γ

18 ln 2− 2

ln 2

∑k∈Z\{0}

ζ(1− χk)Γ(1− χk)Γ(4− χk)(1− χk)

,

Σn :=∑

k∈Z\{0}

ζ(1− χk)Γ(1− χk)(ln 2)(1− χk)

Γ(n+ 1)

Γ(n+ 2− χk);

(iii)

tn = −(nHn − n− 1) + a(n− 2)− 1

2 ln 2

[H2n +H(2)

n −7

2

]+

(γ − 1

ln 2− 1

2

)(Hn −

3

2

)+ b− ˜Σn,

16


where

b :=∑

k∈Z\{0}

2ζ(1− χk)Γ(−χk)(ln 2)(1− χk)Γ(3− χk)

,

˜Σn :=∑

k∈Z\{0}

ζ(1− χk)Γ(−χk)Γ(n+ 1)

(ln 2)(1− χk)Γ(n+ 1− χk),

and H(2)n denotes the n-th Harmonic number of order 2, i.e., H(2)

n :=∑n

i=11i2

.

In this lemma, un and vn are derived in order to obtain the exact expression for tn in (iii).

From (2.16), the exact expression for tn also provides an alternative exact expression for

µ(1, n).

Before proving Lemma 2.1, we complete the proof of Theorem 1.1 using part (iii).

We know

Hn = lnn+ γ +1

2n− 1

12n2+O(n−4), (2.18)

H(2)n =

π2

6− 1

n+

1

2n2+O(n−3). (2.19)

Combining (2.18)–(2.19) with (2.16) and Lemma 2.1(iii), we obtain an asymptotic expres-

sion for µ(1, n):

µ(1, n) = 2an− 1

ln 2(lnn)2 −

(2

ln 2+ 1

)lnn+O(1). (2.20)

The termO(1) in (2.20) has fluctuations of small magnitude due to Σn, which is periodic in

log n with amplitude smaller than 0.00110. Thus, as shown in Theorem 1.1, the asymptotic

slope in (2.20) is

c = 2a =28

9+

17− 6γ

9 ln 2− 4

ln 2

∑k∈Z\{0}

ζ(1− χk)Γ(1− χk)Γ(4− χk)(1− χk)

. (2.21)

17


Let S denote the sum in c:

S :=∑

k∈Z\{0}

ζ(1− χk)Γ(1− χk)Γ(4− χk)(1− χk)

=∑

k∈Z\{0}

ζ(1− χk)(3− χk)(2− χk)(1− χk)2

, (2.22)

where the formula Γ(1 + x) = xΓ(x) is used to derive the second expression. Both ex-

pressions involve the imaginary numbers χk, but S is a real number. We investigate S and

express it using only real functions. We have the following result:

Theorem 2.2. Let S be the sum defined at (2.22). Then

S

ln 2= S − ρ, (2.23)

where

S :=∞∑k=0

2−kh(2k), h(m) :=1

2(m lnm− lnm!−m) +

3

8− 1

24m−1, (2.24)

and

ρ := −17− 6γ

36 ln 2− 1

12.

Proof of Theorem 2.2. Choose and fix 0 < θ < 1. We show that the integral

J :=

∫ θ+i∞

θ−i∞

ζ(1− s)(1− 2−s)(3− s)(2− s)(1− s)2

ds

equals 2πiS on the one hand and equals 2πi[ρ + (S/ ln 2)] on the other hand. Equating

these two expressions gives the desired result.

To get the first expression for J , we calculate

J =∞∑k=0

∫ θ+i∞

θ−i∞

ζ(1− s)2−ks

(3− s)(2− s)(1− s)2ds =

∞∑k=0

2−k∫ 1−θ+i∞

1−θ−i∞

ζ(t)2kt

t2(1 + t)(2 + t)dt.

18


But, for any positive integer m and any α > 1,∫ 1−θ+i∞

1−θ−i∞

ζ(t)mt

t2(1 + t)(2 + t)dt

=

∫ α+i∞

α−i∞

ζ(t)mt

t2(1 + t)(2 + t)dt− 2πiRest=1

[ζ(t)mt

t2(1 + t)(2 + t)

],

which follows from residue calculus, taking into account the contribution of the simple

pole of the integrand at 1. Here

Rest=1

[ζ(t)mt

t2(1 + t)(2 + t)

]=

1

6m

and ∫ α+i∞

α−i∞

ζ(t)mt

t2(1 + t)(2 + t)dt =

∞∑j=1

∫ α+i∞

α−i∞

(j/m)−t

t2(1 + t)(2 + t)dt.

Further, since

1

t2(t+ 1)(t+ 2)=−3/4

t+

1/2

t2+

1

t+ 1− 1/4

t+ 2,

we have by Mellin inversion that

1

2πi

∫ α+i∞

α−i∞

x−t

t2(1 + t)(2 + t)dt

equals

f(x) := −3

4− 1

2lnx+ x− 1

4x2

for 0 ≤ x ≤ 1 and equals 0 for x ≥ 1. (Note that this requires only α > 0.) So∫ α+i∞

α−i∞

ζ(t)mt

t2(1 + t)(2 + t)dt = 2πi

∑f

(j

m

)= 2πi

[1

2(m lnm− lnm!)− 1

3m+

3

8− 1

24m−1

]= 2πi

[h(m) +

1

6m

],

19


where the sum is over 1 ≤ j ≤ m (or 1 ≤ j ≤ m− 1), and therefore∫ 1−θ+i∞

1−θ−i∞

ζ(t)mt

t2(1 + t)(2 + t)dt = 2πih(m).

Thus we obtain our first expression for J . Before we proceed, we examine the series

expression (2.24) for S. Applying Stirling’s formula to lnm!, we obtain

lnm! = m lnm−m+1

2lnm+

1

2ln(2π) +

1

12m− 1

360m3+O(m−4).

Thus

h(m) = −1

4lnm+

(3

8− 1

4ln(2π)

)− 1

12m+

1

720m3+O(m−4),

and the series S converges geometrically rapidly.

To obtain the second expression for J we move the horizontal (i.e., real) coordinate

of the vertical line of integration over from θ to −C where C is large positive number

(C →∞). By residue calculus, we find

J = 2πi

{Ress=0

[ζ(1− s)

(1− 2−s)(3− s)(2− s)(1− s)2

]+

S

ln 2

}= 2πi

(ρ+

S

ln 2

),

as desired.

Using Theorem 2.2 it is straightforward to derive the alternative expression

c = 2∞∑k=0

1 + 2−k2k∑j=1

lnj

2k

(2.25)

for the linear coefficient c in (2.21). Grabner and Prodinger [9] obtained an earlier draft of

this manuscript and independently conducted a similar analysis of S leading to (2.25). They

also showed how to compute c efficiently to high precision and in particular computed c to

50 decimal places.

20


Remark 2.3. The lead-order asymptotics µ(1, n) ∼ cn with c in the form (2.25) can also

be obtained simply using real-analytical arguments. Start with (2.7) with m = 1 and recall

that P (s, t, 1, n) = P1(s, t, 1, n) is given by (2.10) to see that

µ(1, n) = 2

∫ 1

0

∫ t

0

β(s, t)t−2[(1− t)n − 1 + nt] ds dt.

An easy dominated-convergence argument then shows that µ(1, n) ∼ cn with c given in

the integral form

c = 2

∫ 1

0

∫ t

0

β(s, t)t−1 ds dt.

Writing

β(s, t) =∞∑k=0

1(s and t agree in their first k bits)

and breaking up the double integral according to the first k bits of t leads to the summation

form (2.25) of c. Vallee et al. [27] followed and further generalized this approach. We omit

the details. We do not know how to obtain asymptotics for µ(1, n) beyond the lead term by

this sort of approach.

Now we prove Lemma 2.1:

Proof of Lemma 2.1. (i) Since

un = tn+1 − tn

=n∑j=2

Bj

j(1− 2−j)

[(n+ 1)−

(n+1j

)j − 1

− 1

]−

n−1∑j=2

Bj

j(1− 2−j)

[n−

(nj

)j − 1

− 1

]

= −n∑j=2

Bj

j(j − 1)(1− 2−j)

[(n

j − 1

)− 1

],

21


it follows that

vn = un+1 − un = −n+1∑j=2

Bj

j(j − 1)(1− 2−j)

[(n+ 1

j − 1

)− 1

]

+n+1∑j=2

Bj

j(j − 1)(1− 2−j)

[(n

j − 1

)− 1

]

= −n−1∑k=0

(n

k

)Bk+2

(k + 2)(k + 1)[1− 2−(k+2)]

=n−1∑k=0

(−1)k(n

k

)ζ(−1− k)

(k + 1)[1− 2−(k+2)](2.26)

=(−1)n

2πi

∫C

ζ(−1− s)(s+ 1)[1− 2−(s+2)]

n!

s(s− 1) · · · (s− n)ds, (2.27)

where C is a positively oriented closed curve that encircles the integers 0,. . . , n − 1 and

does not include or encircle any of the following points: −2 + χk (where χk := 2πikln 2

),

k ∈ Z; −1; and n. Equality (2.26) follows from the fact that the Bernoulli numbers are

extrapolated by the Riemann zeta function taken at nonnegative integers: Bk = −kζ(1−k).

[The coefficients (−1)k do not concern us since the Bernoulli numbers of odd index greater

than 1 vanish.] Equality (2.27) follows from a direct application of residue calculus, taking

into account contributions of the simple poles at the integers 0,. . . , n− 1.

Let φ(s) denote the integrand in (2.27):

φ(s) =ζ(−1− s)

(s+ 1)[1− 2−(s+2)]

n!

s(s− 1) · · · (s− n).

We consider a positively oriented rectangular contour Cl with horizontal sides Im(s) = λl

and Im(s) = −λl, where λl := (2l+1)πln 2

, l ∈ Z+, and vertical sides Re(s) = n − θ and

22


Re(s) = −λl, where 0 < θ < 1. By elementary bounds on φ(s) along Cl and the fact that

∫ n−θ+i∞

n−θ−i∞φ(s) ds = 0 (2.28)

[this is implicit on page 113 of Flajolet and Sedgewick [8] and explicitly proved in Ap-

pendix A], one can show that

liml→∞

∫Clφ(s) ds = 0.

Accounting for residues due to the poles encircled by Cl, we obtain

vn = (−1)n+1

Ress=−1[φ(s)] + Ress=−2[φ(s)] +∑

k∈Z\{0}

Ress=−2+χk[φ(s)]

= − 1

n+ 1+

Hn+2

ln 2− ( γ

ln 2− 1

2)

(n+ 1)(n+ 2)− Σn, (2.29)

where

Σn :=∑

k∈Z\{0}

ζ(1− χk)Γ(n+ 1)Γ(1− χk)(ln 2)Γ(n+ 3− χk)

. (2.30)

23


(ii) We have u2 = t3 − t2 = t3 = −19. Hence, from (i),

un = u2 +n−1∑j=2

vj = −1

9+

n−1∑j=2

vj

= −1

9−

n−1∑j=2

1

j + 1+

1

ln 2

n−1∑j=2

Hj+2

(j + 1)(j + 2)

−(

γ

ln 2− 1

2

) n−1∑j=2

1

(j + 1)(j + 2)−

n−1∑j=2

Σj

= −1

9− (Hn −H2) +

1

ln 2

n−1∑j=2

Hj+2

(j + 1)(j + 2)

−(

γ

ln 2− 1

2

)(1

3− 1

n+ 1

)−

n−1∑j=2

Σj

=14

9− γ

3 ln 2−Hn +

(γ

ln 2− 1

2

)1

n+ 1+

1

ln 2

n−1∑j=2

Hj+2

(j + 1)(j + 2)−

n−1∑j=2

Σj.

(2.31)

Here

n−1∑j=2

Hj+2

(j + 1)(j + 2)=

n∑j=3

Hj+1

j−

n+1∑j=4

Hj

j

=H4

3+

n∑j=4

Hj+1 −Hj

j− Hn+1

n+ 1(2.32)

=17

18− Hn + 1

n+ 1, (2.33)

where we assume n ≥ 3 for (2.32), but (2.33) holds also for n = 2. In regard to∑n−1

j=2 Σj ,

note that

Σn = −∑

k∈Z\{0}

ζ(1− χk)Γ(1− χk)(ln 2)(1− χk)

[Γ(n+ 2)

Γ(n+ 3− χk)− Γ(n+ 1)

Γ(n+ 2− χk)

],

24


so that

n−1∑j=2

Σj = −∑

k∈Z\{0}

ζ(1− χk)Γ(1− χk)(ln 2)(1− χk)

[Γ(n+ 1)

Γ(n+ 2− χk)− Γ(3)

Γ(4− χk)

]. (2.34)

Define

Σn :=∑

k∈Z\{0}

ζ(1− χk)Γ(1− χk)(ln 2)(1− χk)

Γ(n+ 1)

Γ(n+ 2− χk). (2.35)

Then, combining (2.31), (2.33), and (2.34), we obtain

un = −Hn + a− Hn+1

(ln 2)(n+ 1)+

(γ − 1

ln 2− 1

2

)1

n+ 1+ Σn,

where

a :=14

9+

17− 6γ

18 ln 2− 2

ln 2

∑k∈Z\{0}

ζ(1− χk)Γ(1− χk)Γ(4− χk)(1− χk)

. (2.36)

(iii) Closely following the derivation of un described above, we obtain (for n ≥ 2)

tn = t2 +n−1∑j=2

uj =n−1∑j=2

uj

= −n−1∑j=2

Hj + a(n− 2)− 1

ln 2

n∑j=3

Hj

j+

(γ − 1

ln 2− 1

2

)(Hn −

3

2

)+

n−1∑j=2

Σj

= −(nHn − n− 1) + a(n− 2)− 1

2 ln 2

[H2n +H(2)

n −7

2

]+

(γ − 1

ln 2− 1

2

)(Hn −

3

2

)+ b− ˜Σn, (2.37)

where

b :=∑

k∈Z\{0}

2ζ(1− χk)Γ(−χk)(ln 2)(1− χk)Γ(3− χk)

, (2.38)

˜Σn :=∑

k∈Z\{0}

ζ(1− χk)Γ(−χk)Γ(n+ 1)

(ln 2)(1− χk)Γ(n+ 1− χk). (2.39)

25


Our analysis shows that the expected number of bit comparisons required by

QuickMin is asymptotically linear in n with the asymptotic slope approximately equal

to 5.27938. Hence asymptotically it differs from the expected number of key comparisons

to achieve the same task only by a constant factor. (The expectation for key comparisons

is asymptotically 2n; see Knuth [15] and Mahmoud et al. [19]). This result is rather con-

trastive to the Quicksort case in which (see Fill and Janson [5]) the expected number

of bit comparisons is asymptotically n(lnn)(lg n) whereas the expected number of key

comparisons is asymptotically 2n lnn.

26

Chapter 3

Analysis of µ(m, n)

In this chapter, we derive exact and asymptotic expressions for the expected number

of bit comparisons required by QuickRand; Quickselect is applied to a file of n keys

and finds a key of target rank that is chosen uniformly randomly from {1, 2, . . . , n}. We

will continue to use the framework and notation established in Chapter 2.

3.1 Exact computation of µ(m, n)

We average µ(m,n) over m while the parameter n is fixed. Using the notation

defined in (2.3) through (2.7), we have

µ(m, n) =1

n

n∑m=1

µ(m,n) =1

n

n∑m=1

∫ 1

0

∫ 1

s

β(s, t)P (s, t,m, n) dt ds

=

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=1

P (s, t,m, n) dt ds = µ1(m, n) + µ2(m, n) + µ3(m, n),

27

CHAPTER 3. ANALYSIS OF µ(m, n)

where, for l = 1, 2, 3,

µl(m, n) =

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=1

Pl(s, t,m, n) dt ds. (3.1)

Here µ1(m, n) = µ3(m, n), since

P3(1− t′, 1− s′, n−m′ + 1, n) = P1(s′, t′,m′, n)

by an easy symmetry argument we omit, and so

µ3(m, n) =

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=1

P3(s, t,m, n) dt ds

=

∫ 1

0

∫ 1

s′β(1− t′, 1− s′) 1

n

n∑m′=1

P3(1− t′, 1− s′, n−m′ + 1, n) dt′ ds′

=

∫ 1

0

∫ 1

s′β(s′, t′)

1

n

n∑m′=1

P1(s′, t′,m′, n) dt′ ds′

= µ1(m, n).

Therefore

µ(m, n) = 2µ1(m, n) + µ2(m, n), (3.2)

and we will compute µ1(m, n) and µ2(m, n) exactly in Sections 3.1.1-2.

3.1.1 Exact computation of µ1(m, n)

We use the following lemma in order to compute µ1(m, n) exactly:

28


Lemma 3.1.

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=2

P1(s, t,m, n) dt ds

= 2n−1∑j=2

(−1)j(n−1j

)j(j − 1)

+2

9

n−1∑j=2

(−1)j(n−1j

)j − 1

− 2n−1∑j=3

Bj

n− j + 1−(n−1j−1

)j(j − 1)(j − 2)(1− 2−j)

−2n−1∑j=2

(−1)j(n−1j

)(j + 1)j(j − 1)(1− 2−j)

.

Before proving the lemma, we complete the computation of µ1(m, n). Note that

µ1(m, n) =

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=1

P1(s, t,m, n) dt ds

=1

n

∫ 1

0

∫ 1

s

β(s, t)P1(s, t, 1, n) dt ds

+

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=2

P1(s, t,m, n) dt ds

=1

nµ(1, n) +

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=2

P1(s, t,m, n) dt ds.

Therefore, by (2.16) and Lemma 3.1, we obtain

29


µ1(m, n) =2

n

n∑j=2

(−1)j(nj

)j − 1

+2

n

n−1∑j=2

Bj

n− j + 1−(nj

)j(j − 1)(1− 2−j)

+2n−1∑j=2

(−1)j(n−1j

)j(j − 1)

+2

9

n−1∑j=2

(−1)j(n−1j

)j − 1

−2n−1∑j=3

Bj

n− j + 1−(n−1j−1

)j(j − 1)(j − 2)(1− 2−j)

−2n−1∑j=2

(−1)j(n−1j

)(j + 1)j(j − 1)(1− 2−j)

= n− 1− 4n∑j=3

(−1)j(n−1j−1

)j(j − 1)(j − 2)

+2

n

n−1∑j=2

Bj

n− j + 1−(nj

)j(j − 1)(1− 2−j)

+2

9

n−1∑j=2

(−1)j(n−1j

)j − 1

− 2n−1∑j=3

Bj

n− j + 1−(n−1j−1

)j(j − 1)(j − 2)(1− 2−j)

−2n−1∑j=2

(−1)j(n−1j

)(j + 1)j(j − 1)(1− 2−j)

, (3.3)

where the second equality holds since

2

n

n∑j=2

(−1)j(nj

)j − 1

+ 2n−1∑j=2

(−1)j(n−1j

)j(j − 1)

= 2n∑j=2

(−1)j(n− 1)!

j!(n− j)!(j − 1)− 2

n∑j=3

(−1)j(n− 1)!

(j − 1)!(n− j)!(j − 1)(j − 2)

= n− 1 + 2n∑j=3

(−1)j(n− 1)!

(j − 1)!(n− j)!(j − 1)

[1

j− 1

j − 2

]

= n− 1− 4n∑j=3

(−1)j(n−1j−1

)j(j − 1)(j − 2)

.

In Section 3.1.2 we combine the expression for µ1(m, n) in (3.3) with a similar ex-

pression for µ2(m, n) to obtain an exact expression for µ(m, n). The remainder of this sec-

tion is devoted to proving Lemma 3.1. For this, the following expression for P1(s, t,m, n)

30


will prove useful:

Lemma 3.2. Let m ≥ 2 and let x := s, y := t − s, z := 1 − t. Then the quantity

P1(s, t,m, n) defined at (2.3) satisfies

P1(s, t,m, n)

= 2n

∫ x

0

1

(ξ + y)2[Υ1(m,n, ξ, x, y, z)−Υ2(m,n, ξ, x, y, z) + Υ3(m,n, ξ, x, y, z)] dξ,

(3.4)

where

Υ1(m,n, ξ, x, y, z) :=

(n− 1

m− 2

)(x− ξ)m−2(n−m)(ξ + y + z)n−m+1,

Υ2(m,n, ξ, x, y, z) :=

(n− 1

m− 2

)(x− ξ)m−2(n−m+ 1)z(ξ + y + z)n−m,

Υ3(m,n, ξ, x, y, z) :=

(n− 1

m− 2

)(x− ξ)m−2zn−m+1.

Proof of Lemma 3.2. By (2.2)–(2.3),

P1(s, t,m, n)

=∑

m≤i<j≤n

2

j −m+ 1

n!

(i− 1)!(j − i− 1)!(n− j)!xi−1yj−i−1zn−j

=∑

m≤i<j≤n

2

j −m+ 1

n!

(n−m− 1)!

(n−m− 1

i−m, j − i− 1, n− j

)×(i−m)!

(i− 1)!xi−1yj−i−1zn−j

=2n!

(n−m− 1)!

∑m≤i<j≤n

1

j −m+ 1

(n−m− 1

i−m, j − i− 1, n− j

)×(i−m)!

(i− 1)!xi−1yj−i−1zn−j. (3.5)

31


In order to compactly describe the derivation of (3.4), we define the following indefinite

integration operator T :

T (f(x)) :=

∫ x

0

f(ξ) dξ.

We really should write (Tf)(x) rather than T (f(x)), but we would like to use shorthand

such as T (xj) = xj+1

j+1when j > −1. The operator T treats its argument f as a function of

x; the other variables involved in f (namely, y and z) are treated as constants. The notation

T l will denote the l-th iterate of T . In this notation, for m < i,

(i−m)!

(i− 1)!xi−1 = Tm−1(xi−m),

and the sum in (3.5) equals

Tm−1

( ∑m≤i<j≤n

1

j −m+ 1

(n−m− 1

i−m, j − i− 1, n− j

)xi−myj−i−1zn−j

).

Here

1

j −m+ 1zn−j = zn−m+1

∫ ∞z

η−(j−m+1)−1dη,

so

Tm−1

( ∑m≤i<j≤n

1

j −m+ 1

(n−m− 1

i−m, j − i− 1, n− j

)xi−myj−i−1zn−j

)

= zn−m+1 Tm−1

(∫ ∞z

[ ∑m≤i<j≤n

(n−m− 1

i−m, j − i− 1, n− j

)xi−myj−i−1η−j+m−2

]dη

)

= zn−m+1 Tm−1(∫ ∞

z

η−n+m−2(x+ y + η)n−m−1dη

)= zn−m+1 Tm−1

(∫ ∞z

η−3

(t

η+ 1

)n−m−1

dη

)(3.6)

32


(note that x+y = t). Making the change of variables v = tη

+1 and integrating, we obtain,

after some computation,

∫ ∞z

η−3

(t

η+ 1

)n−m−1

dη =1

t2(n−m+ 1)(n−m)

[(n−m)

(1 +

t

z

)n−m+1

−(n−m+ 1)

(1 +

t

z

)n−m+ 1

].

(3.7)

From (3.5) and (3.6)–(3.7),

P1(s, t,m, n) =2n!

(n−m+ 1)!Tm−1

(t−2[(n−m)(z + t)n−m+1

−(n−m+ 1)z(z + t)n−m + zn−m+1]). (3.8)

Here

t−2[(n−m)(z + t)n−m+1 − (n−m+ 1)z(z + t)n−m + zn−m+1]

=n−m+1∑r=2

tr−2Υ(m,n, r, z), (3.9)

where

Υ(m,n, r, z) := (n−m)

(n−m+ 1

r

)zn−m+1−r − (n−m+ 1)

(n−mr

)zn−m+1−r.

(3.10)

Then, since t = x+ y,

n−m+1∑r=2

tr−2Υ(m,n, r, z) =n−m+1∑r=2

Υ(m,n, r, z)r−2∑j=0

(r − 2

j

)xjyr−2−j. (3.11)

33


From (3.8)–(3.11),

P1(s, t,m, n)

=2n!

(n−m+ 1)!Tm−1

(n−m+1∑r=2

Υ(m,n, r, z)r−2∑j=0

(r − 2

j

)xjyr−2−j

)

=2n!

(n−m+ 1)!

n−m+1∑r=2

Υ(m,n, r, z)r−2∑j=0

(r − 2

j

)yr−2−jTm−1(xj)

=2n!

(n−m+ 1)!

n−m+1∑r=2

Υ(m,n, r, z)r−2∑j=0

(r − 2

j

)yr−2−j xj+m−1

(j + 1) · · · (j +m− 1).

(3.12)

Because of the partial fraction expansion

1

(j + 1) · · · (j +m− 1)=

1

(m− 2)!

m−2∑l=0

(−1)l(m−2l

)j + l + 1

,

it follows that

r−2∑j=0

(r − 2

j

)yr−2−j xj+m−1

(j + 1) · · · (j +m− 1)

=r−2∑j=0

(r − 2

j

)yr−2−j x

j+m−1

(m− 2)!

m−2∑l=0

(−1)l(m−2l

)j + l + 1

=1

(m− 2)!

m−2∑l=0

(−1)l(m− 2

l

)xm−2−l

∫ x

0

ξlr−2∑j=0

(r − 2

j

)yr−2−jξjdξ

=1

(m− 2)!

m−2∑l=0

(−1)l(m− 2

l

)xm−2−l

∫ x

0

ξl(ξ + y)r−2dξ

=1

(m− 2)!

∫ x

0

(x− ξ)m−2(ξ + y)r−2dξ. (3.13)

34


From (3.12)–(3.13),

P1(s, t,m, n) =2n!

(n−m+ 1)!(m− 2)!

n−m+1∑r=2

Υ(m,n, r, z)

∫ x

0

(x− ξ)m−2(ξ + y)r−2dξ

= 2n

(n− 1

m− 2

)∫ x

0

n−m+1∑r=2

Υ(m,n, r, z)(x− ξ)m−2(ξ + y)r−2dξ

= 2n

(n− 1

m− 2

)∫ x

0

(x− ξ)m−2

(ξ + y)2

n−m+1∑r=2

Υ(m,n, r, z)(ξ + y)rdξ. (3.14)

Here, by (3.10),

n−m+1∑r=2

Υ(m,n, r, z)(ξ + y)r

=n−m+1∑r=2

[(n−m)

(n−m+ 1

r

)zn−m+1−r − (n−m+ 1)

(n−mr

)zn−m+1−r

]×(ξ + y)r

= (n−m)n−m+1∑r=2

(n−m+ 1

r

)(ξ + y)rzn−m+1−r

−(n−m+ 1)n−m+1∑r=2

(n−mr

)(ξ + y)rzn−m+1−r

= (n−m)[(ξ + y + z)n−m+1 − zn−m+1 − (n−m+ 1)(ξ + y)zn−m]

−(n−m+ 1)z[(ξ + y + z)n−m − zn−m − (n−m)(ξ + y)zn−m−1]

= (n−m)(ξ + y + z)n−m+1 − (n−m+ 1)z(ξ + y + z)n−m + zn−m+1. (3.15)

Substitution of (3.15) into (3.14) gives the desired (3.4).

Proof of Lemma 3.1. From Lemma 3.2, we have

1

n

n∑m=2

P1(s, t,m, n) = 2

∫ x

0

1

(ξ + y)2

n∑m=2

[Υ1(m,n, ξ, x, y, z)

−Υ2(m,n, ξ, x, y, z) + Υ3(m,n, ξ, x, y, z)] dξ. (3.16)

35


Here

n∑m=2

Υ1(m,n, ξ, x, y, z)

= (ξ + y + z)2 d

dw

[n∑

m=2

(n− 1

m− 2

)(x− ξ)m−2wn−m

]∣∣∣∣∣w=ξ+y+z

= (ξ + y + z)2 d

dw

{w−1[(x− ξ + w)n−1 − (x− ξ)n−1]

}∣∣∣∣w=ξ+y+z

= (ξ + y + z)2{−w−2[(x− ξ + w)n−1 − (x− ξ)n−1]

+w−1(n− 1)(x− ξ + w)n−2}∣∣w=ξ+y+z

= (x− ξ)n−1 − 1 + (n− 1)(ξ + y + z)

(note that x+ y + z = 1). Similarly,

n∑m=2

Υ2(m,n, ξ, x, y, z) = zd

dw

[n∑

m=2

(n− 1

m− 2

)(x− ξ)m−2wn−m+1

]∣∣∣∣∣w=ξ+y+z

= zd

dw

[(x− ξ + w)n−1 − (x− ξ)n−1

]∣∣∣∣w=ξ+y+z

= z[(n− 1)(x− ξ + w)n−2

]∣∣w=ξ+y+z

= z(n− 1),

and

n∑m=2

Υ3(m,n, ξ, x, y, z) =n∑

m=2

(n− 1

m− 2

)(x− ξ)m−2zn−m+1

= (x− ξ + z)n−1 − (x− ξ)n−1.

Hence

n∑m=2

[Υ1(m,n, ξ, x, y, z)−Υ2(m,n, ξ, x, y, z) + Υ3(m,n, ξ, x, y, z)]

= (n− 1)(ξ + y)− 1 + (x− ξ + z)n−1. (3.17)

36


Therefore, from (3.16) and (3.17), we obtain

1

n

n∑m=2

P1(s, t,m, n)

= 2

∫ x

0

1

(ξ + y)2[(n− 1)(ξ + y)− 1 + (x− ξ + z)n−1] dξ

= 2

∫ x

0

1

(ξ + y)2{(n− 1)(ξ + y)− 1 + [1− (ξ + y)]n−1} dξ

= 2

∫ x

0

1

(ξ + y)2

n−1∑j=2

(−1)j(n− 1

j

)(ξ + y)j dξ

= 2n−1∑j=2

(−1)j(n− 1

j

)∫ x

0

(ξ + y)j−2 dξ

= 2n−1∑j=2

(−1)j(n− 1

j

)(x+ y)j−1 − yj−1

j − 1

= 2n−1∑j=2

(−1)j(n− 1

j

)tj−1 − (t− s)j−1

j − 1. (3.18)

We complete the proof by using (3.18) to compute

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=2

P1(s, t,m, n) ds dt.

We have

∫ 1

0

∫ 1

s

β(s, t)1

n

n∑m=2

P1(s, t,m, n) ds dt

= 2

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)tj−1 − (t− s)j−1

j − 1dt ds

= 2

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)tj−1

j − 1dt ds

−2

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1

j − 1dt ds. (3.19)

37


Closely following the derivations shown in (2.11)–(2.16), one can show that

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)tj−1

j − 1dt ds

=n−1∑j=2

(−1)j(n−1j

)j(j − 1)

+1

9

n−1∑j=2

(−1)j(n−1j

)j − 1

−n−1∑j=3

Bj

n− j + 1−(n−1j−1

)j(j − 1)(j − 2)(1− 2−j)

.

(3.20)

Thus, in order to complete the proof, it remains to show that

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1

j − 1dt ds =

n−1∑j=2

(−1)j(n−1j

)(j + 1)j(j − 1)(1− 2−j)

.

(3.21)

Indeed, we have

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1

j − 1dt ds

=∞∑k=0

(k + 1)2k∫ 2−(k+1)

0

∫ 2−k

2−(k+1)

n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1

j − 1dt ds

=∞∑k=0

(k + 1)2k∫ 2−(k+1)

0

∫ 2−k−s

2−(k+1)−s

n−1∑j=2

(−1)j(n− 1

j

)vj−1

j − 1dv ds

=∞∑k=0

(k + 1)2kn−1∑j=2

(−1)j(n− 1

j

)∫ 2−k

0

vj−1

j − 1

∫ (2−k−v)V

2−(k+1)

[2−(k+1)−v]W

0

ds dv.

(3.22)

Here

∫ (2−k−v)V

2−(k+1)

[2−(k+1)−v]W

0

ds =

v if 0 ≤ v ≤ 2−(k+1)

2−k − v if 2−(k+1) < v ≤ 2−k.

38


Thus

∫ 2−k

0

vj−1

j − 1

∫ (2−k−v)V

2−(k+1)

[2−(k+1)−v]W

0

ds dv

=1

j − 1

[∫ 2−(k+1)

0

vj dv +

∫ 2−k

2−(k+1)

vj−1(2−k − v) dv

]

=2−k(j+1)(1− 2−j)

(j + 1)j(j − 1). (3.23)

From (3.22) and (3.23), we obtain

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1

j − 1dt ds

=∞∑k=0

(k + 1)2kn−1∑j=2

(−1)j(n− 1

j

)2−k(j+1)(1− 2−j)

(j + 1)j(j − 1)

=n−1∑j=2

(−1)j(n− 1

j

)1− 2−j

(j + 1)j(j − 1)

∞∑k=0

(k + 1)2−kj

=n−1∑j=2

(−1)j(n− 1

j

)1− 2−j

(j + 1)j(j − 1)

1

(1− 2−j)2

=n−1∑j=2

(−1)j(n−1j

)(j + 1)j(j − 1)(1− 2−j)

, (3.24)

and (3.21) is proved.

3.1.2 Exact computation of µ2(m, n) and µ(m, n)

The derivations for obtaining a computationally preferable exact expression for

µ2(m, n) are entirely analogous to those for µ1(m, n) described in the previous section

(Section 3.1.1). Thus we omit details. As described in Section 2.2, P2(s, t,m, n) is zero

39


for m = 1 and for m = n, so, from (3.1),

µ2(m, n) =

∫ 1

0

∫ 1

s

β(s, t)1

n

n−1∑m=2

P2(s, t,m, n) dt ds. (3.25)

Therefore we first derive a computationally desirable expression for 1n

∑n−1m=2 P2(s, t,m, n).

Again, let x := s, y := t− s, z := 1− t. Then

1

n

n−1∑m=2

P2(s, t,m, n)

=1

n

n−1∑m=2

∑1≤i≤m<j≤n

2

j − i+ 1

(n

i− 1, 1, j − i− 1, 1, n− j

)xi−1yj−i−1zn−j

=1

n

n−1∑m=2

S1(m,n, x, y, z)− 1

n

n−1∑m=2

S2(m,n, x, y, z)− 1

n

n−1∑m=2

S3(m,n, x, y, z),

(3.26)

where

S1(m,n, x, y, z) :=∑

1≤i<j≤n

2

j − i+ 1

(n

i− 1, 1, j − i− 1, 1, n− j

)xi−1yj−i−1zn−j,

S2(m,n, x, y, z) :=∑

m≤i<j≤n

2

j − i+ 1

(n

i− 1, 1, j − i− 1, 1, n− j

)xi−1yj−i−1zn−j,

S3(m,n, x, y, z) :=∑

1≤i<j≤m

2

j − i+ 1

(n

i− 1, 1, j − i− 1, 1, n− j

)xi−1yj−i−1zn−j.

Fill and Janson [5] showed that S1(m,n, x, y, z) = 2∑n

j=2(−1)j(nj

)(t− s)j−2. Hence

1

n

n−1∑m=2

S1(m,n, x, y, z) =2(n− 2)

n

n∑j=2

(−1)j(n

j

)(t− s)j−2. (3.27)

40


Following the derivations shown in (3.5) through (3.18), one can show that

1

n

n−1∑m=2

S2(m,n, x, y, z) = 2y−2x[(x+ z)n−1 − 1 + y(n− 1)] (3.28)

= 2(t− s)−2s{[1− (t− s)]n−1 − 1 + (t− s)(n− 1)}

= 2sn−1∑j=2

(−1)j(n− 1

j

)(t− s)j−2. (3.29)

To obtain a similar expression for 1n

∑n−1m=2 S3(m,n, x, y, z), we note that, letting m′ :=

n+ 1−m, i′ := n+ 1− j, j′ := n+ 1− i,

S3(m,n, x, y, z)

=∑

m′≤i′<j′≤n

2

j′ − i′ + 1

(n

n− j′, 1, j′ − i′ − 1, 1, i′ − 1

)xn−j

′yj′−i′−1zi

′−1

= S2(n+ 1−m,n, z, y, x).

Thus

1

n

n−1∑m=2

S3(m,n, x, y, z) =1

n

n−1∑m=2

S2(n+ 1−m,n, z, y, x)

=1

n

n−1∑m=2

S2(m,n, z, y, x). (3.30)

Inspecting (3.28)–(3.30), we find

1

n

n−1∑m=2

S3(m,n, x, y, z) = 2(1− t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−2. (3.31)

41


From (3.26), (3.27), (3.29), and (3.31),

1

n

n−1∑m=1

P2(s, t,m, n)

=2(n− 2)

n

n∑j=2

(−1)j(n

j

)(t− s)j−2 − 2s

n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−2

−2(1− t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−2

=2(n− 2)

n

n∑j=2

(−1)j(n

j

)(t− s)j−2 − 2

n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−2

+2n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1. (3.32)

Hence, from (3.25) and (3.32),

µ2(m, n) =2(n− 2)

n

∫ 1

0

∫ 1

s

β(s, t)n∑j=2

(−1)j(n

j

)(t− s)j−2 dt ds

−2

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−2 dt ds

+2

∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1 dt ds. (3.33)

Fill and Janson [5] showed that∫ 1

0

∫ 1

s

β(s, t)n∑j=2

(−1)j(n

j

)(t− s)j−2 dt ds =

n∑j=2

(−1)j(nj

)j(j − 1)[1− 2−(j−1)]

. (3.34)

A careful term-by-term inspection of the derivations shown in (3.22)–(3.24) reveals that∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−2 dt ds =

n−1∑j=2

(−1)j(n−1j

)j(j − 1)[1− 2−(j−1)]

,

(3.35)∫ 1

0

∫ 1

s

β(s, t)n−1∑j=2

(−1)j(n− 1

j

)(t− s)j−1 dt ds =

n−1∑j=2

(−1)j(n−1j

)j(j + 1)(1− 2−j)

.

(3.36)

42


Combining (3.33)–(3.36), we obtain

µ2(m, n) =2(n− 2)

n

n−2∑j=0

(−1)j(nj+2

)(j + 1)(j + 2)[1− 2−(j+1)]

− 2n∑j=2

(−1)j(nj

)j(j − 1)[1− 2−(j−1)]

+2(n− 1)

= − 4

n

n∑j=2

(−1)j(nj

)j(j − 1)[1− 2−(j−1)]

+ 2(n− 1). (3.37)

Finally, we complete the exact computation of µ(m, n). From (3.2), (3.3), and

(3.37), we have

µ(m, n) = 2µ1(m, n) + µ2(m, n)

= 2(n− 1)− 8n∑j=3

(−1)j(n−1j−1

)j(j − 1)(j − 2)

+4

n

n−1∑j=2

Bj

n− j + 1−(nj

)j(j − 1)(1− 2−j)

+4

9

n−1∑j=2

(−1)j(n−1j

)j − 1

− 4n−1∑j=3

Bj

n− j + 1−(n−1j−1

)j(j − 1)(j − 2)(1− 2−j)

−4n−1∑j=2

(−1)j(n−1j

)(j + 1)j(j − 1)(1− 2−j)

− 4

n

n∑j=2

(−1)j(nj

)j(j − 1)[1− 2−(j−1)]

+2(n− 1). (3.38)

We rewrite or combine some of the terms in (3.38) for the asymptotic analysis of µ(m, n)

described in the next section. We define

F1(n) :=n∑j=3

(−1)j(nj

)(j − 1)(j − 2)

,

F2(n) :=n−1∑j=2

Bj

j(1− 2−j)

[n−

(nj

)j − 1

− 1

],

F3(n) :=n−1∑j=2

(−1)j(n−1j

)j − 1

,

43


F4(n) :=n−1∑j=3

Bj

j(j − 1)(1− 2−j)

[n− 1−

(n−1j−1

)j − 2

− 1

],

F5(n) :=n∑j=3

(−1)j(nj

)j(j − 1)(j − 2)[1− 2−(j−1)]

.

The second, third, fourth, and fifth terms in (3.38) can be written as − 8nF1(n), 4

nF2(n),

49F3(n), and −4F4(n), respectively. The last three terms in (3.38) can be combined as

follows:

−4n−1∑j=2

(−1)j(n−1j

)(j + 1)j(j − 1)(1− 2−j)

− 4

n

n∑j=2

(−1)j(nj

)j(j − 1)[1− 2−(j−1)]

+ 2(n− 1)

=4

n

n∑j=3

(−1)j(nj

)(j − 1)(j − 2)[1− 2−(j−1)]

− 4

n

n∑j=2

(−1)j(nj

)j(j − 1)[1− 2−(j−1)]

+ 2(n− 1)

=8

n

n∑j=3

(−1)j(nj

)j(j − 1)(j − 2)[1− 2−(j−1)]

=8

nF5(n).

Therefore

µ(m, n) = 2(n− 1)− 8nF1(n) + 4

nF2(n) + 4

9F3(n)− 4F4(n) + 8

nF5(n). (3.39)

3.2 Asymptotic analysis of µ(m, n)

We derive an asymptotic expression for µ(m, n) shown in (3.39). The computations

described in this section are analogous to those in Section 2.3. Hence we merely sketch

details to derive the asymptotic expression. First, we analyze F1(n). A routine complex-

analytical argument similar to (but much easier than) the one described in Section 2.3 shows

44


that

F1(n) = (−1)n+1

2∑k=0

Ress=k

[n!

s(s− 1)2(s− 2)2(s− 3) · · · (s− n)

]= (−1)n+1

[(−1)n

2+ (−1)nnHn−1 +

(−1)n

2n(n− 1)

(Hn−2 −

5

2

)]= −1

2n(n− 1)Hn−2 +

5

4n(n− 1)− nHn−1 −

1

2

= −1

2n2 lnn+

(5

4− γ

2

)n2 − n lnn+

n2

2(n− 1)− (γ + 1)n+O(1).

(3.40)

Since F2(n) is equal to tn, which is defined at (2.17) and analyzed in Section 2.3, we

already have an asymptotic expression for F2(n). Next we derive an asymptotic expression

for F3(n):

F3(n) = (−1)n1∑

k=0

Ress=k

{(n− 1)!

s(s− 1)2(s− 2) · · · [s− (n− 1)]

}= nHn−2 − n−Hn−2 + 2

= n lnn+ (γ − 1)n− lnn+O(1). (3.41)

To obtain an asymptotic expression for F4(n), we closely follow the approach of

Section 2.3. Let un := F4(n+ 1)− F4(n). Then

un = −n∑j=3

Bj

j(j − 1)(j − 2)(1− 2−j)

[(n− 1

j − 2

)− 1

].

Let vn := un+1 − un. Then, by computations similar to those performed for vn in Sec-

45


tion 2.3,

vn = −n−2∑k=0

(−1)kζ(−2− k)

(k + 2)(k + 1)[1− 2−(k+3)]

(n− 1

k

)

= (−1)n+1

3∑k=1

Ress=−k

{ζ(−2− s)

(s+ 2)(s+ 1)[1− 2−(s+3)]

(n− 1)!

s(s− 1) · · · [s− (n− 1)]

}+(−1)n+1

∑k∈Z\{0}

Ress=−3+χk

{ζ(−2− s)

(s+ 2)(s+ 1)[1− 2−(s+3)]

× (n− 1)!

s(s− 1) · · · [s− (n− 1)]

}=

1

9n− 1

n(n+ 1)− 1

n(n+ 1)(n+ 2)

[γ

ln 2− 1

2− Hn+2

n+ 2

]− ξn,

where

ξn :=∑

k∈Z\{0}

ζ(1− χk)Γ(1− χk)Γ(n)

(ln 2)Γ(n+ 3− χk).

Hence

un = u2 +n−1∑j=2

vj

=1

9Hn−1 + a+ ξn −

1

2 ln 2

(Hn

n− Hn+1

n+ 1

)+

1

n− 3 + ln 2− 2γ

4 ln 2

1

n(n+ 1),

where

a :=7

36 ln 2− 41

72− γ

12 ln 2−

∑k∈Z\{0}

ζ(1− χk)Γ(1− χk)(ln 2)(2− χk)Γ(4− χk)

,

ξn :=∑

k∈Z\{0}


(ln 2)(2− χk)Γ(n+ 2− χk).

46


Thus

F4(n) = F4(2) +n−1∑j=2

uj

=1

9nHn−1 +

8

9Hn−1 +

(a− 1

9

)n− 8

9− 3

8 ln 2− 3 + ln 2− 2γ

8 ln 2

−2a+ b− ˜ξn +1

2 ln 2

Hn

n+

3 + ln 2− 2γ

4 ln 2

1

n, (3.42)

where

b :=∑

k∈Z\{0}

ζ(1− χk)Γ(1− χk)(ln 2)(2− χk)(1− χk)Γ(3− χk)

,

˜ξn :=∑

k∈Z\{0}


(ln 2)(2− χk)(1− χk)Γ(n+ 1− χk).

Therefore

F4(n) =1

9n lnn+

(a+

1

9γ − 1

9

)n+

8

9lnn+O(1). (3.43)

Finally, we analyze F5(n). Let

Gn(s) :=n!

[1− 2−(s−1)]s2(s− 1)2(s− 2)2(s− 3) · · · (s− n).

Then, by computations that are entirely analogous to those performed for F1(n), F2(n),

47


and F4(n),

F5(n) = (−1)n+1

2∑k=0

Ress=kGn(s) + (−1)n+1∑

k∈Z\{0}

Ress=1+χkGn(s)

=1

4(2Hn + 3 + 4 ln 2)− n(n− 1)

2(Hn−2 − ln 2− 3)

−n[

1

2 ln 2(Hn−1)2 +

(1

2− 1

ln 2

)Hn−1 +

1

2 ln 2H

(2)n−1 +

2

ln 2+

ln 2

12− 1

2

]+∑

k∈Z\{0}

Γ(−1− χk)Γ(n+ 1)

(ln 2)χk(χ2k − 1)Γ(n− χk)

= −1

2n2 lnn+

3 + ln 2− γ2

n2 − 1

2 ln 2n(lnn)2 +

(1

ln 2− 1

2

)n lnn+O(n).

(3.44)

Therefore, from (3.39)–(3.41) and (3.43)–(3.44), we obtain the following asymp-

totic formula for µ(m, n):

µ(m, n) = 4(1 + ln 2− a)n− 4

ln 2(lnn)2 + 4

(2

ln 2− 1

)lnn+O(1). (3.45)

The asymptotic slope 4(1 + ln 2 − a) is approximately 8.20731. Thus the expected num-

ber of bit comparisons required by QuickRand remains asymptotically linear in n. As

in the QuickMin case, the expectation is asymptotically different from that for key com-

parisons only by a constant factor. (The expected number of key comparisons required by

QuickRand is asymptotically 3n; see Mahmoud et al. [19].)

48

Chapter 4

Derivation of a closed formula

for µ(m,n)

The exact expression for µ(m,n) obtained in Section 2.1 [see (2.8)] involves infinite

summation and integration. Hence it is not a preferable form for numerically computing

the expectation. In this chapter, we establish another exact expression for µ(m,n) that only

involves finite summation. We also use the formula to compute µ(m,n) for m = 1, . . . , n,

n = 2, . . . , 20.

As described in Section 2.1, it follows from equations (2.6)–(2.8) that

µ(m,n) = µ1(m,n) + µ2(m,n) + µ3(m,n), (4.1)

where, for q = 1, 2, 3,

µq(m,n) :=∞∑k=0

2k∑l=1

∫ (l− 12

)2−k

(l−1)2−k

∫ l2−k

(l− 12

)2−k

(k + 1)Pq(s, t,m, n) dt ds. (4.2)

49

CHAPTER 4. CLOSED FORMULA FOR µ(m,n)

The same technique can be applied to eliminate the infinite summation and integration from

each µq(m,n). We describe the technique for obtaining a closed expression of µ1(m,n) in

detail.

First, we transform P1(s, t,m, n) shown in (2.3) so that we can eliminate the inte-

gration in µ1(m,n). Define

C1(i, j) := 1(1 ≤ m ≤ i < j ≤ n)2

j −m+ 1

(n

i− 1, 1, j − i− 1, 1, n− j

), (4.3)

where 1(1 ≤ m ≤ i < j ≤ n) is an indicator function that equals 1 if the event in braces

holds and 0 otherwise. Since

si−1(t− s)j−i−1(1− t)n−j

= si−1

j−i−1∑u=0

(j − i− 1

u

)tu(−1)j−i−1−usj−i−1−u

n−j∑v=0

(n− jv

)(−1)n−j−vtn−j−v,

it follows that

P1(s, t,m, n)

=∑

m≤i<j≤n

C1(i, j)

j−i−1∑u=0

n−j∑v=0

(j − i− 1

u

)(n− jv

)sj−u−2tn−j−v+u(−1)n−i−u−v−1

=∑

m≤i<j≤n

C1(i, j)

j−2∑f=i−1

n−f−2∑h=j−f−2

sf th(j − i− 1

f − i+ 1

)(n− j

h− j + f + 2

)(−1)h−i−j+1

=n−2∑

f=m−1

n−f−2∑h=0

sf thC2(f, h), (4.4)

where

C2(f, h) :=

f+1∑i=m

f+h+2∑j=f+2

C1(i, j)

(j − i− 1

f − i+ 1

)(n− j

h− j + f + 2

)(−1)h−i−j+1.

50


Thus, from (4.2) and (4.4), we can eliminate the integration in µ1(m,n) and express it

using polynomials in l:

µ1(m,n) =n−2∑

f=m−1

n−f−2∑h=0

C3(f, h)∞∑k=0

(k + 1)2k∑l=1

2−k(f+h+2)[lh+1 − (l − 12)h+1]

×[(l − 12)f+1 − (l − 1)f+1], (4.5)

where

C3(f, h) :=1

(n+ 1)(f + 1)C2(f, h).

Note that

lh+1 −(l − 1

2

)h+1

= −h∑j=0

(h+ 1

j

)lj(−1

2

)h+1−j

,

(l − 1

2

)f+1

− (l − 1)f+1 = −f∑

j′=0

(f + 1

j′

)lj′(−1)f+1−j′

[1−

(1

2

)f+1−j′].

Hence[lh+1 −

(l − 1

2

)h+1][(

l − 1

2

)f+1

− (l − 1)f+1

]

=

f∑j′=0

h∑j=0

(f + 1

j′

)(h+ 1

j

)(−1)f+h−j′−j

[1−

(1

2

)f+1−j′](

1

2

)h+1−j

lj′+j,

which can be rearranged to

f+h+1∑j=1

C4(f, h, j)lj−1, (4.6)

where

C4(f, h, j) := (−1)f+h−j+1

(1

2

)h−j+2 (j−1)Vf∑

j′=0W

(j−1−h)

(f + 1

j′

)(h+ 1

j − 1− j′

)

×

[1−

(1

2

)f+1−j′](

1

2

)j′.

51


Therefore, from (4.5)–(4.6), we obtain

µ1(m,n) =n−2∑

f=m−1

n−f−2∑h=0

C3(f, h)∞∑k=0

(k + 1)2k∑l=1

2−k(f+h+2)

f+h+1∑j=1

C4(f, h, j)lj−1

=n−2∑

f=m−1

n−f−2∑h=0

f+h+1∑j=1

C5(f, h, j)∞∑k=0

(k + 1)2−k(f+h+2)

2k∑l=1

lj−1,

where

C5(f, h, j) := C3(f, h) · C4(f, h, j).

Here, as described in Section 2.2,

2k∑l=1

lj−1 =

j−1∑r=0

aj,r2k(j−r),

where aj,r is defined by (2.12). Now define

C6(f, h, j, r) := aj,r C5(f, h, j).

Then

µ1(m,n) =n−2∑

f=m−1

n−f−2∑h=0

f+h+1∑j=1

j−1∑r=0

aj,r C5(f, h, j)∞∑k=0

(k + 1)2−k(f+h+2+r−j)

=n−2∑

f=m−1

n−f−2∑h=0

f+h+1∑j=1

j−1∑r=0

C6(f, h, j, r)[1− 2−(f+h+2+r−j)]−2

=n−1∑a=1

C7(a)(1− 2−a)−2, (4.7)

where

C7(a) :=n−2∑

f=m−1

n−f−2∑h=α

f+h+1∑j=β

C6(f, h, j, a+ j − (f + h+ 2)),

in which α := 0∨

(a− f − 1) and β := 1∨

(f + h+ 2− a).

The procedure described above can be applied to derive analogous exact formulae

52


for µ2(m,n) and µ3(m,n). In order to derive the analogous exact formula for µ2(m,n),

one need only start the derivation by changing the indicator function in C1(i, j) [see (4.3)]

to 1(1 ≤ i < m < j ≤ n) and follow each step of the procedure; for µ3(m,n), start the

derivation by changing the indicator function to 1(1 ≤ i < j ≤ m ≤ n).

5

10

15

20

0

5

10

15

20

0

20

40

60

80

100

120

n

Expectation of bit comparisons

m

μ(m

,n)

n

Figure 4.1: Expected number of bit comparisons for Quickselect. The closed formulaefor µ1(m,n), µ2(m,n), and µ3(m,n) were used to compute µ(m,n) for n = 1, 2, . . . , 20(n represents the number of keys) and m = 1, 2, . . . , n (m represents the rank of the targetkey).

53


Using the closed exact formulae of µ1(m,n), µ2(m,n), and µ3(m,n), we computed

µ(m,n) for n = 2, 3, . . . , 20 and m = 1, 2, . . . , n. Figure 4.1 shows the results, which

suggest the following: (i) for fixed n, µ(m,n) increases inm form ≤ n+12

and is symmetric

about n+12

; (ii) for fixed m, µ(m,n) increases in n; (iii) maxm µ(m,n) is asymptotically

linear in n.

Written as a single expression, the closed formula for µ(m,n) is a seven-fold sum

of rather elementary terms with each sum having order n terms (in the worst case); in this

sense, the running time of the algorithm for computing µ(m,n) is of order n7. This formula

for µ(m,n) does not allow us to prove the three conjectures. A concise formula has been

established for the expected number of key comparisons required by QuickSelect to

find the m-th key from a set of n keys; Knuth [15] showed that the expectation can be

written as 2[n+ 3 + (n+ 1)Hn − (m+ 2)Hm − (n+ 3−m)Hn+1−m].

54

Chapter 5

Background and preliminaries for

limiting-distribution analysis

In this chapter, we formulate a framework that will be used to establish limiting-

distribution results for QuickVal and QuickQuant presented in Chapters 6–8. We

describe the probabilistic models that characterize the generation of keys in Section 5.1

and review known results about key and symbol comparisons in Section 5.2. QuickVal

and QuickQuant are explained in Section 5.3.

5.1 Probabilistic source models for the keys

In this subsection we describe what is meant by a probabilistic source, our model

for how the i.i.d. keys are generated, using the terminology and notation of Valle et al. [27].

55

CHAPTER 5. PRELIMINARIES FOR DISTRIBUTIONAL CONVERGENCEANALYSIS

Let Σ denote a totally ordered alphabet (set of symbols), assumed to be isomorphic

either to {0, . . . , r − 1} for some finite r or to the full set of nonnegative integers, in either

case with the natural order; a word is then an element of Σ∞, i.e., an infinite sequence

(or “string”) of symbols. We will follow the customary practice of denoting a word w =

(w1, w2, . . .) more simply by w1w2 · · · .

We will use the word “prefix” in two closely related ways. First, the symbol strings

belonging to Σk are called prefixes of length k, and so Σ∗ := ∪0≤k<∞Σk denotes the set of

all prefixes of any nonnegative finite length. Second, if w = w1w2 · · · is a word, then we

will call

w(k) := w1w2 · · ·wk ∈ Σk (5.1)

its prefix of length k.

Lexicographic order is the linear order (to be denoted in the strict sense by ≺ and

in the weak sense by�) on the set of words specified by declaring that w ≺ w′ if (and only

if) for some 0 ≤ k < ∞ the prefixes of w and w′ of length k are equal but wk+1 < w′k+1.

We denote the cost of determining w ≺ w′ when comparing distinct words w and w′ by

c(w,w′); we will always assume that the function c is symmetric and nonnegative.

Example 5.1. Here is an example of a natural class of cost functions. Start with non-

negative symmetric functions ci : Σ × Σ → [0,∞), i = 1, 2, . . . , modeling the cost

of comparing symbols in the respective ith positions of two words. This allows for the

symbol-comparison costs to depend both on the positions of the symbols in the words and

56


on the symbols themselves. Then, for comparisons of distinct words, define

c(w,w′) :=k+1∑i=1

ci(wi, w′i) =

k∑i=1

ci(wi, wi) + ck+1(wk+1, w′k+1)

where k is the length of the longest common prefix of w and w′. If ci ≡ δi0,i (independent

of the symbols being compared) for given positive integer i0, then c is the cost used in

counting comparisons of symbols in position i0; in particular, if i0 = 1 then c ≡ 1 is the

cost used in counting key comparisons. On the other hand, if ci ≡ 1 for all i, then c ≡ k+1

is the cost used in counting symbol comparisons.

A probabilistic source is simply a stochastic process W = W1W2 · · · with state

space Σ (endowed with its total σ-field) or, equivalently, a random variable W taking val-

ues in Σ∞ (with the product σ-field). According to Kolmogorov’s consistency criterion, the

distributions µ of such processes are in one-to-one correspondence with consistent specifi-

cations of finite-dimensional marginals, that is, of the probabilities

pw := µ({w1 · · ·wk} × Σ∞), w = w1w2 · · ·wk ∈ Σ∗.

Here the fundamental probability pw is the probability that a word drawn from µ has

w1 · · ·wk as its length-k prefix.

Because the analysis of QuickSelect is significantly more complicated when

its input keys are not all distinct, we will restrict attention to probabilistic sources with

continuous distributions µ. Expressed equivalently in terms of fundamental probabilities,

our continuity assumption is that for any w = w1w2 · · · ∈ Σ∞ we have pw(k) → 0 as

k →∞, recalling the prefix notation (5.1).

57


Example 5.2. We present a few classical examples of sources. For more examples, and for

further discussion, see Section 3 of [27].

(a) In computer science jargon, a memoryless source is one with W1,W2, . . . i.i.d.

Then the fundamental probabilities pw have the product form

pw = pw1pw2 · · · pwk, w = w1w2 · · ·wk ∈ Σ∗.

(b) A Markov source is one for which W1W2 · · · is a Markov chain.

(c) An intermittent source over the finite alphabet Σ = {0, . . . , r − 1} is defined

by specifying the conditional distributions L(Wj |W1, . . . ,Wj−1) (j ≥ 2) in a way that

pays special attention to a particular symbol σ. The source is said to be intermittent of

exponent γ > 0 with respect to σ if L(Wj |W1, . . . ,Wj−1) depends only on the maximum

value k such that the last k symbols in the prefixW1 · · ·Wj−1 are all σ and (i) is the uniform

distribution on Σ, if k = 0; and (ii) if 1 ≤ k ≤ j − 1, assigns mass [k/(k + 1)]γ to σ and

distributes the remaining mass uniformly over the remaining elements of Σ.

We next present an equivalent description of probabilistic sources (with a corre-

sponding equivalent condition for continuity) that will prove convenient because it allows

us to treat all sources within a uniform framework. If M is any measurable mapping from

(0, 1) (with its Borel σ-field) into Σ∞ and U is distributed unif(0, 1), then M(U) is a prob-

abilistic source. Conversely, given any probability measure µ on Σ∞ there exists a mono-

tone measurable mapping M such that M(U) has distribution µ when U ∼ unif(0, 1);

here (weakly) monotone means that M(t) � M(u) whenever t ≤ u. Indeed, if F is the

58


distribution function

F (w) := µ{w′ ∈ Σ∞ : w′ � w}, w ∈ Σ∞,

for µ, then we can always use the inverse probability transform

M(u) := inf{w ∈ Σ∞ : u ≤ F (w)}, u ∈ (0, 1),

for M . The measure µ is continuous if and only if this M is strictly monotone.

So henceforth we will assume that our keys are generated as M(U1), . . . ,M(Un),

whereM : (0, 1)→ Σ∞ is strictly monotone and U1, . . . , Un (we will call these the “seeds”

of the keys) are i.i.d. unif(0, 1). Given a specification of costs c(w,w′) in comparing words,

we can now define a source-specific notion of cost by setting

β(u, t) := c(M(u),M(t)).

In our main application, βsymb(u, t) represents the number of symbol comparisons required

to compare words with seeds u and t.

The following associated terminology and notation from [27] will also prove useful.

For each prefix w ∈ Σ∗, we let Iw = (aw, bw) denote the interval that contains all seeds

whose corresponding words begin with w and µw := (aw + bw)/2 its midpoint. We call

Iw the fundamental interval associated with w. (There is no need to be fussy as to whether

the interval is open or closed or half-open, because the probability that a random seed U

takes any particular value is 0. Also, we always assume that aw < bw, since the case that

aw = bw will not concern us.) The fundamental probability pw can be expressed as bw−aw.

59


The fundamental triangle of prefix w, denoted by Tw, is the triangular region

Tw := {(u, t) : aw < u < t < bw},

and when w is the empty prefix we denote this triangle by T :

T := {(u, t) : 0 < u < t < 1}.

For some of our results, the quantity

πk := max{pw : w ∈ Σk} (5.2)

will play an important role. The following definition of a Π-tamed probabilistic source is

taken (with slight modification) from [27]:

Definition 5.3. Let 0 < γ <∞ and 0 < A <∞. We say that the source is Π-tamed (with

parameters γ and A) if the sequence (πk) at (5.2) satisfies

πk ≤ A(k + 1)−γ for every k ≥ 0.

Observe that a Π-tamed source is always continuous. There is a related condition

for cost functions β that will be assumed (for suitable values of the parameters) in some of

our results:

Definition 5.4. Let 0 < ε < ∞ and 0 < c < ∞. We say that the symmetric cost function

β ≥ 0 is tamed (with parameters ε and c) if

β(u, t) ≤ c(t− u)−ε for all (u, t) ∈ T.

We say that β is ε-tamed if it is tamed with parameters ε and c for some c.

60


We leave it to the reader to make the simple verification that a source is Π-tamed

with parameters γ and A if and only if βsymb is tamed with parameters ε = 1/γ and c =

A1/γ .

Remark 5.5. (a) Many common sources have geometric decrease in πk (call these “g-

tamed”) and so for any γ are Π-tamed with parameters γ andA for suitably chosenA ≡ Aγ

[equivalently, the symbol-comparisons cost βsymb is ε-tamed for any ε; in fact, if πk ≤ b−k

for every k with b > 1, then

βsymb(u, t) ≤ 1 + logb1

t− ufor all (u, t) ∈ T ].

For example, a memoryless source satisfies πk = pkmax, where

pmax := supw∈Σ1

pw

satisfies pmax < 1 except in the highly degenerate case of an essentially single-symbol

alphabet. We also have πk ≤ pkmax for any Markov source, where now pmax is the supremum

of all one-step transition probabilities, and so such a source is g-tamed provided pmax < 1.

Expanding dynamical sources are also g-tamed.

(b) For an intermittent source as in Example 5.2(c), for all large k the maximum

probability πk is attained by the word σk and equals

πk = r−1k−γ.

Intermittent sources are therefore examples of Π-tamed sources for which πk decays at a

truly inverse-polynomial rate, not an exponential rate as in the case of g-tamed sources.

61


5.2 Known results for the numbers of key and

symbol comparisons

In this subsection we give for QuickSelect an abbreviated review of what is

already known about the distribution of the number of key comparisons (β ≡ 1 in our

notation) and (from Vallee et al. [27]) about the expected number of symbol comparisons

(β = βsymb). To our knowledge, no other cost functions have previously been considered,

nor has there been any treatment of the full distribution of the number of symbol compar-

isons.

Let Kn,m denote the number of key comparisons required by the algorithm to find

a key of rank m in a file of n keys (with 1 ≤ m ≤ n). Thus Kn,1 and Kn,n represent

the key comparison costs required by QuickMin and QuickMax, respectively. (Clearly

Kn,1L=Kn,n). It has been shown (see Mahmoud et al. [19], Hwang and Tsai [14]) that as

n → ∞, Kn,1/n converges in law to the Dickman distribution, which can be described as

the distribution of the perpetuity

1 +∑k≥1

U1 · · ·Uk,

where Uk are i.i.d. uniform(0, 1). Mahmoud et al. [19] established a fixed-point equation

for the limiting distribution of the normalized (by dividing by n) number of key compar-

isons required by QuickRand and also explicitly identified this limiting distribution.

By using process-convergence techniques, Grubel and Rosler [11, Theorem 8] iden-

62


tified, for each 0 ≤ α < 1, a nondegenerate random variable K(α) to which Kn,bαnc+1/n

converges in distribution; see also the fixed-point equation in their Theorem 10, and

Grubel [10], who used a Markov chain approach and characterized the limiting distribu-

tion in his Theorem 3. Earlier, Devroye [3] had shown that

supn≥1

max1≤m≤n

P(Kn,m ≥ tn) ≤ Cρt

for any ρ > 3/4 and some C ≡ C(ρ).

Concerning moments, Grubel and Rosler [11, Theorem 11] showed that EK(α) =

2[1 − α lnα − (1 − α) ln(1 − α)], and Paulsen [21] calculated higher-order moments of

K(α). Grubel [10, end of Section 2] proved convergence of the moments for finite n to the

corresponding moments of the limiting K(α).

Prior to the present paper, only expectations have been studied for the number of

symbol comparisons for QuickQuant. The current state of knowledge is summarized by

part (i) of Theorem 2 in Vallee et al. [27] (see also their accompanying Figures 1–3); we

refer the reader to [27] for the other parts of the theorem, which routinely specialize part (i)

to QuickMin, QuickMax, and QuickRand.

To review their result we need the notation and terminology of Section 5.1 and a bit

more. Using the non-standard abbreviations y+ := (1/2) + y and y− := (1/2)− y and the

convention 0 ln 0 := 0, we define

H(y) :=

−(y+ ln y+ + y− ln y−), if 0 ≤ y ≤ 1/2

y−(ln y+ − ln |y−|), if y ≥ 1/2

63


and then set L(y) := 2[1 + H(y)]. According to Theorem 2(i) in [27], for any Π-tamed

source the mean number of symbol comparisons for QuickQuant(n, α) is asymptotically

ρ n + O(n1−δ) for some δ > 0. Here ρ ≡ ρ(α) and δ both depend on the probabilistic

source, with

ρ :=∑w∈Σ∗

pwL

(∣∣∣∣α− µwpw

∣∣∣∣) . (5.3)

They derive (5.3) by first proving the equality

ρ =

∫Tβ(u, t) [(α ∨ t)− (α ∧ u)]−1 du dt (5.4)

for Π-tamed sources with γ > 1.

5.3 QuickQuant and QuickVal

Let SQn ≡ SQ

n (α) denote the total cost required by QuickQuant(n, α). To prove

convergence of SQn /n (in suitable senses to be made precise later), we exploit an idea

introduced by Vallee et al. [27] and begin with the study of a related algorithm, called

QuickVal ≡ QuickVal(n, α), which we now describe. QuickVal is admittedly some-

what artificial and inefficient; it is important to keep in mind that we study it mainly as an

aid to studying QuickQuant.

Having generated n seeds and then n keysM1, . . . ,Mn (say) using our probabilistic

source, QuickVal is a recursive randomized algorithm to find the rank of the additional

word M(α) in the set {M1, . . . ,Mn,M(α)}; thus, while QuickQuant finds the value of

the α-quantile in the sample of keys, QuickVal dually finds the rank of the population

64


α-quantile in the augmented set. First, QuickVal selects a pivot uniformly at random

from the set of keys {M1, . . . ,Mn} and finds the rank of the pivot by (a) comparing the

pivot with each of the other keys (we will count these comparisons) and (b) comparing the

pivot with M(α) (we will find it convenient not to count the cost of this comparison in the

total cost). With probability one, the pivot key will differ from the word M(α). If M(α) is

smaller than the pivot key, then the algorithm operates recursively on the set of keys smaller

than the pivot and determines the rank of the word M(α) in the setMsmaller ∪ {M(α)},

whereMsmaller denotes the set of keys smaller than the pivot. Similarly, if M(α) is greater

than the pivot key, then the algorithm operates recursively on the set of keys larger than the

pivot [together with the word M(α)]. Eventually the set of words on which the algorithm

operates reduces to the singleton {M(α)}, and the algorithm terminates.

Notice that the operation of QuickVal is quite close to that of QuickQuant,

for the same value of α; we expect running costs of the two algorithms to be close, since

when n is large the rank of M(α) in {M1, . . . ,Mn,M(α)} should be close (in relative

error terms) to αn. In fact, we will show that if SVn ≡ SV

n (α) denotes the total cost of

executing QuickVal(n, α), then SQn /n and SV

n /n have the same limiting distribution,

assuming only that the cost function β is ε-tamed for suitably small ε. We will show that

when all the random variables SQ1 , S

Q2 , . . . and SV

1 , SV2 , . . . are strategically defined on a

common probability space, then SQn /n and SV

n /n both converge in Lp to a common limit

for 1 ≤ p <∞.

65

Chapter 6

Analysis of QuickVal

Following some preliminaries in Section 6.1, in Section 6.2 we show that for 1 ≤

p < ∞, a suitably defined SVn /n converges in Lp to a certain random variable S (defined

at the end of Section 6.1) under a certain technical condition which reduces to ES < ∞

when p = 1. We also show that, when the cost function is suitably tamed, SVn /n converges

almost surely to S; see Theorem 6.4 in Section 6.3.

6.1 Preliminaries

Our goal is to establish a limit, in various senses, for the ratio of the total cost

required by QuickVal when applied to a file of n keys to n. It will be both natural and

convenient to define all these total costs, one for each value of n, in terms of a single infinite

sequence (Ui)i≥1 of seeds that are i.i.d. uniform(0, 1). Indeed, let L0 := 0 and R0 := 1.

66

CHAPTER 6. ANALYSIS OF QUICKVAL

For k ≥ 1, inductively define

τk := inf{i : Lk−1 < Ui < Rk−1}, (6.1)

Lk := 1(Uτk < α)Uτk + 1(Uτk > α)Lk−1, (6.2)

Rk := 1(Uτk < α)Rk−1 + 1(Uτk > α)Uτk , (6.3)

Sn,k :=∑

i: τk<i≤n

1(Lk−1 < Ui < Rk−1) β(Ui, Uτk). (6.4)

(Note that Sn,k vanishes if τk ≥ n.) We then claim that, for each n,

SVn :=

∑k≥1

Sn,k (6.5)

has the distribution of the total cost required by QuickVal(n, α).

We offer some explanation here. For each k ≥ 1, the random interval (Lk−1, Rk−1)

(whose length decreases monotonically in k) contains both the target seed α and the seed

Uτk corresponding to the kth pivot; the interval contains precisely those seed values still

under consideration after k − 1 pivots have been performed. The only difference between

how we have defined SVn and how it is usually defined is that we have chosen the initial

pivot seed to be the first seed rather than a random one, and have made this same change

recursively. But our change is permissible because of the following basic probabilistic fact:

If U1, . . . , UN ,M are independent random variables with U1, . . . , UN i.i.d. uniform(0, 1)

and M uniformly distributed on {1, . . . , N}, then UM , like U1, is distributed uniform(0, 1).

Thus the conditional distribution of Uτk given (Lk−1, Rk−1) is uniform(Lk−1, Rk−1).

We illustrate our notation for the first two pivots. First, τ1 = 1; that is, the seed

of the first pivot is the uniform(0, 1) random variable U1. After that, if α < U1 then the

67


seed Uτ2 of the second pivot is chosen as the first seed falling in (0, U1), while if α > U1

then Uτ2 is the first seed falling in (U1, 1). We note that if α = 0 (which means that we

are dealing with the total cost required by QuickMin), then the first of these two cases is

always the one that applies and so for every k ≥ 1 we have Lk = 0 and Rk = Uτk ; we then

have that Uτk is just the kth record low value among U1, U2, . . . .

In order to describe the limit of SVn /n, we let

I(t, x, y) :=

∫ y

x

β(u, t) du,

Ik := I(Uτk , Lk−1, Rk−1), (6.6)

S :=∑k≥1

Ik. (6.7)

Notice that in the case β ≡ 1 of key comparisons we have I(t, x, y) ≡ y − x and so

Ik = Rk−1 − Lk−1.

In Section 6.2 we show that, for 1 ≤ p <∞, SVn /n converges in Lp to S as n→∞

under a suitable technical condition. Under a stronger assumption, we will also prove

almost sure convergence in Section 6.3.

68


6.2 Convergence of SVn/n in Lp for 1 ≤ p <∞

Theorem 6.1 is our main result concerning QuickVal. To state the result, we need

the following notation, extending that of (6.6):

Ip(t, x, y) :=

∫ y

x

βp(u, t) du,

Ip,k := Ip(Uτk , Lk−1, Rk−1), . (6.8)

Theorem 6.1. If 1 ≤ p <∞ and

∑k≥1

(E Ip,k)1/p <∞, (6.9)

then SVn/n converges in Lp (and therefore also in probability and in distribution) to S as

n→∞.

Remark 6.2. For p = 1, notice that the assumption of Theorem 6.1 requires precisely that

ES <∞.

Proof. We use ‖ · ‖ to denote Lp-norm. As background, we recall that the Lp law of large

numbers (LpLLN) states that for 1 ≤ p < ∞ and i.i.d. random variables ξ1, ξ2, . . . with

finite Lp-norm, the sample means ξn = n−1∑n

i=1 ξi converge in Lp to the expectation. To

prove this, we may assume with no loss of generality that the expectation is 0, and then the

bound

P(∣∣ξn∣∣p > c) ≤ c−1E

∣∣ξn∣∣p ≤ c−1‖ξ1‖pp

following from Markov’s inequality and the triangle inequality for Lp-norm shows that the

69


sequence(∣∣ξn∣∣p) is uniformly integrable. So the LpLLN follows from the better-known

strong law of large numbers.

Returning to the setting of the theorem, fix k. Conditionally given the quadruple

Ck := (Lk−1, Rk−1, τk, Uτk), the random variables Ui with i > τk are i.i.d. uniform(0, 1).

By the LpLLN we have [using the convention 0/0 = 0 for Sn,k/(n− τk) when n = τk]

E

[∣∣∣∣ Sn,kn− τk− Ik

∣∣∣∣p ∣∣∣∣ Ck] a.s.−→ 0 as n→∞ (6.10)

since, with U uniformly distributed and independent of all the Ui’s,

E[1(Lk−1 < U < Rk−1) β(U,Uτk) |Ck] = Ik. (6.11)

For our conditional application of the LpLLN in (6.10), it is sufficient to assume only that

the probabilistic source and the cost function β ≥ 0 are such that Ip,k is a.s. finite, and this

clearly holds by (6.9).

Our next goal is to show that the left side of (6.10) is dominated by a single random

variable (depending on the fixed value of k) with finite expectation, and then we will apply

the dominated convergence theorem. For every n, using the convexity of xp for x > 0 we

obtain

E


∣∣∣∣p ∣∣∣∣ Ck] ≤ 2p−1

(E

[(Sn,kn− τk

)p ∣∣∣∣ Ck]+ Ipk

).

We claim that each of the two terms multiplying 2p−1 on the right here is bounded by

Ip,k. First, using the triangle inequality for conditional Lp-norm given Ck, the fact that the

random variables summed to obtain Sn,k are conditionally i.i.d. given Ck, and the defini-

70


tion (6.8) of Ip,k, we can bound the pth root of the first term by

{E

[(Sn,kn− τk

)p ∣∣∣∣ Ck]}1/p

≤ 1

n− τk

∑i:τk<i≤n

{E [1(Lk−1 < Ui < Rk−1)βp(Ui, Uτk) |Ck]}1/p

= {E [1(Lk−1 < U < Rk−1)βp(U,Uτk) |Ck]}1/p = I1/pp,k (6.12)

with U as at (6.11). For the second term we observe that [Ik/(Rk−1 − Lk−1)]p is the pth

power of the absolute value of a uniform average and so is bounded by the corresponding

uniform average of absolute values of pth powers, namely, Ip,k/(Rk−1 − Lk−1); thus

Ipk ≤ (Rk−1 − Lk−1)p−1Ip,k ≤ Ip,k. (6.13)

So we conclude that

E


∣∣∣∣p ∣∣∣∣ Ck] ≤ 2pIp,k.

Thus it follows from E Ip,k < ∞ [which follows from (6.9)] and the dominated conver-

gence theorem that

E

∣∣∣∣ Sn,kn− τk− Ik

∣∣∣∣p → 0 as n→∞. (6.14)

Next, we will show from (6.14) that, for each k,

E

∣∣∣∣Sn,kn − Ik∣∣∣∣p → 0 as n→∞ (6.15)

by proving that

dn,k ≡ dp,n,k := E

∣∣∣∣Sn,kn − Sn,kn− τk

∣∣∣∣p = E

(τkn

Sn,kn− τk

)p

71


vanishes in the limit as n → ∞. Indeed, the corresponding conditional expectation given

Ck is

1(τk < n)(τkn

)pE

[(Sn,kn− τk

)p ∣∣∣∣ Ck] ≤ 1(τk < n)(τkn

)pIp,k,

recalling the inequality (6.12). So again using E Ip,k < ∞ and applying the dominated

convergence theorem we find that dn,k → 0, as desired.

Finally, we show that SVn /n converges to S in Lp. Since we have termwise

Lp-convergence of SVn /n to S by (6.15), the triangle inequality for Lp-norm and the dom-

inated convergence theorem for sums imply that SVn /n converges in Lp to S provided we

can find a summable sequence bk such that

max

{supn≥1

∥∥∥∥Sn,kn∥∥∥∥p

, ‖Ik‖p

}≤ bk.

But, for any n ≥ 1, we have [by taking pth powers in (6.12), then taking expectations, then

taking pth roots] ∥∥∥∥Sn,kn∥∥∥∥p

≤∥∥∥∥ Sn,kn− τk

∥∥∥∥p

≤ (E Ip,k)1/p.

Further, ‖Ik‖p ≤ (E Ip,k)1/p follows from (6.13). Finally, bk := (E Ip,k)

1/p is assumed to

be summable. Thus SVn /n converges to S in Lp.

Remark 6.3. Letting Kn denote the number of key comparisons required by

QuickVal(n, α), we find from Theorem 6.1 with β ≡ 1 and p = 1 that Kn/n converges

in L1 to

K :=∞∑k=0

(Rk − Lk).

72


(In Section 8.1, we will explicitly show the required condition that EK < ∞; see Re-

mark 8.2.)

Suppose α = 0; then the number of key comparisons Kn for QuickVal(n, α) is

the same as for QuickMin. In this case Theorem 6.1 with p = 1 gives

Kn

n

L1

−→K = 1 +∑k≥1

Uτk . (6.16)

The limiting random variable K has the same so-called Dickman distribution as the perpe-

tuity

1 +∞∑k≥1

U1 · · ·Uk. (6.17)

That (6.16)–(6.17) holds is well known (e.g., Mahmoud et al. [19], Hwang and Tsai [14]).

6.3 Almost sure convergence of SVn/n

Under a tameness assumption, we can also show that SVn /n converges to S almost

surely. (Recall Definition 5.4.)

Theorem 6.4. Suppose that the cost β is ε-tamed for some ε < 1/4. Then SVn/n defined

at (6.5) converges to S almost surely.

Before proving this theorem, we establish three lemmas bounding various quantities

of interest.

Lemma 6.5. For any p > 0 and k ≥ 1, we have

E(Rk − Lk)p ≤(

2− 2−p

p+ 1

)k.

73


Here note that for all p > 0 we have

0 <2− 2−p

p+ 1< 1. (6.18)

Proof. Fix p > 0 and k ≥ 1. Since R0 − L0 = 1, it is sufficient to prove that

E[(Rk − Lk)p|Lk−1, Rk−1] ≤ 2− 2−p

p+ 1(Rk−1 − Lk−1)p.

Condition on (Lk−1, Rk−1); then with U uniformly distributed over (Lk−1, Rk−1) we have

the stochastic inequality

Rk − Lk ≤st max{U − Lk−1, Rk−1 − U}.

Thus for Lk−1 6= Rk−1, with

Ak−1 := (Lk−1 +Rk−1)/2,

we have

E[(Rk − Lk)p |Lk−1, Rk−1]

≤ E[(max{U − Lk−1, Rk−1 − U})p |Lk−1, Rk−1]

= (Rk−1 − Lk−1)−1

[∫ Ak−1

Lk−1

(Rk−1 − u)p du+

∫ Rk−1

Ak−1

(u− Lk−1)p

]du

=2− 2−p

p+ 1(Rk−1 − Lk−1)p,

as desired.

Lemma 6.6. Suppose that the cost β is tamed with parameters ε and c. Then for any

interval (a, b) ⊆ (0, 1), any t ∈ (a, b), and any 0 ≤ q < 1/ε, we have∫ b

a

βq(u, t) du ≤ 2qεcq

1− qε(b− a)1−qε.

74


Proof. Using the tameness assumption, integration immediately gives

∫ b

a

βq(u, t) du ≤ cq

1− qε[(t− a)1−qε + (b− t)1−qε] .

The lemma now follows from the concavity of x1−qε for x > 0.

The next lemma is a simple consequence of the preceding two.

Lemma 6.7. Suppose that the cost β is tamed with parameters ε < 1 and c. Then for any

k ≥ 1 and any q > 0, we have

E Iqk ≤(

2εc

1− ε

)q (2− 2−q(1−ε)

q(1− ε) + 1

)k−1

,

and so∑

k E Iqk <∞ geometrically quickly.

Proof. Recalling

Ik =

∫ Rk−1

Lk−1

β(u, Uτk) du,

we find from Lemma 6.6 that

Ik ≤(

2εc

1− ε

)(Rk−1 − Lk−1)1−ε.

By application of Lemma 6.5 we thus obtain the desired bound on E Iqk . The series-

convergence assertion follows from the observation (6.18).

Now we prove Theorem 6.4.

Proof of Theorem 6.4. Clearly it suffices to show that

SVn

n− Sn

n

a.s.−→ 0 (6.19)

75


and

Snn− S a.s.−→ 0, (6.20)

where

Sn :=∑k≥1

(n− τk)+Ik.

We tackle (6.20) first and then (6.19).

By the monotone convergence theorem, Sn/n ↑ S almost surely. But from

Lemma 6.7 (using only ε < 1) we have ES =∑

k≥1 E Ik < ∞, which implies that

S <∞ almost surely. Hence (6.20) follows.

Our proof of (6.19) both is inspired by and follows along the same lines as the

“fourth-moment proof” of the strong law of large numbers described in Ross [26, Chap-

ter 8]; as in that proof, we prefer easy calculations involving fourth moments to more

difficult ones involving tail probabilities—perhaps with the expense that the value 1/4 in

the statement of Theorem 6.4 could be raised by more sophisticated arguments. For (6.19)

it suffices to show that, for any δ > 0,

P

(∣∣∣∣∣SVn

n− Sn

n

∣∣∣∣∣ > δ i.o.

)= 0,

for which it is sufficient by the first Borel–Cantelli lemma and Markov’s inequality to show

that

∑n≥1

E

(SVn

n− Sn

n

)4

<∞. (6.21)

76


Here, by the triangle inequality for the L4 norm,

∑n≥1

E

(SVn

n− Sn

n

)4

≤∑n≥1

[∑k≥1

∥∥∥∥Sn,kn − (n− τk)+

nIk

∥∥∥∥4

]4

=∑n≥1

[∑k≥1

∥∥∥∥(n− τk)+

n

(Sn,kn− τk

− Ik)∥∥∥∥

4

]4

, (6.22)

where we again use the convention 0/0 = 0 for Sn,k/(n−τk) when n = τk. As in the proof

of Theorem 6.1, we let Ck denote the quadruple (Lk−1, Rk−1, τk, Uτk). Also we define

Ik := 1(Lk−1 < U < Rk−1)β(U,Uτk).

and

Mm(k) := E[(Ik − Ik)m|Ck],

where U is unif(0, 1) and independent of Ck. Then routine calculation (see Ross [26,

Section 8.4]) shows that

E

[(n− τk)+

n

(Sn,kn− τk

− Ik)]4

= E

[E

[{(n− τk)+

n

(Sn,kn− τk

− Ik)}4

∣∣∣∣∣Ck]]

= E

{[(n− τk)+

n

]4 [(n− τk)+M4(k) + 3(n− τk)+(n− τk − 1)+M2

2 (k)

[(n− τk)+]4

]}≤ E

{n−4 [nM4(k) + 3n(n− 1)M4(k)]

}≤ 3n−2 EM4(k),

(6.23)

where the first inequality holds because M4(k) ≥M22 (k).

We will show that EM4(k) decays geometrically and then use that fact to

prove (6.21). Since (a− b)4 ≤ 8(a4 + b4) for any real a and b, we have

M4(k) ≤ 8(E[I4

k |Ck] + I4k

). (6.24)

77


First, using Lemma 6.7 we find (using only ε < 1) that E I4k <∞ decays geometrically:

E I4k ≤

(2εc

1− ε

)4(2− 2−4(1−ε)

5− 4ε

)k−1

. (6.25)

Now we analyze, in similar fashion, E[I4k |Ck] in (6.24). Using the assumption 0 < ε < 1/4

and Lemma 6.6 we find

E[I4k

∣∣∣ Ck] ≤ 24εc4

1− 4ε(Rk−1 − Lk−1)1−4ε.

Applying Lemma 6.5 thus gives the geometric decay

E I4k ≤

24εc4

1− 4ε

(2− 2−(1−4ε)

2− 4ε

)k−1

. (6.26)

Therefore, it follows from (6.22)–(6.23) and (6.25)–(6.26) that (6.21) holds:

∑n≥1

E

(SVn

n− Sn

n

)4

≤ 3

(∑n≥1

n−2

)[∑k≥1

(EM4(k))1/4

]4

<∞.

This completes the proof of Theorem 6.4.

78

Chapter 7

Analysis of QuickQuant

As described in Chapter 5, our analysis of QuickQuant is closely related to that

of QuickVal described in Chapter 6. Therefore we will continue to use or extend the

framework and notation established in Chapters 5–6, including the limit S of SVn /n defined

at (6.7).

Following some preliminaries in Section 7.1, in Section 7.2 we show that a suitably

defined SQn /n converges in Lp to S for 1 ≤ p < ∞ provided that the cost function β is

ε-tamed with ε < 1/p; hence SQn /n and SV

n /n have the same limiting distribution provided

only that the cost function β is ε-tamed for suitably small ε. We will examine the expecta-

tion of S in Chapter 8 and recover previously obtained results reviewed in Section 5.2.

79

CHAPTER 7. ANALYSIS OF QUICKQUANT

7.1 Preliminaries

We will closely follow the framework described in Chapter 6 for the analysis of

QuickVal and construct a random variable, call it SQn , that has the distribution of the total

cost required by QuickQuant when applied to a file of n keys. Our goal is to show that,

under suitable technical conditions, SQn /n converges in Lp to S defined at (6.7).

Again, we define SQn in terms of an infinite sequence (Ui)i≥1 of seeds that are i.i.d.

uniform(0, 1). Let mn (with mn/n → α) denote our target rank for QuickQuant. Let

τk(n) denote the index of the seed that corresponds to the kth pivot. As in Section 6.1

we will set the first pivot index τ1(n) to 1 rather than to a randomly chosen integer from

{1, . . . , n}. For k ≥ 1, we will use Lk−1(n) and Rk−1(n), as defined below, to denote the

lower and upper bounds, respectively, of seeds of words that are eligible to be compared

with the kth pivot. [Notice that τk(n), Lk(n), and Rk(n) are analogous to τk, Lk, and Rk

defined in Section 6.1; see (6.1)–(6.3).] Hence we let L0(n) := 0 and R0(n) := 1, and for

k ≥ 1 we inductively define

τk(n) := inf{i ≤ n : Lk−1(n) < Ui < Rk−1(n)},

and

Lk(n) := 1(pivrankk(n) ≤ mn)Uτk(n) + 1(pivrankk(n) > mn)Lk−1(n),

Rk(n) := 1(pivrankk(n) ≥ mn)Uτk(n) + 1(pivrankk(n) < mn)Rk−1(n)

if τk(n) <∞ but

(Lk(n), Rk(n)) := (Lk−1(n), Rk−1(n))

80


if τk(n) =∞. Here pivrankk(n) denotes the rank of the kth pivot seed Uτk(n) if τk(n) <∞

and mn otherwise. Recall that the infimum of the empty set is∞; hence τk(n) =∞ if and

only if Lk−1(n) = Rk−1(n).

Using this notation, let

SQn,k :=

∑i: τk(n)<i≤n

1(Lk−1(n) < Ui < Rk−1(n))β(Ui, Uτk(n))

be the total cost of all comparisons (for the first n keys) with the kth pivot key. Then

SQn :=

∑k≥1

SQn,k (7.1)

has the distribution of the total cost required by QuickQuant.

Notice that the expression (7.1) is analogous to (6.5). In fact, we will prove the

Lp-convergence of SQn /n to S for 1 ≤ p <∞ by comparing the corresponding expressions

for QuickVal and QuickQuant.

7.2 Convergence of SQn/n in Lp for 1 ≤ p <∞

The following is our main theorem regarding QuickQuant.

Theorem 7.1. Let 1 ≤ p < ∞. Suppose that the cost function β is ε-tamed with ε < 1/p.

Then SQn /n converges in Lp to S.

Remark 7.2. Note that as p increases, getting Lp-convergence requires the increasingly

stronger condition ε < 1/p. Thus we have convergence of moments of all orders provided

81


the source is γ-tamed for every γ > 0 – for example, if it is g-tamed as in Remark 5.5(a)

(the source is g-tamed for memoryless and most Markov sources).

The proof of Theorem 7.1 will make use of the following analogue of Lemma 6.5,

whose proof is essentially the same and therefore omitted.

Lemma 7.3. For any p > 0 and k ≥ 1 and n ≥ 1, we have

E(Rk(n)− Lk(n))p ≤(

2− 2−p

p+ 1

)k.

Proof of Theorem 7.1. Part of our strategy in proving this theorem is to compare

QuickQuant with QuickVal. Hence we will frequently refer to the notation estab-

lished in Section 6.1 for the analysis of QuickVal. For each k, observe that as n → ∞

we have

τk(n)a.s.−→ τk, Uτk(n)

a.s.−→ Uτk , Lk(n)a.s.−→ Lk, Rk(n)

a.s.−→ Rk,

where τk, Lk, and Rk, are defined in Section 6.1 [see (6.1)–(6.3)]. (In fact, in each of these

four cases of convergence, the left-hand side almost surely becomes equal to its limit for

all sufficiently large n.) Thus for each k ≥ 1 we have

SQn,k − Sn,k

a.s.−→ 0, (7.2)

where Sn,k is defined at (6.4); indeed, again the difference almost surely vanishes for all

sufficiently large n. In proving Theorem 6.1, we showed [at (6.15)] that

Sn,kn

Lp

−→ Ik,

82


where Ik is defined at (6.6), and it is somewhat easier (by means of conditional application

of the strong law of large numbers, rather than the Lp law of large numbers, together with

Fubini’s theorem) to show that

Sn,kn

a.s.−→ Ik. (7.3)

Combining (7.2) and (7.3), for each k ≥ 1 we have

SQn,k

n

a.s.−→ Ik. (7.4)

What we want to show is that

SQn

n=∑k≥1

SQn,k

n

Lp

−→∑k≥1

Ik = S. (7.5)

Choose any sequence (ak)k≥1 of positive numbers summing to 1, and let A be the prob-

ability measure on the positive integers with this probability mass function. Then, once

again using the fact that the pth power of the absolute value of an average is bounded by

the average of pth powers of absolute values,∣∣∣∣SQn

n− S

∣∣∣∣p ≤[∑k≥1

∣∣∣∣∣SQn,k

n− Ik

∣∣∣∣∣]p

=

[∑k≥1

aka−1k


n− Ik

∣∣∣∣∣]p

≤∑k≥1

aka−pk


n− Ik

∣∣∣∣∣p

.

So for (7.5) it suffices to prove that, with respect to the product probability P × A, as

n→∞ the sequence

a−pk


n− Ik

∣∣∣∣∣p

converges in L1 to 0. What we know from (7.4) is that the sequence converges almost

surely with respect to P× A.

83


Now almost sure convergence together with boundedness in L1+δ are, for any δ >

0, sufficient for convergence in L1 because the boundedness condition implies uniform

integrability (e.g., Chung [2, Exercise 4.5.8]). Thus our proof is reduced to showing that,

for some q > p, the sequence

∑k≥1

a1−qk E


n− Ik

∣∣∣∣∣q

is bounded in n, for a suitably chosen probability mass function (ak). Indeed, by convexity

of qth power,

21−q∑k≥1

a1−qk E


n− Ik

∣∣∣∣∣q

≤∑k≥1

a1−qk E


n

∣∣∣∣∣q

+∑k≥1

a1−qk E Iqk , (7.6)

and we will show that each sum on the right-hand side of (7.6) is bounded in order to prove

the theorem. The value of q that we use can be any satisfying ε < 1/q < 1/p.

First we recall from Lemma 6.7 that

E Iqk ≤(

2εc

1− ε

)q (2− 2−q(1−ε)

q(1− ε) + 1

)k−1

, k ≥ 1. (7.7)

with geometric decay. Thus the second sum on the right in (7.6) is finite if the cost is

ε-tamed with ε < 1 and the sequence (ak) is suitably chosen not to decay too quickly.

Next we analyze E|SQn,k/n|q for the first sum on the right in (7.6). Let

νk−1(n) := |{i : Lk−1(n) < Ui < Rk−1(n), τk(n) < i ≤ n}|.

Until further notice our calculations are done only over the event {νk−1(n) > 0}. Then,

bounding the qth power of the absolute value of an average by the average of qth powers of

84


absolute values,∣∣∣∣∣SQn,k

n

∣∣∣∣∣q

=

∣∣∣∣∣∣ 1

νk−1(n)

∑i:Lk−1(n)<Ui<Rk−1(n)

1(τk(n) < i ≤ n) β(Ui, Uτk(n))

∣∣∣∣∣∣q

×(νk−1(n)

n

)q≤ 1

νk−1(n)

∑i:Lk−1(n)<Ui<Rk−1(n)

1(τk(n) < i ≤ n) βq(Ui, Uτk(n)) (7.8)

×(νk−1(n)

n

)q.

Let Dk(n) denote the quintuple (Lk−1(n), Rk−1(n), τk(n), Uτk(n), νk−1(n)), and notice

that, conditionally given Dk(n), the νk−1(n) values Ui appearing in (7.8) are i.i.d.

unif(Lk−1(n), Rk−1(n)). Using (7.8), we bound the conditional expectation of |SQn,k/n|q

given Dk(n). We have

E

[∣∣∣∣∣SQn,k

n

∣∣∣∣∣q∣∣∣∣∣Dk(n)

]≤ [Rk−1(n)− Lk−1(n)]−1

∫ Rk−1(n)

Lk−1(n)

βq(u, Uτk(n)) du

×(νk−1(n)

n

)q. (7.9)

Under ε-tameness of β with ε < 1/q, we find from Lemma 6.6 that∫ Rk−1(n)

Lk−1(n)

βq(u, Uτk(n)) du ≤2qεcq

1− qε[Rk−1(n)− Lk−1(n)]1−qε. (7.10)

From (7.9)–(7.10), it follows that if ε < 1/q, then

E

[∣∣∣∣∣SQn,k

n

∣∣∣∣∣q∣∣∣∣∣Dk(n)

]≤ 2qεcq

1− qε[Rk−1(n)− Lk−1(n)]q−qε

(νk−1(n)

n(Rk−1(n)− Lk−1(n))

)q.

Until this point we have worked only over the event {νk−1(n) > 0}, but now we enlarge

our scope to the event {Lk−1(n) < Rk−1(n)} and note that the preceding inequality holds

there, as well.

85


Next notice that, conditionally given the triple

Dk(n) := (Lk−1(n), Rk−1(n), τk(n)),

the values Ui with τk(n) < i ≤ n are i.i.d. unif(0, 1), and so the number of them falling

in the interval (Lk−1(n), Rk−1(n)) is distributed binomial(m, t) with m = n − τk(n) and

t = Rk−1(n) − Lk−1(n), and (representing a binomial as a sum of independent Bernoulli

random variables and applying the triangle inequality for Lq) moment of order q bounded

by mqt. Thus

E

[(νk−1(n)

n(Rk−1(n)− Lk−1(n))

)q∣∣∣∣ Dk(n)

]≤ [Rk−1(n)− Lk−1(n)]1−q ,

so that

E

[∣∣∣∣∣SQn,k

n

∣∣∣∣∣q∣∣∣∣∣ Dk(n)

]≤ 2qεcq

1− qε[Rk−1(n)− Lk−1(n)]1−qε.

Since this inequality holds even when Lk−1(n) = Rk−1(n), we can take expectations to

conclude

E


n

∣∣∣∣∣q

≤ 2qεcq

1− qεE[Rk−1(n)− Lk−1(n)]1−qε

≤ 2qεcq

1− qε

(2− 2−(1−qε)

2− qε

)k−1

, (7.11)

where at the second inequality we have employed Lemma 7.3.

From (7.7) and (7.11) we see that we can choose (ak) to be the geometric distribu-

tion ak = (1− θ)θk−1, k ≥ 1, with

2− 2−q(1−ε)

q(1− ε) + 1< θ < 1.

86


We then conclude that∑

k≥1 a1−qk E

∣∣∣(SQn,k/n

)− Ik

∣∣∣q is bounded in n, and therefore that

SQn /n converges to S in Lp, if the cost function is ε-tamed with ε < 1/p.

Remark 7.4. Although the contraction method has been used in finding limiting dis-

tributions for the number of key comparisons required by recursive algorithms such as

QuickSort (e.g., Rosler [24], Rosler and Ruschendorf [25]), our analysis of QuickVal

and QuickQuant does not depend on it. In examining convergence for the number of key

comparisons used by QuickQuant, Grubel and Rosler [11] mentioned that they did not

use the contraction method due to the parameter that represents target rank. (However, they

did engage in contraction arguments to characterize the limiting distribution.) Interestingly,

Mahmoud et al. [19] succeeded in establishing fixed point equations to identify the limit-

ing distributions of the normalized numbers of key comparisons required by QuickRand,

QuickMin, and QuickMax. Regnier [23] used martingales to show convergence for the

number of key comparisons required by QuickSort.

87

Chapter 8

Analysis of ES

In this chapter, we examine the expectation of the common limit S of SVn /n and

SQn /n described in Chapters 6–7. First, in Section 8.1, we derive an integral expression

for ES that is valid for any symmetric cost function β ≥ 0. Then, in Section 8.2, we

restrict to the cost function βsym corresponding to the number of symbol comparisons and

derive a series expression for ES. We use these expressions to recapture the previously

known results reviewed in Section 5.2. The integral expression will be used to verify the

assumption of Theorem 6.1 as promised in Remark 6.3, which recovers the result of [11,

Theorem 8] (in a cosmetically different, but equivalent, form; compare [10, Theorem 3])

for the limiting distribution of the number of key comparisons. In Remark 8.2 we again use

the integral expression to recover first-moment information for the same. Finally, recalling

that L1-convergence implies convergence of means, we use the series expression for ES

to recover at least the lead-order terms in the asymptotics of Fill and Nakama [6] and also

88

CHAPTER 8. ANALYSIS OF ES

of Vallee et al. [27] discussed at (5.3).

8.1 Computation of ES: an integral expression

In this section we derive the following simple double-integral expression for ES in

terms of the cost function β.

Theorem 8.1. For any symmetric cost function β ≥ 0 we have

ES = 2

∫ ∫0<u<t<1

β(u, t) [(α ∨ t)− (α ∧ u)]−1 du dt.

Proof. Recall that ES =∑

k≥1 E Ik, where

Ik =

∫ Rk−1

Lk−1

β(u, Uτk) du.

Recall also that, for each k, the conditional distribution of Uτk given Lk−1 and Rk−1 is

uniform(Lk−1, Rk−1). Thus

E Ik = E

∫ Rk−1

Lk−1

(Rk−1 − Lk−1)−1

∫ Rk−1

Lk−1

β(u,w) dw du

=

∫0<w,u<1

β(w, u) E[(Rk−1 − Lk−1)−11(Lk−1 < u,w < Rk−1)] dw du

= 2

∫0<w<u<1

β(w, u)

×∫

0≤x<α<y≤1

(y − x)−11(x < w < u < y)P(Lk−1 ∈ dx,Rk−1 ∈ dy) dw du.

Hence

ES = 2

∫0<w<u<1

β(w, u) (8.1)

×∫

0≤x<α<y≤1

(y − x)−11(x < w < u < y) ν(dx, dy) dw du

89


where ν is the measure

ν(dx, dy) :=∑k≥0

P(Lk ∈ dx,Rk ∈ dy). (8.2)

As established in Appendix B (see Proposition B.1), one has the tractable expres-

sion

ν(dx, dy) = δ0(dx) δ1(dy) + (1− x)−1 dx δ1(dy) + δ0(dx) y−1 dy + 2(y − x)−2dx dy.

Using this expression, we compute the inner integral in (8.1). For 0 < w < u < 1,∫0≤x<α<y≤1

(y − x)−11(x < w < u < y) ν(dx, dy)

= 1 +

∫ α∧w

0

(1− x)−2 dx+

∫ 1

α∨uy−2 dy

+2

∫0≤x<α<y≤1

(y − x)−31(x < w < u < y) dx dy. (8.3)

Since ∫ α∧w

0

(1− x)−2 dx = [1− (α ∧ w)]−1 − 1,∫ 1

α∨uy−2 dy = (α ∨ u)−1 − 1,

and

2

∫0≤x<α<y≤1

(y − x)−31(x < w < u < y) dx dy

= 2

∫ 1

α∨u

∫ α∧w

0

(y − x)−3 dx dy =

∫ 1

α∨u{[y − (α ∧ w)]−2 − y−2} dy

= [(α ∨ u)− (α ∧ w)]−1 − [1− (α ∧ w)]−1 − (α ∨ u)−1 + 1,

it follows from (8.3) that∫0≤x<α<y≤1

(y − x)−11(x < w < u < y) ν(dx, dy) = [(α ∨ u)− (α ∧ w)]−1. (8.4)

90


Substitute (8.4) into (8.1) to complete the proof of the theorem.

Remark 8.2. We now let β ≡ 1 and use Theorem 8.1 to analyze the expectation of the

numberKn of key comparisons required by QuickQuant(n, α). Then the expected value

in Theorem 8.1 is

2

∫ ∫0<u<t<1

[(α ∨ t)− (α ∧ u)]−1 du dt = 2[1− α lnα− (1− α) ln(1− α)] <∞. (8.5)

Hence the expected number of key comparisons required by QuickQuant(n, α) is asymp-

totically equal to 2[1−α lnα− (1−α) ln(1−α)]n. Blum et al. [1] established a determin-

istic selection algorithm (for finding a key of target rank in a file of n keys) that requires

at most 5.4305n key comparisons (this bound improves for extreme target ranks), and it is

informative to compare these results. It follows by (8.5) that for α = 0 we have

limn→∞

EKn/n = 2,

which is well known since Kn in this case represents the number of key comparisons re-

qured by QuickMin applied to a file of n keys (e.g., Mahmoud et al. [19]). Thus we are

now able to conclude that for any α (0 ≤ α ≤ 1), EKn/n converges to the simple constant

in (8.5). Also notice that we have verified the hypothesis of Theorem 6.1 for p = 1 (see

also Remark 6.2) by (8.5), as we promised in Remark 6.3 that we would.

91


8.2 Computation of ES: a series expression

We now restrict to the cost function βsym and use Theorem 8.1 to derive a series

expression for ES. In the notation of Section 5.1, we have

12ES =

∑w∈Σ∗

∫Tw

[(α ∨ t)− (α ∧ u)]−1 du dt,

which is easily obtained by noting that for u < t we have

β(u, t) =∑w∈Σ∗

1(aw < u < t < bw). (8.6)

Define

J (w) :=

∫Tw

[(α ∨ t)− (α ∧ u)]−1 du dt.

Then routine calculation shows that

J (w) = pwL


∣∣∣∣) .Thus

ES =∑w∈Σ∗

pwL


∣∣∣∣) , (8.7)

in agreement with Theorem 2(i) of Vallee et al. [27]. (See also their Figure 1.)

This formula allows us to compute the lead-order linear coefficient of ESVn , and

we will illustrate its effectiveness with examples. If α = 0, then the random variable

SVn represents the number of symbol comparisons required by QuickMin. In this case

using (8.7) we find

ES = 2∑w∈Σ∗

pw

(1− aw

pwlnbwaw

)= 2

∑w∈Σ∗

pw

[1− p

(−)w

pwln

(1 +

pw

p(−)w

)], (8.8)

92


where, in notation to be used here and at (8.9) below,

p(+)w = 1− bw, p(−)

w := aw,

This is in agreement with Theorem 2(ii) of Vallee et al. [27], after a typo there is corrected.

(The “min/max constants” in their Figure 1 should be twice as large as stated.)

Suppose further that the alphabet is r-ary (Σ = {0, 1, . . . , r− 1}) and that each key

is simply the r-ary representation of its seed. In this case, for each k ≥ 0 and w ∈ Σk, we

have bw − aw = 1/rk. Hence it follows from (8.8) that

ES = 2∑k≥0

∑w∈Σk

r−k(

1− awr−k

lnbwaw

)

= 2∑k≥0

rk∑i=1

r−k[1− (i− 1) ln

i

i− 1

]

= 2∑k≥0

1− r−krk∑i=1

(i− 1) lni

i− 1

= 2

∑k≥0

1 + r−krk∑i=1

lni

rk

.

Fill and Nakama [6] and Grabner and Prodinger [9] independently obtained this expression

for the binary alphabet case (see also Vallee et al. [27]), and Grabner and Prodinger also

used it to efficiently compute the lead-order coefficient of the expected number of symbol

comparisons to 50 decimal places (ES = 5.279378 . . . ).

As a final example, one way to compute the (uniform) average of the values ES

over α ∈ [0, 1] is to integrate the expression (8.7). However, we find it somewhat easier to

93


integrate the integral expression of Theorem 8.1. Integrating first with respect to α we find

ES = 2

∫ ∫0<u<t<1

β(u, t) [1− 2 ln(t− u) + ln t+ ln(1− u)] du dt.

Substituting (8.6) and integrating we find, after some rearrangement, that

∫ 1

0

ES(α) dα (8.9)

=∑w∈Σ∗

p2w

2 +1

pw+∑ε=±

ln

(1 +

p(ε)w

pw

)−

(p

(ε)w

pw

)2

ln

(1 +

pw

p(ε)w

) .

This is in agreement with Theorem 2(iii) of Vallee et al. [27].

94

Appendix A

Proof of (2.28)

In order to prove (2.28), it suffices to show that, for any positive integer m,∫ n−θ+i∞

n−θ−i∞ζ(−1− s)m−s ds

(s+ 1)s · · · (s− n)= 0

(note that n ≥ 2 and 0 < θ < 1). Letting t := −1− s, it is thus sufficient to show that

J :=

∫ −(n+1)+θ+i∞

−(n+1)+θ−i∞ζ(t)mt dt

t(t+ 1) · · · [t+ (n+ 1)]= 0.

Using the residue theorem, we obtain

J = −2πi

[n∑k=0

(−1)kζ(−k)m−k

k!(n+ 1− k)!+

m

(n+ 2)!

]

+

∫ 2+i∞

2−i∞ζ(t)mt dt

t(t+ 1) · · · [t+ (n+ 1)]; (A.1)

The “2” in the second term here could just as well be any real number exceeding 1. Here

n∑k=0

(−1)kζ(−k)m−k

k!(n+ 1− k)!= − 1

2 (n+ 1)!+

n∑k=1

Bk+1m−k

(k + 1)!(n+ 1− k)!

=n+1∑k=1

Bkm−(k−1)

k!(n+ 2− k)!.

95

APPENDIX A. PROOF OF (2.28)

Therefore

n∑k=0

(−1)kζ(−k)m−k

k!(n+ 1− k)!+

m

(n+ 2)!

=m−(n+1)

(n+ 1)!

[n+1∑k=1

Bk (n+ 1)!

k!(n+ 2− k)!mn+2−k +

mn+2

n+ 2

]

=m−(n+1)

(n+ 1)!

m−1∑k=1

kn+1 =1

(n+ 1)!

m−1∑k=1

(1− k

m

)n+1

; (A.2)

for the second equality, see Knuth [16] (Exercise 1.2.11.2-4). On the other hand, Flajolet

et al. [7] showed that

∫ 2+i∞

2−i∞ζ(t)mt dt

t(t+ 1) · · · [t+ (n+ 1)]=

2πi

(n+ 1)!

m−1∑k=1

(1− k

m

)n+1

. (A.3)

Thus it follows from (A.1)–(A.3) that J = 0.

96

Appendix B

Tractable expression for the measure ν

The purpose of this appendix is to prove the following proposition used in the com-

putation of ES in Section 8.1.

Proposition B.1. With (Lk, Rk) defined at (6.2)–(6.3) as the interval of values eligible to

be compared with the kth pivot chosen by QuickVal, and with

ν(dx, dy) :=∑k≥0

P(Lk ∈ dx,Rk ∈ dy)

as defined at (8.2), we have

ν(dx, dy) = δ0(dx) δ1(dy) + (1− x)−1 dx δ1(dy) + δ0(dx) y−1 dy + 2(y − x)−2dx dy.

Proof. To begin, since L0 := 0 and R0 := 1 we have

P(L0 ∈ dx,R0 ∈ dy) = δ0(dx) δ1(dy), (B.1)

where δz denotes the probability measure concentrated at z. Now assume k ≥ 1. If

97

APPENDIX B. TRACTABLE EXPRESSION FOR THE MEASURE ν

0 ≤ λ < α < ρ ≤ 1, then

P(Lk ∈ dx,Rk ∈ dy |Lk−1 = λ,Rk−1 = ρ)

= δρ(dy)1(λ < x < α)(ρ− λ)−1 dx+ δλ(dx)1(α < y < ρ)(ρ− λ)−1 dy.

Hence

P(Lk ∈ dx,Rk ∈ dy) =

∫[δρ(dy)1(λ < x < α)(ρ− λ)−1dx (B.2)

+δλ(dx)1(α < y < ρ)(ρ− λ)−1dy] P(Lk−1 ∈ dλ,Rk−1 ∈ dρ).

We can infer [and inductively prove using (B.2)] that, for k ≥ 1,

P(Lk ∈ dx,Rk ∈ dy) = δ1(dy)fk(x)dx+ δ0(dx)gk(y)dy + hk(x, y)dx dy,

(B.3)

where

f1(x) = 1(0 ≤ x < α), g1(y) = 1(α < y ≤ 1), h1(x, y) = 0,

and, for k ≥ 2,

fk(x) = 1(0 ≤ x < α)

∫1(0 ≤ λ < x)(1− λ)−1fk−1(λ) dλ, (B.4)

gk(y) = 1(α < y ≤ 1)

∫1(y < ρ ≤ 1)ρ−1gk−1(ρ) dρ, (B.5)

hk(x, y) = 1(0 ≤ x < α < y ≤ 1)[(1− x)−1fk−1(x) + y−1gk−1(y) (B.6)

+

∫1(0 ≤ λ < x)(y − λ)−1hk−1(λ, y) dλ

+

∫1(y < ρ ≤ 1)(ρ− x)−1hk−1(x, ρ) dρ

].

98


Henceforth suppose 0 ≤ x < α < y ≤ 1. From (B.5) we obtain

gk(y) =(− ln y)k−1

(k − 1)!, k ≥ 1, (B.7)

whence ∑k≥1

gk(y) = y−1. (B.8)

By recognizing symmetry between (B.4) and (B.5), we also find

fk(x) =[− ln(1− x)]k−1

(k − 1)!, k ≥ 1, (B.9)

and so ∑k≥1

fk(x) = (1− x)−1. (B.10)

In order to compute∑

k≥1 hk(x, y), we consider the generating function

H(x, y, z) :=∑k≥1

hk(x, y) zk. (B.11)

From (B.6),

H(x, y, z) = z

[(1− x)−1

∑k≥1

fk(x) zk + y−1∑k≥1

gk(y) zk

+

∫ x

0

(y − λ)−1H(λ, y, z) dλ+

∫ 1

y

(ρ− x)−1H(x, ρ, z) dρ

].

(B.12)

Using this integral equation, we will show via a series of lemmas culminating in

Lemma B.10 that

H(x, y) := H(x, y, 1) =∑k≥1

hk(x, y) equals 2(y − x)−2. (B.13)

99


Combining equations (B.3), (B.8), (B.10), and (B.13), we obtain the desired expression

for ν.

Throughout the remainder of this appendix, whenever we refer toH(x, y) we tacitly

suppose that 0 ≤ x < α < y ≤ 1.

Lemma B.2. H(x, y) <∞ almost everywhere.

Proof. We revisit Remarks 6.3 and 8.2 and consider the number of key comparisons re-

quired by QuickVal(n, α). As shown at (8.5), we have ES < ∞ in this case. On the

other hand, with β ≡ 1, from (8.1)–(8.2), (B.1), (B.3), and (B.8)–(B.10), we have

ES = 2

∫0<w<u<1

[1 +

∫0≤x<α

(1− x)−1 1(x < w) dx+

∫α<y≤1

y−1 1(y > u) dy

+

∫0≤x<α<y≤1

(y − x)−1 1(x < w < u < y)H(x, y) dx dy

]dw du.

Thus H(x, y) <∞ almost everywhere.

The next lemma establishes monotonicity properties of H(x, y).

Lemma B.3. H(x, y) is increasing in x and decreasing in y.

Proof. For each k ≥ 1, we see from (B.9) that fk(x) is increasing in x and from (B.7) that

gk(y) is decreasing in y. Since h1 ≡ 0, it follows by induction on k from (B.6) that hk(x, y)

is increasing in x and decreasing in y for each k. Thus H(x, y) =∑

k≥1 hk(x, y) enjoys

the same monotonicity properties.

Lemma B.4. H(x, y) <∞ for all x and y.

100


Proof. This is immediate from Lemmas B.2–B.3.

Lemma B.5. The generating function H(x, y, z) at (B.11) is (with h0 :≡ 0) the unique

power-series solution H(x, y, z) =∑

k≥0 hk(x, y)zk (in 0 ≤ z ≤ 1) to the integral equa-

tion (B.12) such that 0 ≤ hk(x, y) ≤ hk(x, y) for all k, x, y.

Proof. We have already seen that H is such a solution. Conversely, if H is such a solution,

then equating coefficients of zk in the integral equation [which is valid because we know

by Lemma B.4 that H(x, y, z), and hence also H(x, y, z), is finite for 0 ≤ z ≤ 1] we find

that the functions hk(x, y) satisfy hk ≡ 0 for k = 0, 1 and the recurrence relation (B.6) for

k ≥ 2. It then follows by induction that hk(x, y) = hk(x, y) for all k, x, y.

Next we let H0(x, y, z) :≡ 0 and, for 0 ≤ z ≤ 1, inductively define Hn(x, y, z) by

applying successive substitutions to the integral equation (B.12); that is, for each n ≥ 1 we

define

Hn(x, y, z) := z

[(1− x)−1

∑k≥1

fk(x) zk + y−1∑k≥1

gk(y) zk

+

∫ x

0

(y − λ)−1Hn−1(λ, y, z) dλ+

∫ 1

y

(ρ− x)−1Hn−1(x, ρ, z) dρ

].

(B.14)

Let [zk]Hn(x, y, z) denote the coefficient of zk in Hn(x, y, z). Then for each k ≥ 1 and

101


n ≥ 1, it follows from (B.14) that

[zk]Hn(x, y, z) = (1− x)−1fk−1(x) + y−1gk−1(y)

+

∫ x

0

(y − λ)−1{[zk−1]Hn−1(λ, y, z)} dλ (B.15)

+

∫ 1

y

(ρ− x)−1{[zk−1]Hn−1(x, ρ, z)} dρ.

Regarding (Hn)n≥0, we have the following lemma:

Lemma B.6. For each k ≥ 1, [zk]Hn(x, y, z) is nondecreasing in n ≥ 0.

Proof. The inequality [zk]Hn(x, y, z) ≥ [zk]Hn−1(x, y, z) is proved easily by induction

on n ≥ 1, as follows. Clearly

[zk]H1(x, y, z) ≥ 0 = [zk]H0(x, y, z).

Assume the induction hypothesis [zk]Hn(x, y, z) ≥ [zk]Hn−1(x, y, z). Then by (B.15),

[zk]Hn+1(x, y, z)

= (1− x)−1fk−1(x) + y−1gk−1(y) +

∫ x

0

(y − λ)−1{[zk−1]Hn(λ, y, z)} dλ

+

∫ 1

y

(ρ− x)−1{[zk−1]Hn(x, ρ, z)} dρ

≥ (1− x)−1fk−1(x) + y−1gk−1(y) +

∫ x

0

(y − λ)−1{[zk−1]Hn−1(λ, y, z)} dλ

+

∫ 1

y

(ρ− x)−1{[zk−1]Hn−1(x, ρ, z)} dρ

= [zk]Hn(x, y, z).

According to the next lemma, H dominates each Hn.

102


Lemma B.7. For all n ≥ 0 and k ≥ 1 we have

0 ≤ [zk]Hn(x, y, z) ≤ hk(x, y). (B.16)

Proof. Lemma B.6 establishes the first inequality, and the second is proved easily by in-

duction on n, as follows. We have hk(x, y) ≥ 0 = [zk]H0(x, y, z). Assume the induction

hypothesis

hk(x, y) ≥ [zk]Hn(x, y, z).

Then by (B.15),

[zk]Hn+1(x, y, z)

= (1− x)−1fk−1(x) + y−1gk−1(y) +

∫ x

0

(y − λ)−1{[zk−1]Hn(λ, y, z)} dλ

+

∫ 1

y

(ρ− x)−1{[zk−1]Hn(x, ρ, z)} dρ

≤ (1− x)−1fk−1(x) + y−1gk−1(y) +

∫ x

0

(y − λ)−1hk−1(λ, y) dλ

+

∫ 1

y

(ρ− x)−1hk−1(x, ρ) dρ

= hk(x, y).

Lemmas B.5–B.7 lead to the following lemma:

Lemma B.8. For 0 ≤ x < α < y ≤ 1 and 0 ≤ z ≤ 1 we have

Hn(x, y, z) ↑ H(x, y, z) as n ↑ ∞.

103


Proof. Recalling Lemmas B.6–B.7, define H(x, y, z) to be the power series in z with co-

efficient of zk equal to hk(x, y) := limn↑∞[zk]Hn(x, y, z), which satisfies 0 ≤ hk(x, y) ≤

hk(x, y). On the other hand, H satisfies the integral equation (B.12) by applying the mono-

tone convergence theorem to (B.14). Thus it follows from Lemma B.5 that H = H . Fi-

nally, another application of the monotone convergence theorem shows that H(x, y, z) =

limn↑∞Hn(x, y, z).

Our next lemma, when combined with the preceding one, immediately leads to

inequality in one direction in (B.13).

Lemma B.9. For 0 ≤ x < α < y ≤ 1 and all n ≥ 0,

Hn(x, y, 1) ≤ 2(y − x)−2.

Proof. We will prove this lemma by induction on n, starting with

H0(x, y) = 0 ≤ 2(y − x)−2.

Suppose that the claim holds for n− 1. Then from (B.14), (B.8), and (B.10) we have

Hn(x, y, 1) ≤ (1− x)−2 + y−2 + (y − x)−2 − y−2 + (y − x)−2 − (1− x)−2

= 2(y − x)−2.

Finally we are ready to prove (B.13).

Lemma B.10. For 0 ≤ x < α < y ≤ 1,

H(x, y, 1) = 2(y − x)−2.

104


Proof. Define

H(x, y) := 2(y − x)−2 −H(x, y).

Then to prove the desired equality it suffices to show that for any integer r ≥ 0 we have

0 ≤ H(x, y) ≤ (23)r × 2(y − x)−3. (B.17)

As remarked earlier, the nonnegativity of H follows from Lemmas B.8–B.9. We prove

the upper bound on H in (B.17) by induction on r. The bound clearly holds for r = 0.

Notice that by substituting z = 1 and H(x, y) = 2(y − x)−2 − H(x, y) into the integral

equation (B.12) we find

H(x, y) = 2(y − x)−2 −H(x, y)

= 2(y − x)−2 −{

(1− x)−2 + y−2 +

∫ x

0

(y − λ)−1[2(y − λ)−2 − H(λ, y)] dλ

+

∫ 1

y

(ρ− x)−1[2(ρ− x)−2 − H(x, ρ)] dρ

}=

∫ x

0

(y − λ)−1H(λ, y) dλ+

∫ 1

y

(ρ− x)−1H(x, ρ) dρ.

Thus if we assume that the upper bound in (B.17) holds for r − 1, then

H(x, y) ≤(

2

3

)r−1

× 2

[∫ x

0

(y − λ)−4dλ+

∫ 1

y

(ρ− x)−4 dρ

]≤ (2

3)r × 2(y − x)−3.

Hence (B.17) holds for any nonnegative integer r.

105

Bibliography

[1] M. Blum, R. Floyd, V. Pratt, R. Rivest, and R. Tarjan. Time bounds for selection.

Journal of Computer and System Sciences, 1973.

[2] K. L. Chung. A Course in Probability Theory. Academic Press, London, 3rd edition,

2001.

[3] L. Devroye. Exponential bounds for the running time of a selection algorithm. Journal

of Computer and System Sciences, 29:1–7, 1984.

[4] L. Devroye. On the probablistic worst-case time of “Find”. Algorithmica, 31:291–

303, 2001.

[5] J. A. Fill and S. Janson. The number of bit comparisons used by Quicksort: An

average-case analysis. Proceedings of the ACM-SIAM Symposium on Discrete Algo-

rithms, pages 293–300, 2004.

[6] J. A. Fill and T. Nakama. Analysis of the expected number of bit comparisons required

by Quickselect. To appear in Algorithmica, 2009.

106

BIBLIOGRAPHY

[7] P. Flajolet, P. Grabner, P. Kirschenhofer, H. Prodinger, and R. F. Tichy. Mellin trans-

forms and asymptotics: digital sums. Theoretical Computer Science, 123:291–314,

1994.

[8] P. Flajolet and R. Sedgewick. Mellin transforms and asymptotics: Finite differences

and Rice’s integrals. Theoretical Computer Science, pages 101–124, 1995.

[9] P. J. Grabner and H. Prodinger. On a constant arising in the analysis of bit comparisons

in Quickselect. Quaestiones Mathematicae, 31:303–306, 2008.

[10] R. Grubel. Hoare’s selection algorithm: a Markov chain approach. Journal of Applied

Probability, 35:36–45, 1998.

[11] R. Grubel and U. Rosler. Asymptotic distribution theory for Hoare’s selection algo-

rithm. Advances in Applied Probability, 28:252–269, 1996.

[12] C. R. Hoare. Find (algorithm 65). Communications of the ACM, 4:321–322, 1961.

[13] C. R. Hoare. Quicksort. Computer Journal, 5:10–15, 1962.

[14] H. Hwang and T. Tsai. Quickselect and the Dickman function. Combinatorics, Prob-

ability and Computing, 11:353–371, 2002.

[15] D. E. Knuth. Mathematical analysis of algorithms. In Information Processing 71

(Proceedings of IFIP Congress, Ljubljana, 1971), pages 19–27. North-Holland, Am-

sterdam, 1972.

107

BIBLIOGRAPHY

[16] D. E. Knuth. The Art of Computer Programming. Volume 1: Fundamental Algorithms.

Addison-Wesley, Reading, Massachusetts, 1998.

[17] D. E. Knuth. The Art of Computer Programming. Volume 3: Sorting and Searching.

Addison-Wesley, Reading, Massachusetts, 1998.

[18] J. Lent and H. M. Mahmoud. Average-case analysis of multiple Quickselect: An

algorithm for finding order statistics. Statistics and Probability Letters, 28:299–310,

1996.

[19] H. M. Mahmoud, R. Modarres, and R. T. Smythe. Analysis of Quickselect: An al-

gorithm for order statistics. RAIRO Informatique Theorique et Applications, 29:255–

276, 1995.

[20] H. M. Mahmoud and R. T. Smythe. Probabilistic analysis of multiple Quickselect.

Algorithmica, 22:569–584, 1998.

[21] Volkert Paulsen. The moments of FIND. J. Appl. Probab., 34(4):1079–1082, 1997.

[22] H. Prodinger. Multiple Quickselect—Hoare’s Find algorithm for several elements.

Information Processing Letters, 56:123–129, 1995.

[23] M. Regnier. A limiting distribution of Quicksort. RAIRO Informatique Theorique et

Applications, 23:335–343, 1989.

[24] U. Rosler. A limit theorem for Quicksort. RAIRO Informatique Theorique et Appli-

cations, 25:85–100, 1991.

108

BIBLIOGRAPHY

[25] U. Rosler and L. Ruschendorf. The contraction method for recursive algorithms.

Algorithmica, 29(1):3–33, 2001.

[26] S. Ross. A First Course in Probability. Prentice Hall, Upper Saddle River, NJ, 6th

edition, 2002.

[27] B. Vallee, J. Clement, J. A. Fill, and P. Flajolet. The number of symbol comparisons in

Quicksort and Quickselect. In S. Albers, A. Marchetti-Spaccamela, Y. Matias, S. E.

Nikoletseas, and W. Thomas, editors, 36th International Colloquium on Automata,

Languages and Programming (ICALP 2009), Part I, LNCS 5555, pages 750–763.

Springer–Verlag, 2009.

109

Vita

Takehiko Nakama

PERSONAL INFORMATION

Birth: October 13, 1970, Kyoto, Janan

Home: 3039 Saint Paul Street, Baltimore, MD 21218 USA

Keihoku-Torii-cho, Kyoto, Kyoto 6010323 Japan

EDUCATION

Johns Hopkins University, Baltimore, Maryland

• Ph.D. candidate in Applied Mathematics and Statistics, August 2003–present (GPA:

4.10/4)

Dissertation: Analysis of execution costs for Quickselect

• Ph.D. in Psychological and Brain Sciences, August 2003 (GPA: 3.98/4)

Dissertation: System identification analysis of neuronal responses to bimanual

stimulation in the second somatosensory cortex

110

VITA

• Completion of the NIMH doctoral training program in Perceptual and Cognitive

Neuroscience, May 2003

• Master of Science in Engineering, Department of Applied Mathematics and

Statistics, May 2003 (GPA: 4.04/4)

University of Tsukuba, Tsukuba, Ibaraki, Japan

• Bachelor of Human Science, March 1996 (GPA: 3.51/4)

State University of New York College at Oswego, Oswego, New York

• Bachelor of Arts, summa cum laude, May 1995 (GPA: 3.87/4)

(While I was an undergraduate at the University of Tsukuba, I spent two years at the

State University of New York College at Oswego as an exchange student and

received a Bachelor’s degree.)

FIELDS OF INTEREST

Probability theory, stochastic processes, stochastic optimization (evolutionary

computation in particular), neural networks, complex analysis, graph theory,

combinatorics, parameter estimation, classification, pattern recognition, information

theory, system identification in neuroscience

111

VITA

RELEVANT COURSEWORK

550.440 Stochastic Calculus 550.662 Optimization Algorithms

550.620 Probability Theory I 550.671 Combinatorial Analysis

550.621 Probability Theory II 550.672 Graph Theory

550.630 Statistical Theory 550.681 Numerical Analysis

550.631 Statistical Inference 550.692 Matrix Analysis

550.661 Foundation of Optimization 550.723 Markov Chains

TEACHING EXPERIENCE

At Johns Hopkins, I assisted teaching for the following courses:

Courses in Applied Mathematics and Statistics: Probability Theory I (550.620) and II

(550.621), Stochastic Processes (550.426), Introduction to Probability (550.420),

Combinatorial Analysis (550.671), Introduction to Statistics (550.430), Time Series

Analysis (550.439), Information, Statistics, and Vision (550.437), Dynamical Systems

(550.391)

Courses in Neuroscience: Systems Neuroscience (080.205), Functional Human

Neuroanatomy (200.372), Neuropsychopharmacology (200.376), Biological Mechanisms

of Learning and Memory (200.357)

During the summers of 2005–2007, I worked for The Johns Hopkins University Center for

Talented Youth as an instructor. I taught Cryptology to junior high school students at a

college level.

112

VITA

TECHNICAL SKILLS

Programming languages: C++, MatLab, S-PLUS (R), SAS, SPSS, SYSTAT, Mathematica

RESEARCH EXPERIENCE

• Department of Applied Mathematics and Statistics, Johns Hopkins University,

Baltimore, Maryland, January 2004–present

Conducting dissertation research in probability with Professor James Allen Fill.

• Kennedy Krieger Institute, Johns Hopkins University, Baltimore, Maryland,

August–September 2008

Assisted statistical analysis to examine how parents with autistic children cope with

emotional stress.

• Johns Hopkins University School of Medicine, Baltimore, Maryland,

August–October 2005

Assisted statistical analysis to identify behavioral prototypes of children with

attention deficit hyperactivity disorder.

• Krieger Mind/Brain Institute, Johns Hopkins University, Baltimore,

Maryland, June 1998–August 2003

Conducted dissertation research in single-unit neurophysiology.

113

VITA

• National Institute of Bioscience and Human Technology, Tsukuba, Ibaraki,

Japan, June 1995–May 1996

Conducted research on the effects of background familiarity in visual search.

• State University of New York College at Oswego, Oswego, New York, August

1994–May 1995

Assisted Dr. Ronald Gerrard’s research on tactile perception.

• National Institute of Bioscience and Human Technology, Tsukuba, Ibaraki,

Japan, May–August 1993

Participated in a research project on the effects of aging in visual information

processing.

HONORS AND AWARDS

• Award for Excellence in Teaching, Department of Applied Mathematics and

Statistics, 2009

• Best Paper Award, 2008 International Symposium on Neural Networks

• Best Paper Award, 2008 ACIS International Conference on Software Engineering,

Artificial Intelligence, Networking, and Parallel/Distributed Computing

• George M.L. Sommerman Engineering Graduate Teaching Assistant Award, 2006

(a university-wide award)

114

VITA

• Award for Excellence in Teaching, Department of Applied Mathematics and

Statistics, 2004

• AT&T Asia/Pacific Leadership Award, 1996

• Summa Cum Laude (highest distinction), State University of New York College at

Oswego, 1995

• President’s List for the Spring 1995 semester, State University of New York College

at Oswego

• President’s List for the Fall 1994 semester, State University of New York College at

Oswego

PUBLICATIONS

Nakama, T. (2009). Properties of genetic algorithms applied to fitness functions perturbed

concurrently by multiple sources of noise. (submitted for publication)

Nakama, T. (2009). Theoretical analysis of genetic algorithms in noisy environments

based on a Markov model. (submitted for publication)

Fill, J.A., & Nakama, T. (2009). Analysis of the expected number of bit comparisons

required by Quickselect. Proceedings of the Fourth Workshop on Analytic Algorithmics

and Combinatorics, 249–256; full paper to appear in Algorithmica.

Nakama, T. (2009). Theoretical analysis of batch and on-line training for gradient descent

learning in neural networks. To appear in Neurocomputing. (This paper was selected as

115

VITA

one of the best papers of 2008 International Symposium on Neural Networks.)

Nakama, T. (2009). Using genetic algorithms to find a globally optimal solution in

uncertain environments with multiple sources of noise. Proceedings of the International

Conference on Computational and Experimental Engineering and Sciences, 113–122; full

paper to appear in Computers, Materials & Continua.

Nakama, T. (2009). A Markov chain that models genetic algorithms in noisy

environments. To appear in Nonlinear Analysis Series A: Theory, Methods & Applications.

Nakama, T. (2009). Markov chain analysis of genetic algorithms in a wide variety of

noisy environments. Proceedings of the 2009 Genetic and Evolutionary Computation

Conference, 827–834.

Nakama, T. (2009). Transition and convergence properties of genetic algorithms applied

to fitness functions perturbed concurrently by additive and multiplicative noise.

Proceedings of IEEE Congress on Evolutionary Computation, 2662–2669.

Nakama, T. (2008). Markov chain analysis of genetic algorithms applied to fitness

functions perturbed by multiple sources of noise. Studies in Computational Intelligence,

149, 123–136. (This paper was selected as one of the 19 best papers of 2008 ACIS

International Conference on Software Engineering, Artificial Intelligence, Networking,

and Parallel/Distributed Computing.)

Nakama, T. (2008). Theoretical analysis of genetic algorithms in noisy environments

116

VITA

based on a Markov model. Proceedings of the 2008 Genetic and Evolutionary

Computation Conference, 1001–1008.

Nakama, T. (2008). Markov chain analysis of genetic algorithms in a noisy environment.

Proceedings of the 2008 International Conference on Engineering Optimization,

Evolutionary Techniques, Paper Code 516, 1–10.

Egeth, H.E., Folk, C.L., Leber, A.B., Nakama, T., & Hendel, S. (2001). Attentional

capture in the temporal and spatial domains. In C.L. Folk & B.S. Gibson (Eds.) Advances

in Psychology XXX - Attraction, distraction, and Action: Multiple Perspectives on

Attentional Capture. Amsterdam: Elsevier Science B.V.

Yantis, S., & Nakama, T. (1998). Visual interactions in the path of apparent motion.

Nature Neuroscience, 1, 508–512.

MANUSCRIPTS IN PREPARATION

Fill, J. A., & Nakama, T. Distributional convergence for the number of symbol

comparisons used by Quickselect.

Nakama, T., Lane, J.W., Yantis, S., & Hsiao, S.S. Receptive field structures and bimanual

responsiveness of neurons in the second somatosensory cortex.

Nakama,T., Maiste, P., Yantis, S., & Hsiao, S.S. Regression analysis of ipsilateral,

contralateral, and bimanual interactions in the second somatosensory cortex.

117

VITA

Nakama, T., Lane, J.W., & Hsiao, S.S. Information transmission for ipsilateral,

contralateral, and bimanual stimulation in the second somatosensory cortex.

Nakama, T., Lane, J.W., Fitzgerald, P.J., & Hsiao, S.S. Orientation selectivity for

ipsilateral and contralateral stimulation in the second somatosensory cortex.

Nakama, T., Fitzgerald, P.J., Lane, J.W., & Hsiao, S.S. Attentional modulation of neuronal

activity for orientation discrimination in the second somatosensory cortex.

Nakama, T., Fitzgerald, P.J., Sripati, A., & Hsiao, S.S. Attentional modulation of bimanual

information transmission in the second somatosensory cortex.

CONFERENCE PRESENTATIONS

Nakama, T. (2009, July). Markov chain analysis of genetic algorithms in a wide variety of

noisy environments. Paper presented at the 2009 Genetic and Evolutionary Computation

Conference (GECCO 2009), Montreal, Canada.

Nakama, T. (2009, May). Transition and convergence properties of genetic algorithms

applied to fitness functions perturbed concurrently by additive and multiplicative noise.

Paper presented at IEEE Congress on Evolutionary Computation (CEC 2009), Trondheim,

Norway.

Nakama, T. (2009, April). Using genetic algorithms to find a globally optimal solution in

uncertain environments with multiple sources of noise. Paper presented at the

International Conference on Computational and Experimental Engineering and Sciences

118

VITA

(ICCES 2009), Phuket, Thailand.

Nakama, T. (2008, September). Theoretical analysis of batch and on-line training for

gradient descent learning in neural networks. Paper presented at the Fifth International

Symposium on Neural Networks (ISNN 2008), Beijing, China. (This paper was selected

as one of the best papers of this conference.)

Nakama, T. (2008, August). Markov chain analysis of genetic algorithms applied to

fitness functions perturbed by multiple sources of noise. Paper presented at the Ninth

ACIS International Conference on Software Engineering, Artificial Intelligence,

Networking, and Parallel/Distributed Computing (SNPD 2008), Phuket, Thailand. (This

paper was selected as one of the 19 best papers of this conference.)

Nakama, T. (2008, July). Theoretical analysis of genetic algorithms in noisy environments

based on a Markov model. Paper presented at the 2008 Genetic and Evolutionary

Computation Conference (GECCO 2008), Atlanta, GA.

Nakama, T. (2008, July). A Markov chain that models genetic algorithms in noisy

environments. Paper presented at the Fifth World Congress of Nonlinear Analysts

(WCNA 2008), Orlando, FL.

Nakama, T. (2008, June). Markov chain analysis of genetic algorithms in a noisy

environment. Paper presented at the International Conference on Engineering

Optimization (EngOpt 2008), Rio de Janeiro, Brazil.

119

VITA

Fill, J. A., & Nakama, T. (2008, January). Analysis of the expected number of bit

comparisons required by Quickselect. Paper presented at the Fourth Workshop on

Analytic Algorithmics and Combinatorics (ANALCO 2008), San Francisco, CA.

Liu, T., Slotnick, S.D., Nakama, T., & Yantis, S. (2002, November). Filling in the path of

apparent motion in human cortex. Paper presented at the 32nd Annual Meeting of the

Society for Neuroscience, Orlando, FL.

Nakama, T., Lane, J.W., Fitzgerard, P.J., Johnson, K.O., & Hsiao, S.S. (2001, November).

Linear regression analysis of bimanual neuronal responses to oriented bars in the second

somatosensory cortex. Paper presented at the 31th Annual Meeting of the Society for

Neuroscience, San Diego, CA.

Nakama, T., Sripati, A., Lane, J.W., Fitzgerard, P.J., Johnson, K.O., Yantis, S., & Hsiao,

S.S. (2000, November). Information theoretic analysis of attentional modulation in the

secondary somatosensory cortex during an orientation discrimination task. Paper

presented at the 2000 Society for Neuroscience Baltimore Chapter Meeting, Baltimore,

MD.

Nakama, T., Lane, J.W., Fitzgerard, P.J., Sripati, A., Johnson, K.O., Yantis, S., & Hsiao,

S.S. (2000, November). Attentional modulation of bilateral neuronal responses in the

secondary somatosensory cortex during an orientation discrimination task. Paper

presented at the 30th Annual Meeting of the Society for Neuroscience, New Orleans, LA.

120

VITA

Egeth, H., Folk, C.L., Leber, A., & Nakama, T. (2000). Contingent attentional capture and

the attentional blink. Paper presented at Attraction, Distraction, and Action: An

Interdisciplinary Conference and Workshop on Attentional Capture, Villanova, PA.

Nakama, T. & Egeth, H. (2000, March). An attentional blink at negative stimulus-onset

asynchronies. Paper presented at the 2000 Eastern Psychological Association Vision and

Attention Meeting.

Fitzgerard, P.J., Lane, J.W., Yoshioka, T., Nakama, T., & Hsiao, S.S. (1999, October).

Multi-digit receptive field structures and orientation tuning properties of neurons in SII

cortex of the awake monkey. Poster presented at the 29th Annual Meeting of the Society

for Neuroscience, Miami, FL.

Yantis, S., & Nakama, T. (1999, May). Visual feedback as revealed by motion masking.

Poster presented at the 3rd Annual Vision Research Conference, Fort Lauderdale, FL.

Nakama, T. & Egeth, H. (1999, May). Dependence of attentional blink on the temporal

position of target in RSVP. Poster presented at the 39th Annual Meeting of the

Association for Research in Vision and Ophthalmology, Fort Lauderdale, FL.

Egeth, H. & Nakama, T. (1999, May). Pop-out line detection affected by string length of

concurrent RSVP task and “attentional capture” by initial item. Poster presented at the

39th Annual Meeting of the Association for Research in Vision and Ophthalmology, Fort

Lauderdale, FL.

121

VITA

Nakama, T., Yantis, S., & Rudd, M. (1998, May). Motion masking reflected in detection

threshold. Poster presented at the 38th Annual Meeting of the Association for Research in

Vision and Ophthalmology, Fort Lauderdale, FL.

Yantis, S., & Nakama, T. (1997, November). Real masking by apparent motion. Paper

presented at the 38th annual meeting of the Psychonomic Society, Philadelphia, PA.

REFEREEING

Journal of Theoretical Probability

PROFESSIONAL SOCIETY MEMBERSHIPS

IEEE

Association for Computing Machinery

122

Analysis of Execution Costs for QuickSelectfill/papers/NakamaDissertation.pdfAnalysis of Execution...

Documents

Transcript of Analysis of Execution Costs for QuickSelectfill/papers/NakamaDissertation.pdfAnalysis of Execution...