PROBABILITY INTEGRALS OF THE MULTIVARIATE DISTRIBUTION · 2006-07-08 · Application of...

CANADIAN APPLIED

MATHEMATICS QUARTERLY

Volume 13, Number 1, Spring 2005

PROBABILITY INTEGRALS OF THE

MULTIVARIATE t DISTRIBUTION

SARALEES NADARAJAH AND SAMUEL KOTZ

ABSTRACT. Results on probability integrals of multivari-ate t distributions are reviewed. We believe that this review willserve as an important reference and encourage further researchactivities in the area.

1 Introduction A p-dimensional random vector

XT = (X1, . . . , Xp)

is said to have the t distribution with degrees of freedom ν, mean vectorµ and correlation matrix R if its joint pdf is given by:

(1) f (x) =Γ ((ν + p)/2)

(πν)p/2Γ (ν/2) |R|1/2

×[1 +

1

ν(x − µ)T

R−1 (x − µ)

]−(ν+p)/2

.

The degrees of freedom parameter ν is also referred to as the shapeparameter, as the peakedness of (1) may be diminished, preserved orincreased by varying ν (Jensen [31]). The distribution is said to becentral if µ = 0. Note that if p = 1, µ = 0 and R = 1, then (1) reducesto the univariate Student’s t distribution. If p = 2, then (1) is a slightmodification of the bivariate surface of Pearson [43]. If ν = 1, then (1)is the p-variate Cauchy distribution. If (ν + p)/2 = m, an integer, then(1) is the p-variate Pearson type VII distribution. The limiting form of(1) as ν → ∞ is the joint pdf of the p-variate normal distribution with

AMS subject classification: 62E99.Keywords: Multivariate normal distribution, multivariate t distribution, proba-

bility integrals.Copyright c©Applied Mathematics Institute, University of Alberta.

43

44 SARALEES NADARAJAH AND SAMUEL KOTZ

mean vector µ and covariance matrix R. The particular case of (1) forµ = 0 and R = Ip is a mixture of the normal density with zero meansand covariance matrix vIp—in the scale parameter v.

Multivariate t distributions are of increasing importance in classi-cal as well as in Bayesian statistical modeling; however, relatively littleis known by means of mathematical properties or statistical methods.These distributions have been perhaps unjustly overshadowed by themultivariate normal distribution. Both the multivariate t and the mul-tivariate normal are members of the general family of elliptically sym-metric distributions. However, we feel that it is desirable to focus onthese distributions separately for several reasons:

• Multivariate t distributions are generalizations of the classical uni-variate Student t distribution, which is of central importance in sta-tistical inference. The possible structures are numerous, and eachone possesses special characteristics as far as potential and currentapplications are concerned.

• Application of multivariate t distributions is a very promising ap-proach in multivariate analysis. Classical multivariate analysis issoundly and rigidly tilted toward the multivariate normal distribu-tion while multivariate t distributions offer a more viable alternativewith respect to real-world data, particularly because its tails are morerealistic. We have seen recently some unexpected applications in novelareas such as cluster analysis, discriminant analysis, multiple regres-sion, robust projection indices, and missing data imputation.

• Multivariate t distributions for the past 20 to 30 years have played acrucial role in Bayesian analysis of multivariate data. They serve bynow as the most popular prior distribution (because elicitation of priorinformation in various physical, engineering, and financial phenomenais closely associated with multivariate t distributions) and generatemeaningful posterior distributions.

There has been some amount of research carried out on probability in-tegrals of multivariate t distributions. Most of the work was done dur-ing the pre-computer era, but recently several computer programs havebeen written to evaluate probability integrals. The aim of this paper isto provide a comprehensive review of the known results. We believe thatthis review will serve as an important reference and encourage furtherresearch activities in the area.

The paper is organized as follows. Sections 2 to 8 by now may havelost some of their usefulness but are still of substantial historical interestin addition to their mathematical value. We have decided to record these

PROBABILITY INTEGRALS 45

results in some detail in spite of the fact that some of the expressions arequite lengthy and cumbersome. Sections 9 to 14 contain more practicallyrelevant and modern results.

2 Dunnett and Sobel’s probability integrals One of the earli-est results on probability integrals is that due to Dunnett and Sobel [14].Let (X1, X2) have the central bivariate t distribution with degrees offreedom ν and the equicorrelation structure rij = ρ, i 6= j. The corre-sponding bivariate pdf is

(2) f(x1, x2; ν, ρ) =1

2π√

1 − ρ2

1 +

x21 + x2

2 − 2ρx1x2

ν(1 − ρ2)

−(ν+2)/2

with the probability integral

(3) P (y1, y2; ν, ρ) =

∫ y2

−∞

∫ y1

−∞

f(x1, x2; ν, ρ) dx1 dx2.

Let

x (m, y1, y2) =(y1 − ρy2)

2

(y1 − ρy2)2 + (1 − ρ2)(m + y22)

,

and let

Ix(m,y1,y2)(a, b) =

∫ x(m,y1,y2)

0

Γ(a + b)

Γ(a)Γ(b)ya−1(1 − y)b−1 dy

denote the incomplete beta function. Dunnett and Sobel [14] evaluatedexact expressions for (3) when ν takes on positive integer values. Foreven ν and odd ν, they obtained

P (y1, y2; ν, ρ) =1

2πarctan

√1 − ρ2

−ρ

+y2

4√

νπ

ν/2∑

j=1

Γ(j − 1/2)

Γ(j)

(1 +

y22

ν

)1/2−j

×[1 + sgn(y1 − ρy2)Ix(ν,y1,y2)

(1

2, j − 1

2

)]

+y1

4√

νπ

ν/2∑

j=1

Γ(j − 1/2)

Γ(j)

(1 +

y21

ν

)1/2−j

×[1 + sgn(y2 − ρy1)Ix(ν,y2,y1)

(1

2, j − 1

2

)]

(4)


and

P (y1, y2; ν, ρ) =1

2πarctan

−√

ν

[αβ + γδ

γβ − ναδ

]

+y2

4√

νπ

(ν−1)/2∑

j=1

Γ(j)

Γ(j + 1/2)

(1 +

y22

ν

)−j

×[1 + sgn(y1 − ρy2)Ix(ν,y1,y2)

(1

2, j

)]

+y1

4√

νπ

(ν−1)/2∑

j=1

Γ(j)

Γ(j + 1/2)

(1 +

y21

ν

)−j

×[1 + sgn(y2 − ρy1)Ix(ν,y2,y1)

(1

2, j

)],

(5)

respectively. Here,

α = y1 + y2, β = y1y2 + ρν, γ = y1y2 − ν,

and

δ =√

y21 − 2ρy1y2 + y2

2 + ν(1 − ρ2) .

In the special case y1 = y2 = 0, both (4) and (5) reduce to the neatexpression

(6) P (0, 0; ν, ρ) = arctan

√1 − ρ2

−ρ,

which is independent of ν and is therefore identical with the correspond-ing result for the bivariate normal integral. Since the number of termsin (4) and (5) increases with ν, the usefulness of these expressions isconfined to small values of ν. Dunnett and Sobel [14] also derived anasymptotic expansion in powers of 1/ν, the first few terms of whichyield a good approximation to the probability integral even for mod-erately small values of ν. The method of derivation is essentially thesame as that used by Fisher [17] to approximate the probability inte-gral of the univariate Student’s t distribution: Express the differencef(x1, x2; ν, ρ) − f(x1, x2;∞, ρ) as a power series in 1/ν and then inte-grate this series term by term over the desired region of integration.Setting

r2 =y21 − 2ρy1y2 + y2

2

1− ρ2,


Dunnett and Sobel obtained

f(y1, y2; ν, ρ)

f(y1, y2;∞, ρ)= 1 +

(r2

4− r2

)1

ν+

(r8

32− 5r6

12+ r4

)1

ν2

+

(r12

384− 7r10

96+

13r8

24− r8

)1

ν3

+

(r16

6144− r14

128+

17r12

144− 77r10

120+ r8

)1

ν4

= 1 + D(r),

say. Thus, the desired probability integral is

(7) P (y1, y2; ν, ρ) =

∫ y2

−∞

∫ y1

−∞

f(x1, x2;∞, ρ) dx1 dx2

+

∫ y2

−∞

∫ y1

−∞

D(r)f(x1, x2;∞, ρ) dx1 dx2.

The first term on the right-hand side of (7) is the integral of the bivariatenormal pdf, and it has been tabulated by Pearson [44] with a series ofcorrection terms. The second term can be integrated term by term toobtain an asymptotic expansion in powers of 1/ν. Dunnett and Sobelgave expressions for the coefficients Ak of the terms 1/νk for k = 1, 2, 3, 4.The first of these coefficients takes the form

A1 =ay2

4φ(a)φ (y2) +

by1

4φ(b)φ (y1)

− y2(y22 + 1)

4φ (y2) Φ(a) − y1(y

21 + 1)

4φ (y1) Φ(b),

where φ and Φ are, respectively, the pdf and the cdf of the standardnormal distribution, and

a =y1 − ρy2√

1 − ρ2, b =

y2 − ρy1√1 − ρ2

.

In the special case y1 = y2 = y, (7) reduces to

(8) P (y, y; ν, ρ) =

∫ y

−∞

∫ y

−∞

f(x1, x2;∞, ρ) dx1 dx2

+A1

ν+

A2

ν2+

A3

ν3+

A4

ν4+ · · · ,


with the first two coefficients A1 and A2 now taking the forms

A1 = −yφ(y)

2

(y2 + 1

)Φ(cy) − yΦ′(cy)

and

A2 = −yφ(y)

48

(3y6 − 7y4 − 5y2 − 3

)Φ(cy)

− yΦ′(cy)[3y4

(c4 + 3c2 + 3

)− y2

(c2 + 5

)− 3]

,

where c =√

(1 − ρ)/(1 + ρ). In this special case, Dunnett and Sobel [14]tabulated numerical values of the coefficients Ak for selected values ofρ, y, and ν. The following table gives the values for ρ = 0.5

Coefficients of the asymptotic expansion (8) for ρ = 0.5

y ν A1 A2 A3 A4

0.25 4 −0.025870 0.003371 0.003816 −0.0010500.50 4 −0.057784 0.008999 0.006868 −0.0021550.75 6 −0.100016 0.021983 0.006891 −0.0018791.00 5 −0.150182 0.047374 −0.006835 0.0079911.25 6 −0.198378 0.079687 −0.033130 0.0368171.50 6 −0.231628 0.096254 −0.038696 0.0328081.75 9 −0.240531 0.067469 0.052274 −0.1914822.00 12 −0.223682 −0.020268 0.293449 −0.8192192.25 13 −0.187525 −0.149011 0.623867 −1.6187052.50 12 −0.142571 −0.276255 0.858993 −1.7652493.00 18 −0.062685 −0.376815 0.432592 2.236773

These values can be used to construct tables for the probability integralin (8).

3 Gupta and Sobel’s probability integrals Gupta and Sobel[29] investigated the special case when X follows the central p-variatet distribution with degrees of freedom ν and the correlation structurerij = ρ = 1/2, i 6= j. If Y1, Y2, . . . , Yn, Y are independent normal randomvariables with common mean and common variance σ2, and if νS2/σ2 is


a chi-squared random variable with degrees of freedom ν, independentof Y1, Y2, . . . , Yn, Y , then one can rewrite the probability integral as

P (d) =

∫ d

−∞

· · ·∫ d

−∞

f(x1, . . . , xp; ν, ρ) dxp · · · dx1

= Pr

max(Y1, Y2, . . . , Yp) − Y

S≤

√2 d

= Pr

(Mp − Y

S<

√2 d

)

= Pr(Z <

√2 d)

,

(9)

where Mp = max(Y1, Y2, . . . , Yp) and Z = (Mp − Y )/S. Gupta andSobel [29] provided four useful expressions for P (d). These are by nowclassical results applicable in statistical inference. The first expressionis derived by fixing Y and S in (9) and integrating with respect to Mp

(10) P (d) =

∫ ∞

0

h(s)

[∫ ∞

−∞

Φp(y)φ

(y −

√2

νds

)dy

]ds,

where φ and Φ are, respectively, the pdf and cdf of the standard normaldistribution and h is the pdf of the chi-squared distribution with ν de-grees of freedom. Based on the fact that the pdf φ admits an expansionabout d = 0, it easy to justify a term-by-term integration of (10) toobtain the second expression

P (d) =1

p + 1

∞∑

k=0

2k/2dk

k!AkE

Hk

(max(X1, X2, . . . , Xp+1)

σ

),

where

(11) Ak = E

(S

σ

)k =

Γ(

ν+k2

)(

ν2

)k/2Γ(

ν2

)

is the kth moment of χν/√

ν (provided that k > −ν) and Hk is the kthHermite polynomial defined by

(12)

(− d

dx

)k

exp

(−x2

2

)= Hk(x) exp

(−x2

2

).


A third expression for P (d) is derived by first expanding φ about S = σand then integrating term by term, obtaining

P (d) =

∫ ∞

−∞

Φp(y)

[φ(y −

√2 d)−√

2 dφ(1)(y −

√2 d)

E

(S

σ− 1

)

+ d2φ(2)(y −

√2 d)

E

(S

σ− 1

)2

− · · ·]dy

=

∫ ∞

−∞

Φp(y)φ(y −

√2 d)

dy −√

2 d (1 − A1)

×∫ ∞

−∞

(y −

√2 d)

Φp(y)φ(y −

√2 d)

dy

+ 2d2 (1 − A1)

∫ ∞

−∞

y2 − 2

√2 dy + 2d2 − 1

× Φp(y)φ(y −

√2 d)

dy + · · · ,

where A1 is given by (11). Each of the integrals above can be evaluatedby expanding the pdf φ about d = 0, as was done in (10). The fourthand final expression for P (d) given by Gupta and Sobel [29] uses theresult of Seal [47] that the distribution of D = (Mp−Y )/σ is asymptot-ically normal as p tends to infinity. It follows directly from Seal’s resultthat the third and higher central moments of D tend to the correspond-ing moments of the standard normal distribution. Since the coefficientsinvolving ν in A−k in (11) tend to unity as ν → ∞, it follows that thethird and higher central moments of Z = (Mp − Y )/S tend to the cor-responding moments of the standard normal distribution as both ν andp tend to infinity. It is therefore reasonable to approximate the distri-bution of W = (Z −E(Z))/

√Var (Z) by a Gram-Charlier expansion in

the Edgeworth form, where

E(Z) = A−1ap,1

and

Var (Z) = A−2 (ap,2 + 1) − (A−1ap,1)2.

Here, ap,i denotes the ith moment of the largest of p independent stan-dard normal random variables. Using equation (17.7.3) of Cramer [12]


and letting ds = (√

2d − E(Z))/√

Var (Z), Gupta and Sobel obtained

P (d) = Pr(Z <

√2d)

= Pr (W < ds)

= Φ (ds) −α3

3!φ(2) (ds) +

α4

4!φ(3) (ds) +

10α23

6!φ(5) (ds)

− α5

5!φ(4) (ds) −

35α3α4

7!φ(6) (ds) −

280α33

9!φ(8) (ds) + · · · ,

where

(13) αk =κk√κ2

is the kth standardized cumulant of Z obtained from the momentsaround the origin.

In a related development, Gupta [27] studied the above caseρ = 1/2 and showed that P (d) = P (d; ν) satisfies

(14)dP (d; ν)

dd+ ν P (d; ν) − P (d; ν + 2) = 0,

which is Hartley’s differential-difference equation for the probability in-tegral of a general class of statistics known as Studentized statistics.Using Hartley’s solution (obtained using the theory of characteristics),Gupta obtained an approximation for P (d; ν) in powers of 1/ν and re-marked that it can be computed by using the Gauss-Hermite quadrature.Gupta et al. [28] extended this result for any ρ > 0 and showed thatP (d) satisfies (14) in this case too. In this case the approximation forP (d) in powers of 1/ν is

(15) P (d) = G(d, . . . , d) +

m∑

k=1

Lk(d),

where Lk is the kth correction term and G is the joint cdf of a p-variate normal distribution with zero means, common variance σ2, andthe equicorrelation structure rij = ρ, i 6= j. Letting G(k)(d) denotethe kth-order derivative of G(d, . . . , d) with respect to d, the first four


correction terms can be written as

L1(d) =1

d

α(2) − α(1)

,

L2(d) =1

6ν2

3α(4) − 10α(3) + 9α(2) − 2α(1)

,

L3(d) =1

6ν3

α(6) − 7α(5) + 17α(4) − 17α(3) + 6α(2)

,

and

L4(d) =1

360ν4

15α(8) − 180α(7) + 830α(6) − 1848α(5)

+ 2015α(4) − 900α(3) + 20α(2) + 48α(1),

where

α(k) =1

2kϕ(k)(d), k = 1, 2, . . . , 8,

and the first eight ϕ(k)(d) are

ϕ(1)(d) = dG(1)(d),

ϕ(2)(d) = d2G(2)(d) + dG(1)(d),

ϕ(3)(d) = d3G(3)(d) + 3d2G(2)(d) + dG(1)(d),

ϕ(4)(d) = d4G(4)(d) + 6d3G(3)(d) + 7d2G(2)(d) + dG(1)(d),

ϕ(5)(d) = d5G(5)(d) + 10d4G(4)(d) + 25d3G(3)(d)

+ 15d2G(2)(d) + dG(1)(d),

ϕ(6)(d) = d6G(6)(d) + 15d5G(5)(d) + 65d4G(4)(d)

+ 90d3G(3)(d) + 31d2G(2)(d) + dG(1)(d),

ϕ(7)(d) = d7G(7)(d) + 21d6G(6)(d) + 140d5G(5)(d)

+ 350d4G(4)(d) + 301d3G(3)(d)

+ 63d2G(2)(d) + dG(1)(d),

ϕ(8)(d) = d8G(8)(d) + 28d7G(7)(d) + 266d6G(6)(d)

+ 1050d5G(5)(d) + 1701d4G(4)(d) + 966d3G(3)(d)

+ 127d2G(2)(d) + dG(1)(d).

(16)


Thus the evaluation of P (d) in (15) involves that of G(k) fork = 0, 1, . . . , 8.

4 John’s probability integrals John [33] provided alternativeformulas for the evaluation of the probability integral. Although themethod is discussed in detail only for the bivariate case, it has widerapplicability in the sense that it can be adopted to obtain the probabilityintegral of the multivariate t distribution for any dimension.

Let X be a p-variate vector having the central t distribution withdegrees of freedom ν and correlation matrix R. Using the definitionthat X can be represented as (Z1/S, Z2/S, . . . , Zp/S), where Z is a p-variate normal random vector with correlation matrix R and νS2/σ2 isan independent chi-squared random variable with degrees of freedom ν,one can show that the characteristic function of X is

E(exp

(itT X

))= E

(E(exp(itT Z/s | S = s

))

=1

Γ(ν/2)

∫ ∞

0

xν/2−1 exp(−x − ν

4xtT R−1t

)dx.

In the case p = 2 with the equicorrelation structure rij = ρ, i 6= j, theabove expression reduces to

E (exp (it1X1 + it2X2)) =1

Γ(ν/2)

∫ ∞

0

xν/2−1

× ∞∑

i=0

1

i!

(−νρ

2x

)i

ti1 ti2

exp

− x − ν

(t21 + t22

)

4x

dx.

By the inversion theorem, John [33] derived the corresponding joint pdfas an infinite series of one-dimensional integrals. Integrating the infiniteseries term by term, the probability integral becomes

P (y1, y2; ν, ρ) = yν,0 (y1, y2) +1

2π

∞∑

i=1

ρi

i!yν,i (y1, y2) ,

where

yν,0 (y1, y2) =1

Γ(ν/2)

∫ ∞

0

xν/2−1 exp(−x) Φ

(√2xy1√

ν

)Φ

(√2xy2√

ν

)dx


and

yν,i (y1, y2) =1

Γ(ν/2)

∫ ∞

0

xν/2−1 exp

[−x

1 +

y21 + y2

2

ν

]

× Hi−1

(√2x y1√

ν

)Hi−1

(√2x y2√

ν

)dx

for i = 1, 2, . . .. Here, Φ(·) is the cdf of the standard normal distributionand Hk denotes the Hermite polynomial of order k defined by (12). Johnprovided explicit algebraic expressions for yν,i for i = 1, 2, . . . , 6. Thefirst three of them are

yν,1 (y1, y2) = z−ν/2,

yν,2 (y1, y2) = y1 y2 z−(ν/2+1),

and

yν,3 (y1, y2) =

(1 +

2

ν

)y21 y2

2 z−(ν/2+2)

−(y21 + y2

2

)z−(ν/2+1) + z−ν/2,

where z = (y21 + y2

2)/ν + 1. In principle, explicit expressions for yν,i canbe obtained for any i ≥ 1. To evaluate yν,0, the integration has to bedone numerically. John tabulated values of this quantity for ν = 11, 12using Gauss’ formula for a numerical quadrature (Kopal [36], p. 371).He also provided several useful recursion relations. For example, valuesof yν,0(y1, y2) for y1 negative or y2 negative or both negative can befound from the formulas

yν,0 (y1, y2) = Tν (y2) − yν,0 (−y1, y2) ,

yν,0 (y1, y2) = Tν (y1) − yν,0 (y1,−y2) ,

and

yν,0 (y1, y2) = 1 + yν,0 (−y1,−y2) − Tν (−y1) − Tν (−y2) ,

where Tν is the cdf of the Student’s t distribution with ν degrees offreedom.


5 Amos and Bulgren’s probability integrals In a widely quotedpaper, Amos and Bulgren [4] derived several representations for (3) interms of series and simple one-dimensional quadratures, together withefficient computational procedures for the special functions used in theirnumerical evaluation. One of the quadrature formulas given is

P =1

2π(ν + 1)(1 + γ21 + γ2

2)ν/2

×∫ θ2

θ1

2F1

(1,

ν

2;ν + 3

2; 1 − c2 cos2(θ − φ)

)dθ

− Γ((ν + 1)/2)√πΓ(ν/2)(1 + γ2

1 + γ22)ν/2

×∫ θ2

θ1

I cos(θ − φ) < 0 cos(θ − φ)

1 − c2 cos2(θ − φ)(ν+1)/2dθ,

where 2F1 is the Gauss hypergeometric function, I is the indicatorfunction,

c =

√γ21 + γ2

2

1 + γ21 + γ2

2

,

γ1 = (y2 + y1)

√λ1

2ν,

γ2 = (y2 − y1)

√λ2

2ν,

θ1 = π − arctan

√1 + ρ

1 − ρ,

θ2 = π + arctan

√1 + ρ

1 − ρ,

φ =

arctan (γ2/γ1) , if γ1 > 0,

π + arctan (γ2/γ1) , if γ1 < 0,

λ1 =1

1 + ρ, and λ2 =

1

1 − ρ.


One of the series formulas given is

(17) P =1

2√

πΓ(ν/2)

×∞∑

k=0

(−c)k

(1 + γ21 + γ2

2)ν/2

Γ((ν + k)/2)

Γ((1 + k)/2)

∫ θ2

θ1

cosk(θ − φ) dθ.

For the special case ν = 1, P can be reduced to the closed-form expres-sion

P =1

πarctan

(2v

u2 + v2 − 1

)+ I

u2 + v2 < 1

,

where

u =2r sinφ

A (1 + r2 + 2r cosφ),

v =1 − r2

A (1 + r2 + 2r cosφ),

r =

√γ21 + γ2

2

1 +√

1 + γ21 + γ2

2

,

and

A = tan

(θ2 − π

2

).

If in addition ρ = 0, then the expression for P reduces further to

P =1

2π

arctan

(y1y2√

1 + y21 + y2

2

)+ arctany1 + arctany2 +

π

2

.

The advantage of these expressions over the ones given by Dunnett andSobel [14] is that these are easier to compute, especially for large degreesof freedom. For instance, the integral in θ in (17) can be expressed interms of incomplete beta functions that are extensively tabulated. Amosand Bulgren [4] numerically evaluated values of P for all combinationsof ρ = −0.9,−0.5, 0, 0.5, 0.9 and ν = 1, 2, 5, 10, 25, 50.


6 Steffens’ noncentral probabilities Consider the p-variate non-central t distribution defined by

f (x) = exp

−1

2ξ

TRξ

Γ ((ν + p)/2)

(νπ)p/2Γ(ν/2) |R|1/2

×

1 +1

νxT R−1x

−(ν+p)/2

×∞∑

k=0

Γ ((ν + p + k)/2)

k!Γ ((ν + p)/2)

√2xT R−1ξ√

ν + xT R−1x

k

,

(18)

where ξ = µ/σ (this reduces to the standard p-variate t distributionwhen µ = 0). Motivated by the Studentized maximum and minimummodulus tests, Steffens [61] studied the particular case for p = 2 andR = Ip. In this case, the joint pdf (18) reduces to

f (x1, x2) = exp

(−ξ2

1 + ξ22

2

)1

πΓ (ν/2)

∞∑

k=0

∞∑

l=0

Γ ((ν + k + l)/2 + 1)

k! l! ν(k+l)/2+1

×(√

2ξ1x1

)k (√2ξ2x2

)l(

1 +x2

1

ν+

x22

ν

)(ν+k+l+2)/2

,

where ξj = µj/σ are the noncentrality parameters and ν denotes thedegrees of freedom. The testing procedures involve maximum or mini-mum values of the components X1 and X2 and the computation of thecorresponding probabilities. For this reason, Steffens [61] derived seriesrepresentations for probabilities of the form

P1 = Pr(| X1 |≤ A, | X2 |≤ A) and P2 = Pr(| X1 |> A, | X2 > A).

It is seen that

P1 = 2 exp

(−ξ2

1 + ξ22

2

) ∞∑

k=0

∞∑

l=0

(ξ21/2)k (

ξ22/2)l

k! l! B (k + 1/2, l + 1/2)

×∫ π/4

0

(sin2k v cos2l v + sin2l v cos2k v

)Iα

(k + l + 1,

ν

2

)dv


and

P2 = 2 exp

(−ξ2

1 + ξ22

2

) ∞∑

k=0

∞∑

l=0

(ξ21/2)k (

ξ22/2)l

k! l! B (k + 1/2, l + 1/2)

×∫ π/4

0

(sin2k v cos2l v + sin2l v cos2k v

) 1 − Iβ

(k + l + 1,

ν

2

)dv,

where Ix denotes the incomplete beta function ratio

α =A2 sec2 v

ν + A2 sec2 vand β =

A2cosec2v

ν + A2cosec2v.

Using these representations, Steffens estimated values of the criticalpoints A for all combinations of ν = 1, 2, 5, 10, 20, 50,∞ andξ1, ξ2 = 0(1)5 for the significance level 0.05. In a more recent devel-opment, Bohrer et al. [8] developed a flexible algorithm to computeprobabilities of the form Pr(c11 ≤ Xp ≤ c21, . . . , c1p ≤ Xp ≤ c2p) associ-ated with the noncentral p-variate distribution (18).

7 Dutt’s probability integrals Dutt [16] obtained a Fouriertransform representation for the probability integral of a central p-variatet distribution with degrees of freedom ν and correlation matrix R

(19) P (y1, . . . , yp) =

∫ y1

−∞

· · ·∫ yp

−∞

f(x1, . . . , xp; ν) dxp · · · dx1.

Using the definition of multivariate t, one can rewrite (19) as

(20) P (y1, . . . , yp)

=2

2ν/2Γ(ν/2)

∫ ∞

0

zν−1 exp(−z2/2

)G (y1, . . . , yp) dz,

where yk = ykz/√

ν, k = 1, . . . , p and G is the joint cdf of the multivari-ate normal distribution with zero means and correlation matrix R. Inthe case yk = 0, one has P independent of ν and

P (y1, . . . , yp) = G (0, . . . , 0) .


Explicit forms of G for p = 2, 3, 4 in terms of the D-functions are givenin Dutt [15]. The D-functions are integral forms over (−∞,∞) definedby

Dk (t1, . . . , tp;R) =

∣∣ik∣∣

(2π)k

∫ ∞

−∞

· · ·∫ ∞

−∞

dk

s1 · · · sk

× exp

(i

k∑

l=0

tlsl −k∑

l=0

s2l

/2

)dsk · · · ds1,

where the first five dk are

d1 = 1,

d2 = d12,

d3 = d12+13+23 − (d12 + d13 + d23) ,

d4 = −d12+13+23+14+24+34 + d12+13+23

+ d12+14+24 + d13+14+34 + d23+24+34

− (d12 + d13 + d23 + d14 + d24 + d34) ,

d5 = −d12+13+23+24+34+15+25+35+45 + d12+13+23+14+24+34

+ d12+13+23+15+25+35 + d12+14+24+15+25+45

+ d13+14+34+15+35+45 + d23+24+34+25+35+45

−(d12+13+23 + d12+14+24 + d12+15+25 + d13+14+34

+ d13+15+35 + d14+15+45 + d23+24+34 + d23+25+35

+ d24+25+45 + d34+35+45

)+ d12 + d13 + · · · + d45,

and

dp1q1+···+pmqm= 1 − exp − (rp1q1

sp1sq1

+ · · · + rpmqmspm

sqm) .

Using the notation

Dk:j1,...,jk= Dk tj1 , . . . , tjk

;R (tj1 , . . . , tjk) ,


where R (tj1 , . . . , tjk) is the correlation matrix based on the subscripts

j1, . . . , jk, Dutt [15] provided the following explicit forms for G

G (t1, t2) = 1 − Φ (t1) 1 − Φ (t2) + D2:1,2,

G (t1, t2, t3) = 1 − Φ (t1) 1 − Φ (t2) 1 − Φ (t3)

+ 1 − Φ (t1)D2:2,3 + 1 − Φ (t2)D2:1,3

+ 1 − Φ (t3)D2:1,2 + D3:1,2,3,

and

G (t1, t2, t3, t4) =

4∏

k=1

1 − Φ (tk) + 1 − Φ (t1) 1 − Φ (t2)D2:3,4

+ 1 − Φ (t1) 1 − Φ (t3)D2:2,4

+ 1 − Φ (t2) 1 − Φ (t3)D2:1,4

+ 1 − Φ (t1) 1 − Φ (t4)D2:2,3

+ 1 − Φ (t2) 1 − Φ (t4)D2:1,3

+ 1 − Φ (t3) 1 − Φ (t4)D2:1,2

+ 1 − Φ (t1)D3:2,3,4 1 − Φ (t2)D3:1,3,4

+ 1 − Φ (t3)D3:1,2,4 + 1 − Φ (t4)D3:1,2,3

+ D4:1,2,3,4.

A much simplified representation for G in terms of the error function,erf(·), and integral forms over (0,∞), denoted as the D∗-functions, isgiven in a later paper by Dutt [16]. These D∗-functions are defined by

(21) D∗

k (t1, . . . , tp;R) =2

(2π)k

∫ ∞

0

· · ·∫ ∞

0

d∗ks1 · · · sk

× exp

(−

k∑

l=0

s2l

/2

)dsk · · · ds1,


where for the first few k are

d∗1 = sin (t1s1) ,

d∗2 = e−12 cos1−2 −e12 cos1+2,

d∗3 = e12+13+23+14+24+34 cos1+2+3+4

+ e12−13−23−14−24+34 cos−1−2+3+4

+ e−12+13−23−14+24−34 cos−1+2−3+4

+ e−12−13+23+14−24−34 cos1−2−3+4

− e−12−13+23−14+24+34 cos−1+2+3+4

− e−12+13−23+14−24+34 cos1−2+3+4

− e12−13−23+14+24−34 cos1+2+3+4

− e12+13+23−14−24−34 cos1+2+3−4

and for notation

ep1q1+···+pmqm= exp − (rp1q1

sp1sq1

+ · · · + rpmqmspm

sqm) ,

sinp1+···+pm= sin (tp1

sp1+ · · · + tpm

spm) ,

cosp1+···+pm= cos (tp1

sp1+ · · · + tpm

spm) .

(A negative sign on the index p1q1 corresponds to +rp1q1sp1

sq1and −p1

corresponds to −tp1sp1

.) Important special cases of these functions are

D∗

1(y) =1

2erf

(y√2

),

D∗

2 (0, 0;R) =1

2πarcsin (r12) ,

and

D∗

k (0;R) ≡ 0 for k odd.

Using the abbreviation that

D∗

k:j1,...,jk= D∗

k tj1 , . . . , tjk;R (tj1 , . . . , tjk

) ,


Dutt [16] provided the following representation for G

G (t1, . . . , tp) =

(1

2

)p

−(

1

2

)p−1 p∑

k=1

D∗

1:k +

(1

2

)p−2 p∑

k<l=1

D∗

2:kl

+

(1

2

)p−3 p∑

k<l<m=1

D∗

3:klm + · · · + Dp:1,...,p.

Hence, by (20), the computation of P in (19) can be achieved by succes-sive applications of the Gauss-Hermite quadrature formula using onlypositive Hermite zeros (Abramowitz and Stegun [1], page 924). Thereare several advantages for this approach. First, it is not necessary toinvert the correlation matrix. In addition, (21) permits the use of Gaussquadrature formula that are remarkably effective in estimating the valueof an integral from a few points, provided that the integral excluding theweighting function can be accurately approximated by a polynomial.Moreover, often the integrand separates as a product of two functions,one depending only on correlation coefficients and the other on the orig-inal limits of integration.

For selected correlation structures and several values of ν and y = yk,k = 1, . . . , p, Dutt [16] computed values of P accurate up to six decimalplaces.

8 Amos’ probability integral For the equicorrelation structurerij = ρ, i 6= j considered by Gupta and Sobel [29] and Gupta [27]—but with the common ρ taken to be any positive real number less than1—Amos [3] derived the following simpler expression for the probabilityintegral

(22) P (d) =2ν−3/2Γ ((ν + 1)/2)

√π (1 + b2)ν/2

×∫ ∞

−∞

exp

(−dx2

2

)Φp(x)erfc

(− cx√

2

)dx,

where erfc(·) is the complementary error function defined by

erfc(x) =2√π

∫ ∞

x

exp(−z2

)dz,


and a, b, c and d are constants given by

a =

√1 − ρ

ρ, b =

d√ρν

,

c =ab√

1 + b2, d =

a2

1 + b2.

The reduction to (22) was obtained by means of a relationship betweenthe parabolic cylinder function and the complementary error function.Amos [3] suggested computing the integral (22) by locating the x0 forwhich the derivative of the integrand is zero and then summing quadra-tures on intervals of length h to the left and right of x0 until a limit ofintegration is reached or the truncation error is small enough. The mo-tivation for this procedure comes from the fact that x0 can vary widelywith extreme parameter values, and h, which estimates the spread of theintegrand, can be small or large. Thus, x0 and h accommodate the pa-rameters, producing meaningful results by preventing quadratures overtails that are negligible or preventing gross misjudgments of the scale ofintegration. Letting g(x) denote the integrand of (22), Amos [3] showedthat the derivative of log g(x) decreases monotonically from ∞ to −∞as x traverses (−∞,∞), guaranteeing a unique root x0 of g′(x) = 0.

9 Fujikoshi’s probability integrals Fujikoshi [19] providedasymptotic expansions as well as error bounds for the probability inte-gral (19) when the correlation matrix R = Ip, the p× p identity matrix.Specifically, letting

aδ,j (y1, . . . , yp) =dj

dsj

Φ(s−δ/2y1

)· · ·Φ

(s−δ/2yp

)∣∣∣s=1

,

where δ = −1, 1, and Φ denotes the cdf of the standard normal distri-bution, Fujikoshi established the following approximation for the prob-ability integral

(23) P (y1, . . . , yp) = Φ (y1) · · ·Φ (yp)

+k−1∑

j=1

1

j!aδ,j (y1, . . . , yp) E

[(χ2

ν

ν

)δ

− 1

j],

which we shall denote by Aδ,k(y1, . . . , yp). Fujikoshi also derived uni-form and nonuniform error bounds for this approximation. Under the


assumptions that

aδ,k = supy

|aδ,k (y1, . . . , yp)| < ∞,

and

E

(χ2

ν

ν

)k< ∞, E

(ν

χ2ν

)k< ∞,

the uniform bound takes the form

supy

∣∣P (y1, . . . , yp) − Aδ,k (y1, . . . , yp)∣∣

≤ 1

k!aδ,kE

[(χ2

ν

ν

)∨(

ν

χ2ν

)− 1

k].

Under the assumptions that

aδ,k(l) = supy

(1+ ‖ y ‖l

)|aδ,k (y1, . . . , yp)| < ∞

and

E

(χ2

ν

ν

)k+l/2 < ∞, E

(ν

χ2ν

)k < ∞,

the nonuniform bound takes the form

|P (y1, . . . , yp) − Aδ,k (y1, . . . , yp)|

≤ 1

k!

(1+ ‖ y ‖l

)−1aδ,k(l)E

(χ2

ν

ν

)l/2 ∣∣∣∣χ2

ν

ν− 1

∣∣∣∣k

+

∣∣∣∣ν

χ2ν

∣∣∣∣k

.

Clearly the latter bounds are improvements on the uniform bounds inthe tail part of the multivariate t distribution. In the case p = 1,these results provide useful approximations for the univariate Student’st distribution—see Fujikoshi [18] and Fujikoshi and Shimizu [22]. Thespecial case of (23) for yj = y has been investigated more recently byFujikoshi [19, 20, 21], Fujikoshi and Shimizu [23], and Shimizu andFujikoshi [48].


z1

rc

E(c) and ||z||=rr

Ar(c)

FIGURE 1: The sets Ar(c) and E(c) ∩ ‖z‖ = r in two dimensions

10 Probabilities of cone Consider the p-dimensional set

(24) Ar(c) =x : zT x ≤ r‖z‖, all z in E(c)

,

where E(c) = z : z1 ≥ c‖z‖, ‖z‖ =√

zT z, and c is a nonnegativeconstant. The set E(c) is the cone, with vertex at the origin, whichintersects origin-centered spheres in spherical caps. This is illustrated inFigure 1 for p = 2.

Bohrer [6] studied the analytical shape of Ar(c) and the associatedprobability

p(c, r, p, ν) = Pr (X ∈ Ar(c))

when X has the p-variate t distribution with mean vector 0, covariancematrix σ2Ip, and degrees of freedom ν. The evaluation of p(c, r, p, ν) isof statistical interest and use in the construction of confidence bounds(Wynn and Bloomfield [68], Section 3; Bohrer and Francis [7], equation(2.3)) and in testing multivariate hypotheses (Kudo [39], Theorem 3.1,Section 5; Barlow et al. [5], pages 136ff, 177).


As regards the shape, Bohrer showed that every two-dimensional sec-tion of Ar containing the z1-axis is exactly the two-dimensional versionof Ar illustrated in Figure 1. Thus, Ar is the solid of revolution aboutthe z1-axis that is swept out by the Ar in Figure 1. To express thismore precisely in mathematical terms—for an p × 1 vector v—definepolar coordinates Rv and µv = θvi, with −π < θvi ≤ π, by

v1 = Rv cos θv1,

vi = Rv cos θvi

i−1∏

j=1

sin θvj , i = 2, . . . , p − 1,

and

vp = Rv

i−1∏

j=1

sin θvj .

Also define

θ∗ = arccos c,

T1 = x : |θt1| ≤ θ∗, Rt ≤ r ,

T2 =

x : θt1 − θ∗ ∈

(0,

π

2

], Rt cos (θt1 − θ∗) ≤ r

,

T3 =

x : θt1 + θ∗ ∈

[− π

2, 0

), Rt cos (θt1 + θ∗) ≤ r

,

and

T4 =x : |θt1| > θ∗ +

π

2

.

Then the set Ar is the union of the disjoint sets T1, . . . , T4. As regardsevaluating the probability p(c, r, p, ν), Bohrer [6] derived the followingexpression

p(c, r, p, ν) =k (θ∗)

k (π/2)Pr

(Fp,ν ≤ r2

σ2

)

+k (π/2− θ∗)

k (π/2)

1

2k (π/2)

p−2∑

j=0

B

(j + 1

2,p − j − 1

2

)


×(

p − 2

j

)cj(1 − c2

)(p−2−j)/2Pr

(Fp−2−j,ν ≤ r2

σ2

),

where k(θ) is given by

k(θ) =(2mm!)

2

p!(1 − cos θ) − cos θ sin2m θ

p

−m−1∑

l=1

sin2(m−l) θ cos θ

p − 2l

l∏

j=1

p − 2j

p + 1 − 2j

when p = 2m + 1 is odd, and by

k(θ) =(p − 1)!θ

2p−1(m − 1)!m!− sinp−1−2m θ cos θ

p

−m−1∑

l=1

sinp−1−2l θ cos θ

p − 2l

l∏

j=1

p + 1 − 2j

p + 2 − 2j

when p = 2m is even. The statistical questions that motivate this workask what radius r is required so that p(c, r, p, ν) = α for preassignedvalues of α. For p ≤ 5, Bohrer [6] provided tables of these percentilesfor α = 0.95 and 0.99 and for a range of (c, ν) pairs.

11 Probabilities of convex polyhedra It is well known (Nichol-son [41]; Cadwell [10]; Owen [42]) that probabilities of polygons underbivariate normal distributions can be evaluated in terms of probabilitiesof right-angled triangles with vertices (0, 0), (y1, 0), (y1, y2), yj > 0,j = 1, 2 under bivariate normal distributions with zero correlation.John [34] proved an analogous result that probabilities of polygonaland angular regions for a given bivariate t distribution can be expressedin terms of Vν(y1, y2), the integral of

f (x1, x2; ν) =Γ((ν + 2)/2)

νπΓ(ν/2)

1 +

x21 + x2

2

ν

−(ν+2)/2

over the right-angled triangles with vertices (0, 0), (y1, 0), and (y1, y2).John [34] also provided several formulas for evaluating Vν(y1, y2). A


formula in terms of the incomplete beta function is

Vν (y1, y2) =1

2πarctan

(y2

y1

)

− y1cν/2

4π√

ν + y21

∞∑

k=0

ckBu

(ν

2+ k +

1

2,1

2

),

(25)

where

c =ν

ν + y21

, u =ν + y2

1

ν + y21 + y2

2

,

and

Bx(a, b) =

∫ 1

x

wa−1(1 − w)b−1 dw

is the incomplete beta function. This series converges slowly unless y1

is large in relation to ν. In the two cases ν odd and ν even, (25) can bereduced considerably. If ν = 2m for a positive integer m, then

(26) V2m (y1, y2) =

√1 − c

4π

m−1∑

k=0

ckBu

(k +

1

2,1

2

)

while if ν = 2m + 1 for a nonnegative integer m, then

(27) V2m+1 (y1, y2) =1

2πarctan

(y2

y1

)− 1

4πBv

(1

2,1

2

)

+

√c(1 − c)

4π

m−1∑

k=0

ckBu

(k + 1,

1

2

),

where

v =ν(ν + y2

1 + y22

)

ν (ν + y21 + y2

2) + y21y

22

.

An attractive feature of (26) and (27) is that, when utilizing them forevaluating V2m and V2m+1, they are already evaluated for lower valuesof m also. If one performs the summations in the order indicated in theformulas, the addition of each term will yield values of V2m or V2m+1 forthe next higher value of m. This feature makes it particularly suitablefor use in preparing tables.


A second formula for Vν(y1, y2) given in John [34] is an expansion inpowers of 1/ν

(28) Vν (y1, y2) = V∞ (y1, y2) −exp

(−y2

1/2)

2π

∞∑

k=1

ν−k

k!Uk (y1, y2) ,

where the first three Uk are given by

U1 (y1, y2) =y41

4W1 (y1, y2) ,

U2 (y1, y2) = y61

− 1

3W2 (y1, y2) +

y21

16W3 (y1, y2)

,

and

U3 (y1, y2) = y81

3

4W3 (y1, y2) −

y21

4W4 (y1, y2) +

y41

64W5 (y1, y2)

,

where

Wν (y1, y2) =

∫ y2/y1

0

(1 + t2

)νexp

(−y2

1t2

2

)dt.

The term V∞ in (28) is the integral of exp−(y21 + y2

2)/(2π) over theright-angled triangle with vertices (0, 0), (y1, 0), and (0, y2). The methodof derivation for (28) is similar to the classical method employed byFisher [17] for expanding the probability integral of Student’s t. De-spite the complexity of (28) over (25), (28) should be preferred if νis sufficiently large. The first two or three terms of (28) then can beexpected to provide fairly accurate values of Vν .

John [34] also provided a recurrence relation and an approximationfor Vν(y1, y2); the latter proved to be satisfactory only when either νis too small or y2/y1 is too large. In a subsequent paper, John [35]extended this result to higher dimensions, by showing that the probabil-ities of the p-dimensional convex polyhedra with vertices (0, 0, 0, 0, . . .,0), (y1, 0, 0, 0, . . ., 0), (y1, y2, 0, 0, . . ., 0), . . ., (y1, y2, y3, y4, . . ., yp),hj > 0, j = 1, 2, . . . , p under a p-variate t distribution with ν degrees offreedom can be expressed in terms of the function Vν(y1, y2, . . . , yp), theintegral of the p-variate t pdf

f (x1, x2, . . . , xp; ν) =Γ((ν + p)/2)

(νπ)p/2Γ(ν/2)

1 +

x21 + x2

2 + · · · + x2p

ν

−ν+p

2


over the same p-dimensional convex polyhedra. John also provided animportant asymptotic expansion in powers of 1/ν connecting Vν(y1, y2,. . ., yp) with V (y1, y2, . . . , yp), the integral of the p-variate normal pdf

f (x1, x2, . . . , xp;∞) = (2π)−p/2 exp−(x2

1 + x22 + · · · + x2

p

)/2

over the same polyhedra discussed above. Up to the order of the termO(1/ν2), the expansion is

Vν (y1, y2, . . . , yp)

= V (y1, y2, . . . , yp) +1

4ν

y1y2f (y1, y2) V (y3, y4, . . . , yp)

− y1

(1 + y2

1

)f (y1) V (y2, y3, . . . , yp)

+1

96ν2

3y1y2y3y4f (y1, y2, y3, y4) V (y5, y6, . . . , yp)

− y1y2y3

(2 + 9y2

1 + 6y22 + 3y2

3

)f (y1, y2, y3) V (y4, . . . , yp)

− y1y2

(3 + 5y2

1 + y22 − 9y4

1 − 9y21y

22 − 3y4

2

)V (y3, . . . , yp) f (y1, y2)

+ y1

(3 + 5y2

1 + 7y41 − 3y6

1

)f (y1) V (y2, . . . , yp)

+ o

(1

ν2

).

In this formula, V (ym, ym+1, . . . , yp) is to be replaced by 0 if m ≥ p + 2and by 1 if m = p + 1. In principle, there is no difficulty in determiningfurther terms of this expansion, but the coefficients of higher powers of(1/ν) have rather complicated expressions. Other useful results given byJohn [35] include recursion formulas connecting Vν(y1, y2, . . . , yp) withVν±2(y1, y2, . . . , yp).

More recently, several authors have looked into the problem of com-puting multivariate t probabilities of the form

(29) P =

∫

A

f (x; ν) dx,

where X has the central multivariate t distribution with correlation ma-trix R and A is any convex region. Somerville [50, 51, 52, 53] developedthe first known procedures for evaluating P in (29). Let MMT be theCholesky decomposition of R (where M is a lower triangular matrix)and set X = MW. Then W is multivariate t with correlation matrixIp. If one further sets r2 = WT W, then F = r2/p has the well known


F distribution with degrees of freedom p and ν. Let A be the regionbounded by p hyperplanes and described by

GW ≤ d,

where G = (g1, . . . ,gp) and the jth hyperplane is gTj W = dj . For a

random direction c, let r be the distance from the origin to the boundaryof A, that is, the smallest positive distance from the origin to the jthplane, j = 1, . . . , p. Then an unbiased estimate of the integral P in (29)is

(30) Pr(F ≤ r2/p

).

To implement the procedure, Somerville chose successive random direc-tions c and obtained corresponding estimates of (30). The value of Pwas then taken as the arithmetic mean of the individual estimates.

Somerville [54, 56] provided the following modification of the aboveprocedure. Let r∗ be the minimum distance from the origin to theboundary of A, that is, the smallest of the r for all random directions c.Divide A into two regions, the portion inside the hypersphere of radiusr∗ and centered at the origin, and the region outside. The probabilitycontent of the hypersphere is

P1 = Pr(F ≤ r∗2/p

),

and this can be estimated as in Somerville [50, 51, 52, 53]. If E(v)and e(v), respectively, denote the cdf and the pdf of v = 1/r (the recip-rocal distance from the origin from and to the boundary of A), then theprobability content of the outer region is

P2 =

∫ 1/r∗

0

E(v)e(v) dv.

Since F = r2/p, the pdf of v is

e(v) =2νν/2Γ ((ν + p)/2)

Γ (ν/2)Γ (p/2)· vν−1

(1 + νv2)(ν+p)/2.

The strategy is to use some numerical method to estimate E(v) andthen evaluate the integral P2 using the Gauss-Legendre quadrature. Theapproaches of Somerville [54, 56] differ in that Somerville [54] applied


Monte Carlo techniques to estimate E(v) while Somerville [56] useda binning procedure. It should be noted, however, that an approachsimilar to these had been introduced earlier by Deak [13].

Somerville [57] provided an extension of the above methodologies toevaluate P in (29) when A is an ellipsoidal region. This has potentialapplications in the field of reliability (in particular relating to the com-putation of the tolerance factor for multivariate normal populations) andto the calculation of probabilities for linear combinations of central andnoncentral chi-squared and F . In the coordinate system of the trans-formed variables W, assume, without loss of generality, that the axes ofthe ellipsoid are parallel to the coordinate axes and the ellipsoid has theequation (w − u)T B−1(w − u) = 1, where B is a diagonal matrix withthe ith element given by bi. If the ellipsoid contains the origin, then foreach random direction c there is a unique distance r to the boundary.An unbiased estimate of P is then given by

Pr(F ≤ r2/p

).

If the ellipsoid does not contain the origin, then, for a random direction,a line from the origin in that direction either intersects the boundary ofthe ellipsoid at two points (say r ≥ r∗) or does not intersect it at all. Ifthe line intersects the boundary, then an unbiased estimate of P is givenby the difference

Pr(F ≤ r2/p

)− Pr

(F ≤ r2

∗/p).

If the line does not intersect the ellipsoid, an unbiased estimate is 0. Asin the first procedure described above, this is repeated for successive ran-dom directions c, each providing an unbiased estimate. The value of Pis then taken as the arithmetic average. A modification of this procedurealong the lines of Somerville [54, 56] is described in Somerville [57].

Somerville [58] provided an application of his methods for multipletesting and comparisons by taking A in (29) to be

A =

x ∈ <p : max cT x ≤ q√

2

, c ∈ B,

where B is the set of contrasts corresponding to the different hypothesesand q > 0. The purpose is to calculate the value of q for arbitrary R

and ν and arbitrary sets B such that the probability content of A hasa preassigned value γ. Somerville and Bretz [60] have written two For-

tran 90 programs (QBATCH4.FOR and QINTER4.FOR) and two SAS-

IML programs (QBATCH4.SAS and QINTER4.SAS) for this purpose.


QINTER4.FOR and QINTER4.SAS are interactive programs, while theother two are batch programs. A compiled version of the Fortran 90

programs that should run on any PC with Windows 95 or later can befound at

http://pegasus.cc.ucf.edu/~somervil/home.html

These programs implement the methodology described above to evaluatethe probability content of A (A Fortran 90 programs MVI3.FOR usedto evaluate multivariate t integrals over any convex region is describedin Somerville [55]. An extended Fortran 90 programs MVELPS.FOR toevaluate multivariate t integrals over any ellipsoidal regions is describedin Somerville [59]. The average running times for the latter programrange from 0.075 and 0.109 second for p = 2 and 3, respectively, to0.379 and 0.843 second for p = 10 and 20, respectively.). The so-called“Brent’s method,” an interactive procedure described in Press [45], isused to solve for the value of q. The time to estimate the q values (witha standard error of 0.01) using QINTER4 or QBATCH4 range from 10seconds for Dunnett’s multiple comparisons procedure to 52 seconds forTukey’s procedure, using a 486-33 processor.

A problem that frequently arises in statistical analysis is to compute(29) when A is a rectangular region, that is,

(31) P =

∫ b1

a1

∫ b2

a2

· · ·∫ bp

ap

f (x1, x2, . . . , xp) dxp · · · dx2 dx1.

Wang and Kennedy [67] employed numerical interval analysis to com-pute P . The method is similar to the approaches of Corliss and Rall [11]for univariate normal probabilities and Wang and Kennedy [66] for bi-variate normal probabilities. The basic idea is to apply the multivariateTaylor expansion to the joint pdf f . Letting cj = (aj + bj)/2, the Taylorexpansion of f at the mid point (c1, c2, . . . , cp) is

(32) f (x1, x2, . . . , xp)

=

m−1∑

k=0

∑

]k[

1

k1! · · · kp!· ∂kf (c1, c2, . . . , cp)

∂xk1

1 ∂xk2

2 · · · ∂xkp

p

p∏

j=1

(xj − cj)kj

+∑

]m[

1

m1! · · ·mp!· ∂mf (ξ1, ξ2, . . . , ξp)

∂xm1

1 ∂xm2

2 · · · ∂xmp

p

p∏

j=1

(xj − cj)mj ,


where ξj is contained in the integration region [aj , bj ] and ]k[ denotesall possible partitions of k into p parts. For example, in the case p = 3,]2[ will result in 6 possible partitions of ‘2’ into k1, k2, k3: 0, 0, 2,0, 1, 1, 0, 2, 0, 1, 0, 1, 1, 1, 0, and 2, 0, 0. The main problemwith computing (32) is the presence of high-order partial derivatives off . Defining

(33) (f)k1k2···kp=

1

k1!k2! · · · kp!· ∂k1+k2+···+kpf

∂xk1

1 ∂xk2

2 · · ·∂xkpp

,

Wang and Kennedy derived the following recursive formula

(f)k1k2···kp= − 1

k1

(1 +

xT R−1x

ν

)−1

×k1∑

l1=0

k2∑

l2=0

· · ·kp∑

lp=0

p + ν

2(k1 − l1) + l1

(f)l1l2···lp

×(

1 +xT R−1x

ν

)

k1−l1,k2−l2,...,kp−lp

.

With regard to the last quadratic term, it should be noted that higherthan second-order partial derivatives are all zero. To carry out the com-putation of (33) for a given (k1, k2, . . . , kp), one can

• first let one lj be kj − 1 (if this kj 6= 1) and all the other lj ’s be theircorresponding kj ’s;

• next let lr and ls be kr − 1 and ks − 1, respectively (if kr 6= 1 andks 6= 1), while all the other lj ’s take their corresponding kj ’s;

• finally, let some lj be kj − 2 (if kj ≥ 2) and all other lj ’s be thecorresponding kj ’s.

The total number of terms that contribute to computing (f)k1k2···kpis at

most p(p+3)/2. Compared to the multivariate normal distribution, thisnumber is larger (Wang and Kennedy [66]). The following table givesthe running times and the accuracy for computing (31) with ν = 10.


Running time and accuracy for computing P in (31)

p Running aj = −0.5 aj = −0.4 aj = −0.3 aj = −0.2time (min) bj = 0.5 bj = 0.4 bj = 0.3 bj = 0.2

10 80 2 sig 4 sig9 70 3 sig 7 sig8 85 0 sig 5 sig 10 sig7 90 3 sig 8 sig6 110 3 sig 8 sig5 180 10 sig

Another point to note about Wang and Kennedy’s method is that whenthe integration region is near the origin it works better for larger ν, whilewhen the integration region is off the origin it works better for smallerν.

The main problem with Wang and Kennedy’s [67] method is that thecalculation times required are too large even for low accuracy results(see the table above). Genz and Bretz [25] proposed a new method forcomputing (31) by transforming the p-variate integrand into a productof univariate integrands. The method is similar to the one used byGenz [24] for the multivariate normal integral.

Letting MMT be the Cholesky decomposition of R, define the fol-lowing transformations

Xj =

p∑

k=1

Mj,kYk,

Yj = Uj

√ν +

∑j−1k=1 Y 2

k

ν + j − 1,

Uj = Tν+j−1 (Zj) ,

and

Zj = dj + Wj (ej − dj) ,

where Tτ denotes the cdf of the univariate Student’s t distribution withdegrees of freedom τ ,

dj = Tν+j−1 (aj) ,

ej = Tν+j−1

(bj

),


aj = a′

j

√ν + j − 1

ν +∑j−1

k=1 y2k

,

bj = b′j

√ν + j − 1

ν +∑j−1

k=1 y2k

,

a′

j =aj −

∑j−1k=1 mj,kYk

mj,j,

and

b′j =bj −

∑j−1k=1 mj,kYk

mj,j.

Applying the above transformations successively, Genz and Bretz re-duced (31) to

P = (e1 − d1)

∫ 1

0

(e2 − d2) · · ·∫ 1

0

(ep − dp)

∫ 1

0

dw(34)

=

∫ 1

0

∫ 1

0

· · ·∫ 1

0

f (w) dw.(35)

The transformation has the effect of flattening the surface of the originalfunction, and P becomes an integral of f(w) = (e1 − d1) · · · (ep − dp)over the (p − 1)-dimensional unit hypercube. Hence, one has improvednumerical tractability and (35) can be evaluated with different multidi-mensional numerical computation methods. Genz and Bretz consideredthree numerical algorithms for this: an acceptance-rejection samplingalgorithm, a crude Monte Carlo algorithm, and a lattice rule algorithm.

• Acceptance-rejection sampling algorithm: Generate p-dimensionaluniform random vectors w1,w2, . . . ,wN and estimate P by

P =1

N

N∑

l=1

h (Myl) ,

where

h (x) =

1, if aj ≤ xj ≤ bj , j = 1, 2, . . . , p,

0, otherwise


and

yl,j = T−1ν+j−1 (wl,j)

√ν +

∑j−1k=1 y2

k

ν + j − 1,

j = 1, 2, . . . , p, l = 1, 2, . . . , N.

• A crude Monte Carlo algorithm: Generate (p−1)-dimensional uniformrandom vectors w1,w2, . . . ,wN and estimate P by

P =1

N

N∑

l=1

f (wl) ,

an unbiased estimator of the integral (35).• A lattice rule algorithm (Joe [32]; Sloan and Joe [49]): Generate (p−

1)-dimensional uniform random vectors w1,w2, . . . ,wN and estimateP by

P =1

Nq

N∑

l=1

q∑

j=1

f

(∣∣∣∣2

j

qz + wl

− 1p

∣∣∣∣)

.

Here N is the simulation size, usually very small, q corresponds to thefineness of the lattice, and z ∈ <p−1 denotes a strategically chosenlattice vector. Braces around vectors indicate that each componenthas to be replaced by its fractional part. One possible choice of z

follows the good lattice points; see, for example, Sloan and Joe [49].

For all three algorithms—to control the simulated error—one may usethe usual error estimate of the means. Perhaps the most intuitive oneof the three is the acceptance-rejection method. However, Deak [13]showed that, among various methods, it is the one with the worst ef-ficiency. Genz and Bretz [26] proposed the use of the lattice rule al-gorithm. Bretz et al. [9] provided an application of this algorithm formultiple comparison procedures.

The method of Genz and Bretz [25] described above also includes anefficient evaluation of probabilities of the form

P =

∫ b

a

g(x)f(x) dx,

where g(x) is some nuisance function. Fortran and SAS-IML codes toimplement the method for p ≤ 100 are available from the Web sites withURLs


http://www.bioinf.uni-hannover.de/~betz/

and

http://www.sci.wsu.edu/math/faculty/genz/homepage.

12 Probabilities of linear inequalities Let X be a random vari-able characterizing the “load,” and let Y be a random variable determin-ing the “strength” of a component. Then the probability that a systemis “trouble-free” is Pr(Y > X). In a more complicated situation, theoperation of the system may depend on a linear combination of randomvectors, say aT

1 X1 + aT2 X2 + b, and the probability of a trouble-free

operation will be

(36) Pr(aT

1 X1 + aT2 X2 + b > 0

),

where Xj are independent kj-dimensional random vectors, aj are kj-dimensional constant vectors, and b is a scalar constant. Abusev andKolegova [2] studied the problem of constructing unbiased, maximumlikelihood, and Bayesian estimators of the probability (36) when Xj isassumed to have the multivariate t distribution with mean vector µj andcorrelation matrix Rj . If x11, . . . ,x1n1

and x21, . . . ,x2n2are iid samples

from the two multivariate t distributions, then – in the where case bothµj and Rj are unknown—it was established that the unbiased and themaximum likelihood estimators are

Pr(aT

1 X1 + aT2 X2 + b > 0

)=

Γ (n1/2)Γ (n2/2)

πΓ ((n1 − 1)/2)Γ ((n2 − 1)/2)

×∫

Ω1

2∏

j=1

(1 − ν2

j

)(nj−3)/2dν1dν2

and

Pr(aT

1 X1 + aT2 X2 + b > 0

)= Φ

(aT

1 xn1+ aT

2 xn2+ b√

aT1 Sn1+1a1 + aT

2 Sn2+1a2

),

respectively, where

Ω1 =

ν2

j < 1, j = 1, 2,

2∑

j=1

νj

√nja

Tj Snj+1aj +

2∑

j=1

aTj xj + b > 0

,


xnj=

1

nj

nj∑

m=1

xm, (nj + 1)xj =

nj+1∑

m=1

xjm,

(nj + 1)Snj+1 =

nj+1∑

m=1

(xjm − xj) (xjm − xj)T

,

and xnj+1 = x. A Bayesian estimator of (36) with unknown parametersµj and Rj and the Lebesgue measure p(θ) dθ = dµ dR was calculatedto be

PrB

(aT

1 X1 + aT2 X2 + b > 0

)

=

2∏

j=1

Γ ((nj − kj)/2) nkj

j

πΓ ((nj − 1)/2) (nj + 1)kj

2∏

j=1

Γ ((nj − kj − 1)/2)

Γ ((nj − 2kj − 1)/2)

×∫

Ω2

2∏

j=1

(1 − z2

j

)nj−3

2 dz1 dz2,

where

Ω2 =

z2

j < 1, j = 1, 2,

2∑

j=1

zj

√nja

Tj Snj+1aj +

2∑

j=1

aTj xj + b > 0

.

This Bayesian estimator is biased and is related to the unbiased estima-tor via the relation

PrB

(aT

1 X1 + aT2 X2 + b > 0

)= APr

(aT

1 X1 + aT2 X2 + b > 0

),

where

A =

2∏

j=1

Γ ((nj − kj − 1)/2)Γ ((nj − k)/2)

Γ (nj − 2kj − 1)/2)Γ (nj/2)· (nj + 1)

kj

nkj

j

.

The coefficient A can be expanded as

A = 1 +k

n− k

n − k+ O

(1

n2

),

where n = max(n1, n2) and k = max(k1, k2). Therefore, the Bayesianestimator is asymptotically unbiased as n → ∞.

Substantial literature is now available on problems concerning prob-abilities of the form (36) for various distributions. For a comprehensiveand up-to-date summary, the reader is referred to Kotz et al. [38].


13 Maximum probability content Let X be a bivariate randomvector with the joint pdf of the form

(37) f (x) = g((x − µ)

TR−1 (x − µ)

),

which, of course, includes the bivariate t pdf. Consider the class ofrectangles

R(a) =

(x1, x2) : |x1| ≤ a, |x2| ≤

λ

4a

with the area equal to λ. Kunte and Rattihalli [40] studied the problemof characterizing the region R in this class for which the probabilityP (R(a)) = Pr(X ∈ R(a)) is maximum. As noted in Rattihalli [46], thecharacterizations of such regions is useful for obtaining Bayes regionalestimators when (i) the decision space is the class of rectangular regionsand (ii) the loss function is a linear combination of the area of the regionand the indicator of the noncoverage of the region. It was shown that,for any fixed λ > 0, the maximal set is

(x1, x2) : |x1 − µ1| ≤ c, |x2 − µ2| ≤

λ

4c

,

where c is given by

c =

√λ

4

√r22

r11.

Here, rij denotes the (i, j)th element of the inverse of R. In particular,if µ = 0, r12 = r21 = ρ and | ρ |< 1 in (37), then P (R(a)) is increasingfor a <

√λ/2 and is decreasing for a >

√λ/2.

14 Monte Carlo evaluation Let X be a central p-variate t ran-dom vector with correlation matrix R and degrees of freedom ν. Vijver-berg [63] developed a family of simulators of the multivariate t proba-bility p = Pr(X ≤ X0) based on Monte Carlo simulation and recursiveimportance sampling. We shall provide the basic steps of this rathercomplicated but powerful procedure.

Define Z = AX, where A is an upper triangular matrix such thatAT A = R. Then it is well known that the pdf of Z can be expressed asa product of univariate Student’s t pdfs

f(z) =

p∏

k=1

f1

(zk; σ2

k, νk

),


where

νk = ν + p − k, k = 1, 2, . . . , p,

σ2k =

ν + y2k+1 + · · · + y2

p

νk, k = 1, 2, . . . , p − 1,

σ2n = 1,

and

f1

(x; σ2, ν

)=

Γ ((ν + 1)/2)√π σΓ (ν/2)

[1 +

1

ν

x2

σ2

]−(ν+1)/2

.

We shall denote by F1(x; σ2, ν) the cdf corresponding to f1. For conve-nience denote A−1 = B = bij , where B is an upper triangular matrixwith bpp = 1 and bjj > 0 for all j. Then, since the integral over X coversthe region X ≤ X0, the integral over Z is determined by the inequalityBZ < X0, and the bounds can be written as

zp < xp0 ≡ zn0

and

zk < b−1kk

(xk0 −

p∑

i=k+1

bkizi

)≡ zk0 (xk0, zk+1, . . . , zp)

for k = 1, 2, . . . , p − 1. Utilizing this transformation, the probability pcan be written as p = Jp, where

Jk =

∫ yk0

−∞

f1

(zk; σ2

k, νk

)Jk−1dzk(38)

=

∫ zk0

−∞

F1

(zk0

; σ2k , νk

)Jk−1f

c1

(zk; σ2

k, νk

)dzk

= Efc1

[F1

(zk0

; σ2k, νk

)Jk−1

], k = 2, 3, . . . , p,

where

f c1

(zk; σ2

k, νk

)=

f1

(zk; σ2

k, νk

)

F1 (zk0; σ2k, νk)

is the univariate unconditional t pdf for zk ≤ zk0 and J1 = F1(z10; σ21 , ν1).

Hence, Jk is the probability over the range of (z1, . . . , zk) conditional onthe values for (zk+1, . . . , zp).


The Monte Carlo simulation starts off by drawing random values ofzp from the distribution f c

1 (·; zp0, σ2p, νp), which we shall denote by zp,r,

r = 1, . . . , R. Each of these yields a different bound zp−1,0,r and pa-rameter value σ2

p−1,r for each draw of zp−1; zp−1,r is then drawn fromthe distribution f c

1 (·; zp−1,0,r, σ2p−1,r, νp−1). This process continues until

z2,r is drawn and J1 = F1(z10,r; σ21,r, ν1) is computed with a commonly

available approximation routine for the univariate Student’s t cdf. Thesimulated estimate p of p is then found as the sample average of the Jp

values across the simulated sample of R elements

p ≡ Jp =1

R

R∑

r=1

F1

(zp0,r; σ

2p,r, νp

)Jp−1,

where Jk = F1(zk0,r ; σ2k,r, νk)Jk−1 for k = 2, . . . , p−1. It is more efficient

to estimate Jp by averaging over a large number of elements than toobtain close approximations of its components Jk for k < p. Therefore,a better estimate for p is

p =1

R

R∑

r=1

p∏

k=1

F1

(zk0,r; σ

2k,r, νk

).

The right-hand side of (38) remains unchanged if the integrand isdivided or multiplied by any nonzero function of z. Let gp be a p-dimensional pdf such that

gp(z; ν) =

p∏

k=1

g1

(zk; τ2

k

),

where g1 is a univariate pdf of a type to be mentioned below withV ar(zk) = τ2

k = σ2kνk/(νk − 2), and σ2

k and νk are as defined above.Let G1(zk; τ2

k ) be the associated cdf, and let

gc1

(zk; zk0, τ

2k

)=

g1

(zk; τ2

k

)

G1 (zk0; τ2k )

be the conditional pdf. Finally, let

gcp (z; z0, ν) =

p∏

k=1

gc1

(zk; zk0, τ

2k

).


With these definitions, one can write p = Jp in terms of

Jk =

∫ zk0

−∞

f1

(zk; σ2

k, νk

)

gc1 (zk; zk0, τ2

k )Jk−1 gc

1

(zk; zk0, τ

2k

)dzk

= Egc1

[f1

(zk; σ2

k, νk

)

gc1 (zk; zk0, τ2

k )Jk−1

], k = 2, . . . , p,

and J1 = F (z10; σ21 , ν1). Clearly, gc

p, and, more particularly, gc1, is an

important sampling density (see, for example, Hammersley and Hand-scomb [30]). To evaluate p, the procedure is as follows: Generate ran-dom drawings zp,r for r = 1, . . . , R from the distribution gc

1(·; zn0, τ2n);

compute the implied values zp−1,0,r and τ2p−1,r for each drawing of zp−1;

draw zp−1,r from the distribution gc1(·; zp−1,0,r, τ

2p−1,r, νp); and continue

on until z2,0,r is drawn and J1 is computed. Based on this procedure, pmay be written in the form

p =1

R

R∑

r=1

(F1

(z10,r; τ

21,r

) p∏

k=1

f1

(zk,r; zk0,r, σ

2k,r , νk

)

gc1

(zk,r; zk0,r, τ2

k,r

))

.

Three suitable choices for the importance density function are

• the logit with

g1(x) =λ

τq(1 − q),

whereq = [1 + exp (−λx/τ)]

−1

and λ = π/√

3;• transformed beta (2, 2) density (Vijverberg [62]) with

g1(x) = 6z2(1 − z)2,

where

z =exp (x/σ)

1 + exp (x/σ);

• the normal N(0, σ2) density.

Vijverberg [64, 65] has developed a new family of simulators thatextends the above research on the simulation of high-order probabilities.For instance, Vijverberg [65] has reported that the gain in precisionusing the new family translates into a 40% savings in computationaltime.


REFERENCES

1. M. Abramowitz and I.A. Stegun, Handbook of Mathematical Functions, Dover,New York, 1965.

2. R. A. Abusev and N.V. Kolegova, On estimation of probabilities of linearinequalities for multivariate t distributions, Journal of Mathematical Sciences103 (2001), 542–546.

3. D. E. Amos, Evaluation of some cumulative distribution functions by numer-ical evaluation, SIAM Review 20 (1978), 778–800.

4. D. E. Amos and W. G. Bulgren, On the computation of a bivariate t distribu-tion, Mathematics and Computation 23 (1969), 319–333.

5. R. E. Barlow, D. J. Bartholomew, J. M. Bremner and H. D. Brunk, StatisticalInference under Order Restrictions, John Wiley and Sons, Chichester, 1972.

6. R. Bohrer, A multivariate t probability integral, Biometrika 60 (1973), 647–654.7. R. Bohrer and G. K. Francis, Sharp one-sided confidence bounds over positive

regions, Annals of Mathematical Statistics 43 (1972), 1541–1548.8. R. Bohrer, M. Schervish and J. Sheft, Algorithm AS 184: Non-central stu-

dentized maximum and related multiple-t probabilities, Applied Statistics 31

(1982), 309–317.9. F. Bretz, A. Genz and L. A. Hothorn, On the numerical availability of multiple

comparison procedures, Biometrical Journal 43 (2001), 645–656.10. J. H. Cadwell, The bivariate normal integral, Biometrika 38 (1951), 475–481.11. G. F. Corliss and L. B. Rall, Adaptive, self-validating numerical quadrature,

SIAM Journal on Scientific and Statistical Computing 8 (1987), 831–847.12. H. Cramer, Mathematical Methods of Statistics, Princeton University Press,

1951.13. I. Deak, Random Number Generators and Simulation, Akademiai Kiado, Bu-

dapest, 1990.14. C. W. Dunnett and M. Sobel, A bivariate generalization of Student’s t-distribu-

tion with tables for certain special cases, Biometrika 41 (1954), 153–169.15. J. E. Dutt, A representation of multivariate normal probability integrals by

integral transforms, Biometrika 60 (1973), 637–645.16. J. E. Dutt, On computing the probability integral of a general multivariate t,

Biometrika 62 (1975), 201–205.17. R. A. Fisher, Expansion of Student’s integral in power of n

−1, Metron 5 (1925),109.

18. Y. Fujikoshi, Error bounds for asymptotic expansions of scale mixtures of dis-tributions, Hiroshima Mathematics Journal 17 (1987), 309–324.

19. Y. Fujikoshi, Non-uniform error bounds for asymptotic expansions of scalemixtures of distributions, Journal of Multivariate Analysis 27 (1988), 194–205.

20. Y. Fujikoshi, Error bounds for asymptotic expansions of the maximums of themultivariate t- and F -variables with common denominator, Hiroshima Math-ematics Journal 19 (1989), 319–327.

21. Y. Fujikoshi, Error bounds for asymptotic approximations of some distributionfunctions, in Multivariate Analysis: Future Directions, (C. R. Rao, ed.), North-Holland, Amsterdam, 1993, 181–208.

22. Y. Fujikoshi and R. Shimizu, Error bounds for asymptotic expansions of scalemixtures of univariate and multivariate distributions, Journal of MultivariateAnalysis 30 (1989), 279–291.

23. Y. Fujikoshi and R. Shimizu, Asymptotic expansions of some distributions andtheir error bounds–the distributions of sums of independent random variablesand scale mixtures, Sugaku Expositions 3 (1990), 75–96.


24. A. Genz, Numerical computation of the multivariate normal probabilities, Jour-nal of Computational and Graphical Statistics 1 (1992), 141–150.

25. A. Genz and F. Bretz, Numerical computation of multivariate t probabilitieswith application to power calculation of multiple contrasts, Journal of Statis-tical Computation and Simulation 63 (1999), 361–378.

26. A. Genz and F. Bretz, Methods for the computation of multivariate t-probabili-ties, Journal of Computational and Graphical Statistics (2001).

27. S. S. Gupta, Probability integrals of multivariate normal and multivariate t,Annals of Mathematical Statistics 34 (1963), 792–828.

28. S. S. Gupta, S. Panchapakesan and J. K. Sohn, On the distribution of thestudentized maximum of equally correlated normal random variables, Commu-nications in Statistics—Simulation and Computation 14 (1985), 103–135.

29. S. S. Gupta and M. Sobel, On a statistic which arises in selection and rankingproblems, Annals of Mathematical Statistics 28 (1957), 957–967.

30. J. M. Hammersley and D. C. Handscomb, Monte Carlo Methods, Methuen &Co. Ltd, London, 1964.

31. D. R. Jensen, Closure of multivariate t and related distributions, Statistics andProbability Letters 20 (1994), 307–312.

32. S. Joe, Randomization of lattice rules for numerical multiple integration, Jour-nal of Computational and Applied Mathematics 31 (1990), 299–304.

33. S. John, On the evaluation of the probability integral of the multivariate t

distribution, Biometrika 48 (1961), 409–417.34. S. John, Methods for the evaluation of probabilities of polygonal and angular

regions when the distribution is bivariate t, Sankhya A 26 (1964), 47–54.35. S. John, On the evaluation of probabilities of convex polyhedra under multi-

variate normal and t distributions, Journal of the Royal Statistical SocietyB 28 (1966), 366–369.

36. Z. Kopal, Numerical Analysis, Chapman and Hall, London, 1955.37. S. Kotz, N. Balakrishnan and N. L. Johnson, Continuous Multivariate Distri-

butions, Volume 1: Models and Applications, second edition John Wiley andSons, New York, 2000.

38. S. Kotz, Y. Lumelskii and M. Pensky, The Stress-Strength Model and Its Gen-eralizations: Theory and Applications, World Scientific Publishing Co., RiverEdge, New Jersey, 2003.

39. A. Kudo, A multivariate analogue of the one-sided test, Biometrika 50 (1963),403–418.

40. S. Kunte and R. N. Rattihalli, Rectangular regions of maximum probabilitycontent, Annals of Statistics 12 (1984), 1106–1108.

41. C. Nicholson, The probability integral for two variables, Biometrika 33 (1943),59–72.

42. D. B. Owen, Tables for computing bivariate normal probabilities, Annals ofMathematical Statistics 27 (1956), 1075–1090.

43. K. Pearson, On non-skew frequency surfaces, Biometrika 15 (1923), 231.44. K. Pearson, Tables for Statisticians and Biometricians, Part II, Cambridge

University Press for the Biometrika Trust, London, 1931.45. S. J. Press and J. E. Rolph, Empirical Bayes estimation of the mean in a mul-

tivariate normal distribution, Communications in Statist. Theory and Methods15 (1986), 2201–2228.

46. R. N. Rattihalli, Regions of maximum probability content and their applica-tions, Ph.D. Thesis, University of Poona, India, 1981.

47. K. C. Seal, On a class of decision procedures for ranking means, Institute ofStatistics Mimeograph Series No. 109, University of North Carolina at ChapelHill, 1954.


48. R. Shimizu and Y. Fujikoshi, Sharp error bounds for asymptotic expansions ofthe distribution functions for scale mixtures, Annals of the Institute of Statis-tical Mathematics 49 (1997), 285–297.

49. I. H. Sloan and S. Joe, Lattice Methods for Multiple Integration, ClarendonPress, Oxford, 1994.

50. P. N. Somerville, Simultaneous confidence intervals (General linear model),Bulletin of the International Statistical Institute 2 (1993), 427–428.

51. P. N. Somerville, Exact all-pairwise multiple comparisons for the general linearmodel, in Proceedings of the 25th Symposium on the Interface, ComputingScience and Statistics (1993), (Interface Foundation, Virginia), 352–356.

52. P. N. Somerville, Simultaneous multiple orderings, Technical Report TR-93-1(1993), Department of Statistics, University of Central Florida, Orlando.

53. P. N. Somerville, Multiple comparisons, Technical Report TR-94-1 (1994), De-partment of Statistics, University of Central Florida, Orlando.

54. P. N. Somerville, Multiple testing and simultaneous confidence intervals: Cal-culation of constants, Computational Statistics and Data Analysis 25 (1997),217–233.

55. P. N. Somerville, A Fortran 90 program for evaluation of multivariate normaland multivariate-t integrals over convex regions, Journal of Statistical Software(1998), http://www.stat.ucla.edu/journals/jss/v03/i04.

56. P. N. Somerville, Numerical computation of multivariate normal and mul-tivariate t probabilities over convex regions, Journal of Computational andGraphical Statistics 7 (1998), 529–544.

57. P. N. Somerville, Numerical evaluation of multivariate integrals over ellipsoidalregions, Bulletin of the International Statistical Institute (1999).

58. P. N. Somerville, Critical values for multiple testing and comparisons: onestep and step down procedures, Journal of Statistical Planning and Inference82 (1999), 129–138.

59. P. N. Somerville, Numerical computation of multivariate normal and multi-variate t probabilities over ellipsoidal regions, Journal of Statistical Software(2001), http://www.stat.ucla.edu/www.jstatsoft.org/v06/i08.

60. P. N. Somerville and F. Bretz, Fortran 90 and SAS-IML programs for compu-tation of critical values for multiple testing and simultaneous confidence in-tervals, Journal of Statistical Software (2001), http://www.stat.ucla.edu/www.jstatsoft.org/v06/i05.

61. F. E. Steffens, Power of bivariate studentized maximum and minimum modulustests, Journal of the American Statistical Association 65 (1970), 1639–1644.

62. W. P. M. Vijverberg, Monte Carlo evaluation of multivariate normal probabil-ities, Journal of Econometrics (1995).

63. W. P. M. Vijverberg, Monte Carlo evaluation of multivariate Student’s t prob-abilities, Economics Letters 52 (1996), 1–6.

64. W. P. M. Vijverberg, Monte Carlo evaluation of multivariate normal probabil-ities, Journal of Econometrics 76 (1997), 281–307.

65. W. P. M. Vijverberg, Rectangular and wedge-shaped multivariate normal prob-abilities, Economics Letters 68 (2000), 13–20.

66. O. Wang and W. J. Kennedy, Comparison of algorithms for bivariate nor-mal probability over a rectangle based on self-validating results from intervalanalysis, Journal of Statistical Computation and Simulation 37 (1990), 13–25.

67. O. Wang and W. J. Kennedy, Application of numerical interval analysis to ob-tain self-validating results for multivariate probabilities in a massively parallelenvironment, Statistics and Computing 7 (1997), 163–171.

68. H. P. Wynn and P. Bloomfield, Simultaneous confidence bands for regressionanalysis (with discussion), Journal of the Royal Statistical Society B 33 (1997),202–217.


Department of Statistics, University of Nebraska, Lincoln, NE 68583E-mail address: [email protected]

Department of Engineering Management and Systems Engineering,The George Washington University, Washington, D.C. 20052

PROBABILITY INTEGRALS OF THE MULTIVARIATE DISTRIBUTION · 2006-07-08 · Application of...

Documents

Transcript of PROBABILITY INTEGRALS OF THE MULTIVARIATE DISTRIBUTION · 2006-07-08 · Application of...