Generalized conditioning based approaches to computing...

26
manuscript No. (will be inserted by the editor) Generalized conditioning based approaches to computing confidence intervals for solutions to stochastic variational inequalities Michael Lamm · Shu Lu Received: date / Accepted: date Abstract Stochastic variational inequalities (SVI) provide a unified framework for the study of a general class of nonlinear optimization and Nash-type equilibrium problems with uncertain model data. Often the true solution to an SVI cannot be found directly and must be approximated. This paper considers the use of a sample average approximation (SAA), and proposes a new method to compute confidence intervals for individual components of the true SVI solution based on the asymptotic distribution of SAA solutions. We estimate the asymptotic distribution based on one SAA solution instead of generating multiple SAA solutions, and can handle inequal- ity constraints without requiring the strict complementarity condition in the standard nonlinear programming setting. The method in this paper uses the confidence regions to guide the selection of a single piece of a piecewise linear function that governs the asymptotic distribution of SAA solutions, and does not rely on convergence rates of the SAA solutions in probability. It also provides options to control the computation procedure and investigate effects of certain key estimates on the intervals. Mathematics Subject Classification (2010) 90C33 · 90C15 · 65K10 · 62F25 1 Introduction Variational inequalities (VI) have been used as a tool to study the mathematical struc- ture and develop algorithms for a general class of nonlinear optimization and Nash- type equilibrium problems. For a comprehensive treatment on this subject see the book [11]. For the various applications of variational inequalities, see [3,13,14,17, M. Lamm SAS Institute Inc, SAS Campus Drive Cary, NC 27513 E-mail: [email protected] S. Lu Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, 355 Hanes Hall, CB#3260, Chapel Hill, NC 27599-3260 E-mail: [email protected] Manuscript &OicN KHrH tR YiHZ OinNHG 5HIHrHncHs 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Transcript of Generalized conditioning based approaches to computing...

Page 1: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

manuscript No.

(will be inserted by the editor)

Generalized conditioning based approaches to computing

confidence intervals for solutions to stochastic variational

inequalities

Michael Lamm · Shu Lu

Received: date / Accepted: date

Abstract Stochastic variational inequalities (SVI) provide a unified framework forthe study of a general class of nonlinear optimization and Nash-type equilibriumproblems with uncertain model data. Often the true solution to an SVI cannot befound directly and must be approximated. This paper considers the use of a sampleaverage approximation (SAA), and proposes a new method to compute confidenceintervals for individual components of the true SVI solution based on the asymptoticdistribution of SAA solutions. We estimate the asymptotic distribution based on oneSAA solution instead of generating multiple SAA solutions, and can handle inequal-ity constraints without requiring the strict complementarity condition in the standardnonlinear programming setting. The method in this paper uses the confidence regionsto guide the selection of a single piece of a piecewise linear function that governs theasymptotic distribution of SAA solutions, and does not rely on convergence rates ofthe SAA solutions in probability. It also provides options to control the computationprocedure and investigate effects of certain key estimates on the intervals.

Mathematics Subject Classification (2010) 90C33 · 90C15 · 65K10 · 62F25

1 Introduction

Variational inequalities (VI) have been used as a tool to study the mathematical struc-ture and develop algorithms for a general class of nonlinear optimization and Nash-type equilibrium problems. For a comprehensive treatment on this subject see thebook [11]. For the various applications of variational inequalities, see [3,13,14,17,

M. LammSAS Institute Inc, SAS Campus Drive Cary, NC 27513E-mail: [email protected]

S. LuDepartment of Statistics and Operations Research, University of North Carolina at Chapel Hill, 355 HanesHall, CB#3260, Chapel Hill, NC 27599-3260E-mail: [email protected]

Manuscript Click here to view linked References

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

This is a post-peer-review, pre-copyedit version of an article published in Mathematical Programming. The final authenticated version is available online at: https://link.springer.com/article/10.1007/s10107-018-1279-z
Page 2: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

2 Michael Lamm, Shu Lu

18,21,35] and references therein. A solution to the variational inequality defined bya closed convex set S ⇢Rn and a function f : S !Rn is a point x 2 S that satisfies theinclusion � f (x) 2 NS(x), where NS(x)⇢ Rn is the normal cone to S at x:

NS(x) = {v 2 Rn| hv,s� xi 0 for each s 2 S}.

For many problems of interest, data defining the problem are subject to uncertainty.Several formulations of stochastic variational inequalities (SVI) have been proposedto incorporate uncertain model data into the VI framework, including the expectedvalue formulation considered in [20,24,25,46,52] , the expected residual minimiza-tion formulation studied in [1,4,6,7,12,33,54], and two-stage or multistage SVI for-mulations [5,42].

In this paper we consider the expected value formulation of an SVI. To give itsdefinition, let (W ,F ,P) be a probability space, and x be a d-dimensional randomvector defined on W and supported on a closed subset X of Rd . Let O be an opensubset of Rn and F be a measurable function from O⇥X to Rn, such that for eachx 2 O the expectation EkF(x,x )k < •. Let f0(x) = E[F(x,x )] for each x 2 O. TheSVI problem is to find a point x 2 S\O that satisfies

0 2 f0(x)+NS(x). (1)

The SVI in the formulation (1) represents the first order necessary conditions of alarge class of single stage stochastic optimization and stochastic equilibrium prob-lems. This SVI formulation is closely related to statistical estimation such as M-estimation and has been used to study statistical inference for sparse penalized re-gression; see [23,25,32,49,51,53] and references therein.

Throughout this paper we assume that the set S defining the SVI (1) is a polyhe-dral convex set in Rn. A solution to (1) will be referred to as a true solution and willbe denoted by x0. The expectation used to define the function f0 often cannot be eas-ily evaluated, and the true solution x0 must be approximated. This paper considers theuse of a sample average approximation (SAA) of (1) defined formally in §2, and thecomputation of confidence intervals for individual components of the true solutionbased on the solution to an SAA problem.

1.1 Motivation and related work

For the extensive study on stability and inference for stochastic optimization andstochastic variational inequalities, see [2,8,10,20,25,27,28,36,43,45–47,50,52] andreferences therein. Our work aims to complement the existing literature with the fol-lowing features: (1) our interest is in confidence regions and intervals for the true SVIsolution; (2) we estimate the asymptotic distribution based on a SAA solution with-out generating multiple SAA solutions; (3) we can handle inequality constraints andallow the true solution to be degenerate in the sense that the corresponding normalmap solution of (1), defined formally in §2, is allowed to lie on the boundary of acell in the normal manifold, in other words, the true solution does not need to satisfystrict complementarity in the standard nonlinear programming setting.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 3: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 3

A general polyhedral convex set, if not affine, has a nonsmooth boundary thatincludes all of its proper faces. A consequence of such nonsmoothness is a piecewiselinear structure in the asymptotic distribution of solutions to the SAA problems whenthe true solution is degenerate in the sense described above. Traditional approachesto performing inference are not applicable due to this piecewise structure, and alter-native techniques are needed for developing and analyzing new inferential methods.

In [26,29–31] the authors have applied sensitivity analysis of constrained op-timization and variational inequalities to exploit properties of the piecewise linearstructure to construct asymptotically exact confidence regions and intervals. In [26],methods were proposed for computing generally valid individual confidence intervalsfor solutions to the SVI and its normal map formulation. These methods addressed thechallenges posed by the piecewise structure of the SAA solutions’ asymptotic distri-bution by using a conditioning based approach described in §2. However, the methodsdeveloped in [26] face two limitations. First, they require exponential convergencerates of the SAA solutions to obtain a consistent estimator of a certain piecewise lin-ear function needed to compute the confidence intervals. Second, the computationof the confidence intervals can be sensitive to misestimation of the piecewise struc-ture of the asymptotic distribution. Addressing these limitations serves as the primarymotivation for the methods proposed in this paper.

1.2 Contributions

The methods proposed in this paper generalize the conditioning based approaches forcomputing individual confidence intervals proposed in [26]. Instead of working withan estimation of a global piecewise linear function as in [26], in this paper we directlytarget the estimation at a single piece of that piecewise linear function. Moreover, themethods that we propose are designed to allow for separate consideration of prop-erties of the estimates for the cone corresponding to the selection function and thematrix representation of the selection function.

By allowing for a separate consideration of properties of these estimates the pro-posed methods can incorporate a greater level of flexibility. As a result, instead ofrequiring the estimate of the selection function to be consistent, we allow an incor-rect estimation to occur under a certain probability and then adjust for the potentialmisestimation during the interval computation stage. The new methods in this papertherefore do not rely on exponential convergence rates of the SAA solutions. Theincreased flexibilities can also be exploited to generate a family of confidence inter-vals to account for different possibilities for the cone and the potentially piecewisestructure and to investigate the sensitivity of the intervals to certain estimates. Theseoptions are especially beneficial at moderate sample sizes when certain asymptoticproperties of the estimates may not be sufficiently realized.

The proposed methods are justified with weak convergence results in Theorems1 and 2. Key to implementing these results is the ability to estimate from the sampledata a cone, or family of cones, that satisfies the conditions of Theorems 1 and 2. InTheorem 3 we show how such a cone can be obtained by using confidence regions

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 4: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

4 Michael Lamm, Shu Lu

for the true solution proposed in [30]. Combined, these results allow the methodsdeveloped in this paper to overcome the limitations of the methods proposed in [26].

The remainder of this paper is organized as follows. In §2 we formally introducethe SAA formulation of (1), the normal map formulation of a variational inequality,and review previous work on the computation of confidence sets for the true solution.Section 3 introduces the new methods for computing individual confidence intervalsand contains the main theoretical results of this paper. In §4 we apply the proposedmethods to a numerical example. For this example we are able to compute the truesolution and evaluate the performance of the proposed methods; we observe that boththe interval coverage rates and the interval lengths are in line with our expectations.

2 Background

In this section we formally define the SAA problem, introduce the normal map for-mulation of variational inequalities, discuss pertinent properties of piecewise linearfunctions, and review previous work on the construction of confidence sets for thetrue solution. Throughout, we use N (µ,S) to denote a normal random vector withmean µ and covariance matrix S , Z ⇠ N (0, In) to denote a standard normal randomvector, and Yn )Y to denote the weak convergence of random variables Yn to Y . Theprobability of an event A is denoted by Pr(A) and the (1�a) ⇤ 100% percentile ofa c

2 random variable with l degrees of freedom is denoted by c

2l (a). We denote

the jth component of a vector x by (x) j and the jth row of a matrix M by (M) j.For a set C ⇢ Rn, we use cone(C) to denote the cone generated by C: that is, the set{lx | l � 0,x 2C}. We use riC to denote the relative interior of C. Finally, k ·k willdenote the norm of an element in a normed space; unless a specific norm is stated itcan be any norm, as long as the same norm is used in all related contexts.

The SAA problem considers a variational inequality where the function f0 in(1) is estimated with a sample average function. Let x

1, . . . ,x N be independent andidentically distributed (i.i.d) random variables with the same distribution as that of x .The sample average function fN(x,w) : O⇥W ! Rn is defined to be

fN(x,w) = N�1N

Âi=1

F(x,x i(w)). (2)

The sample average function gives rise to the SAA problem of finding a point x 2

S\O such that0 2 fN(x)+NS(x). (3)

Throughout this paper, we shall refer to a point that satisfies (3) as an SAA solutionand will denote it by xN . Conditions under which the SAA solutions are known to beconsistent estimators of x0 have been well studied, see [10,19,20,25,46]. In additionto consistency, results about the exponential rate of convergence of SAA solutions inprobability were obtained in [52]; see also [48].

A tool we shall use extensively for analyzing the asymptotic behavior of SAAsolutions and for the computation of confidence intervals is the normal map formula-tion of a variational inequality. Taking f0,S and O as above, the normal map [11,37,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 5: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 5

40,44] induced by f0 and S is a function f nor0,S : P

�1S (O)! Rn defined as

f nor0,S (z) = f0 �PS(z)+ z�PS(z).

Here PS denotes the Euclidian projector onto the set S and f0 �PS is the compositefunction of f0 and PS. A point x 2 O\S is a solution to the SVI (1) only if the pointz = x� f0(x) satisfies

f nor0,S (z) = 0. (4)

When the above equality is satisfied, one also has PS(z) = x. We refer to (4) as thenormal map formulation of (1) and denote its solution by z0. Similarly, we shall usef norN,S to denote the normal map induced by fN and S and denote the solution to the

normal map formulation of (3) by zN .Since S is a polyhedral convex set, the Euclidian projector PS is a piecewise affine

function. For a piecewise affine function y we denote its selection functions by y j,j = 1, . . . ,k. Every piecewise affine function admits a corresponding polyhedral sub-division G = {P1, . . . ,Pk} [44, Proposition 2.2.3]. Such a subdivision is a finite collec-tion of n-dimensional polyhedral sets whose union equals Rn while the intersectionof any distinct sets is either empty or a proper face of the sets. When each set Pj is apolyhedral cone the subdivision is referred to as a conical subdivision. A polyhedralsubdivision G corresponds to the piecewise affine function y if the restriction of y

to each set Pj 2 G agrees with its selection function y j. We will denote the selectionfunction of y corresponding to the set Pj 2 G by y|Pj . When each selection functiony j is linear, y is piecewise linear and the corresponding subdivision is conical.

For the Euclidian projector onto a polyhedral convex set S the correspondingpolyhedral subdivision can be obtained from the n-cells in the normal manifold ofS. The n-cells in the normal manifold of S are constructed from the nonempty facesof S. Let F be a nonempty face of S. Then on the relative interior of F the normalcone to S is a constant cone, denoted as NS(riF), and the corresponding n-cell CFis defined to be CF = F +NS(riF). Each k-dimensional face of an n-cell is called ak-cell in the normal manifold for k = 0,1, . . . ,n. The relative interiors of all cells inthe normal manifold of S form a partition of Rn.

A piecewise affine function y will not have a linear Frechet derivative (F-derivative)at points x such that y(x) = y j(x) = yk(x) for two distinct selections functions y jand yk. While not F-differentiable at all points, a piecewise affine function will beB-differentiable everywhere. A function h : Rn

! Rm is B-differentiable at a point xif there exists a positive homogeneous function H : Rn

! Rm such that

h(x+ v) = h(x)+H(v)+o(v).

The function H is then called the B-derivative of h at x, and we denote it by dh(x).If dh(x) is also linear, then it is the classical F-derivative. For a piecewise affinefunction h with selection functions hi and the polyhedral subdivision G , we can give aspecific representation of dh(x) by writing G (x) = {Pi 2 G |x 2 Pi}, I = {i |Pi 2 G (x)}and G

0(x) = {Ki = cone(Pi � x)|i 2 I}. That is, G (x) is the collection of elements inG that contain x, and G

0(x) is the “globalization” of G (x) along with a shift of theorigin. With this notation, the B-derivative dh(x) is piecewise linear with the family of

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 6: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

6 Michael Lamm, Shu Lu

selection functions given by {dhi(x)|i 2 I} and the corresponding conical subdivisiongiven by G

0(x).For a polyhedral convex set S the form of the B-derivative dPS(x)(·) is also

closely related to the normal manifold of S. First we define the tangent cone to apolyhedral convex set S at x 2 S to be

TS(x) = {v 2 Rn| there exists t > 0 such that x+ tv 2 S},

and the critical cone to S at a point z 2 Rn to be

K(z) = TS(PS(z))\{z�PS(z)}?.

As shown in [39, Corollary 4.5] and [34, Lemma 5], for any point z 2 Rn and anysufficiently small h 2 Rn the equality

PS(z+h) = PS(z)+PK(z)(h) (5)

holds, which impliesdPS(z) = PK(z). (6)

The connection to the normal manifold of S follows from the fact that for all pointsz in the relative interior of a k-cell the critical cone K(z) is a constant cone; see [31,Theorem 8]. The B-derivative dPS(z)(·) is therefore the same function for all z inthe relative interior of a k-cell. For points z and z0 in the relative interior of differentk-cells dPS(z)(·) and dPS(z0)(·) can be quite different. As a result small changesin the choice of z can result in significant changes in the form of dPS(z)(·) and theB-derivative is not continuous with respect to the point z at which it is taken.

Next, we introduce the two assumptions used in this paper. Note that the deriva-tives that appear in the assumptions are the classic F-derivatives.Assumption 1 (a) EkF(x,x )k2 < • for all x 2 O.(b) The map x 7! F(x,x (w)) is continuously differentiable on O for a.e. w 2 W .(c) There exists a square integrable random variable C such that

kF(x,x (w))�F(x0,x (w))k+kdxF(x,x (w))�dxF(x0,x (w))k C(w)kx� x0k,

for a.e. w 2 W .Assumption 1 implies that f0 is continuously differentiable on O with d f0(x) =E [dxF(x,x )], see, e.g., [46, Theorem 7.44]. By the chain rule of B-differentiability,f nor0,S is B-differentiable with its B-derivative at z0 given by

d f nor0,S (z0)(h) = d f0(x0)�dPS(z0)(h)+h�dPS(z0)(h). (7)

For any nonempty compact subset X of O, let C1(X ,Rn) be the Banach space ofcontinuously differentiable mappings f : X ! Rn, equipped with the norm

k fk1,X = supx2X

k f (x)k+ supx2X

kd f (x)k.

In addition to ensuring the differentiability of fN , Assumption 1 also guarantees thealmost sure convergence of the sample average function fN to f0 as an element ofC1(X ,Rn) [46, Theorem 7.48], as well as the weak convergence of

p

N( fN � f0) inC1(X ,Rn) as shown in [31, Theorem 5]. In the rest of this paper, let X be a compactneighborhood of x0 in O.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 7: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 7

Assumption 2 Suppose that x0 solves the variational inequality (1). Let z0 = x0 �

f0(x0), L = d f0(x0), K0 = TS(x0)\{z0 �x0}?, and assume that Lnor

K0is a homeomor-

phism from Rn to Rn, where LnorK0

is the normal map induced by L and K0.

Since x0 solves (1) and z0 is defined as x0 � f0(x0), we have x0 = PS(z0). In viewof (5), PS(z)�x0 coincides with PK0(z� z0) for z in a neighborhood of z0. It followsthat Lnor

K0is exactly d f nor

0,S (z0), the B-derivative of f nor0,S at z0, and that Lnor

K0is a global

homeomorphism if and only if LnorS (·) is a local homeomorphism at z0 (see also [40,

Theorem 5.2]). Moreover, based on the relation between variational inequalities andnormal maps, it can be shown that Lnor

S (·) is a local homeomorphism at z0 if and onlyif there exist neighborhoods X of x0 in Rn and V of 0 in Rn such that the followinglinear variational inequality

v 2 f0(x0)+L(x� x0)+NS(x)

has a unique solution in X for any v 2 V that is Lipschitz continous with respect tov. The latter condition is known as strong regularity, and is a sufficient condition forthe variational inequality (1) to have a locally unique solution in a neighborhood ofx0 under small perturbations of f0 [38,41]. This sufficient condition is also necessary,when the perturbation to f0 includes a parameter on the right-hand-side of (1) [9].In this sense, Assumption 2 is the weakest condition possible to ensure (1) to have alocally unique solution under small perturbations of f0.

It was shown in [37,40,44] that LnorK0

is a global homeomorphism if and onlyif it is coherent oriented, meaning that all matrices representing its linear selectionfunctions are of the same nonzero determinantal sign. In particular, if the restrictionof L on the linear span of K is positive definite, then Lnor

K0is coherent oriented.

To relate the homeomorphism condition in Assumption 2 with other conditionsused in nonlinear programming, consider a nonlinear program in Rn with p inequalityand q equality constraints, with its objective and constraints being twice continuouslydifferentiable. The corresponding Karush-Kuhn-Tucker (KKT) system can be writtenas a variational inequality (1) with S = Rn

⇥Rp+⇥Rq. Suppose that the KKT system

has a solution. If that solution satisfies the linear independence constraint qualifi-cation (LICQ) and a strong second-order sufficient condition, then the homeomor-phism condition holds [9,38]. In particular, Assumption 2 does not require the KKTsolution to satisfy the strict complementarity condition, which requires multiplierscorresponding to active constraints to be nonzero. Under strict complementarity, thenormal map solution z0 corresponding to the KKT solution lies in the interior of acell in the normal manifold of S, K0 is a subspace, Lnor

K0is a linear map, and f nor

0,S isdifferentiable at z0 . By relaxing strict complementarity, we allow z0 to be located onthe boundary of a cell in the normal manifold of S, Lnor

K0to be piecewise linear, and

f nor0,S to be nondifferentiable (but B-differentiable) at z0.

Assumptions 1 and 2 are used in [31, Theorem 7] to show that for a.e. w 2W thereexits an N

w

such that for N � Nw

the SAA problems have locally unique solutions xNand zN that converge to x0 and z0 respectively, with

p

N(zN � z0)) (LnorK0

)�1(Y0), (8)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 8: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

8 Michael Lamm, Shu Lu

andp

N(PS(zN)�PS(z0))) PK0 � (LnorK0

)�1(Y0). (9)

Here, Y0 is a normal random vector in Rn with zero mean and the same covariancematrix as F(x0,x ). For related work on the weak convergence of SAA solutions ofSVIs and optimal solutions to constrained stochastic optimization problems, see, [10,25,46] and references therein.

The above weak convergence results have been used to obtain asymptoticallyexact confidence regions for z0 in [29–31]. Below we review some results that willbe used in the next section. Let S0 be the covariance matrix of Y0 and SN be thesample covariance matrix of {F(xN ,x i)}N

i=1; then SN converges almost surely to S0[30, Lemma 3.6]. Following the notation used earlier, let G (z0) denote the collectionof n-cells in the normal manifold of S that contain z0. [30, Proposition 3.5] says thatunder Assumptions 1 and 2,

limN!•

Pr⇣

zN 2 [Pj2G (z0)ri(Pj)⌘

= 1, (10)

which implies that dPS(zN) is a linear function with high probability even if dPS(z0)is piecewise linear. Accordingly, the B-derivative d f nor

N,S (zN) of f norN,S at zN , defined as

d f norN,S (zN)(h) = d fN(xN)�dPS(zN)(h)+h�dPS(zN)(h),

is a linear function with high probability and therefore not a consistent estimator ofd f nor

0,S (z0) in general.If S0 is nonsingular, then SN is nonsingular for large N almost surely, and the

following set (which is an ellipsoid with high probability)

QN =n

z 2 Rn�

N⇥

d f norN,S (zN)(z� zN)

⇤TS

�1N⇥

d f norN,S (zN)(z� zN)

c

2n (a)

o

satisfies limN!• Pr{z0 2 QN}= 1�a as shown in [30]. In general, one can decom-pose SN as

SN =UTN DNUN

where UN is an orthogonal n⇥ n matrix and DN is a diagonal matrix with monoton-ically decreasing elements. Let r > 0 be the minimum of all positive eigenvalues ofS0 and DN be the upper-left submatrix of DN whose diagonal elements are at leastr/2. Let lN be the number of rows in DN , (UN)1 be the submatrix of UN that consistsof its first lN rows, and (UN)2 be the submatrix that consists of the remaining rows ofUN . It was shown in [30] that for any e > 0 and integer N, the set RN,e defined as

RN,e =

(

z 2 Rn

N⇥

d f norN,S (zN)(z� zN)

⇤T(UN)T

1 D�1N (UN)1

d f norN,S (zN)(z� zN)

c

2lN (a)

k

p

N(UN)2d f norN,S (zN)(z� zN)k• e

)

(11)

satisfies limN!• Pr{z0 2 RN,e}= 1�a . Since the nonsingular case can be treated asa specialization of the singular case with l = n, in the rest of this paper we refer onlyto the singular case and consider regions RN,e .

While d f norN,S (zN) can be used to obtain asymptotically exact confidence regions

as above, it does not lead to asymptotically exact confidence intervals for individual

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 9: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 9

components of z0 because it is not a consistent estimator of d f nor0,S (z0) and therefore

does not provide sufficient information on the distribution of individual componentsof zN . It was shown in [30] that the following interval

(zN) j ±

v

u

u

t

c

21 (a)

d f norN,S (zN)�1

SNd f norN,S (zN)�T

j, j

N(12)

is not guaranteed to be asymptotically exact without more restrictive assumptions onthe true solution z0.

Methods for computing asymptotically exact individual confidence intervals for(z0) j and (x0) j are proposed in [26] by using a consistent estimator for d f nor

0,S (z0) de-veloped in [29]. To reduce the amount of computation associated with the piecewisestructure, the authors propose a conditioning-based approach in [26] to eliminate cal-culations for irrelevant regions. This is done by defining two functions h

a

j (·, ·) andha

j (·, ·, ·) to be used to compute confidence intervals for (z0) j and (x0) j respectively.To define the function h

a

j (·, ·), let y :Rn!Rn be a piecewise linear homeomorphism

with a family of selection functions {M1, . . . ,Ml} and the corresponding conical sub-division {K1, . . . ,Kl}. For any choice of cone Ki, i = 1, . . . , l, component j = 1, . . . ,nand a 2 (0,1) we first define h

a

j (y,x) for points x 2 intKi as the unique and strictlypositive number satisfying

Pr⇣

|

y

�1(Z)�

j | h

a

j (y,x), y

�1(Z) 2 Ki

= (1�a)Pr�

y

�1(Z) 2 Ki�

(13)

where Z denotes a standard normal random vector and�

y

�1(Z)�

j stands for the jthcomponent of the random variable y

�1(Z). For points x2Tk

s=1 Kis define h

a

j (y,x) =max

s=1,...,kh

a

j (y,xis) where xis 2 intKis . The definition of ha

j (·, ·, ·) is analogous. Let y

and n be piecewise linear functions from Rn to Rn that share a common conical sub-division, {K1, . . . ,Kl}, with n invertible. Then for any choice of cone Ki, i = 1, . . . , l,component j = 1, . . . ,n and a 2 (0,1) define ha

j (y,n ,x) for points x 2 intKi to be

ha

j (y,n ,x) =

inf

(

`� 0�

Pr⇣

|(y�n

�1(Z)) j |` and n

�1(Z)2Ki

Pr(n

�1(Z)2Ki)� (1�a)

)

.(14)

For points x2Tv

s=1 Kis define ha

j (y,n ,x)=maxs=1,...,v ha

j (y,n ,xis) where xis 2 intKis .In [26], the confidence intervals for (z0) j are obtained by evaluating h

a

j (y,x),where y is a transformation of a consistent estimator of d f nor

0,S (z0) so that the distri-bution of y

�1(Z) approximates that ofp

N(zN � z0), and the location of x is deter-mined by zN and that estimator of d f nor

0,S (z0) and is used to identify the cone in theconical subdivision associated with d f nor

0,S (z0) that contains zN �z0. Confidence inter-vals of (x0) j are obtained by evaluations of ha

j which is analogous to h

a

j but moreinvolved. This approach to computing individual confidence intervals is dependenton the consistency of the estimator for d f nor

0,S (z0) and the accurate estimate for thepiecewise structure of the asymptotic distribution of SAA solutions. The consistencyof that estimator is ensured by requiring zN to converge to z0 at an exponential rate,

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 10: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

10 Michael Lamm, Shu Lu

which is in turn guaranteed by replacing Assumption 1 with a stronger assumptionon the moment generating functions of F(X ,x )� f0(x) and dF(X ,x )�d f0(x).

In the next section we will develop new methods that do not use that consistent es-timator and do not require the exponential convergence rate of zN . The conditioning-based approach will be extended with generalizations of h

a

j and ha

j . The new methodsalso provide a natural mechanism for investigating the sensitivity of interval widthsto different estimates for the piecewise structure in the asymptotic distribution of theSAA solutions.

3 New methods on computation of individual confidence intervals

In this section we develop new methods for computing confidence intervals for (z0) jand (x0) j. The basic idea is to estimate the selection function of d f nor

S,0 (z0) that con-tains zN � z0 in its corresponding cone, and then compute confidence intervals basedon that estimate. In contrast to methods in [26] which first obtain an estimate ofthe function d f nor

S,0 (z0) and then identify a selection function of it, the new methodsdirectly target at estimation of that selection function and not the global piecewisestructure. As a result, the new methods are more flexible than the methods in [26].In particular, the new methods do not require the exponential convergence rate ofzN . The new methods also allow a misestimation of the selection function to occurwithin a given probability, and allow for the use of multiple cones to estimate thecone corresponding to the selection function.

Below, Section 3.1 discusses how to estimate the selection function of d f norS,0 (z0),

Section 3.2 introduces the functions h

a

j and ha2j to be used for interval computation

and proves the limiting coverage properties of the intervals, and Section 3.3 providesa specific method to estimate the cone corresponding to the selection function.

3.1 Estimation of a selection function of d f norS,0 (z0)

For a given sample we need to estimate the selection function of d f norS,0 (z0) that con-

tains zN � z0 in its corresponding cone. Using the notation from the previous sectionlet G (z0) = {P1, . . . ,Pl} denote the n-cells in the normal manifold of S that contain z0.Then the cones Ki = cone(Pi � z0), i = 1, . . . , l, make up the elements of the conicalsubdivision associated with d f nor

S,0 (z0). Let G

0(z0) = {K1, . . . ,Kl} and let Ki(zN) denotethe element of G

0(z0) withzN � z0 2 Ki(zN);

by (10) the choice of Ki(zN) is unique with high probability. Let Mi(zN) denote the ma-trix representation for the selection function corresponding to the cone Ki(zN). Thenby definition for all h 2 Ki(zN)

Mi(zN)h = d f nor0,S (z0)(h) = d f0(x0)�dPS(z0)(h)+h�dPS(z0)(h).

The proposed methods will require an estimate MN(w) for Mi(zN) that satisfies

limN!•

Pr

suph2Rn,h6=0

kMNh�Mi(zN)hkkhk

e

!

= 1 (15)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 11: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 11

for all e > 0. To obtain such an estimate MN we make use of the following lemma.

Lemma 1 Suppose that Assumptions 1 and 2 hold. Then the estimate

MN := d f norN,S (zN) = d fN(xN)�dPS(zN)+ In �dPS(zN)

satisfies (15).

Proof Let G (z0) denote the set of n-cells in the normal manifold of S that contain z0,and for each N define the event

AN =n

w

� zN 2 [Pj2G (z0)ri(Pj)o

.

When AN occurs the SAA solution is contained in the interior of an n-cell PN(w) 2G (z0). For any z2PN(w), the restriction of dPS(z) to the cone Kz(w)= cone(PN(w)�z) satisfies dPS(z)|Kz(w) = dPS(zN).

By [30, Proposition 3.5], under Assumptions 1 and 2 limN!• Pr(AN) = 1. There-fore

limN!•

Pr⇣

dPS(z0)|Ki(zN )= dPS(zN)

= 1 (16)

and,

limN!•

Pr

suph2Rn,h6=0

kMNh�Mi(zN)hkkhk

e

!

= limN!•

Pr

suph2Rn,h6=0

kd fN(xN)�dPS(zN)(h)�d f0(x0)�dPS(zN)(h)kkhk

e

!

= 1

where the final equality follows from the a.s. convergence of fN to f0 in C1(X ,Rn).ut

Estimation of the cone Ki(zN) demands more work. Without knowing the rate atwhich zN converges to z0, we do not have an estimate that equals the cone Ki(zN)

with probability converging to one. We therefore relax the conditions required for anestimate of Ki(zN) in two ways. First, we allow Ki(zN) to be estimated with a set ofcones instead of a single cone. Second, we only require that set contain Ki(zN) witha limiting probability greater than a specified lower bound. Formally, our methodswill make use of a family of polyhedral cones of dimension n, denoted by K a1

N , thatsatisfy

liminfN!•

Pr�

Ki(zN) 2 K a1N

� 1�a1. (17)

We will provide a computationally efficient method in Section 3.3 for choosing K a1N

based on an SAA solution zN , and prove that it satisfies (17).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 12: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

12 Michael Lamm, Shu Lu

3.2 Limiting coverage probabilities of confidence intervals

We first give a generalization of the function h

a

j in (13), to be used to compute thewidth of a confidence interval for (z0) j. Let M be an invertible n ⇥ n matrix andK =

K1, . . . , Km

be a collection of polyhedral convex sets of dimension n. Wedefine the function h

a

j (K ,M) as

h

a

j (K ,M) =

inf⇢

`� 0�

Pr(|(M�1Z) j |`, and M�1Z2Ki)Pr(M�1Z2Ki )

� 1�a for all Ki 2 K

.

To compute an confidence interval’s width, the value of M at which we evaluate h

a

jis chosen so that M�1Z provides an estimate for random variable (Lnor

K0)�1(Y0), as

defined in (8), when (LnorK0

)�1(Y0) 2 Ki(zN). The set of cones K at which we evaluateh

a

j provide a set of estimates for Ki(zN). The ability to include multiple cones in Kprovides a natural means for addressing the potential sensitivity of the interval’s widthto misestimation of the piecewise structure in the distribution of (Lnor

K0)�1(Y0).

The following convergence result establishes the desired properties of intervalscomputed using h

a

j . As before, G

0(z0) = {K1, . . . ,Kl} denotes the conical subdivi-sion associated with d f nor

0,S (z0), with d f nor0,S (z0)|Ki = Mi and Ki = cone(Pi � z0) where

P1, . . . ,Pl are the n-cells in the normal manifold of S that contain z0.

Theorem 1 Suppose that Assumptions 1 and 2 hold, that K a1N satisfies (17) for a1 2

(0,1], and that the determinant of S0 is strictly positive. Then for every j = 1, . . . ,nand a2 2 (0,1),

liminfN!•

Pr⇣

p

N|(zN � z0) j| h

a2j

K a1N ,S�1/2

N d f norN,S (zN)

⌘⌘

� 1� (a1 +a2).

Proof For each N define a function F0(zN) : Rn! Rn as

F0(zN)(h) = d fN (PS(zN))�dPS(z0)(h)+h�dPS(z0)(h) for each h 2 Rn, (18)

and define the event AN =�

Ki(zN) 2 K a1N

. Let B be a fixed neighborhood of z0 suchthat B\ (z0 +Ki) = B\Pi for i = 1, . . . , l. Then

liminfN!•

Pr⇣

p

N|(zN � z0) j| h

a2j

K a1N ,S�1/2

N d f norN,S (zN)

⌘⌘

� liminfN!•

l

Âi=1

Pr⇣

p

N|(zN � z0) j| h

a2j

Ki(zN)

,S�1/2N d f nor

N,S (zN)⌘

; AN ; zN 2 B\ intPi

.

When AN holds and zN 2 B\ intPi, we have F0(zN)|Ki(zN) = d f norN,S (zN), and for any

xi 2 int(Ki(zN))

h

a2j

Ki(zN)

,S�1/2N d f nor

N,S (zN)⌘

= h

a2j

S

�1/2N F0(zN),xi

. (19)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 13: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 13

Under Assumptions 1 and 2, for all e > 0,

limN!•

Pr

suph2Rn,h6=0

kF0(zN)(h)�d f nor0,S (z0)(h)k

khk e

!

= 1.

By applying [26, Lemma 3] and the same argument used to show [26, Theorem 6],we find

h

a2j

S

�1/2N F0(zN),xi

) h

a2j

S

�1/20 d f nor

0,S (z0),xi

,

and

limN!•

l

Âi=1

Pr⇣

p

N|(zN � z0) j| h

a2j

(S�1/2N F0(zN),xi

; zN 2 B\ intPi

= 1�a2.

(20)Next we observe that,

liminfN!•

l

Âi=1

Pr⇣

p

N|(zN � z0) j| h

a2j

Ki(zN)

,S�1/2N d f nor

N,S (zN)⌘

; AN ; zN 2 B\ intPi

= liminfN!•

l

Âi=1

h

Pr⇣

p

N|(zN � z0) j| h

a2j

S

�1/2N F0(zN),xi

; AN ; zN 2 B\ intPi

⌘i

� limN!•

l

Âi=1

h

Pr⇣

p

N|(zN � z0) j| h

a2j

S

�1/2N F0(zN),xi

; zN 2 B\ intPi

⌘i

�a1

where the first equality follows from (19), and the inequality follows from the defini-tion of AN and (17). Combining this with (20) proves the theorem.

ut

In the proof of Theorem 1 the nonsingularity assumption about S0 is used to en-sure that the limit (20) is well defined. This assumption can be replaced with thecondition described after [26, Theorem 6] without affecting the convergence result.That condition requires that for all cones Ki 2 G

0(z0) and matrices Mi satisfyingd f nor

0,S (z0)|Ki = Mi the polyhedra Ki are continuity sets with respect to the randomvector (Mi)�1Y0 and that Pr

(M�1i Y0) j = 0

must equal zero.Next, we extend the definition of ha

j in (14) to be used to directly computeconfidence intervals for (x0) j. As above, let M be an invertible n ⇥ n matrix andK =

K1, . . . , Km

be a collection of polyhedral convex sets of dimension n. In ad-dition, let Q be an n⇥n matrix. We define the function ha2

j (K ,Q,M) as

ha2j (K ,Q,M) =

inf⇢

`� 0�

Pr(|(Q) jM�1Z|`, and M�1Z2Ki)Pr(M�1Z2Ki )

� 1�a2 for all Ki 2 K

.

Theorem 2 Suppose that Assumptions 1 and 2 hold, that K a1N satisfies (17) for a1 2

(0,1], and that the determinant of S0 is strictly positive. Then for every j = 1, . . . ,nand a2 2 (0,1),

liminfN!•

Pr⇣

p

N|(xN � x0) j| ha2j

K a1N ,dPS(zN),S

�1/2N d f nor

N,S (zN)⌘⌘

� 1�(a1+a2).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 14: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

14 Michael Lamm, Shu Lu

Proof For each N define F0(zN) : Rn⇥Rn as in (18), and define the event AN =

Ki(zN) 2 K a1N

. As before let B be a fixed neighborhood of z0 such that B\ (z0 +Ki) = B\ Pi for i = 1, . . . , l and let ui 2 intKi. Then using the same arguments inTheorem 1 and [26, Theorem 7] it follows that,

liminfN!•

Pr⇣

p

N|(xN � x0) j| ha2j

K a1N ,dPS(zN),S

�1/2N d f nor

N,S (zN)⌘⌘

� liminfN!•

l

Âi=1

Pr⇣

p

N|(xN � x0) j| ha2j

Ki(zN )

,dPS(zN),S�1/2N d f nor

N,S (zN)⌘

; AN ; zN 2 B\ intPi

= liminfN!•

l

Âi=1

Pr⇣

p

N|(xN � x0) j| ha2j

dPS(z0),S�1/2N F0(zN),ui

; AN ; zN 2 B\ intPi

� limN!•

l

Âi=1

h

Pr⇣

p

N|(xN � x0) j| ha2j

dPS(z0),S�1/2N F0(zN),ui

; zN 2 B\ intPi

⌘i

�a1

� 1� (a1 +a2).

ut

As in the proof of Theorem 1, the assumption that S0 is invertible is used to ensurethat the final limit in the proof of Theorem 2 is well defined, and can similarly be re-laxed to the conditions described after [26, Theorem 7]. By its definition, the functionha

j will return a value of zero when the jth component of dPS(zN) is zero. Under theconditions assumed by Theorem 2, it follows from the a.s. convergence of zN to z0and equations (5), (6), and (16) that

limN!•

Pr⇣

(dPS(zN)) j = 0⌘

= limN!•

Pr⇣

(dPS(zN)) j = 0 and (xN) j = (x0) j

and the value of zero returned by ha

j is asymptotically correct. However, the valueof ha

j may incorrectly equal zero when zN is contained in an n-cell Pk for which thejth component of PS|Pk is zero but z0 62 Pk, an event which occurs with probabilityconverging to zero by (16). In practice, to guard against such cases, when the jthcomponent of an estimate dPS(zN) is equal to zero we replace it by the unit vector e jwhose jth element is one. This replacement does not effect the results of Theorem 2and would return a value of ha

j = h

a

j .To evaluate h

a

j and ha

j the approach described in [26, Section 4] can be applied.This approach requires a search over values of ` > 0 and evaluating probabilities ofthe form

Pr�

|(Q) jM�1Z| `, and M�1Z 2 Ki�

.

Monte Carlo or Quasi-Monte Carlo methods as discussed and implemented in [15,16] can be used to evaluate those probabilities. In practice we have found that thisimplementation provides sufficient accuracy for the evaluation h

a

j and ha

j so long asthe probability

Pr�

M�1Z 2 Ki�

is greater than 10�6 in magnitude.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 15: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 15

3.3 Identification of K a1N

For the methods of Theorems 1 and 2 to be computationally tractable we would liketo limit the number of cones Ki 2 K a1

N . In Theorem 3 below we show that from thesample data we can identify a single cone to be included in K a1

N to satisfy (17). Todo so we use the asymptotically exact confidence regions in (11) to identify a subsetof cells in the normal manifold of S, and select a cell with the lowest dimension fromthis subset. The proof of Theorem 3 does not require SN to be invertible and will usethe following proposition which provides a closed form expression for the simultane-ous confidence regions computed from RN,e . The simultaneous confidence intervalsare computed by finding the minimal axis-aligned bounding box that contains RN,e .When d f nor

N,S (zN) is a linear function, the resulting simultaneous confidence intervalsare given by

(zN)1 �we

N,1,(zN)1 +we

N,1⇤

⇥ . . .⇥⇥

(zN)n �we

N,n,(zN)n +we

N,n⇤

(21)

where we

N, j is the optimal value of the following problem in which (UN)1,(UN)2,DNare defined in Proposition 1.

maximize (w) j (22)

subject to: N⇥

d f norN,S (zN)(w)

⇤T(UN)

T1 D�1

N (UN)1⇥

d f norN,S (zN)(w)

c

2lN (a)

k

p

N(UN)2d f norN,S (zN)(w)k• e.

Proposition 1 Suppose that d f norN,S (zN) is a linear homeomorphism and SN has de-

composition SN =UTN DNUN , where UN is an orthogonal matrix with rows uN,1, . . . ,uN,n

and DN is a diagonal matrix with elements l1 � l2 � . . .� ln. For a choice of lN withllN > 0, let DN be a diagonal matrix with elements l1, . . . ,llN ,

(UN)1 =

2

4

uN,1. . .

uN,lN

3

5 and (UN)2 =

2

6

4

uN,lN+1...

uN,n

3

7

5

.

Then for each j = 1, . . . ,n, the optimal value of (22) is an affine function of e givenby

we

N, j =

s

c

2lN (a)ÂlN

i=1(cN, juTN,i)

2li

N+

e

p

N

n

Âi=lN+1

|cN, juTN,i|, (23)

where cN, j is the jth row of (d f norN,S (zN))�1.

Proof Let VN and TN be the subspaces spanned byn

uTN,1, . . . ,u

TN,lN

o

andn

uTN,lN+1, . . . ,u

TN,n

o

,respectively. Then VN is the orthogonal complement of TN and any vector d f nor

N,S (zN)(w)can be decomposed as

d f norN,S (zN)(w) = v+ t

with v 2VN and t 2 TN . The result (23) can be proved by formulating (22) in terms ofv and t and then separating it into two problems.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 16: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

16 Michael Lamm, Shu Lu

ut

In the proof of Theorem 3 we will use the same notation as in Theorem 1. Withthis notation, the conical subdivision of dPS(z0) is comprised of sets Ki = cone(Pi �

z0) where P1, . . . ,Pl are all n-cells in the normal manifold of S that contain z0. Theelement of the conical subdivision of dPS(z0) that contains zN � z0 for a particularsample shall be denoted by Ki(zN).

Theorem 3 Suppose that Assumptions 1 and 2 hold. Let a1 2 (0,1) and e > 0 befixed. Let RN,e be a (1�a1) ⇤ 100% confidence region for z0 as given in equation(11), PN be an n-cell in the normal manifold of S with zN 2 PN, and CiN be a cell thathas the smallest dimension of all cells that intersect RN,e and PN. Then for ziN 2 ri CiN ,K a1

N = {cone(PN � ziN )} satisfies (17).

Proof Let C1, . . . ,Cm denote all of the cells in the normal manifold of S, and I ={i | z0 62Ci} be the collection of indices for the cells that do not contain z0. For eachcell Ci define the function

di(z) = d(z,Ci) = minx2Ci

kx� zk.

Then mini2I di(z0) = d > 0 since di(z0) > 0 for each i 2 I and I is a finite set.Let Ci0 denote the unique cell that contains z0 in its relative interior. As shown in [31,Proposition 5.1] the cell Ci0 is the cell of lowest dimension to contain z0. Thereforefor any cell Ci with Ci 6= Ci0 and dimension less than or equal to that of Ci0 , z0 62 Ciand i 2 I .

For any i 2 I and z 2Ci,

kzN � zk � kz0 � zk�kzN � z0k � d �kzN � z0k. (24)

Let GN denote the event that d/2 mini2I di(zN). By (24) and the almost sure con-vergence of zN to z0, the probability of GN occurring converges to one.

Next, consider the simultaneous confidence intervals (21) for z0. As in Proposition1 let SN =UT

N DNUN , where UN is an orthogonal matrix with rows uN,1, . . . ,uN,n andDN is a diagonal matrix with elements l1 � l2 � . . . � ln. For each j = 1, . . . ,n, letcN, j be the jth row of d f nor

N,S (zN)�1. Then the half-width we

N, j of the jth interval in(21) is given by (23).

Let wN =(we

N,1, . . . ,we

N, j). Define the event AN to be {kwNk< d/2 and z0 2 RN,e},and let B be a fixed neighborhood of z0 such that B\(z0+Ki) = B\Pi for i = 1, . . . , l.Then,

liminfN!•

Pr�

Ki(zN) = cone(PN � ziN )�

= liminfN!•

l

Âi=1

Pr(Ki = cone(Pi � ziN ); zN 2 B\ intPi)

� liminfN!•

l

Âi=1

Pr(Ki = cone(Pi � ziN ); AN ; GN ; zN 2 B\ intPi)

= liminfN!•

l

Âi=1

Pr(AN ; GN ; zN 2 B\ intPi) .

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 17: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 17

The final equality holds for the following reason. Suppose AN and GN both occur,with zN 2 B\ intPi. Then PN = Pi, and Ci0 intersects with PN because Pi contains z0.When the event AN occurs, for any z 2 RN,e ,

kzN � zk kwNk< d/2.

On the other hand, when GN occurs,

d/2 mini2I

di(zN),

and thus no cell with index i 2 I intersects with RN,e . Therefore each cell thatintersects with RN,e necessarily contains z0 and Ci0 . It follows that Ci0 is the cellof lowest dimension to intersect with RN,e and PN , i.e., Ci0 = CiN . ConsequentlyKi = cone(Pi � z0) = cone(Pi � ziN ).

It follows that

liminfN!•

Pr�

Ki(zN) = cone(PN � ziN )�

� liminfN!•

l

Âi=1

Pr(AN ; GN ; zN 2 B\ intPi)

= liminfN!•

Pr(AN ; GN)

� limN!•

Pr(z0 2 RN,e) = 1�a1

and K a1N = {cone(PN � ziN )} satisfies (17).

ut

When the k-cell that contains z0 in its relative interior is a singleton so that Ci0 ={z0}, we have

limN!•

Pr�

Ki(zN) = cone(PN � ziN )�

= 1�a1

and (17) is satisfied with equality. Alternatively if z0 is contained in the interior of ann-cell

limN!•

Pr�

Ki(zN) = cone(PN � ziN )�

= 1

so in general it is not possible to sharpen the results of Theorem 3. The proof ofTheorem 3 can also be applied to justify the use of simultaneous confidence intervalscomputed from RN,e to identify the set of k-cells from which CiN is chosen. Whenthe set S is a box, working with the simultaneous confidence intervals has the com-putational benefit of allowing us to identify the cell CiN by making n componentwisecomparisons.

In Theorem 3 we use (1�a1) ⇤ 100% confidence regions to identify which cellin the normal manifold of S contains z0 in its relative interior and which cone in theconical subdivision of d f nor

0,S (z0) contains zN � z0. The limiting probability of makingan incorrect choice is bounded above by a1. This is different from the methods of[26] which estimate d f nor

0,S (z0) based on the exponential rate of convergence of zN .The new approach allows us to construct intervals to meet or exceed the (1�a) ⇤100% confidence level, with a = a1 +a2, by balancing between the control on coneestimation error (a1) and the interval noncoverage rate given the correct cone (a2).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 18: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

18 Michael Lamm, Shu Lu

Theorems 1 and 2 provide a general framework for confidence interval compu-tation. In addition to accommodating the balance between the two types of error a1and a2, they also allow for the option of including multiple cones in the set K a1

N .This gives an opportunity to balance between increased empirical coverage and theamount of computation required, which may range from the choice of a single coneas in Theorem 3 to choosing all the cones generated from the faces of PN . In thelater case a1 can be set to zero, but the required computations may be intractable forproblems of even modest size due to the large number of cones involved.

To end this section we illustrate the estimation of Ki(zN) using a single cone withthe following example for a single SAA solution. Let S = R2

+ and

F(x,x ) =

x1 x2x3 x4

x1x2

+

x5x6

where the random vector x is uniformly distributed over the box [0,2]⇥ [0,1]⇥[0,2]⇥ [0,4]⇥ [�1,1]⇥ [�1,1]. For this example f0 has a closed form expressionand the true SVI is given by

0 2

1 1/21 2

x+NR2+(x)

and has true solutions z0 = x0 = 0. The B-derivative d f nor0,S (z0) is a piecewise line

function represented by the matrices

1 1/21 2

,

1 01 2

,

1 1/20 2

, and

1 00 1

,

in each of the orthants R2+, R

⇥R+, R2�

, and R+ ⇥R�

. An SAA problem withN = 200 is given by

0 2

1.0887 0.47121.0357 1.9836

x+

0.03070.0654

+NR2+(x).

The SAA solution is zN = (�0.0307,�0.0654)2R�

⇥R�

=Ki(zN) , and the matricesMi(zN) and MN are equal to the identity matrix I2. The 97.5% confidence region for z0is equal to

QN =

z 2 R2�

(z� zN)T

3.3798 �0.0549�0.0549 2.8518

(z� zN) 0.03689�

which is contained in the rectangle [�0.1352,0.0738]⇥ [�0.1792,0.0483]. Edges ofthe rectangle are conservative simultaneous confidence intervals for (z0)1 and (z0)2.As seen in Figure 1 both QN and the rectangle intersect all of the faces of R

⇥R�

,and using either set leads to the selection of K .025

N = Ki(zN) = R�

⇥R�

. Evaluatingh

0.0251 and h

0.0252 at the estimates K .025

N and MN , we obtain the 95% individual con-fidence intervals [�0.1173,0.0558] for (z0)1 and [�0.1597,0.0288] for (z0)2. Notethat [�0.1173,0.0558]⇥ (zN)2 ⇢ QN and (zN)1 ⇥ [�0.1597,0.0288] ⇢ QN . For thisexample dPS(zN) is equal to the zero matrix, so using the conservative approach toevaluating h0.025

j described after Theorem 2, h0.025j = h

0.025j , and the 95% individ-

ual confidence intervals for (x0)1 and (x0)2 are given by [0,0.0866] and [0,0.0942]respectively.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 19: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 19

z2

z1

z0

zN

(�.14,�.18)

(.07,�05)

Fig. 1: Confidence region QN (dashed) and simultaneous confidence intervals

4 A numerical example

In this section we apply the methods proposed in §3 to a stochastic Cournot-Nashequilibrium problem. The example is adapted from a model for the European naturalgas market previously considered in [20] and [22]. Cournot competition is used todescribe the competitive behavior of firms making simultaneous decisions regardingtheir levels of production and distribution of a homogeneous product across variousmarkets.

To formally state the problem, assume that the decision made by each firm i,i = 1, . . . ,m, is represented by a vector xi of dimension di and that each firm’s deci-sion is constrained to be in a convex set Si ⇢Rdi . Let x = (x1, . . . ,xm) denote the con-catenation of all players’ decisions and denote the random vector by x . With fi0(x) =E [Fi(x,x )] denoting the expected profit function for player i, x⇤ = (x⇤1, . . . ,x

m) is aCournot-Nash equilibrium if

x⇤i 2 argmaxxi2Sifi0(x⇤

1, . . . ,xi, . . . ,x⇤m) for each i = 1, . . . ,m.

The first order necessary condition for player i’s profit maximization problem is

0 2 �

∂fi0∂xi

(x)+NSi(xi).

Let S = ’mi=1 Si and

f0(x) =⇣

∂f10∂x1

(x), . . . � ∂fm0∂xm

(x)⌘

.

A necessary condition for x⇤ to be a Cournot-Nash equilibrium is

0 2 f0(x)+NS(x). (25)

For (25) to fit the framework of an SVI we require a function F(x,x ) such that f0(x)=E [F(x,x )] and EkF(x,x )k< • for all x 2 S. The natural candidate

F(x,x ) =�

∂Fi∂x1

(x,x ), . . . � ∂Fm∂xm

(x,x )�

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 20: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

20 Michael Lamm, Shu Lu

will meet these criteria if the profit functions Fi satisfy the conditions of Assumption1. In this case the SVI (25) gives rise to the SAA problem

0 2 fN(x)+NS(x) (26)

where

fN = N�1N

Âk=1

F(x,x k).

In the natural gas market example we consider in this section the competing firmsare four gas producers, indexed by i, with six markets indexed by j. Producers decideon the quantity of gas to ship each year during four time periods, indexed by t, tothe six markets. There are 24 decision variables for each producer, denoted by xt

i, j,corresponding to the amount of natural gas shipped by producer i to market j eachyear in time period t. The decision vectors xi are required to be nonnegative so eachplayer’s set of feasible decisions is Si = R24

+ and the concatenation of all players’decisions is given by x = (x1, . . . ,x4) 2 S1 ⇥ . . .⇥S4 ⇢ R96.

In the model’s formulation, the following parameters are used:

– Dtj: the domestic gas production of market j each year in time period t

– cti: the constant marginal transportation cost for shipping for producer i in time

period t– et

j: the price elasticity of demand for natural gas in market j in time period t– yt : the number of years in time period t, taken to be 5 years for time periods 1, 2,

3, and 20 years for time period 4.

In time period t, the yearly production cost for producer i is given by

Gi(x) = ai �bi ln(Xi �6

Âj=1

xti, j),

where ai,bi and Xi are parameters. The parameter Xi provides an upper bound on theyearly production of producer i. Values for the parameters indexed by player i aregiven in Table 1 and values for the parameters indexed by market j are given in Table2.

Table 1: Producer parameter values

Producer a b X c1 c2 c3 c4

Russia 1.606 51 80 .58 .56 .55 .55Netherlands 1.212 67 80 .14 .13 .13 .12

Norway 1.507 85 80 .35 .34 .34 .33Algeria 2.102 96 80 .70 .69 .64 .62

The uncertainty in the problem is associated with the price of natural gas in thedifferent markets. The price of natural gas in market j for time period t, denoted by

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 21: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 21

Table 2: Values for price elasticity e and domestic gas production D

Market Period 1 Period 2 Period 3 Period 4BelLux -1.07 0.00 -1.26 0.00 -1.34 0.00 -1.42 0.00FRGer -1.46 13.70 -1.58 13.80 -1.68 13.80 -1.79 13.80France -.81 4.80 -1.19 2.90 -1.57 3.00 -2.01 3.00

Italy -1.15 10.40 -1.36 10.00 -1.45 10.00 -1.54 10.40Netherlands -.94 22.93 -1.13 20.96 -1.29 24.11 -1.45 23.90

UK -.61 33.70 -.87 35.00 -1.10 37.00 -1.30 38.00

Ptj , is determined by the total amount of natural gas available annually, as well as x

t

the random price of oil in time period t. It is given by

Ptj(x,x

t) = ptj(x

t)

Qtj(x)

qtj(x

t)

!1/etj

.

In the above equation, Qtj(x) = Dt

j +Â6i=1 xt

i, j is the total amount of natural gas avail-able in market j annually throughout time period t. The functions pt

j(xt) and qt

j(xt)

provide the base price and the base demand for natural gas as a function of the priceof oil, and are defined as

ptj(x

t) = p0tj�

x

t/ort�

and qtj(x

t) = q0tj�

x

t/ort�

ht

with the following parameters:– p0t

j: reference price of natural gas in market j in time period t– q0t

j: reference demand for natural gas in market j in time period t– ort : reference price for oil in time period t– ht : the elasticity relating the relative demand for natural gas to the relative price

of oil.We assume that the prices of oil in each time period are independent and uniformlydistributed with lower and upper bounds Lt and Ut . The values for the parameters inthe base price and demand functions are given in Tables 3 and 4.

Table 3: Reference prices p0 and demands q0

Market Period 1 Period 2 Period 3 Period 4BelLux 5.12 7.8 2.56 9.4 3.41 9.4 5.12 9.5FRGer 5.27 40.7 2.64 46.2 3.52 46.5 5.27 44.6France 5.25 23.6 2.62 28.3 3.50 9.8 5.25 28.5Italy 5.15 25.3 2.57 34.9 3.43 37.5 5.15 37.2Netherlands 5.16 28.9 2.58 29.9 3.44 32.2 5.16 29.7UK 4.54 43.8 2.27 50.3 3.03 56.4 4.54 53.7

Assuming a fixed annual interest rate of r = 0.1, for each time period we use thefactor ft to express the future value of money with

ft =✓

(1+ r)yt�1

r

1(1+ r)Ât

s=1 ys

!

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 22: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

22 Michael Lamm, Shu Lu

Table 4: Time period parameters in base price demand function

t ht ort Lt Ut1 -0.10 30 16 342 -0.12 15 12 183 -0.24 30 24 364 -0.36 35 28 42

The net present value profit function for producer i is then defined to be

Fi(x,x ) =4

Ât=1

ft

"

6

Âj=1

Ptj(x,x

t)� cti�

xti, j �Gi(x)

#

. (27)

Taking the expectation of (27) reduces to calculating Eh

Ptj(x,x

t)i

and provides uswith an expression for the expected profit function for player i. Under the assumptionthat the oil prices are uniformly distributed we have

E⇥

Ptj(x,x

t)⇤

=

p0tj

Qtj(x)q0t

j

⌘1/etj or

ht/etj�1

t

U2�ht/et

jt �L

2�ht/etj

t

(Ut �Lt)(2�ht/etj)

.

Using knowledge of the true SVI we can observe that d( f0)S(z0) is a linear func-tion and S0 is degenerate but satisfies the alternative conditions required to proveTheorems 1 and 2. Since z0 is in the interior of an n-cell for this example,

limN!•

Pr�

Ki(zN) = cone(PN � ziN )�

= 1

and the intervals computed using h

a2j and ha2

j will be conservative for large samplesizes.

The performance of the obtained 95% confidence intervals is examined by gen-erating 2,000 replications at each sample size of N = 200, and 2,000. To computethe width of an interval h

a2j or ha2

j we consider values of a2 = 0.04, 0.025 anda1 = 0.05�a2. For each replication K a1

N is chosen to contain a single cone de-termined using the approach described in Theorem 3 but with the set RN,e replacedby the simultaneous confidence intervals (21). For all replications SN is degenerateand the expression (23) is evaluated with e = 0 and lN determined by the numberof eigenvalues of SN greater than N�1/3. In addition to the intervals computed usingh

a2j and ha2

j we consider the intervals (12) . Since z0 is contained in the relative in-terior of an n-cell for this example the intervals (12) are asymptotically exact. Theseintervals are computed for a value of a = 0.05 and we denote their half-widths byu

a

j . Finally, we also consider individual confidence intervals for (x0) j obtained fromthe projection of the intervals (12) onto the set S.

The performance of all the methods were largely in line with expectations. InTables 5 and 6 we provide the five-number summaries for the coverage rates acrossall components of (z0) j and (x0) j for each interval. For both values of a1 = 0.01 and0.025 the set of R96 was chosen for K a1

N across all replications with N = 2,000. For

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 23: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 23

Table 5: Summary of coverage rates for (z0) j, N = 200 and N = 2,000

N = 200 N = 2,000Percentile u

.05j h

.04j h

.025j u

.05j h

.04j h

.025j

MIN 88.05 % 88.70 % 89.05 % 94.60 % 95.70 % 97.20 %Q1 94.60 % 95.57 % 97.07 % 94.85 % 96.00 % 97.43 %

MEDIAN 94.85 % 95.78 % 97.45 % 95.18 % 96.15 % 97.78 %Q3 94.95 % 95.88 % 97.55 % 95.36 % 96.41 % 98.15 %

MAX 96.85 % 97.60 % 98.30 % 95.50 % 96.60 % 98.40 %

these replications the interval widths h

a2j reduced to the standard expressions used

in (12). All differences in the interval widths for these samples were therefore due tothe differences in the values of c

21 (0.05),c2

1 (0.04), and c

21 (0.025), and the intervals

computed using h

a2j were conservative in their coverage of (z0) j.

At the smaller sample size of N = 200, the estimates for K a1N were more con-

servative. For 689 replications with a2 = 0.025 and 941 replications with a2 =0.04 the estimates of K a1

N contained a cone generated from a face of PN with di-mension of n � 2 or below, and h

a2j lacked a closed form expression. For these

replications the differences in h

a2j and u

a

j were still largely driven by the different

choices of a2 and a . The ratio of h

0.04j and u

0.05j differed from

q

c

21 (0.04)/c

21 (0.05)

by no more than ten percent for 99.59% of the intervals for which h

0.04j lacked a

closed form expression. Similarly the ratio of h

0.025j and u

0.05j was within ten percent

ofq

c

21 (0.025)/c

21 (0.05) for 99.84% of the replications for which h

0.025j lacked a

closed form expression. One can also include other candidate cones in K a1N when

computing h

a2j . Because the simultaneous confidence intervals for z0 tend to be con-

servative in their coverage rates, cones generated from faces of PN with dimensionlarger than the face identified using the approach of Theorem 3 are likely to be of themost interest. For this example that would suggest including the whole space R96,which in view of the above observations will not cause significant changes in theinterval widths.

Overall at the sample size of N = 200 the proposed methods performed well. Aswas the case for the replications at the larger sample size, while the proposed methodswere conservative in their coverage of (z0) j this was not due to excessively wideintervals. The proposed intervals are therefore providing informative measures of thevariability in each component of (zN) j, even with the conservative adjustments usedto generalize the methods described in [26]. Among the 96 components of z0, therewere only two components, namely (z0)74 and (z0)83, for which the three computedintervals had coverages rates below 90% at the sample size of N = 200. There were184 and 195 replications for which the signs of (zN)74 and (z0)74, and (zN)83 and(z0)83 differed. For these replications zN and z0 were not contained in the same n-celland the intervals for these two components were negatively affected. This is reflectedin the “MIN” row in Table 5 under N = 200.

In computing confidence intervals for (x0) j, we used the more conservative im-plementation described after Theorem 2, which will not return a value of ha2

j = 0, to

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 24: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

24 Michael Lamm, Shu Lu

Table 6: Summary of coverage rates for (x0) j, N = 200 and N = 2,000

N = 200 N = 2,000Percentile u

.05j h.04

j h.025j u

.05j h.04

j h.025j

MIN 88.20 % 88.70% 89.05 % 94.60 % 95.70 % 97.20 %Q1 94.75 % 95.70 % 97.08% 94.85 % 96.00 % 97.50 %

MEDIAN 94.90 % 95.83 % 97.45% 95.30 % 96.25 % 97.95 %Q3 95.05% 95.95 % 97.60% 95.40 % 96.5 % 98.35 %

MAX 100 % 100 % 100 % 100 % 100 % 100 %

evaluate ha2j . Since S =R96

+ , for this example ha2j = h

a2j for all components. The cov-

erage of (x0) j and (z0) j therefore only differ for replications where (zN) j < 0=(xN) j.As can be seen from Table 6, the coverage rates for most components meet or exceedthe specified level. A majority of the intervals are conservative due to the natureof the proposed methods and the relaxed conditions. However, the intervals are notoverly wide and provide useful information about the uncertainty for each individualcomponent of the SAA solutions.

Acknowledgements Research of Michael Lamm and Shu Lu is supported by National Science Founda-tion under the grants DMS-1109099 and DMS-1407241. Research of Michael Lamm took place duringhis graduate study in the Department of Statistics and Operations at the University of North Carolina atChapel Hill. We thank the three anonymous referees for comments and suggestions that have helped toimprove the presentation of this paper.

References

1. Agdeppa, R.P., Yamashita, N., Fukushima, M.: Convex expected residual models for stochastic affinevariational inequality problems and its application to the traffic equilibrium problem. Pacific Journalon Optimzation 6(1), 3–19 (2010)

2. Anitescu, M., Petra, C.: Higher-order confidence intervals for stochastic programming using boot-strapping. Tech. Rep. ANL/MCS-P1964-1011, Mathematics and Computer Science Division, Ar-gonne National Laboratory, Argonne, IL (2011)

3. Attouch, H., Cominetti, R., Teboulle, M. (eds.): Special Issue on Nonlinear Convex Optimization andVariational Inequalities. Springer (2009). Mathematical Programming 116, Numbers 1-2

4. Chen, X., Fukushima, M.: Expected residual minimization method for stochastic linear complemen-tarity problems. Mathematics of Operations Research 30, 1022–1038 (2005)

5. Chen, X., Pong, T.K., Wets, R.J.B.: Two-stage stochastic variational inequalities: An ERM-solutionprocedure. Preprint (2015)

6. Chen, X., Wets, R.J.B., Zhang, Y.: Stochastic variational inequalities: Residual minimization smooth-ing sample average approximations. SIAM Journal on Optimization 22(2), 649–673 (2012)

7. Chen, X., Zhang, C., Fukushima, M.: Robust solution of monotone stochastic linear complementarityproblems. Mathematical Programming 117, 51–80 (2009)

8. Dentcheva, D., Romisch, W.: Differential stability of two-stage stochastic programs. SIAM Journalon Optimization 11(1), 87–112 (2000)

9. Dontchev, A.L., Rockafellar, R.T.: Characterizations of strong regularity for variational inequalitiesover polyhedral convex sets. SIAM Journal on Optimization 6(4), 1087–1105 (1996)

10. Dupacova, J., Wets, R.: Asymptotic behavior of statistical estimators and of optimal solutions ofstochastic optimization problems. Ann. Stat. 16(4), 1517–1549 (1988)

11. Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems,vol. I. Springer, New York (2003)

12. Fang, H., Chen, X., Fukushima, M.: Stochastic R0 matrix linear complementarity problems. SIAMJournal on Optimization 18, 482–506 (2007)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 25: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

Title Suppressed Due to Excessive Length 25

13. Ferris, M.C., Pang, J.S.: Complementarity and Variational Problems: State of the Art. SIAM (1997)14. Ferris, M.C., Pang, J.S.: Engineering and economic applications of complementarity problems. SIAM

Review 39, 669–713 (1997)15. Genz, A., Bretz, F.: Computation of Multivariate Normal and t Probabilities. Lecture Notes in Statis-

tics. Springer-Verlag, Heidelberg (2009)16. Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Hothorn, T.: mvtnorm: Multivariate

Normal and t Distributions (2013). URL http://CRAN.R-project.org/package=mvtnorm. R packageversion 0.9-9996

17. Giannessi, F., Maugeri, A. (eds.): Variational Inequalities and Network Equilibrium Problems. PlenumPress, New York (1995)

18. Giannessi, F., Maugeri, A., Pardalos, P.M. (eds.): Equilibrium Problems: Nonsmooth Optimizationand Variational Inequality Models, Nonconvex Optimization and Its Applications, vol. 58. KluwerAcademic Publishers (2001)

19. Gurkan, G., Pang, J.S.: Approximations of Nash equilibria. Mathematical Programming pp. 223–253(2009)

20. Gurkan, G., Yonca Ozge, A., Robinson, S.M.: Sample-path solution of stochastic variational inequal-ities. Mathematical Programming 84, 313–333 (1999)

21. Harker, P.T., Pang, J.S.: Finite-dimensional variational inequality and nonlinear complementarityproblems: A survey of theory, algorithms, and applications. Mathematical Programming 48, 161–220 (1990)

22. Haurie, A., Zaccour, G., Legrand, J., Smeers, Y.: A stochastic dynamic nash-cournot model forthe european gas market. Tech. Rep. G-87-24, Ecole des hautes etudes commerciales, Montreal,Quebec,Canada (1987)

23. Huber, P.: The behavior of maximum likelihood estimates under nonstandard conditions. In: LeCam,L., Neyman, J. (eds.) Proceedings of the Fith Berkeley Symposium on Mathematical Statistics, pp.221–233. University of California Press, Berkeley, C.A. (1967)

24. Jiang, H., Xu, H.: Stochastic approximation approaches to the stochastic variational inequality prob-lem. IEEE Transactions on Automatic Control 53(6), 1462–1475 (2008)

25. King, A.J., Rockafellar, R.T.: Asymptotic theory for solutions in statistical estimation and stochasticprogramming. Mathematics of Operations Research 18, 148–162 (1993)

26. Lamm, M., Lu, S., Budhiraja, A.: Individual confidence intervals for true solutions to expected valueformulations stochastic variational inequalities. Mathematical Programming Ser. B 165(1), 151–196(2017)

27. Lan, G., Nemirovski, A., Shapiro, A.: Validation analysis of mirror descent stochastic approximationmethod. Mathematical programming 134(2), 425–458 (2012)

28. Linderoth, J., Shapiro, A., Wright, S.: The empirical behavior of sampling methods for stochasticprogramming. Annals of Operations Research 142, 215–241 (2006)

29. Lu, S.: A new method to build confidence regions for solutions of stochastic variational inequalities.Optimization 63(9), 1431–1443 (2014)

30. Lu, S.: Symmetric confidence regions and confidence intervals for normal map formulations ofstochastic variational inequalities. SIAM Journal on Optimization 24(3), 1458–1484 (2014)

31. Lu, S., Budhiraja, A.: Confidence regions for stochastic variational inequalities. Mathematics ofOperations Research 38, 545–568 (2013)

32. Lu, S., Liu, Y., Yin, L., Zhang, K.: Confidence intervals and retions for the lasso by using stochasticvariational inequality techniques in optimization. Journal of the Royal Statistical Society: Series B79(2), 589–611 (2017)

33. Luo, M., Lin, G.: Expected residual minimization method for stochastic variational inequality prob-lems. Journal of Optimization Theory and Applicatoins 140, 103–116 (2009)

34. Pang, J.: Newton’s method for b-differentiable equations. Mathematics of Operations Research 15,311–341 (1990)

35. Pang, J.S., Ralph, D. (eds.): Special Issue on Nonlinear Programming, Variational Inequalities, andStochastic Programming. Springer (2009). Mathematical Programming 117, Numbers 1-2

36. Phelps, C., Royset, J.O., Gong, Q.: Optimal control of uncertain systems using sample average ap-proximations. SIAM Journal on Control and Optimization 54(1), 1–29 (2016)

37. Ralph, D.: On branching numbers of normal manifolds. Nonlinear Analysis: Theory, Methods, andApplications 22, 1041–1050 (1994)

38. Robinson, S.M.: Strongly regular generalized equations. Mathematics of Operations Research 5(1),43–62 (1980)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Page 26: Generalized conditioning based approaches to computing ...shulu.web.unc.edu/files/2018/06/18gcba_online.pdf · Several formulations of stochastic variational inequalities (SVI) have

26 Michael Lamm, Shu Lu

39. Robinson, S.M.: An implicit-function theorem for a class of nonsmooth functions. Mathematics ofOperations Research 16(2), 292–309 (1991)

40. Robinson, S.M.: Normal maps induced by linear transformations. Mathematics of Operations Re-search 17(3), 691–714 (1992)

41. Robinson, S.M.: Sensitivity analysis of variational inequalities by normal-map techniques. In: Gian-nessi, F., Maugeri, A. (eds.) Variational Inequalities and Network Equilibrium Problems, pp. 257–269.Plenum Press, New York (1995)

42. Rockafellar, R.T., Wets, R.J.B.: Stochastic variational inequalities: Single-stage to multistage. Math-ematical Programming Ser. B 165(1), 331–360 (2017)

43. Romisch, W.: Stochastic programming. In: Ruszczynski, A., Shapiro, A. (eds.) Stability of stochasticprogramming problems, Handbooks in Operations Research and Management Science, vol. 10, pp.483–554. Elsevier (2003)

44. Scholtes, S.: Introduction to Piecewise Differentiable Equations. Springer New York, New York(2012)

45. Shapiro, A.: Asymptotic behavior of optimal solutions in stochastic programming. Mathematics ofOperations Research 18, 829–845 (1993)

46. Shapiro, A., Dentcheva, D., Ruszczynski, A.P.: Lectures on Stochastic Programming: Modeling andTheory. Society for Industrial and Applied Mathematics and Mathematical Programming Society,Philadelphia, PA (2009)

47. Shapiro, A., Homem-de Mello, T.: On the rate of convergence of optimal solutions of monte carloapproximations of stochastic programs. SIAM journal on optimization 11(1), 70–86 (2000)

48. Shapiro, A., Xu, H.: Stochastic mathematical programs with equilibrium constraints, modeling andsample average approximation. Optimization 57, 395–418 (2008)

49. Stefanski, L.A., Boos, D.D.: The Calculus of M-Estimation. The Americn Statistician 56(1), 29–38(2002)

50. Vogel, S.: Universal confidence sets for solutions of optimization problems. SIAM Journal on Opti-mization 19(3), 1467–1488 (2008)

51. Wald, A.: Note on the consitency of the maximum likelihood estimate. Annals of MathematicalStatistics 20, 595–601 (1949)

52. Xu, H.: Sample average approximation methods for a class of stochastic variational inequality prob-lems. Asia-Pacific Journal of Operational Research 27(1), 103–119 (2010)

53. Yin, L., Lu, S., Liu, Y.: Confidence intervals for sparse penalized regression with random designs.Submitted for publication (2015)

54. Zhang, C., Chen, X., Sumlee, A.: Robust Wardrop’s user equilibrium assignment under stochasticdemand and supply. Transportation Research B 45(3), 534–552 (2011)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65