Helton et.al. 2005

26
A comparison of uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling J.C. Helton a, * , F.J. Davis b , J.D. Johnson c a Department of Mathematics and Statistics, Arizona State University, Tempe, AZ 85287-1804, USA b Sandia National Laboratories, Albuquerque, NM 87185-0779, USA c ProStat, Mesa, AZ 85204-5326, USA Received 16 October 2003; accepted 3 September 2004 Available online 21 November 2004 Abstract Uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling are compared. The comparison uses results from a model for two-phase fluid flow obtained with three independent random samples of size 100 each and three independent Latin hypercube samples (LHSs) of size 100 each. Uncertainty and sensitivity analysis results with the two sampling procedures are similar and stable across the three replicated samples. Poor performance of regression-based sensitivity analysis procedures for some analysis outcomes results more from the inappropriateness of the procedure for the nonlinear relationships between model input and model results than from an inadequate sample size. Kendall’s coefficient of concordance (KCC) and the top down coefficient of concordance (TDCC) are used to assess the stability of sensitivity analysis results across replicated samples, with the TDCC providing a more informative measure of analysis stability than KCC. A new sensitivity analysis procedure based on replicated samples and the TDCC is introduced. q 2004 Elsevier Ltd. All rights reserved. Keywords: Epistemic uncertainty; Kendall’s coefficient of concordance; Latin hypercube sampling; Monte Carlo analysis; Random sampling; Replicated sampling; Sensitivity analysis; Stability; Subjective uncertainty; Top down coefficient of concordance; Two-phase fluid flow; Uncertainty analysis 1. Introduction The identification and representation of the implications of uncertainty is widely recognized as a fundamental component of analyses of complex systems [1–10]. The study of uncertainty is usually subdivided into two closely related activities referred to as uncertainty analysis and sensitivity analysis, where (i) uncertainty analysis involves the determination of the uncertainty in analysis results that derives from uncertainty in analysis inputs and (ii) sensitivity analysis involves the determination of relation- ships between the uncertainty in analysis results and the uncertainty in individual analysis inputs. At an abstract level, the analysis or model under consideration can be represented as a function of the form y Z yðxÞ Z f ðxÞ; (1.1) where x Z ½x 1 ; x 2 ; .; x nX (1.2) is a vector of uncertain analysis inputs and y Z ½y 1 ; y 2 ; .; y nY (1.3) is a vector of analysis results. Further, a sequence of distributions D 1 ; D 2 ; .; D nX (1.4) is used to characterize the uncertainty associated with the elements of x, where D i is the distribution associated with x i for iZ1, 2,.,nX. Correlations and other restrictions involving the elements of x are also possible. The goal of uncertainty analysis is to determine the uncertainty in the elements of y that derives from the uncertainty in the elements of x characterized by the distributions D 1 ,D 2 ,.,D nX and any associated restrictions. The goal of sensitivity analysis is to determine relationships between 0951-8320/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.ress.2004.09.006 Reliability Engineering and System Safety 89 (2005) 305–330 www.elsevier.com/locate/ress * Corresponding author. Address: Department 6874, MS 0779, Sandia National Laboratories, Albuquerque, NM 87185-0779, USA. Tel.: C1- 505-284-4808; fax: C1-505-844-2348. E-mail address: [email protected] (J.C. Helton).

Transcript of Helton et.al. 2005

Page 1: Helton et.al. 2005

A comparison of uncertainty and sensitivity analysis results obtained

with random and Latin hypercube sampling

J.C. Heltona,*, F.J. Davisb, J.D. Johnsonc

aDepartment of Mathematics and Statistics, Arizona State University, Tempe, AZ 85287-1804, USAbSandia National Laboratories, Albuquerque, NM 87185-0779, USA

cProStat, Mesa, AZ 85204-5326, USA

Received 16 October 2003; accepted 3 September 2004

Available online 21 November 2004

Abstract

Uncertainty and sensitivity analysis results obtained with random and Latin hypercube sampling are compared. The comparison uses

results from a model for two-phase fluid flow obtained with three independent random samples of size 100 each and three independent Latin

hypercube samples (LHSs) of size 100 each. Uncertainty and sensitivity analysis results with the two sampling procedures are similar and

stable across the three replicated samples. Poor performance of regression-based sensitivity analysis procedures for some analysis outcomes

results more from the inappropriateness of the procedure for the nonlinear relationships between model input and model results than from an

inadequate sample size. Kendall’s coefficient of concordance (KCC) and the top down coefficient of concordance (TDCC) are used to assess

the stability of sensitivity analysis results across replicated samples, with the TDCC providing a more informative measure of analysis

stability than KCC. A new sensitivity analysis procedure based on replicated samples and the TDCC is introduced.

q 2004 Elsevier Ltd. All rights reserved.

Keywords: Epistemic uncertainty; Kendall’s coefficient of concordance; Latin hypercube sampling; Monte Carlo analysis; Random sampling; Replicated

sampling; Sensitivity analysis; Stability; Subjective uncertainty; Top down coefficient of concordance; Two-phase fluid flow; Uncertainty analysis

1. Introduction

The identification and representation of the implications

of uncertainty is widely recognized as a fundamental

component of analyses of complex systems [1–10]. The

study of uncertainty is usually subdivided into two closely

related activities referred to as uncertainty analysis and

sensitivity analysis, where (i) uncertainty analysis involves

the determination of the uncertainty in analysis results that

derives from uncertainty in analysis inputs and (ii)

sensitivity analysis involves the determination of relation-

ships between the uncertainty in analysis results and the

uncertainty in individual analysis inputs.

At an abstract level, the analysis or model under

consideration can be represented as a function of the form

0951-8320/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.

doi:10.1016/j.ress.2004.09.006

* Corresponding author. Address: Department 6874, MS 0779, Sandia

National Laboratories, Albuquerque, NM 87185-0779, USA. Tel.: C1-

505-284-4808; fax: C1-505-844-2348.

E-mail address: [email protected] (J.C. Helton).

y Z yðxÞ Z f ðxÞ; (1.1)

where

x Z ½x1; x2;.; xnX� (1.2)

is a vector of uncertain analysis inputs and

y Z ½y1; y2;.; ynY � (1.3)

is a vector of analysis results. Further, a sequence of

distributions

D1;D2;.;DnX (1.4)

is used to characterize the uncertainty associated with the

elements of x, where Di is the distribution associated with xi

for iZ1, 2,.,nX. Correlations and other restrictions

involving the elements of x are also possible. The goal of

uncertainty analysis is to determine the uncertainty in the

elements of y that derives from the uncertainty in the

elements of x characterized by the distributions

D1,D2,.,DnX and any associated restrictions. The goal of

sensitivity analysis is to determine relationships between

Reliability Engineering and System Safety 89 (2005) 305–330

www.elsevier.com/locate/ress

Page 2: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330306

the uncertainty associated with individual elements of x and

the uncertainty associated with individual elements of y.

A variety of approaches to uncertainty and sensitivity

analysis are in use, including (i) differential analysis, which

involves approximating a model with a Taylor series and then

using variance propagation formulas to obtain uncertainty

and sensitivity analysis results [11–24], (ii) response surface

methodology, which is based on using classical experimental

designs to select points for use in developing a response

surface replacement for a model and then using this

replacement model in subsequent uncertainty and sensitivity

analyses based on Monte Carlo simulation and variance

propagation [25–35], (iii) the Fourier amplitude sensitivity

test (FAST) and other variance decomposition procedures,

which involve the determination of uncertainty and sensi-

tivity analysis results on the basis of the variance of model

predictions and the contributions of individual variables to

this variance [36–55], (iv) fast probability integration, which

is primarily an uncertainty analysis procedure used to

estimate the tails of uncertainty distributions for model

predictions [56–62], and (v) sampling-based (i.e. Monte

Carlo) procedures, which involve the generation and

exploration of a probabilistically based mapping from

analysis inputs to analysis results [63–73]. Additional

information on uncertainty and sensitivity analysis is

available in a number of reviews [69,70,74–80]. The primary

focus of this presentation is on sampling-based methods for

uncertainty and sensitivity analysis.

Sampling-based approaches for uncertainty and sensi-

tivity analysis are very popular [81–96]. Desirable proper-

ties of these approaches include conceptual simplicity, ease

of implementation, generation of uncertainty analysis

results without the use of intermediate models, and

availability of a variety of sensitivity analysis procedures

[67,69,76,97,98]. Despite these positive properties, concern

is often expressed about using these approaches because of

the computational cost involved. In particular, the concern

is that the sample sizes required to obtain meaningful results

will be so large that analyses will be computationally

impracticable for all but the most simple models. At times,

statements are made that 1000 to 10,000s of model

evaluations are required in a sampling-based uncertainty/

sensitivity analysis.

In this presentation, results obtained with a computa-

tionally demanding model for two-phase fluid flow are used

to illustrate that robust uncertainty and sensitivity analysis

results can be obtained with relatively small sample sizes.

Further, results are obtained and compared for replicated

random and Latin hypercube samples (LHSs) [63,73]. For

the problem under consideration, random and LHSs of size

100 produce similar, stable results.

The presentation is organized as follows. The analysis

problem is described in Section 2. Then, the following

topics are considered: stability of uncertainty analysis

results (Section 3), stability of sensitivity analysis results

based on stepwise rank regression (Section 4), use of

coefficients of concordance in comparing replicated sensi-

tivity analyses (Section 5), sensitivity analysis based on

replicated samples and the top down coefficient concor-

dance (Section 6), sensitivity analysis with reduced sample

sizes (Section 7), and sensitivity analysis without regression

analysis (Section 8). Finally, the presentation ends with a

concluding discussion (Section 9).

2. Analysis problem

The analysis problem under consideration comes from

the 1996 performance assessment (PA) for the Waste

Isolation Pilot Plant (WIPP) [99,100]. This PA was the core

analysis that supported the successful Compliance Certifi-

cation Application (CCA) by the US Department of Energy

(DOE) to the US Environmental Protection Agency (EPA)

for the operation of the WIPP [101]. With the certification of

the WIPP by the EPA for the disposal of transuranic waste

in May 1998 [102], the WIPP became the first operational

facility in the United States for the deep geologic disposal of

radioactive waste. Thus, the example used to illustrate

properties of sampling-based approaches to uncertainty and

sensitivity analysis in this presentation is part of a real

analysis rather than a hypothetical example constructed

solely for illustrative purposes.

The analysis problem involves the model for two-phase

fluid flow that is at the center of the 1996 WIPP PA. This

model is based on the following system of nonlinear partial

differential equations:

Gas conservation

V†argKgkrg

mg

ðVpg CrggVhÞ

� �Caqwg Caqrg Z a

vðfrgSgÞ

vt

(2.1)

Brine conservation

V†arbKbkrb

mb

ðVpb CrbgVhÞ

� �Caqwb Caqrb Z a

vðfrbSbÞ

vt

(2.2)

Saturation constraint

Sg CSb Z 1 (2.3)

Capillary pressure constraint

pC Z pg Kpb Z f ðSbÞ (2.4)

Gas density

rg determined by Redlich–Kwong–Soave equation of state

(see Eqs. (31) and (32), Ref. [103]).

Brine density

rb Z r0 exp½bbðpb Kpb0Þ� (2.5)

Formation porosity

f Z f0 exp½bfðpb Kpb0Þ� (2.6)

Page 3: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 307

where gZacceleration due to gravity (m/s2), hZvertical

distance from a reference location (m), KlZpermeability

tensor (m2) for fluid lZ(l, gwgas, lZbwbrine), krlZrelative permeability (dimensionless) to fluid l, pCZcapillary pressure (Pa), plZpressure of fluid l (Pa), qrlZrate of production (or consumption, if negative) of fluid l

due to chemical reaction (kg/m3/s), qwlZrate of injection

(or removal, if negative) of fluid l (kg/m3/s), SlZsaturation

of fluid l (dimensionless), t, time (s), aZgeometry factor

(m in present analysis), rlZdensity of fluid l (kg/m3), mlZviscosity of fluid l (Pa s), fZporosity (dimensionless),

f0Z reference (i.e. initial) porosity (dimensionless), pb0Zreference (i.e. initial) brine pressure (Pa) (constant in

Fig. 1. Computational grid used in BRAGFLO to represent two-phase flow in 199

formulation is used in the absence of a drilling intrusion except that regions 1A–

Eq. (2.5) and spatially variable in Eq. (2.6)), r0Zreference

(i.e. initial) brine density (kg/m3), bfZpore compressibility

(PaK1), bbZbrine compressibility (PaK1), and f is defined

by the model for capillary pressure in use (see the right hand

sides of Eqs. (10), (19) and (20) in Ref. [103]). The

conservation equations are valid in one (i.e. VZ[v/vx]), two

(i.e. VZ[v/vx v/vy]) and three (i.e. VZ[v/vx v/vy v/vz])

dimensions. In the present analysis, the preceding system of

equations is used to model two-phase fluid flow in a two-

dimensional region (Fig. 1), with the result that the spatial

scale factor a in Eqs. (2.1) and (2.2) has units of meters (m).

In general, the individual terms in Eqs. (2.1)–(2.6) are

functions of location and time (e.g. pg(x, y, t), rg(x, y, t),

6 WIPP PA subsequent to a drilling intrusion (Fig. 1, Ref. [70]). Same

C have the same properties as the regions to either side.

Page 4: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330308

krg(x, y, t),.) and often other variables as well. A full

description of how the individual terms in these equations

are defined is beyond the scope of this presentation and is

available elsewhere [103]. The system of partial differential

equations in Eqs. (2.1)–(2.6) is too complex to permit a

closed form solution. In the present analysis, these equations

were solved with finite difference procedures implemented

by the BRAGFLO program [104,105] on the computational

grid in Fig. 1.

The analysis problem under consideration involves a

single drilling intrusion (regions 1A–C in Fig. 1) that passes

through a waste disposal panel in the WIPP repository

(region 23 in Fig. 1) 1000 yr after closure of the repository

and also penetrates a region of pressurized brine beneath the

repository (region 30 in Fig. 1). In the terminology of the

1996 WIPP PA, this designated an E1 intrusion. Due to

regulatory requirements placed on the WIPP [106–108], the

modeled period extends from slightly before closure of the

repository (tZK5 yr), through closure of the repository

(tZ0 yr), and out to tZ10,000 yr.

To assess the effects of uncertainty, the 1996 WIPP PA

identified 31 uncertain inputs to the BRAGFLO program

required in the formulation of the model in Eqs. (2.1)–(2.6)

for an E1 intrusion (Table 1). The exact manner in which

these inputs were used in the definition of the coefficients in

Eqs. (2.1)–(2.6) is described in Table 1 of Ref. [109].

The analysis was structured to require a single value for

each of the variables in Table 1. However, the exact values

to use for these variables were felt to be poorly known.

Therefore, ranges of possible values for these variables were

developed, and distributions were assigned to these ranges

to characterize a degree of belief with respect to the location

of the appropriate values to use in the 1996 WIPP PA. Thus,

the distributions indicated in Table 1 are characterizing

subjective (i.e. epistemic) uncertainty [110–112].

The 1996 WIPP PA used Latin hypercube sampling

[63,73] to investigate the effects of the uncertain variables in

Table 1 on predictions of two-phase flow in the vicinity of

the repository. In particular, three replicated LHSs of size

100 each were generated [109] with use of the Iman and

Conover restricted pairing technique to control correlations

[113] and then pooled to produce a single sample of size 300

that was used in the investigation of two-phase flow

[114,115]. Within the 1996 WIPP PA, these replicates are

denoted R1, R2 and R3, respectively. The reason for the

replication was to assess the stability of complementary

cumulative distribution functions (CCDFs) used in com-

parisons with the EPA’s regulations for the WIPP [108,109].

The present investigation makes use of these three

replicated LHSs to investigate the stability of uncertainty

and sensitivity analysis results obtained with relatively

small sample sizes (i.e. samples of size 100 from 31

uncertain variables) for a complex and computationally

intensive model (i.e. 2–4 h of CPU time per model

evaluation on a VAX Alpha). Further, to provide perspec-

tive on the use of random and Latin hypercube sampling,

the problem was also analyzed for three random samples of

size 100. As for the LHSs, the Iman and Conover restricted

pairing technique was also used to control correlations

within the random samples.

The total number of dependent variables that can be

generated in the solution of Eqs. (2.1)–(2.6) is quite large and

includes (i) time-dependent gas and brine flow across each

cell boundary in Fig. 1, (ii) time-dependent porosity, gas

pressure, gas saturation, brine pressure and brine saturation

in each cell in Fig. 1, (iii) time-dependent gas generation due

to corrosion and microbial degradation of cellulose in each

cell in Fig. 1 corresponding to a location at which waste

disposal takes place, and (iv) quantities obtained by summing

results over multiple cells in Fig. 1. Thus, the present analysis

cannot investigate the stability of sampling-based uncer-

tainty and sensitivity analysis results for all possible results

that arise from solution of Eqs. (2.1)–(2.6).

To keep the analysis at a reasonable scale, four

dependent variables were selected for consideration

(Table 2). These variables were selected because they

involved analysis outcomes potentially of interest in PA

for the WIPP and also because they displayed a spectrum

of behavior. For perspective, approximately 250 time-

dependent results produced in the solution of Eqs. (2.1)–

(2.6) were typically examined as part of the analysis

process for these equations within the WIPP PA. For the

present analysis, the four results in Table 2 are examined

at three times: (i) 1000 yr, which is just before the

drilling intrusion occurs, (ii) 10,000–1000 yr, which

indicates the result at 10,000 yr minus the result at

1000 yr, and (iii) 10,000 yr, which is the end of the

simulation period.

The solution of Eqs. (2.1)–(2.6) yields time-dependent

results for each of the dependent variables in Table 2

(Fig. 2), and also for many additional dependent

variables as previously indicated. The results in Fig. 2

are for the first of the three replicates (i.e. R1) used in

the 1996 WIPP PA and illustrate both the spectrum of

behaviors and the complexity of behavior that solutions

to Eqs. (2.1)–(2.6) can display. In particular, a significant

change in behavior occurs subsequent to the drilling

intrusion at 1000 yr.

3. Uncertainty analysis results

The time-dependent results in Fig. 2 display the

uncertainty in solutions to Eqs. (2.1)–(2.6) that results

from uncertainty in the 31 variables in Table 1. The goal of

this presentation is to illustrate the robustness of such

uncertainty representations with respect to the type and size

of the sample in use. As previously indicated, results at

1000, 10,000–1000, and 10,000 yr will be used for

illustration.

One way to compare uncertainty analysis results is to

present cumulative distributions functions (CDFs)

Page 5: Helton et.al. 2005

Table 1

Uncertain variables used as input to BRAGFLO in the 1996 WIPP PA

ANHBCEXP—Brooks–Corey pore distribution parameter for anhydrite (dimensionless). Distribution: Student’s with 5 degrees of freedom.

Range: 0.491–0.842. Mean, median: 0.644, 0.644

ANHBCVGP—Pointer variable for selection of relative permeability model for use in anhydrite. Distribution: discrete with 60% 0, 40% 1. Value of 0 implies

Brooks–Corey model; value of 1 implies van Genuchten–Parker model

ANHCOMP—Bulk compressibility of anhydrite (PaK1). Distribution: Student’s with 3 degrees of freedom. Range: 1.09!10K11–2.75!10K10 PaK1.

Mean, median: 8.26!10K11, 8.26!10K11 PaK1. Correlation: K0.99 rank correlation [113] with ANHPRM

ANHPRM—Logarithm of anhydrite permeability (m2). Distribution: Student’s with 5 degrees of freedom. Range: K21.0 to K17.1 (i.e. permeability range is

1!10K21–1!10K17.1 m2). Mean, median: K18.9, K18.9. Correlation: K0.99 rank correlation with ANHCOMP

ANRBRSAT—Residual brine saturation in anhydrite (dimensionless). Distribution: Student’s with 5 degrees of freedom. Range: 7.85!10K3–1.74!10K1.

Mean, median: 8.36!10K2, 8.36!10K2

ANRGSSAT—Residual gas saturation in anhydrite (dimensionless). Distribution: Student’s with 5 degrees of freedom. Range: 1.39!10K2–1.79!10K1.

Mean, median: 7.71!10K2, 7.71!10K2

BHPRM—Logarithm of borehole permeability (m2). Distribution: uniform. Range: K14 to K11 (i.e. permeability range is 1!10K14–1!10K11 m2).

Mean, median: K12.5, K12.5

BPCOMP—Logarithm of bulk compressibility of brine pocket (PaK1). Distribution: triangular. Range: K11.3 to K8.00 (i.e. bulk compressibility range

is 1!10K11.3–1!10K8 PaK1). Mean, mode: K9.80, K10.0. Correlation: K0.75 rank correlation with BPPRM

BPINTPRS— Initial pressure in brine pocket (Pa). Distribution: triangular. Range: 1.11!107–1.70!107 Pa. Mean, mode: 1.36!107, 1.27!107 Pa

BPPRM— Logarithm of intrinsic brine pocket permeability (m2). Distribution: triangular. Range: K14.7 to K9.80

(i.e. permeability range is 1!10K14.7–1!10K9.80 m2). Mean, mode: K12.1, K11.8. Correlation: K0.75 rank correlation with BPCOMP

BPVOL— Pointer variable for selection of brine pocket volume. Distribution: discrete, with integer values 1, 2,.,32 equally likely

HALCOMP—Bulk compressibility of halite (PaK1). Distribution: uniform. Range: 2.94!10K12–1.92!10K10 PaK1. Mean, median: 9.75!10K11,

9.75!10K11 PaK1. Correlation: K0.99 rank correlation with HALPRM

HALPOR—Halite porosity (dimensionless). Distribution: piecewise uniform. Range: 1.0!10K3–3!10K2. Mean, median: 1.28!10K2, 1.00!10K2

HALPRM—Logarithm of halite permeability (m2). Distribution: uniform. Range: K24 to K21 (i.e. permeability range is 1!10K24–1!10K21 m2).

Mean, median: K22.5, K22.5. Correlation: K0.99 rank correlation with HALCOMP

SALPRES—Initial brine pressure, without the repository being present, at a reference point located in the center of the combined shafts at the elevation of the

midpoint of Marker Bed (MB) 139 (Pa). Distribution: uniform. Range: 1.104!107–1.389!107 Pa. Mean, median: 1.247!107, 1.247!107 Pa

SHBCEXP—Brooks–Corey pore distribution parameter for shaft (dimensionless). Distribution: piecewise uniform. Range: 0.11–8.10.

Mean, median: 2.52, 0.94

SHPRMASP—Logarithm of permeability (m2) of asphalt component of shaft seal (m2). Distribution: triangular. Range: K21 to K18 (i.e. permeability range is

1!10K21–1!10K18 m2). Mean, mode: K19.7, K20.0

SHPRMCLY—Logarithm of permeability (m2) for clay components of shaft seal. Distribution: triangular. Range: K21 to K17.3 (i.e. permeability range

is 1!10K21–1!10K17.3 m2). Mean, mode: K18.9, K18.3

SHPRMCON—Same as SHPRMASP, but for concrete component of shaft seal for 0–400 yr. Distribution: triangular. Range: K17.0 to K14.0

(i.e. permeability range is 1!10K17–1!10K14 m2). Mean, mode: K15.3, K15.0

SHPRMDRZ—Logarithm of permeability (m2) of DRZ surrounding shaft seal. Distribution: triangular. Range: K17.0 to K14.0 (i.e. permeability range is

1!10K17–1!10K14 m2). Mean, mode: K15.3, K15.0

SHPRMHAL—Pointer variable (dimensionless) used to select permeability in crushed salt component of shaft seal at different times. Distribution: uniform.

Range: 0–1. Mean, mode: 0.5, 0.5. A distribution of permeability (m2) in the crushed salt component of the shaft seal is defined for each of the following time

intervals: [0, 10 yr], [10, 25 yr], [25, 50 yr], [50, 100 yr], [100, 200 yr], [200, 10,000 yr]. SHPRMHAL is used to select a permeability value from the

cumulative distribution function for permeability for each of the preceding time intervals with result that a rank correlation of 1 exists between the

permeabilities used for the individual time intervals

SHRBRSAT—Residual brine saturation in shaft (dimensionless). Distribution: uniform. Range: 0–0.4. Mean, median: 0.2, 0.2

SHRGSSAT—Residual gas saturation in shaft (dimensionless). Distribution: uniform. Range: 0–0.4. Mean, median: 0.2, 0.2

WASTWICK—Increase in brine saturation of waste due to capillary forces (dimensionless). Distribution: uniform. Range: 0–1. Mean, median: 0.5, 0.5

WFBETCEL—Scale factor used in definition of stoichiometric coefficient for microbial gas generation (dimensionless). Distribution: uniform.

Range: 0–1. Mean, median: 0.5, 0.5

WGRCOR—Corrosion rate for steel under inundated conditions in the absence of CO2 (m/s). Distribution: uniform. Range: 0–1.58!10K14 m/s.

Mean, median: 7.94!10K15, 7.94!10K15 m/s

WGRMICH—Microbial degradation rate for cellulose under humid conditions (mol/kg s). Distribution: uniform. Range: 0–1.27!10K9 mol/kg s.

Mean, median: 6.34!10K10, 6.34!10K10 mol/kg s

WGRMICI—Microbial degradation rate for cellulose under inundated conditions (mol/kg s). Distribution: uniform.

Range: 3.17!10K10–9.51!10K9 mol/kg s. Mean, median: 4.92!10K9, 4.92!10K9 mol/kg s

WMICDFLG—Pointer variable for microbial degradation of cellulose. Distribution: discrete, with 50% 0, 25% 1, 25% 2. WMICDFLGZ0, 1, 2 implies no

microbial degradation of cellulose, microbial degradation of only cellulose, microbial degradation of cellulose, plastic, and rubber

WRBRNSAT—Residual brine saturation in waste (dimensionless). Distribution: uniform. Range: 0–0.552. Mean, median: 0.276, 0.276

WRGSSAT—Residual gas saturation in waste (dimensionless). Distribution: uniform. Range: 0–0.15. Mean, median: 0.075, 0.075

See Table 1, Ref. [109] and App. PAR, Ref. [101], for additional information.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 309

constructed from the individual replicated random and

LHSs. For each of the 12 analysis outcomes under

consideration (i.e. three values for each of BRNREPTC,

GAS_MOLE, REP_SATB, and WAS_PRES), three CDFs

resulting from three random samples of size 100 and also

three CDFs resulting from three LHSs of size 100 are

available. These CDFs are generally very similar, both when

compared within sampling procedures (i.e. random or Latin

Page 6: Helton et.al. 2005

Table 2

Dependent variables arising from the solution of Eqs. (2.1)–(2.6) for an E1 intrusion at 1000 yr selected for consideration

BRNREPTC—total brine flow (m3) into repository, which, in the context of Fig. 1, corresponds to regions 23 and 24, the part of region 1 (i.e. the borehole)

between the two parts of region 23, and the part of region 25 (i.e. the panel closure) between regions 23 and 24

GAS_MOLE—total gas generation (moles) in repository due to corrosion and microbial degradation of cellulose

REP_SATB—brine saturation (dimensionless) in part of repository not penetrated by a drilling intrusion, which corresponds to region 24 in Fig. 1

WAS_PRES—pressure (Pa) in part of the repository penetrated by a drilling intrusion, which corresponds to region 23 in Fig. 1. A capillary pressure of zero is

assumed within regions 23 and 24 in Fig. 1, with the result that gas and brine pressure are equal within these regions

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330310

hypercube) and when compared across sampling pro-

cedures. The greatest variability occurred for WAS_PRES

(Fig. 3), with Latin hypercube sampling producing notice-

ably more stable CDFs than random sampling. For the other

results, visual inspection indicated little difference between

the CDFs obtained with random and Latin hypercube

sampling.

Plots of CDFs are too bulky to permit their presentation

for all 12 analysis outcomes under consideration. In

particular, 12 analysis outcomes, two sampling procedures,

and three replicates results in 72 (i.e. 12!2!3) CDFs.

However, box plots provide a compact representation of the

information contained in the 72 CDFs under consideration

that can be presented in a single figure (Fig. 4). Further,

Fig. 2. Time-dependent solutions to Eqs. (2.1)–(2.6) obtained for the first repli

BRNREPTC, GAS_MOLE, REP_SATB and WAS_PRES.

the flattened structure of box plots facilitates the comparison

of CDFs both within and across sampling procedures.

As inspection of Fig. 4 shows, the distributions of results

obtained with the two sampling techniques are quite stable,

both within and across the two techniques. Visual inspection

suggests that the results obtained with Latin hypercube

sampling are slightly more stable than those obtained with

random sampling, but the difference is not very large. If

desired, the t-test can be used to determine confidence

intervals for the estimated means for the two sampling

procedures [109,116].

In typical uncertainty analyses dealing with subjective

(i.e. state of knowledge or epistemic) uncertainty, the

primary goal is to obtain a general assessment of

cated LHS (i.e. replicate R1) of size 100 used in the 1996 WIPP PA for

Page 7: Helton et.al. 2005

Fig. 3. Comparison of CDFs obtained with three replicated random and LHSs of size 100 for WAS_PRES at 1000 yr and 10,000 yr.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 311

the uncertainty in analysis outcomes of interest. In

particular, there is neither need nor justification for the

estimation of very small or very large quantiles of

distributions characterizing subjective uncertainty. This is

in contrast to risk studies where much emphasis is placed

on the determination of the effects of stochastic (i.e.

random or aleatory) uncertainty due to the need to

determine the likelihood of rare, high consequence events

[110–112,117].

The present analysis is concerned with the effects of

subjective uncertainty. In this context, the use of any of

the individual random or LHSs would have led to

operationally similar assessments of the uncertainty in

analysis outcomes. The word operational is used because

the individual assessments of uncertainty are sufficiently

similar that it is difficult to envision that the individual

assessments would have led to different courses of

action being chosen (e.g. whether or not to fund

additional research to reduce the indicated state of

uncertainty).

4. Stepwise results

A sensitivity analysis based on stepwise regression

analysis with rank-transformed data [118] was carried

Page 8: Helton et.al. 2005

Fig. 4. Box plots for BRNREPTC, GAS_MOLE, REP_SATB and WAS_PRES at 1000, 10,000–1000, and 10,000 yr (key: RS1, RS2, RS3 and LS1, LS2, LS3

designate replicates 1–3 for random sampling and Latin hypercube sampling; 1, 9 and 10 K designate results at 1000 yr, difference in results at 10,000 yr and

1000 yr, and results at 10,000 yr).

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330312

out for the replicated samples summarized in Fig. 4

(Tables 3–6). This analysis required a-values of 0.02 and

0.05 for variables to enter and to be retained in a given

analysis, respectively, and was carried out with the

STEPWISE program [119]. The summary tables (Tables 3–

6) present results for both the individual replicates and

for the three replicates of a given type (i.e. random or

Latin hypercube) pooled. The standardized rank

regression coefficient (SRRC) is used as a measure of

variable importance.

Inspection of Tables 3–6 shows that the results

obtained with the individual replicates are very

consistent. In particular, the results obtained for a given

dependent variable for the three replicated random

samples are very similar to each other and also to the

results obtained for the three replicated LHSs. This

similarity includes the order in which variables are

selected in the stepwise process, the SRRCs associated

with individual variables, and the R2 value of the final

regression model.

The results obtained with the pooled replicates tend

to include a few more variables than the results obtained

with the individual replicates. However, the effects

associated with the addition of these variables are

Page 9: Helton et.al. 2005

Table 3

Sensitivity analysis results based on stepwise rank regression for replicated random and Latin hypercube samples of size 100 for BRNREPTC at 1000, 10,000–

1000, and 10,000 yr

Stepa Replicate 1 Replicate 2 Replicate 3 Replicates 1, 2, 3 pooled

Variableb SRRCc R2d Variable SRRC R2 Variable SRRC R2 Variable SRRC R2

Random: BRNREPTC, 1000 yr

1 HALPOR 0.99 0.97 HALPOR 0.97 0.94 HALPOR 0.97 0.95 HALPOR 0.98 0.96

2 WMICDFLG K0.10 0.98 ANHPRM 0.13 0.95 WMICDFLG K0.11 0.96 ANHPRM 0.09 0.97

3 ANHPRM 0.07 0.99 WMICDFLG K0.07 0.96 ANHPRM 0.10 0.97 WMICDFLG K0.09 0.98

4 SALPRES K0.04 0.99 HALPRM 0.07 0.96 BPPRM 0.05 0.97 HALPRM 0.04 0.98

5 WRBRNSAT K0.03 0.99 WGRCOR K0.06 0.97 HALPRM 0.04 0.98 WGRCOR K0.02 0.98

6 HALPRM 0.03 0.99 SHBCEXP 0.05 0.97 SALPRES K0.02 0.98

7 WASTWICK K0.03 0.99

8 SHPRMDRZ 0.02 0.99

LHS: BRNREPTC, 1000 yr

1 HALPOR 0.98 0.97 HALPOR 0.98 0.95 HALPOR 0.96 0.94 HALPOR 0.98 0.96

2 WMICDFLG K0.10 0.98 ANHPRM 0.10 0.96 WMICDFLG K0.10 0.96 WMICDFLG K0.10 0.97

3 ANHPRM 0.07 0.98 WMICDFLG K0.09 0.97 ANHPRM 0.07 0.96 ANHPRM 0.08 0.97

4 WASTWICK K0.04 0.98 WRBRNSAT K0.05 0.97 HALPRM 0.07 0.97 HALPRM 0.05 0.97

5 HALPRM 0.05 0.97 WASTWICK K0.05 0.97 WRBRNSAT K0.04 0.98

6 SALPRES K0.05 0.97 WASTWICK K0.04 0.98

7 WRBRNSAT K0.05 0.97 SALPRES K0.04 0.98

8 WGRCOR K0.03 0.98

Random: BRNREPTC, 10,000–1000 yr

1 BHPRM 0.70 0.51 BHPRM 0.68 0.45 BHPRM 0.59 0.34 BHPRM 0.66 0.44

2 BPCOMP 0.37 0.64 BPCOMP 0.40 0.63 BPCOMP 0.37 0.49 BPCOMP 0.40 0.60

3 WMICDFLG K0.26 0.70 ANHPRM 0.21 0.68 ANHPRM 0.29 0.57 WMICDFLG K0.22 0.65

4 BPINTPRS 0.14 0.72 WMICDFLG K0.21 0.72 WMICDFLG K0.23 0.63 ANHPRM 0.18 0.68

5 HALPOR 0.13 0.74 BPINTPRS 0.11 0.70

6 HALPOR 0.08 0.70

LHS: BRNREPTC, 10,000–1000 yr

1 BHPRM 0.67 0.43 BHPRM 0.66 0.43 BHPRM 0.64 0.44 BHPRM 0.66 0.43

2 BPCOMP 0.43 0.59 BPCOMP 0.46 0.66 BPCOMP 0.36 0.58 BPCOMP 0.42 0.60

3 WMICDFLG K0.25 0.64 WMICDFLG K0.24 0.71 WMICDFLG K0.29 0.67 WMICDFLG K0.27 0.67

4 ANHPRM 0.16 0.67 BPVOL 0.14 0.74 ANHPRM 0.17 0.70 BPVOL 0.16 0.70

5 BPVOL 0.16 0.69 BPVOL 0.17 0.73 ANHPRM 0.14 0.72

6 WGRCOR K0.14 0.71 BPINTPRS 0.09 0.72

7 WGRCOR K0.08 0.73

Random: BRNREPTC, 10,000 yr

1 BHPRM 0.63 0.43 BHPRM 0.63 0.39 BHPRM 0.51 0.29 BHPRM 0.60 0.37

2 HALPOR 0.42 0.59 BPCOMP 0.38 0.55 HALPOR 0.37 0.41 BPCOMP 0.36 0.50

3 BPCOMP 0.33 0.70 ANHPRM 0.23 0.61 BPCOMP 0.35 0.54 HALPOR 0.33 0.61

4 WMICDFLG K0.24 0.76 HALPOR 0.24 0.66 WMICDFLG K0.23 0.60 WMICDFLG K0.23 0.67

5 BPINTPRS 0.13 0.77 WMICDFLG K0.22 0.71 ANHPRM 0.23 0.65 ANHPRM 0.18 0.70

6 BPINTPRS 0.09 0.71

LHS: BRNREPTC, 10,000 yr

1 BHPRM 0.60 0.32 BHPRM 0.58 0.36 BHPRM 0.57 0.35 BHPRM 0.58 0.34

2 HALPOR 0.37 0.46 BPCOMP 0.41 0.55 HALPOR 0.32 0.47 BPCOMP 0.38 0.49

3 BPCOMP 0.38 0.59 HALPOR 0.34 0.65 BPCOMP 0.33 0.58 HALPOR 0.34 0.61

4 WMICDFLG K0.26 0.65 WMICDFLG K0.24 0.71 WMICDFLG K0.30 0.67 WMICDFLG K0.27 0.68

5 ANHPRM 0.14 0.67 BPVOL 0.17 0.74 BPVOL 0.17 0.70 BPVOL 0.17 0.70

6 ANHPRM 0.16 0.73 ANHPRM 0.12 0.72

7 BPINTPRS 0.09 0.72

8 HALPRM 0.08 0.73

a Steps in stepwise rank regression analysis with a-values of 0.02 and 0.05 required for a variable to enter and to be retained in an analysis, respectively.b Variables listed in order of selection in regression analysis with ANHCOMP and HALCOMP excluded from entry into regression model because of K0.99 rank correlation with

the pairs (ANHPRM, ANHCOMP) and (HALPRM, HALCOMP).

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 313

Page 10: Helton et.al. 2005

Table 4

Sensitivity analysis results based on stepwise rank regression for replicated random and Latin hypercube samples of size 100 for GAS_MOLE at 1000, 10,000–

1000, and 10,000 yr

Stepa Replicate 1 Replicate 2 Replicate 3 Replicates 1, 2, 3 pooled

Variableb SRRCc R2d Variable SRRC R2 Variable SRRC R2 Variable SRRC R2

Random: GAS_MOLE, 1000 yr

1 WMICDFLG 0.88 0.74 WMICDFLG 0.91 0.76 WMICDFLG 0.89 0.78 WMICDFLG 0.90 0.76

2 WGRCOR 0.37 0.88 WGRCOR 0.36 0.88 WGRCOR 0.33 0.89 WGRCOR 0.36 0.89

3 WASTWICK 0.21 0.92 WASTWICK 0.23 0.93 WASTWICK 0.14 0.91 WASTWICK 0.19 0.92

4 HALPOR 0.11 0.93 HALPOR 0.10 0.94 HALPOR 0.10 0.92 HALPOR 0.10 0.93

5 WGRMICI 0.09 0.93 WGMICI 0.04 0.94

6 SHPRMDRZ 0.07 0.93

LHS: GAS_MOLE, 1000 yr

1 WMICDFLG 0.87 0.76 WMICDFLG 0.85 0.78 WMICDFLG 0.85 0.74 WMICDFLG 0.85 0.76

2 WGRCOR 0.37 0.89 WGRCOR 0.32 0.88 WGRCOR 0.34 0.85 WGRCOR 0.35 0.88

3 WASTWICK 0.21 0.93 WASTWICK 0.22 0.93 WASTWICK 0.25 0.91 WASTWICK 0.23 0.93

4 HALPOR 0.09 0.94 HALPOR 0.11 0.94 HALPOR 0.12 0.93 HALPOR 0.11 0.94

5 WGRMICI 0.06 0.93 WGRMICI 0.04 0.94

6 ANHPRM 0.04 0.94

7 ANHBCVGP K0.03 0.94

Random: GAS_MOLE, 10,000–1000 yr

1 HALPOR 0.53 0.31 HALPOR 0.52 0.26 HALPOR 0.43 0.21 HALPOR 0.49 0.26

2 WGRCOR 0.43 0.51 WGRCOR 0.43 0.45 WGRCOR 0.38 0.36 WGRCOR 0.41 0.44

3 WRBRNSAT 0.26 0.58 BPCOMP 0.32 0.55 WASTWICK K0.22 0.40 BHPRM 0.20 0.48

4 BHPRM 0.24 0.63 BHPRM 0.19 0.58 BHPRM 0.19 0.44 BPCOMP 0.17 0.51

5 BPVOL 0.15 0.65 ANHPRM 0.19 0.48 ANHPRM 0.12 0.52

6 WRBRNSAT 0.11 0.53

7 WASTWICK K0.10 0.54

8 BPVOL 0.10 0.55

LHS: GAS_MOLE 10,000–1000 yr

1 HALPOR 0.52 0.27 HALPOR 0.50 0.28 BHPRM 0.39 0.16 HALPOR 0.47 0.23

2 BHPRM 0.33 0.40 BHPRM 0.38 0.42 HALPOR 0.35 0.31 BHPRM 0.38 0.37

3 WGRCOR 0.33 0.50 WGRCOR 0.32 0.51 WGRCOR 0.32 0.41 WGRCOR 0.32 0.47

4 SHRGSSAT 0.31 0.59 BPPRM K0.20 0.55 WMICDFLG K0.26 0.48 BPPRM K0.17 0.50

5 WMICDFLG K0.20 0.59 BPPRM K0.18 0.51 WMICDFLG K0.17 0.53

6 BPINTPRS 0.12 0.54

Random: GAS_MOLE, 10,000 yr

1 WMICDFLG 0.50 0.27 WMICDFLG 0.52 0.23 WMICDFLG 0.48 0.23 WMICDFLG 0.49 0.24

2 WGRCOR 0.49 0.54 WGRCOR 0.49 0.45 WGRCOR 0.43 0.41 WGRCOR 0.47 0.47

3 HALPOR 0.42 0.72 HALPOR 0.45 0.66 HALPOR 0.39 0.58 HALPOR 0.42 0.65

4 BHPRM 0.16 0.74 BPCOMP 0.25 0.73 ANHPRM 0.16 0.61 BHPRM 0.16 0.68

5 WRBRNSAT 0.16 0.77 BHPRM 0.14 0.75 BHPRM 0.15 0.63 BPCOMP 0.15 0.70

6 BPVOL 0.13 0.78 ANHPRM 0.08 0.71

7 SHPRMDRZ 0.08 0.71

LHS: GAS_MOLE, 10,000 yr

1 WMICDFLG 0.53 0.30 HALPOR 0.42 0.24 WGRCOR 0.44 0.21 WMICDFLG 0.44 0.22

2 HALPOR 0.43 0.48 WGRCOR 0.43 0.43 WMICDFLG 0.42 0.35 WGRCOR 0.41 0.40

3 WGRCOR 0.39 0.63 WMICDFIG 0.43 0.60 HALPOR 0.36 0.48 HALPOR 0.41 0.57

4 BHPRM 0.26 0.70 BHPRM 0.26 0.67 BHPRM 0.29 0.56 BHPRM 0.28 0.64

5 SHRGSSAT 0.21 0.74 BPPRM K0.15 0.69 BPPRM K0.17 0.59 BPCOMP 0.13 0.66

6 ANHPRM 0.16 0.62 BPINTPRS 0.09 0.67

7 ANHPRM 0.08 0.68

a Steps in stepwise rank regression analysis with a-values of 0.02 and 0.05 required for a variable to enter and to be retained in an analysis, respectively.b Variables listed in order of selection in regression analysis with ANHCOMP and HALCOMP excluded from entry into regression model because of K0.99 rank correlation with

the pairs (ANHPRM, ANHCOMP) and (HALPRM, HALCOMP).c Standardized rank regression coefficients (SRRCs) in final regression model.d Cumulative R2 value with entry of each variable into regression model.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330314

Page 11: Helton et.al. 2005

Table 5

Sensitivity analysis results based on stepwise rank regression for replicated random and Latin hypercube samples of size 100 for REP_SATB at 1000, 10,000–

1000, and 10,000 yr

Stepa Replicate 1 Replicate 2 Replicate 3 Replicates 1, 2, 3 pooled

Variableb SRRCc R2d Variable SRRC R2 Variable SRRC R2 Variable SRRC R2

Random: REP_SATB, 1000 yr

1 HALPOR 0.81 0.60 HALPOR 0.77 0.60 HALPOR 0.78 0.61 HALPOR 0.80 0.62

2 WGRCOR K0.40 0.76 WGRCOR K0.44 0.77 WGRCOR K0.39 0.76 WGRCOR K0.41 0.78

3 WMICDFLG K0.31 0.84 WMICDFLG K0.30 0.85 WMICDFLG K0.34 0.86 WMICDFLG K0.31 0.86

4 WASTWICK K0.27 0.91 WASTWICK K0.28 0.93 WASTWICK K0.26 0.93 WASTWICK K0.26 0.93

5 WRBRNSAT 0.12 0.92 SHRGRSAT K0.07 0.94 WRBRNSAT 0.05 0.93

6 ANHPRM 0.06 0.94

LHS: REP_SATB, 1000 yr

1 HALPOR 0.78 0.63 HALPOR 0.80 0.59 HALPOR 0.77 0.62 HALPOR 0.79 0.61

2 WGRCOR K0.42 0.81 WGRCOR K0.38 0.74 WGRCOR K0.38 0.76 WGRCOR K0.39 0.77

3 WASTWICK K0.25 0.87 WASTWICK K0.30 0.84 WMICDFLG K0.30 0.85 WASTWICK K0.28 0.85

4 WMICDFLG K0.21 0.91 WMICDFLG K0.28 0.91 WASTWICK K0.28 0.93 WMICDFLG K0.26 0.92

Random: REP_SATB, 10,000–1000 yr

1 BHPRM 0.59 0.34 BHPRM 0.66 0.41 BHPRM 0.60 0.35 BHPRM 0.63 0.38

2 BPCOMP 0.29 0.43 HALPOR K0.29 0.49 HALPOR K0.29 0.44 HALPOR K0.27 0.46

3 WGRCOR K0.25 0.50 WGRCOR K0.28 0.57 WGRCOR K0.30 0.52 WGRCOR K0.26 0.53

4 HALPOR K0.23 0.55 BPCOMP 0.20 0.61 BPCOMP 0.22 0.57 BPCOMP 0.24 0.58

5 WRBRNSAT K0.16 0.58 ANHPRM 0.18 0.64 ANHPRM 0.21 0.61 ANHPRM 0.17 0.61

LHS: REP_SATB, 10,000–1000 yr

1 BHPRM 0.58 0.34 BHPRM 0.48 0.22 BHPRM 0.57 0.32 BHPRM 0.54 0.29

2 BPPRM K0.31 0.43 BPCOMP 0.39 0.36 WGRCOR K0.27 0.40 BPCOMP 0.30 0.38

3 HALPOR K0.28 0.51 HALPOR K0.31 0.46 BPCOMP 0.44 0.45 HALPOR K0.27 0.45

4 WGRCOR K0.22 0.56 WGRCOR K0.30 0.56 BPPRM 0.32 0.50 WGRCOR K0.27 0.52

5 ANHPRM 0.19 0.60 HALPOR K0.20 0.54 ANHPRM 0.14 0.54

6 SHRGSSAT K0.16 0.57 BPVOL 0.13 0.56

7 BPVOL 0.17 0.60 SHRGSSAT K0.12 0.57

Random: REP_SATB, 10,000 yr

1 BHPRM 0.57 0.35 BHPRM 0.60 0.34 BHPRM 0.49 0.28 BHPRM 0.56 0.33

2 WGRCOR K0.42 0.51 WGRCOR K0.50 0.59 WGRCOR K0.54 0.54 WGRCOR K0.48 0.54

3 HALPOR 0.30 0.59 ANHPRM 0.21 0.64 ANHPRM 0.21 0.59 HALPOR 0.25 0.60

4 BPCOMP 0.25 0.65 HALPOR 0.21 0.69 HALPOR 0.22 0.64 BPCOMP 0.21 0.65

5 WMICDFLG K0.17 0.68 WMICDFLG K0.20 0.73 BPCOMP 0.19 0.68 ANHPRM 0.18 0.69

6 ANHPRM 0.15 0.70 BPCOMP 0.16 0.75 WMICDFLG K0.15 0.71 WMICDFLG K0.18 0.72

7 WASTWICK K0.13 0.77 SHPRMDRZ K0.13 0.72 WASTWICK K0.10 0.73

8 WRBRNSAT K0.13 0.74 BPINTPRS 0.09 0.74

9 BPINTPRS 0.12 0.76

LHS: REP_SATB, 10,000 yr

1 BHPRM 0.55 0.28 BHPRM 0.49 0.26 BHPRM 0.50 0.26 BHPRM 0.52 0.26

2 WGRCOR K0.45 0.47 WGRCOR K0.50 0.48 WGRCOR K0.48 0.49 WGRCOR K0.48 0.48

3 HALPOR 0.28 0.55 BPCOMP 0.33 0.59 HALPOR 0.31 0.59 HALPOR 0.28 0.56

4 BPPRM K0.26 0.61 HALPOR 0.24 0.65 BPCOMP 0.19 0.63 BPCOMP 0.26 0.62

5 WASTWICK K0.20 0.66 WASTWICK K0.16 0.67 WMICDFLG K0.16 0.66 WASTWICK K0.16 0.65

6 ANHPRM 0.14 0.68 BPVOL 0.15 0.68 WMICDFLG K0.15 0.67

7 SHRGSSAT K0.14 0.70 BPVOL 0.11 0.68

8 ANHPRM 0.11 0.69

9 SHRGSSAT K0.10 0.70

a Steps in stepwise rank regression analysis with a-values of 0.02 and 0.05 required for a variable to enter and to be retained in an analysis, respectively.b Variables listed in order of selection in regression analysis with ANHCOMP and HALCOMP excluded from entry into regression model because of K0.99 rank correlation with

the pairs (ANHPRM, ANHCOMP) and (HALPRM, HALCOMP).c Standardized rank regression coefficients (SRRCs) in final regression model.d Cumulative R2 value with entry of each variable into regression model.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 315

small, and the R2 values for the pooled analyses are not

much larger than the R2 values obtained for the

individual replicates. Further, the results obtained with

the pooled random and LHSs are very similar.

The comparisons of random and Latin hypercube

sampling in this section are based on nonquantitative

impressions gained from inspecting the results in

Tables 3–6. Section 5 introduces quantitative procedures

Page 12: Helton et.al. 2005

Table 6

Sensitivity analysis results based on stepwise rank regression for replicated random and Latin hypercube samples of size 100 for WAS_PRES at 1000, 10,000–

1000, and 10,000 yr

Stepa

Replicate 1 Replicate 2 Replicate 3 Replicates 1, 2, 3 pooled

Variableb SRRCc R2d Variable SRRC R2 Variable SRRC R2 Variable SRRC R2

Random: WAS_PRES, 1000 yr

1 WMICDFLG 0.90 0.77 WMICDFLG 0.93 0.79 WMICDFLG 0.91 0.80 WMICDFLG 0.91 0.79

2 WGRCOR 0.35 0.90 WGRCOR 0.33 0.90 WGRCOR 0.31 0.90 WGRCOR 0.34 0.90

3 WASTWICK 0.20 0.94 WASTWICK 0.22 0.95 WASTWICK 0.14 0.91 WASTWICK 0.18 0.93

4 HALPOR 0.07 0.94 HALPOR 0.06 0.95 WGRMICI 0.09 0.92 HALPOR 0.06 0.94

5 BPVOL 0.06 0.95 WGRMICI 0.04 0.94

6

7

LHS: WAS_PRES, 1000 yr

1 WMICDFLG 0.88 0.78 WMICDFLG 0.86 0.80 WMICDFLG 0.87 0.77 WMICDFLG 0.87 0.78

2 WGRCOR 0.35 0.90 WGRCOR 0.31 0.89 WGRCOR 0.31 0.87 WGRCOR 0.33 0.89

3 WASTWICK 0.21 0.94 WASTWICK 0.21 0.94 WASTWICK 0.22 0.92 WASTWICK 0.21 0.94

4 HALPOR 0.06 0.95 HALPOR 0.08 0.94 HALPOR 0.10 0.93 HALPOR 0.08 0.94

5 ANHPRM 0.06 0.95 ANHPRM 0.05 0.95

6 WGRMICI 0.04 0.95

Random: WAS_PRES, 10,000–1000 yr

1 WMICDFLG K0.84 0.65 WMICDFLG K0.80 0.60 WMICDFLG K0.77 0.61 WMICDFLG K0.81 0.62

2 WGRCOR K0.32 0.75 WGRCOR K0.29 0.68 WGRCOR K0.32 0.70 WGRCOR K0.30 0.71

3 WASTWICK K0.19 0.79 WASTWICK K0.20 0.72 BPVOL 0.19 0.73 WASTWICK K0.17 0.74

4 BPVOL K0.17 0.81 ANHPRM 0.15 0.74 ANHPRM 0.12 0.75 ANHPRM 0.11 0.75

5 HALPRM 0.12 0.83 BPCOMP 0.09 0.76

6 HALPRM 0.07 0.76

7

LHS: WAS_PRES, 10,000–1000 yr

1 WMICDFLG K0.77 0.60 WMICDFLG K0.77 0.63 WMICDFLG K0.74 0.56 WMICDFLG K0.76 0.59

2 WGRCOR K0.26 0.67 WGRCOR K0.24 0.68 WGRCOR K0.31 0.65 WGRCOR K0.27 0.66

3 HALPRM 0.17 0.71 HALPRM 0.17 0.68 HALPRM 0.16 0.69

4 BPVOL 0.16 0.73 WASTWICK K0.14 0.70 WASTWICK K0.13 0.71

5 WASTWICK K0.15 0.75 BPCOMP 0.14 0.72 BPCOMP 0.11 0.72

6 BPVOL 0.09 0.73

7 BPINTPRS 0.07 0.73

Random: WAS_PRES, 10,000 yr

1 BPVOL K0.28 0.06 ANHPRM 0.31 0.10 BPCOMP 0.32 0.11 ANHPRM 0.25 0.06

2 HALPRM 0.28 0.14 BPVOL 0.28 0.18 BPCOMP 0.25 0.13

3 WRBRNSAT K0.22 0.19 WGRCOR K0.25 0.24 WGRCOR K0.18 0.16

4 ANHPRM 0.24 0.30 HALPRM 0.16 0.18

5 HALPOR 0.14 0.20

LHS: WAS_PRES, 10,000 yr

1 HALPRM 0.31 0.09 HALPRM 0.42 0.15 HALPRM 0.34 0.11 HALPRM 0.36 0.12

2 BPCOMP 0.25 0.16 BPVOL 0.25 0.21 BPCOMP 0.25 0.17 BPCOMP 0.22 0.17

3 ANHPRM 0.18 0.20

4 BPVOL 0.17 0.23

5 HALPOR 0.15 0.25

a Steps in stepwise rank regression analysis with a-values of 0.02 and 0.05 required for a variable to enter and to be retained in an analysis, respectively.b Variables listed in order of selection in regression analysis with ANHCOMP and HALCOMP excluded from entry into regression model because of K0.99 rank correlation with

the pairs (ANHPRM, ANHCOMP) and (HALPRM, HALCOMP).c Standardized rank regression coefficients (SRRCs) in final regression model.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330316

for comparing the results in Tables 3–6 obtained with

random and Latin hypercube sampling.

5. Coefficients of concordance

Inspection of the results in Tables 3–6 suggests that

the individual replicates are producing similar results.

Kendall’s coefficient of concordance (KCC) provides a way

to formally assess this similarity (p. 305, Ref. [120]). This

coefficient is based on the consideration of arrays of the form

R1 R2 . RnR

x1 rðO11Þ rðO12Þ . rðO1;nRÞ

x2 rðO21Þ rðO22Þ . rðO2;nRÞ

« « « . «

xnX rðOnX;1Þ rðOnX;2Þ . rðOnX;nRÞ

(5.1)

Page 13: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 317

where x1,x2,.,xnX are the variables under consideration (i.e.

nXZ29 with the exclusion of ANHCOMP and HALCOMP

from the analysis; see Footnote b, Table 3), R1, R2,.,RnR

designate the replicates (i.e. nRZ3), Oij is the outcome (i.e.

sensitivity measure) for variable xi and replicate Rj, and r(Oij),

iZ1, 2,.,nX, are the ranks assigned to the outcomes

associated with replicate Rj. In the assigning of ranks, (i) a

rank of 1 is assigned to the outcome Oij with the largest value

for jOijj, (ii) a rank of 2 is assigned the outcome Oij with the

second largest value for jOijj, and so on, and (iii) averaged

ranks are assigned to equal values of Oij. This is the reverse of

the procedure used to assign ranks for use in rank regression.

Kendall’s coefficient of concordance (KCC) is defined by

W Z12

nR2nXðnX C1ÞðnX K1Þ

!XnX

iZ1

XnR

jZ1

rðOijÞKnRðnX C1Þ

2

" #2

(5.2)

(see Eq. (23), p. 305, Ref. [120]). The coefficient W is

related to the average rar of the nR(nRK1) correlations (i.e.

rank or Spearman correlations due to the indicated rank

transformation) between the columns in Eq. (5.1) by

W Z ½ðnR K1Þrar C1�=nR: (5.3)

The preceding equality follows from a rewriting of Eq. (29),

p. 307, of Ref. [120] in the form rarZ(nR WK1)/(nRK1)

with rar corresponding to ra in the indicated equation from

Ref. [120]. Under repeated random assignment of the

integers in the columns of Eq. (5.1),

T Z nRðnX K1ÞW (5.4)

approximately follows a c2-distribution with nXK1 degrees

of freedom (see Eq. (24), p. 304, Ref. [120]; Iman and

Davenport [121] recommend using an F-distribution with

k1ZnXK1 and k2Z(nRK1)(nXK1) degrees of freedom

rather than the indicated c2-distribution).

Kendall’s coefficient of concordance (KCC) places equal

weight on agreement of rankings for both important variables

(i.e. variables with ranks close to 1) and unimportant

variables (i.e. variables with ranks close to nX). In practice,

only a few variables typically have significant effects on a

given model prediction, with the remaining variables having

no discernable effects and rankings that are either unassigned

or meaningless. The stopping of the regressions in Tables 3–6

at an a-value of 0.02 is an example of only the important

variables being assigned ranks, with the remaining variables

(i.e. the variables not selected in the stepwise regression)

assigned no rank. Alternatively, the regression could be

forced to include all variables, which would result in the

assignment of ranks to all variables, but with most of these

ranks having no meaning. As a result, KCC can be a poor

indicator of agreement when only a few variables have

significant effects.

As an alternative to KCC, Iman and Conover [122]

proposed the top down coefficient of concordance (TDCC)

as a measure of agreement between multiple rankings for

use when it is desired to emphasize agreement between

rankings assigned to important variables and to deempha-

size disagreement between rankings assigned to less

important/unimportant variables. For the TDCC, the ranks

r(Oij) in Eq. (5.1) are replaced by the corresponding Savage

scores ss(Oij), where

ssðOijÞ ZXnX

iZrðOijÞ

1=i (5.5)

and average Savage scores are assigned in the event of ties.

The result is an array of the form

R1 R2 . RnR

x1 ssðO11Þ ssðO12Þ . ssðO1;nRÞ

x2 ssðO21Þ ssðO22Þ . ssðO2;nRÞ

« « « . «

xnX ssðOnX;1Þ ssðOnX;2Þ . ssðOnX;nRÞ

(5.6)

which has the same form as the array in Eq. (5.1) except that

the ranks r(Oij) have been replaced by the corresponding

Savage scores ss(Oij).

The TDCC is defined by

CT ZXnX

iZ1

XnR

jZ1

ssðOijÞ

" #2

KnR2nX

( ),

nR2 nX KXnX

iZ1

1=i

!( )ð5:7Þ

and is equivalent to KCC calculated with Savage scores

rather than ranks. In particular,

CT Z ðnR K1Þras C1� �

=nR (5.8)

where ras is the average of the nR(nRK1)/2 correlations

(i.e. ordinary or Pearson correlations involving Savage

scores) between the columns in Eq. (5.6). Under repeated

random assignment of the integers in the columns of Eq.

(5.1),

T Z nRðnX K1ÞCT (5.9)

approximately follows a c2-distribution with nXK1

degrees of freedom (see Sect. 4, Ref. [122]).

Sensitivity analysis results obtained with the random and

LHSs were compared with both KCC and the TDCC

(Table 7). For this comparison, the associated rank

regression models were forced to include all 29 variables

under consideration (i.e. all variables in Table 1 except

ANHCOMP and HALCOMP as indicated in Footnote b of

Table 3), and the ranking was done on the basis of the

absolute values of the SRRCs for the regression model

containing all variables. An alternative would be to rank the

variables included in the stepwise regressions in Tables 3–6

Page 14: Helton et.al. 2005

Table 7

Consistency of variable rankings with stepwise rank regression for three replicated random samples of size 100 and three replicated Latin hypercube samples of

size 100

Variablea Random sampling Latin hypercube sampling

KCCb p-valuec TDCCd p-valuee KCCb p-valuec TDCCd p-valuee

BRNREPTC1 0.58 8.2!10K3 0.80 5.2!10K5 0.69 7.3!10K4 0.88 5.9!10K6

BRNREPTC2 0.55 1.7!10K2 0.79 6.4!10K5 0.64 2.5!10K3 0.86 9.9!10K6

BRNREPTC3 0.59 7.4!10K3 0.83 2.0!10K5 0.69 8.0!10K4 0.88 5.9!10K6

GAS_MOLE1 0.55 1.6!10K2 0.81 3.4!10K5 0.61 4.3!10K3 0.85 1.2!10K5

GAS_MOLE2 0.53 2.5!10K2 0.76 1.4!10K4 0.57 1.1!10K2 0.78 7.2!10K5

GAS_MOLE3 0.58 8.6!10K3 0.84 1.5!10K5 0.55 1.5!10K2 0.77 1.0!10K4

REP_SATB1 0.61 5.2!10K3 0.83 2.0!10K5 0.53 2.6!10K2 0.80 4.3!10K5

REP_SATB2 0.61 4.4!10K3 0.85 1.1!10K5 0.66 1.7!10K3 0.82 2.4!10K5

REP_SATB3 0.73 2.5!10K4 0.88 4.6!10K6 0.77 1.1!10K4 0.88 5.0!10K6

WAS_PRES1 0.52 3.2!10K2 0.78 7.3!10K5 0.61 4.2!10K3 0.86 1.0!10K5

WAS_PRES2 0.55 1.8!10K2 0.80 5.0!10K5 0.62 3.5!10K3 0.83 2.0!10K5

WAS_PRES3 0.46 9.3!10K2 0.58 9.5!10K3 0.58 9.4!10K3 0.72 3.4!10K4

a Dependent variables (Table 2) with 1–3 designating results at 1000, 10,000–1000, and 10,000 yr, respectively.b Kendall’s coefficient of concordance (KCC).c p-value for KCC.d Top down coefficient of concordance (TDCC).e p-value for TDCC.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330318

and then to assign tied ranks to the variables not selected in a

particular regression. This approach was not used.

The TDCC values in Table 7 provide more insightful

indications of analysis consistency than the KCC values. In

particular, the numerical values for the TDCC are larger

than those for KCC, and more importantly, the correspond-

ing p-values are more significant (i.e. the TDCC is

producing smaller p-values than KCC). For example,

BRNREPTC1 for random sampling has a KCC of 0.58

with a p-value of 8.2!10K3 and a TDCC of 0.80 with a

p-value of 5.2!10K5; similar comparisons also exist for the

other analyses in Table 7. This behavior results because the

TDCC emphasizes agreement on important variables and

deemphasizes disagreement on unimportant variables. In

contrast, KCC tends to weight agreement/disagreement on

the rankings assigned to all variables equally.

As indicated by the TDCC, random and Latin hypercube

sampling show similar levels of consistency in rankings of

variable importance for the three replicated samples. In

particular, both approaches have similar TDCC values for a

given variable, and neither approach has TDCC values

across all variables that are consistently higher than the

values for the other approach. Thus, at least in this example,

neither sampling approach appears to have an advantage in

the consistent identification of important variables with a

sample size of 100.

6. Sensitivity analysis with the TDCC

Replicated samples and the TDCC provide the basis for a

sensitivity analysis procedure to identify important sets of

variables that does not depend on direct testing of

the statistical significance of sensitivity measures (e.g. the

significance of the coefficients in a stepwise regression

model as defined by an a-value for entry into the model).

Rather, important variables are identified by the similarity

of outcomes in analyses performed for the individual

replicated samples.

The procedure operates in the following manner: (i) The

sensitivity analysis technique in use (e.g. stepwise

regression analysis) is applied to each replicate to rank

variable importance. (ii) The TDCC is applied to the

variable rankings obtained with each replicate to determine

if there is a significant agreement between the replicates

(e.g. as defined by a specified p-value for the TDCC). (iii) If

there is significant agreement, the top ranked variable (i.e.

rank 1) for each replicate is removed from consideration for

all replicates; this results in the removal of one variable if all

replicates assign the same variable a rank of 1 and more than

one variable if different variables are assigned a rank of 1 in

different replicates. (iv) A new sensitivity analysis is then

performed for each replicate with the remaining variables,

the remaining variables are reranked for each replicate, and

Steps (ii) and (iii) are repeated with the reduced set of

variables. (v) The process is continued until the deleted

variable result in the analysis reaches a point at which the

TDCC indicates that there is no significant agreement

between the variable rankings obtained with the individual

replicates. (vi) At this point, the analysis ends, and the

significant set of variables are those deleted before

the TDCC indicated no significant agreement between the

variable rankings obtained with the individual replicates.

This procedure is illustrated for rank regression analysis

with the three random samples for BRNREPTC at 1000 yr

(i.e. BRNREPTC1). The individual regression analyses all

Page 15: Helton et.al. 2005

Table 8

Sensitivity analysis results based on SRRCs for three replicated random samples (RS1 RS2, RS3) and three replicated Latin hypercube samples (LS1, LS2, LS3)

of Size 100 for BRNREPTC at 1000 yr

Variablea RS1b,c RS2 RS3 LS1 LS2 LS3

HALPOR 9.93!10K1(1) 9.67!10K1(1) 9.73!10K1(1) 9.82!10K1(1) 9.79!10K1(1) 9.67!10K1(1)

WMICDFLG K9.72!10K2(2) K6.92!10K2(4) K1.13!10K1(2) K9.53!10K2(2) K8.37!10K2(3) K1.03!10K1(2)

ANHPRM 6.49!10K2(3) 1.33!10K1(2) 9.84!10K2(3) 7.62!10K2(3) 1.04!10K1(2) 7.20!10K2(3)

SALPRES K4.00!10K2(4) K2.70!10K3(26) K1.41!10K2(13) K1.90!10K2(10) K3.17!10K2(8) K4.57!10K2(7)

HALPRM 3.53!10K2(5) 7.67!10K2(3) 4.05!10K2(5) 2.92!10K2(5) 4.99!10K2(5) 6.83!10K2(4)

WRBRNSAT K3.08!10K2(6) K1.79!10K2(14) 9.13!10K3(17) K1.59!10K2(12) K5.15!10K2(4) K5.07!10K2(6)

WASTWICK K2.82!10K2(7) K2.27!10K2(10) K4.47!10K3(21) K3.78!10K2(4) K2.74!10K2(10) K5.14!10K2(5)

BPCOMP K2.61!10K2(8) 2.36!10K2(9) K8.05!10K4(29) K1.04!10K2(19) 9.96!10K3(23) K1.29!10K2(19)

SHPRMDRZ 2.29!10K2(9) K1.37!10K2(17) 2.58!10K2(8) 1.05!10K2(18) K2.45!10K2(11) 9.97!10K3(23)

BPPRM K1.85!10K2(10) 1.27!10K2(19) 5.08!10K2(4) K2.46!10K2(7) 2.51!10K3(26) K2.12!10K3(29)

WFBETCEL K1.60!10K2(11) 1.89!10K2(13) 1.46!10K2(11) K1.39!10K2(15) K5.76!10K3(25) K2.70!10K2(9)

SHPRMASP K1.30!10K2(12) 1.02!10K2(20) K2.36!10K2(9) K5.85!10K3(26) 1.83!10K2(15) K2.93!10K3(26)

BPINTPRS 1.27!10K2(13) 2.07!10K2(12) K1.27!10K2(15) K7.75!10K3(22) 1.24!10K2(21) 2.54!10K3(27)

WGRMICH K1.14!10K2(14) 7.36!10K3(22) 1.32!10K2(14) K1.34!10K2(17) K1.64!10K2(17) K1.84!10K2(12)

SHRGSSAT K1.13!10K2(15) K1.64!10K3(27) 5.93!10K3(18) K6.60!10K3(24) 1.89!10K2(14) K1.31!10K2(18)

SHPRMHAL 1.11!10K2(16) K5.05!10K3(24) K3.49!10K3(22) K7.12!10K3(23) 4.97!10K4(29) 1.10!10K2(22)

WGRCOR K1.03!10K2(17) K6.13!10K2(5) K2.15!10K3(25) K2.74!10K2(6) K3.95!10K2(7) K2.45!10K2(10)

SHBCEXP 9.95!10K3(18) 4.62!10K2(6) K1.44!10K2(12) 1.35!10K2(16) 2.21!10K3(27) K6.49!10K3(24)

SHPRMCON K9.62!10K3(19) 4.76!10K4(29) K1.09!10K3(28) 1.21!10K3(29) K1.17!10K2(22) 1.45!10K2(17)

BHPRM 7.42!10K3(20) 1.37!10K3(28) 1.79!10K3(26) 1.77!10K2(11) 1.67!10K2(16) 2.03!10K2(11)

SHPRMCLY K6.59!10K3(21) 8.78!10K3(21) 2.68!10K2(7) K1.44!10K2(13) K4.35!10K2(6) 1.19!10K2(21)

ANRGSSAT K6.11!10K3(22) K1.75!10K2(15) 1.64!10K3(27) 7.94!10K3(21) 8.88!10K3(24) 1.77!10K2(13)

ANHBCVGP K4.49!10K3(23) K2.40!10K2(8) K1.26!10K2(16) K1.80!10K3(28) 1.29!10K3(28) 2.74!10K2(8)

WGRMICI K3.33!10K3(24) K1.48!10K2(16) K3.35!10K3(23) K8.81!10K3(20) K1.44!10K2(19) K1.60!10K2(16)

SHRBRSAT 2.75!10K3(25) K2.23!10K2(11) 2.87!10K3(24) K4.44!10K3(27) 3.09!10K2(9) 1.73!10K2(14)

ANRBRSAT 1.95!10K3(26) K4.05!10K2(7) 1.64!10K2(10) 6.56!10K3(25) K1.54!10K2(18) 4.31!10K3(25)

BPVOL K1.58!10K3(27) 6.54!10K3(23) 4.64!10K3(20) 2.12!10K2(9) 2.37!10K2(13) K2.18!10K3(28)

ANHBCEXP K1.30!10K3(28) 4.32!10K3(25) 2.88!10K2(6) K2.15!10K2(8) K1.30!10K2(20) 1.69!10K2(15)

WRGSSAT K1.19!10K3(29) 1.32!10K2(18) K5.33!10K3(19) 1.43!10K2(14) 2.45!10K2(12) 1.29!10K2(20)

a Variables included in regression model (i.e. all variables in Table 1, except for ANHCOMP and HALCOMP which are not included because of K0.99 rank

correlation with the pairs (ANHPRM, ANHCOMP) and (HALPRM, HALCOMP)).b SRRC in model containing all variables for indicated sample.c Variable rank based on absolute value of SRRC for indicated sample.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 319

rank HALPOR as the most important variable (see left three

columns of results in Table 8) and have a TDCC of 0.80

with a p-value of 5.2!10K5 (Table 9). As a result,

HALPOR is removed from consideration, which reduces

the number of independent variables from 29 to 28. A new

rank regression is then performed for each replicate with the

remaining 28 variables, and the variables are reranked (i.e.

from 1 to 28) on the basis of their SRRCs, with ANHPRM

having a rank of 1 in one replicate and WMICDFLG having

a rank of 1 in two replicates. For this new ranking (i.e.

without HALPOR), the TDCC has a value of 0.71 with a

p-value of 5.0!10K4 (Table 9). As this is considered to be

significant agreement, ANHPRM and WMICDFLG are

dropped; the remaining 26 variables are reranked; new

regressions are performed for each replicate; and a resultant

TDCC of 0.46 with a p-value of 9.8!10K2 is calculated

(Table 9). If a p-value of 9.8!10K2 is considered to be

insignificant, then the analysis ends, and the set of

significant variables is taken to be {HALPOR, ANHPRM,

WMICDFLG}.

If a p-value of 9.8!10K2 is considered to be significant

(e.g. if the analysis was using 0.1 as the p-value above which

the analysis stopped), then the analysis would continue with

the top ranked variables in the individual replicates being

dropped (i.e. SALPRES, HALPRM, BPPRM) and the TDCC

recalculated for the remaining 23 variables. This process

would continue until either an insignificant value for

the TDCC was obtained or all variables were dropped,

with the latter being an unlikely outcome.

For perspective, the process is also illustrated for

BRNREPTC, 10,000–1000 yr (i.e. BRNREPTC2), and

BRNREPTC at 10,000 yr (i.e. BRNREPTC3) in Table 9. If

a p-value of 0.02 was being used to determine significance

for the TDCC, then the analyses for BRNREPTC2 and

BRNREPTC3 would identify {BHPRM, BPCOMP,

WMICDFLG, ANHPRM} and {BHPRM, BPCOMP,

HALPOR, ANHPRM, WMICDFLG}, respectively, as the

important sets of variables.

Sensitivity analysis results based on the TDCC as

described in this section are presented in Table 10 for the

18 dependent variables considered in Tables 3–6. There is

little difference between the sets of important variables

identified with random and Latin hypercube sampling.

The sensitivity analysis procedure presented in this

section is analogous to forward stepwise regression analysis

in the sense that the procedure operates by finding the most

Page 16: Helton et.al. 2005

Table 9

Sensitivity analysis with the TDCC for three replicated random samples of

size 100 for BRNREPTC at 1000, 10,000–1000, and 10,000 yr

Stepa TDCCb p-valuec Variable(s)

removedd

Random: BRNREPTC, 1000 yr

1 0.80 5.2!10K5 HALPOR

2 0.71 5.0!10K4 WMICDFLG,

ANHPRM

3 0.46 9.8!10K2 SALPRES,

HALPRM,

BPPRM

Random: BRNREPTC, 10,000–1000 yr

1 0.79 6.4!10K5 BHPRM

2 0.72 4.3!10K4 BPCOMP

3 0.60 8.1!10K3 WMICDFLG,

ANHPRM

4 0.30 5.9!10K1 BPINTPRS,

BPVOL,

WGRCOR

Random: BRNREPTC, 10,000 yr

1 0.83 2.0!10K5 BHPRM

2 0.77 1.4!10K4 HALPOR,

BPCOMP

3 0.64 3.9!10K3 WMICDFLG,

ANHPRM

4 0.28 6.8!10K1 BPINTPRS,

BPVOL,

BPPRM

a Steps in analysis.b TDCC at beginning of step.c p-value for TDCC at beginning of step.d Variable(s) removed at end of step.

Table 10

Sensitivity analysis results with the TDCC for three replicated random samples o

Variablea Random samplingb

BRNREPTC1 HALPOR(1d), WMICDFLG(2), ANHPRM(2)

BRNREPTC2 BHPRM(1), BPCOMP(2), WMICDFLG(3), ANHPRM(3

BRNREPTC3 BHPRM(1), HALPOR(2), BPCOMP(2), WMICDFLG(3)

ANHPRM(3)

GAS_MOLE1 WMICDFLG(1), WGRCOR(2), WASTWICK(3), HALPO

GAS_MOLE2 HALPOR(1), WGRCOR(2)

GAS_MOLE3 WMICDFLG(1), WGRCOR(2), HALPOR(3), BHPRM(4

BPCOMP(4)

REP_SATB1 HALPOR(1), WGRCOR(2), WMICDFLG(3), WASTWIC

REP_SATB2 BHPRM(1), WGRCOR(2), HALPOR(3), BPCOMP(4),

ANHPRM(4)

REP_SATB3 BHPRM(1), WGRCOR(1), HALPOR(2), ANHPRM(2),

BPCOMP(2), WMICDFLG(3)

WAS_PRES1 WMICDFLG(1), WGRCOR(2), WASTWICK(3)

WAS_PRES2 WMICDFLG(1), WGRCOR(2), WASTWICK(3), BPCOM

WAS_PRES3 HALPRM(1), BPCOMP(1), BPVOL(2), ANHPRM(2),

WGRCOR(2)

a Dependent variables (Table 2) with 1–3 designating results at 1000, 10,000–1b Significant variables identified with replicated random sampling with a p-valuc Significant variables identified with replicated Latin hypercube sampling withd Step at which variable is identified as being significant.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330320

important variable(s), then the next most important

variable(s), and so on until no more variables having

identifiable effects can be found. However, the procedure

differs from forward stepwise regression analysis in that a

variable is removed from further consideration once it is

identified as being important. In contrast, forward stepwise

regression analysis retains those variables identified as

being important at previous steps as it moves forward to

identify additional important variables. At a certain

operational level, the sensitivity analysis procedure pre-

sented in this section is analogous to backward stepwise

regression analysis in which unimportant variables are

sequentially eliminated from inclusion in the regression

model. However, there is a very important difference.

Backward stepwise regression analysis eliminates variables

from further consideration on the basis of being unim-

portant; in contrast, the presented sensitivity procedure

eliminates variables on the basis of being important.

7. Sensitivity analysis with small samples

The sensitivity analysis results obtained with random and

LHSs of size 100 are very similar and thus indicate that a

sample size of 100 is adequate for the problem under

consideration. The question naturally arises if smaller

sample sizes would also be adequate.

To partially address this question, the random samples

were pooled to produce 300 observations, and then three

f size 100 and three replicated Latin hypercube samples of size 100

Latin hypercube samplingc

HALPOR(1), WMICDFLG(2), ANHPRM(2), WASTWICK(3),

WRBRNSAT(3), HALPRM(3)

) BHPRM(1), BPCOMP(2), WMICDFLG(3), ANHPRM(4),

BPVOL(4)

, BHPRM(1), HALPOR(2), BPCOMP(2), WMICDFLG(3),

ANHPRM(4), BPVOL(4)

R(4) WMICDFLG(1), WGRCOR(2), WASTWICK(3), HALPOR(4)

HALPOR(1), BHPRM(1), WGRCOR(2)

), WMICDFLG(1), WGRCOR(1), HALPOR(2), BHPRM(3)

K(4) HALPOR(1), WGRCOR(2), WASTWICK(3), WMICDFLG(3)

BHPRM(1), BPCOMP(1), HALPOR(2), WGRCOR(2),

BPPRM(2), ANHPRM(3), BPVOL(3)

BHPRM(1), WGRCOR(1), HALPOR(2), BPCOMP(2), WAST-

WICK(3), BPPRM(3), ANHPRM(4), WMICDFLG(4),

SHPRMCON(5), BPVOL(5)

WMICDFLG(1), WGRCOR(2), WASTWICK(3), ANHPRM(4),

HALPOR(4)

P(3) WMICDFLG(1), WGRCOR(2), BPCOMP(3), HALPRM(3)

BPCOMP(1), HALPRM(1), ANHPRM(2), BPVOL(2)

000, and 10,000 yr, respectively.

e cutoff of 0.02 for the TDCC.

a p-value cutoff of 0.02 for the TDCC.

Page 17: Helton et.al. 2005

Table 11

Sensitivity analysis results based on stepwise rank regression for replicated random samples of size 50 for BRNREPTC, GAS_MOLE, WAS_SATB and

WAS_PRES at 1000, 10,000–1000, and 10,000 yr

Stepa Replicate 1 Replicate 2 Replicate 3

Variableb SRRCc R2d Variable SRRC R2 Variable SRRC R2

Random 50: BRNREPTC, 1000 yr

1 HALPOR 1.00 0.96 HALPOR 0.97 0.94 HALPOR 0.95 0.90

2 WMICDFLG K0.09 0.97 ANHPRM 0.11 0.95 ANHPRM 0.13 0.92

3 ANHPRM 0.08 0.98 WMICDFLG K0.08 0.96 WMICDFLG K0.13 0.94

4 SHRGSSAT K0.05 0.98 HALPRM 0.10 0.96

5 WASTWICK K0.08 0.97

6 SALPRES K0.06 0.97

Random 50: BRNREPTC, 10,000–1000 yr

1 BHPRM 0.55 0.28 BHPRM 0.78 0.39 BHPRM 0.66 0.20

2 BPCOMP 0.50 0.56 BPCOMP 0.45 0.57 BPCOMP 0.53 0.36

3 WMICDFLG K0.24 0.64 BPINTPRS 0.31 0.62 WMICDFLG K0.34 0.47

4 ANRGSSAT 0.24 0.69 ANHPRM 0.30 0.71 ANHPRM 0.29 0.55

5 BPVOL 0.22 0.75

Random 50: BRNREPTC, 10,000 yr

1 BHPRM 0.53 0.23 BHPRM 0.67 0.31 BHPRM 0.60 0.16

2 BPCOMP 0.51 0.45 BPCOMP 0.42 0.45 BPCOMP 0.48 0.29

3 WMICDFLG K0.29 0.55 BPVOL 0.33 0.57 WMICDFLG K0.43 0.43

4 HALPOR 0.35 0.65 BPINTPRS 0.30 0.63 ANHPRM 0.30 0.51

5 ANRGSSAT 0.26 0.72 ANHPRM 0.25 0.68 ANHBCEXP 0.25 0.57

6 BPVOL 0.20 0.76

Random 50: GAS_MOLE, 1000 yr

1 WMICDFLG 0.81 0.77 WMICDFLG 0.93 0.77 WMICDFLG 0.92 0.72

2 WGRCOR 0.36 0.88 WGRCOR 0.34 0.92 WGRCOR 0.43 0.91

3 WFBETCEL K0.12 0.89 WASTWICK 0.19 0.95 HALPOR 0.13 0.94

4 SHBCEXP 0.11 0.95 WASTWICK 0.12 0.95

5 SHPRMDRZ 0.10 0.96

6 SHPRMCLY K0.08 0.97

Random 50: GAS_MOLE, 10,000–1000 yr

1 WGRCOR 0.45 0.25 HALPOR 0.63 0.15 WGRCOR 0.42 0.16

2 HALPOR 0.56 0.42 BHPRM 0.45 0.28 HALPOR 0.36 0.29

3 BPCOMP 0.37 0.54 BPCOMP 0.36 0.42

4 ANRGSSAT 0.26 0.61 WGRCOR 0.37 0.50

5 SHBCEXP 0.26 0.67 WASTWICK K0.34 0.60

6 BPVOL 0.22 0.71

Random 50: GAS_MOLE, 10,000 yr

1 WGRCOR 0.48 0.36 WMICDFLG 0.61 0.27 WMICDFLG 0.59 0.25

2 WMICDFLG 0.50 0.60 WGRCOR 0.40 0.48 WGRCOR 0.48 0.46

3 HALPOR 0.28 0.70 HALPOR 0.42 0.56 HALPOR 0.36 0.59

4 BPCOMP 0.26 0.76 BHPRM 0.34 0.65

5 SHRBRSAT 0.23 0.80 BPCOMP 0.27 0.71

Random 50: REP_SATB, 1000 yr

1 HALPOR 0.82 0.54 HALPOR 0.86 0.59 HALPOR 0.77 0.54

2 WGRCOR K0.37 0.75 WGRCOR K0.36 0.75 WGRCOR K0.52 0.81

3 WMICDFLG K0.41 0.89 WASTWICK K0.35 0.83 WMICDFLG K0.30 0.87

4 WASTWICK K0.21 0.94 WMICDFLG K0.29 0.92 WASTWICK K0.25 0.92

5 WRBRNSAT 0.12 0.93 SHRGSSAT 0.13 0.93

6 SHPRMCON 0.10 0.94

Random 50: REP_SATB, 10,000–1000 yr

1 BHPRM 0.62 0.38 BHPRM 0.49 0.33 BHPRM 0.58 0.15

2 WGRCOR K0.30 0.52 WGRCOR K0.33 0.45 WGRCOR K0.38 0.28

3 BPCOMP 0.27 0.61 HALPOR K0.33 0.56 HALPOR K0.36 0.41

4 HALPOR K0.24 0.66 ANHPRM 0.32 0.50

5 BPCOMP 0.28 0.57

6 WMICDFLG K0.28 0.63

7 BPVOL K0.25 0.69

Random 50: REP_SATB, 10,000 yr

1 BHPRM 0.77 0.30 WGRCOR K0.62 0.37 WGRCOR K0.66 0.39

2 WGRCOR K0.67 0.63 BHPRM 0.47 0.55 BHPRM 0.51 0.53

(continued on next page)

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 321

Page 18: Helton et.al. 2005

Table 11 (continued)

Stepa Replicate 1 Replicate 2 Replicate 3

Variableb SRRCc R2d Variable SRRC R2 Variable SRRC R2

3 HALPOR 0.31 0.70 HALPOR 0.27 0.62 WMICDFLG K0.43 0.68

4 ANHPRM 0.26 0.77 BPCOMP 0.29 0.74

5 SHPRMHAL K0.25 0.82 ANHPRM 0.19 0.77

6 BPCOMP 0.17 0.84

Random 50: WAS_PRES, 1000 yr

1 WMICDFLG 0.82 0.78 WMICDFLG 0.94 0.78 WMICDFLG 0.92 0.75

2 WGRCOR 0.35 0.88 WGRCOR 0.33 0.93 WGRCOR 0.41 0.92

3 WFBETCEL K0.12 0.90 WASTWICK 0.19 0.95 WASTWICK 0.14 0.94

4 SHBCEXP 0.11 0.96

5 SHPRMCLY K0.09 0.96

6 SHPRMDRZ 0.08 0.97

7 ANHBCEXP 0.07 0.97

Random 50: WAS_PRES, 10,000–1000 yr

1 WMICDFLG K0.75 0.67 WMICDFLG K0.88 0.67 WMICDFLG K0.81 0.61

2 WGRCOR K0.20 0.73 WGRCOR K0.23 0.77 WGRCOR K0.39 0.76

3 ANRGSSAT 0.20 0.77 WASTWICK K0.22 0.80

4 SHBCEXP K0.21 0.85

Random 50: WAS_PRES, 10,000 yr

1 ANRGSSAT 0.39 0.15 BPCOMP 0.33 0.12 No variable

selected2 SHBCEXP K0.33 0.22

a Steps in stepwise rank regression analysis with a-values of 0.02 and 0.05 required for a variable to enter and to be retained in an analysis, respectively.b Variables listed in order of selection in regression analysis with ANHCOMP and HALCOMP excluded from entry into regression model because of K0.99

rank correlation with the pairs (ANHPRM, ANHCOMP) and (HALPRM, HALCOMP).c Standardized rank regression coefficients (SRRCs) in final regression model.d Cumulative R2 value with entry of each variable into regression model.

Table 12

Consistency of variable rankings with stepwise rank regression for three

replicated random samples of size 50 and 100

Variablea Random 50 Random 100

TDCCb p-valuec TDCCb p-valuec

BRNREPTC1 0.63 3.3!10K03 0.80 5.2!10K05

BRNREPTC2 0.61 5.0!10K03 0.79 6.4!10K05

BRNREPTC3 0.65 1.8!10K03 0.83 2.0!10K05

GAS_MOLE1 0.75 1.7!10K04 0.81 3.4!10K05

GAS_MOLE2 0.57 1.0!10K04 0.76 1.4!10K04

GAS_MOLE3 0.79 6.7!10K05 0.84 1.5!10K05

REP_SATB1 0.79 5.5!10K05 0.83 2.0!10K05

REP_SATB2 0.77 1.0!10K04 0.85 1.1!10K05

REP_SATB3 0.81 3.2!10K05 0.88 4.6!10K06

WAS_PRES1 0.76 1.3!10K04 0.78 7.3!10K05

WAS_PRES2 0.70 6.2!10K04 0.80 5.0!10K05

WAS_PRES3 0.44 1.1!10K01 0.58 9.5!10K03

a Dependent variables (Table 2) with 1–3 designating results at 1000,

10,000–1000, and 10,000 yr, respectively.b Top down coefficient of concordance (TDCC).c p-value for TDCC.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330322

samples of size 50 were obtained by randomly sampling

from these 300 observations. Each new sample of size 50

was produced by sampling without replacement from the

300 observations (i.e. each sample of size 50 was

generated without replacement from the original 300

observations). This resampling process is used instead of

generating entirely new analysis results because BRAGFLO

is time consuming to run for the problem under

consideration (i.e. 2–4 h of CPU time on a VAX Alpha

per model evaluation). As a result, it is desirable to reuse

the available results rather than generate entirely new

results. This process was not performed for the LHSs

because the stratification associated with Latin hypercube

sampling would not be preserved in the resampling, with

the result that the new samples of reduced size would not

be LHSs.

The new random samples of size 50 were analyzed with

stepwise rank regression in the same manner as the random

samples of size 100 in Section 4 (Table 11). For a given

dependent variable, the results for the samples of size 50

were similar to each other and also similar to the

corresponding results for random samples of size 100 in

Tables 3–6.

Although the results with random samples of size 50 and

100 are generally similar, the impression emerges that the

results with samples of size 100 are somewhat better in the

sense of being more consistent and having more variables

identified as being significant. To test this, rank regression

models containing all independent variables were con-

structed for the samples of size 50 and 100, and variable

importance was ranked on the basis of the resultant SRRCs.

The general impression that the analyses with samples of

size 100 are somewhat better than the analyses with samples

of size 50 is confirmed by the resultant TDCC values

(Table 12). In particular, the values obtained for the three

Page 19: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 323

samples of size 100 are consistently larger and more

significant than those obtained for the three samples of

size 50.

The analysis was also tried for random samples of size

25. At this sample size, considerable deterioration in the

results was observed (i.e. few or no variables identified as

being significant, and considerable variation in identified

variables from replicate to replicate). However, even at this

small sample size, the sensitivity analysis was typically

successful in identifying the important independent vari-

ables for the dependent variables that had high R2 values in

the analyses for samples of size 50 and 100. The TDCC was

not calculated for the samples of size 25 because a

regression model containing all 29 variables cannot be

constructed when only 25 observations are available, and a

TDCC based on a regression model containing a fewer

number of variables would not be directly comparable to a

TDCC based on regression models containing all variables

obtained from samples of size 50 or 100.

8. Sensitivity analysis without regression

The regression analyses summarized in Tables 3–6

exhibit various levels of success. Some analyses are quite

good, with R2 values above 0.9. Other analyses are not quite

so good, with R2 values in the range from 0.6 to 0.8. The

analyses for WAS_PRES at 10,000 yr are effectively

failures, with R2 values in the vicinity of 0.2.

An important aspect of the analyses in Tables 3–6 is that

the identification of dominant variables tends to remain the

same across replicates for both random and Latin hypercube

sampling. This consistency holds for regression models with

both high and low R2 values. This implies that the failure to

account for uncertainty as measured by R2 values probably

derives from the sensitivity analysis technique in use (i.e.

stepwise regression analysis with rank-transformed data)

rather than from an overly small sample size.

When regression-based approaches to sensitivity anal-

ysis do not yield satisfactory insights, important variables

can be searched for by attempting to identify patterns in

scatterplots between sampled and predicted variables with

techniques that are not predicated on searches for linear or

monotonic relationships. For a sampled variable x (i.e. one

of the variables in Table 1) and a predicted variable y (i.e.

one of the variables in Table 2 at a specific point in time),

possibilities include use of (i) the F-statistic to identify

changes in the mean value of y across the range of x, (ii) the

c2-statistic to identify changes in the median value of y

across the range of x, (iii) the Kruskal–Wallis statistic to

identify changes in the distribution of y across the range of

x, and (iv) the c2-statistic to identify a nonrandom joint

distribution involving y and x [70]. For convenience, the

preceding will be referred to as tests for (i) common means

(CMNs), (ii) common medians (CMDs), (iii) common

locations (CLs), and (iv) statistical independence (SI),

respectively.

The indicated statistics are based on dividing the values

of x into intervals (Fig. 5). Typically, these intervals contain

equal numbers of values for x (i.e. the intervals are of equal

probability); however, this is not always the case (e.g. when

the sample space for x has a finite number of values of

unequal probability). The calculation of the F-statistic for

CMNs and the Kruskal–Wallis statistic for CLs involves

only the division of x into intervals. The F-statistic and the

Kruskal–Wallis statistic are then used to indicate if the y

values associated with these intervals appear to have

different means and distributions, respectively. The c2-

statistic for CMDs involves a further division of the

predicted y values into values above and below their median

(i.e. the horizontal line in Fig. 5a), with the corresponding

significance test used to indicate if the y values associated

with the individual intervals defined for x appear to have

medians that are different from the median for all values of

y. The c2-statistic for SI involves a division of the y values

into intervals of equal probability analogous to the division

of the values of x (i.e. the horizontal lines in Fig. 5b), with

the corresponding significance test used to indicate if the

observed distribution of the (x, y) pairs over the cells in

Fig. 5b appears to be different from what would be expected

if there was no relationship between x and y. For each

statistic, a p-value can be calculated which corresponds to

the probability of observing a stronger pattern than the one

actually observed if there is no relationship between x and y.

An ordering of p-values then provides a ranking of variable

importance (i.e. the smaller the p-value, the stronger the

effect of x on y appears to be).

Owing to the poor resolution of the regression analyses

in Table 6, WAS_PRES at 10,000 yr was analyzed with

the tests for CMs, CMDs, CLs and SI (Table 13). For both

random and Latin hypercube sampling, BHPRM was

identified as the dominant variable by all four tests. In

contrast, BHPRM was not identified as being significant by

any of the corresponding regression analyses in Table 6.

Basically, although there is a strong relationship between

BHPRM and WAS_PRES, the nonmonotonic, nonlinear

nature of this relationship (Fig. 5) prevents it from being

identified in a regression analysis with rank-transformed

data.

After the identification of BHPRM as the most important

variable, the individual replicates and also the individual

analysis techniques show considerable variability in the

second and subsequent variables selected for both random

and Latin hypercube sampling. The small p-values for the

pooled replicates for random and Latin hypercube sampling

indicate more significant variables than is the case for

the individual replicates. This is in contrast to most of the

results in Tables 3–6, where the individual replicates and

associated pooled replicates produced similar results. This

may indicate that the significance tests for nonrandomness

require larger sample sizes to be effective than

Page 20: Helton et.al. 2005

Fig. 5. Scatterplots for HALPRM and BHPRM versus WAS_PRES at 10,000 yr generated with Latin hypercube sampling for replicate R1 (Frames 5a, b) and

replicates R1, R2, R3 pooled (Frames 5c, d).

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330324

the significance tests used in conjunction with the regression

analyses in Tables 3–6. However, more investigation is

needed before any conclusions can be safely drawn.

In addition to the four tests illustrated in this section,

many other procedures also exist that might be effective in

the identification of patterns in sampling-based sensitivity

analyses. For example, the two-dimensional Kolmogorov–

Smirnov test has the potential to be a useful technique for

the identification of nonrandom patterns [123–126]. As

another example, techniques developed to identify random-

ness in spatial point patterns also have potential for use in

the identification of nonrandom patterns in sampling-based

sensitivity analyses [127–140]. Finally, nonparametric

regression procedures are likely to be effective sensitivity

analysis tools in the presence of significant nonlinear

relationships between analysis inputs and analysis results

[141–145].

A complete variance decomposition of model predictions

is a very appealing approach to sensitivity analysis in the

presence of nonmonotonic relationships between model

inputs and model predictions [36–55]. However, this

approach is complicated to implement and is likely to

require substantially more model evaluations than a

sampling-based approach. Fortunately, publicly available

procedures to implement variance decomposition calcu-

lations are provided as part of the SIMLAB program [146].

Page 21: Helton et.al. 2005

Table 13

Sensitivity analysis results based on CMs, CLs, CMDs and SI for yZWAS_PRES at 10,000 yr with nXZ5 for CMs, CLs, CMDs and SI and nYZ5 for SI

Variable

namea

CMNb CLc CMDd SIe

Rank p-value Rank p-value Rank p-value Rank p-value

Random R1

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

BPCOMP 2.0 0.0114 2.0 0.0119 3.0 0.0123 2.0 0.0180

BPVOL 3.0 0.0157 4.0 0.0249 4.0 0.0331 8.0 0.3490

WGRCOR 4.0 0.0166 3.0 0.0231 5.5 0.0404 3.0 0.0615

HALPOR 5.0 0.0202 5.0 0.0315 2.0 0.0061 6.0 0.2202

Random R2

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

ANHBCEXP 2.0 0.0062 3.0 0.0127 3.0 0.0342 5.5 0.1601

WFBETCEL 3.0 0.0094 4.0 0.0173 2.0 0.0073 5.5 0.1601

ANHPRM 4.0 0.0143 2.0 0.0052 6.0 0.0563 2.0 0.0135

WGRCOR 5.0 0.0585 5.0 0.0190 23.0 0.6842 3.0 0.0698

Random R3

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

ANHPRM 2.0 0.0019 2.0 0.0051 2.0 0.0043 6.0 0.1601

BPCOMP 3.0 0.0042 3.0 0.0069 3.0 0.0111 2.5 0.0415

BPVOL 4.0 0.0107 4.0 0.0237 6.0 0.0566 7.0 0.1655

Random R1, R2, R3

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

ANHPRM 2.0 0.0001 3.0 0.0001 3.0 0.0049 4.0 0.0180

BPCOMP 3.0 0.0001 2.0 0.0001 2.0 0.0031 3.0 0.0022

WGRCOR 4.0 0.0006 4.0 0.0003 4.0 0.0054 2.0 0.0001

LHS R1

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

HALPRM 2.0 0.0088 2.0 0.0164 2.0 0.0289 16.0 0.3856

BPCOMP 3.0 0.0611 3.0 0.0571 6.0 0.1074 2.0 0.0100

LHS R2

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

HALPRM 2.0 0.0027 2.0 0.0056 2.0 0.0156 3.5 0.1010

HALPOR 3.0 0.0147 6.0 0.0643 3.5 0.0289 5.5 0.2202

BPCOMP 4.0 0.0209 3.0 0.0269 3.5 0.0289 18.0 0.5987

SHBCEXP 5.0 0.0243 4.0 0.0426 7.0 0.1074 12.0 0.4530

ANHPRM 6.0 0.0544 5.0 0.0465 5.0 0.0342 2.0 0.0208

LHS R3

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

WGRMICH 2.0 0.0012 2.0 0.0027 2.0 0.0206 2.0 0.0018

HALPRM 3.0 0.0217 3.0 0.0119 14.0 0.4481 4.0 0.0791

LHS R1, R2, R3

BHPRM 1.0 0.0000 1.0 0.0000 1.0 0.0000 1.0 0.0000

HALPRM 2.0 0.0000 2.0 0.0000 2.0 0.0002 2.0 0.0000

BPCOMP 3.0 0.0004 3.0 0.0005 4.0 0.0146 4.0 0.0029

ANHPRM 4.0 0.0019 4.0 0.0009 3.0 0.0022 5.0 0.0189

HALPOR 5.0 0.0089 5.0 0.0195 7.0 0.0780 9.0 0.1661

WGRCOR 16.0 0.4973 15.0 0.3462 20.5 0.7358 3.0 0.0026

a Variables for which at least one of the tests (i.e. CMN, CL, CMD, SI) has a p-value less than 0.02; variables ordered by p-values for CMNs.b Ranks and p-values for CMNs test with 1!5 grid.c Ranks and p-values for CLs (Kruskal–Wallis) test with 1!5 grid.d Ranks and p-values for CMDs test with 2!5 grid.e Ranks and p-values for SI test with 5!5 grid.

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 325

9. Discussion

Uncertainty and sensitivity analysis results obtained with

replicated random and LHSs are compared. In particular,

uncertainty and sensitivity analyses were performed for a

large model for two-phase fluid flow with three indepen-

dently generated random samples of size 100 each and also

three independently generated LHSs of size 100 each.

For the outcomes under consideration, analyses with

random and LHSs produced similar results. Specifically,

there is little difference in the uncertainty and sensitivity

analysis results obtained with random and LHSs of size 100.

Further, the results obtained with samples of size 100 are

similar to the results obtained for the samples of size 300

that result from pooling the three replicated samples for

each sampling procedure. The results obtained with random

Page 22: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330326

and LHS in this study are more similar than what has been

observed in several other comparisons [63,72–74].

An important implication of this study is that large

sample sizes are often unnecessary to develop an under-

standing of a complex system. This has also been

demonstrated in several other studies with relatively

small, replicated samples [147,148]. A possible analysis

strategy is to initially carry out an analysis with a relatively

small sample size, and then add additional sample elements

and associated model evaluations only if the initial analysis

is found to be inadequate.

In considering appropriate sample sizes, it is important to

recognize the distinction between analyses carried out to

assess the effects of subjective (i.e. epistemic) uncertainty,

and analyses carried out to assess the effects of stochastic

(i.e. aleatory) uncertainty. In assessing the effects of

stochastic uncertainty, it is often desired to determine the

likelihood of low probability, but high consequence, events.

The determination of such likelihoods in a naıve sampling-

based analysis requires a very large sample size. However,

in assessing the effects of subjective uncertainty, the goal is

usually to determine general patterns of behavior rather than

likelihoods for specific, low probability behaviors. As a

result, analyses to assess the effects of subjective uncer-

tainty can be carried out with much smaller sample sizes

than analyses carried out to assess the effects of stochastic

uncertainty. In practice, analyses that must estimate the

likelihood of low probability events typically use some

type of importance sampling procedure (e.g. an event tree)

rather than an unstructured random sampling procedure

[149–157].

Extensive regression-based sensitivity analyses were

carried out for the individual replicated samples. When

these analyses performed poorly, this performance was due

to the inappropriateness of the regression model for the

patterns in the mapping between model input and a model

output rather than to an inadequate sample size. In

particular, employing a more appropriate sensitivity anal-

ysis procedure is more effective than simply increasing the

sample size. Fortunately, a number of procedures exist that

can be used to identify nonrandom patterns in the mapping

between model input and a model output.

The TDCC was found to be an effective procedure for

comparing variable rankings obtained with replicated

samples. Owing to its emphasis on agreement between the

rankings assigned to important variables and deemphasis on

disagreement between the rankings assigned to unimportant

variables, the TDCC was more effective in comparing

variable rankings than KCC. Further, when replicated

samples are available, the TDCC provides the basis for a

sensitivity analysis procedure predicated on the agreement

of importance measures obtained for the individual

replicates.

Although random and Latin hypercube sampling per-

formed similarly in this analysis, the authors’ preference

remains Latin hypercube sampling for use in analyses of

complex systems with small sample sizes. On the whole, the

enforced stratification over the range of each sampled

variable gives Latin hypercube sampling a desirable

property that should not be given up. In a large analysis

with many inputs and even more outputs, this stratification

should decrease the likelihood of being mislead in assessing

the relationships between individual inputs and outputs.

Acknowledgements

Work performed for Sandia National Laboratories

(SNL), which is a multiprogram laboratory operated by

Sandia Corporation, a Lockheed Martin Company, for the

United States Department of Energy under contract

DE-AC04-94AL85000. Review provided at SNL by

M. Chavez, J. Garner, and S. Halliday. Editorial support

provided by F. Puffer, J. Ripple, and K. Best of Tech

Reps, Inc.

References

[1] Wagner RL. Science, uncertainty and risk: the problem of complex

phenomena. APS News 2003;12(1):8.

[2] Oberkampf WL, DeLand SM, Rutherford BM, Diegert KV,

Alvin KF. Error and uncertainty in modeling and simulation. Reliab

Eng Syst Saf 2002;75(3):333–57.

[3] Helton JC, Burmaster DE. Guest editorial: treatment of aleatory and

epistemic uncertainty in performance assessments for complex

systems. Reliab Eng Syst Saf 1996;54(2/3):91–4.

[4] NCRP (National Council on Radiation Protection and Measure-

ments). A guide for uncertainty analysis in dose and risk assessments

related to environmental contamination. NCRP commentary No. 14.

Bethesda, MD: National Council on Radiation Protection and

Measurements; 1996.

[5] NRC (National Research Council). Science and judgment in risk

assessment. Washington, DC: National Academy Press; 1994.

[6] NRC (National Research Council). Issues in risk assessment.

Washington, DC: National Academy Press; 1993.

[7] US EPA (US Environmental Protection Agency). An SAB report:

multi-media risk assessment for radon, review of uncertainty

analysis of risks associated with exposure to radon. EPA-SAB-

RAC-93-014. Washington, DC: US Environmental Protection

Agency; 1993.

[8] Øvreberg O, Damsleth E, Haldorsen HH. Putting error bars on

reservoir engineering forecasts. J Pet Technol 1992;44(6):732–8.

[9] IAEA (International Atomic Energy Agency). Evaluating the

reliability of predictions made using environmental transfer models.

Safety series No. 100. Vienna: International Atomic Energy Agency;

1989.

[10] Beck MB. Water-quality modeling: a review of the analysis of

uncertainty. Water Resour Res 1987;23(8):1393–442.

[11] Tomovic R, Vukobratovic M. General sensitivity theory. New York:

Elsevier; 1972.

[12] Frank PM. Introduction to system sensitivity theory. New York:

Academic Press; 1978.

[13] Hwang J-T, Dougherty EP, Rabitz S, Rabitz H. The Green’s function

method of sensitivity analysis in chemical kinetics. J Chem Phys

1978;69(11):5180–91.

Page 23: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 327

[14] Dougherty EP, Rabitz H. A computational algorithm for the Green’s

function method of sensitivity analysis in chemical kinetics. Int

J Chem Kinet 1979;11(12):1237–48.

[15] Dougherty EP, Hwang JT, Rabitz H. Further developments and

applications of the Green’s function method of sensitivity analysis in

chemical kinetics. J Chem Phys 1979;71(4):1794–808.

[16] Cacuci DG, Weber CF, Oblow EM, Marable JH. Sensitivity theory

for general systems of nonlinear equations. Nucl Sci Eng 1980;75(1):

88–110.

[17] Cacuci DG. Sensitivity theory for nonlinear systems. I. Nonlinear

functional analysis approach. J Math Phys 1981;22(12):2794–802.

[18] Cacuci DG. Sensitivity theory for nonlinear systems. II. Extensions

to additional classes of responses. J Math Phys 1981;22(12):

2803–12.

[19] Cacuci DG, Schlesinger ME. On the application of the adjoint

method of sensitivity analysis to problems in the atmospheric

sciences. Atmosfera 1994;7(1):47–59.

[20] Lewins J, Becker M, editors. Sensitivity and uncertainty analysis of

reactor performance parameters. Advances in nuclear science and

technology, vol. 14. New York: Plenum Press; 1982.

[21] Rabitz H, Kramer M, Dacol D. Sensitivity analysis in chemical

kinetics. In: Rabinovitch BS, Schurr JM, Strauss HL, editors. Annual

review of physical chemistry, vol. 34. Palo Alto, CA: Annual

Reviews, Inc.; 1983. p. 419–61.

[22] Turanyi T. Sensitivity analysis of complex kinetic systems. Tools

and applications. J Math Chem 1990;5(3):203–48.

[23] Vuilleumier L, Harley RA, Brown NJ. First- and second-order

sensitivity analysis of a photochemically reactive system (a Green’s

function approach). Environ Sci Technol 1997;31(4):1206–17.

[24] Cacuci DG. Sensitivity and uncertainty analysis. Theory, vol. 1.

Boca Raton, FL: Chapman & Hall/CRC Press; 2003.

[25] Hill WJ, Hunter WG. A review of response surface methodology: a

literature review. Technometrics 1966;8(4):571–90.

[26] Mead R, Pike DJ. A review of response surface methodology from a

biometric viewpoint. Biometrics 1975;31:803–51.

[27] Myers RH. Response surface methodology. Boston, MA: Allyn &

Bacon; 1971.

[28] Morton RH. Response surface methodology. Math Scientist 1983;8:

31–52.

[29] Morris MD, Mitchell TJ. Exploratory designs for computational

experiments. J Stat Plan Inf 1995;43(3):381–402.

[30] Myers RH, Khuri AI, Carter J, Walter H. Response surface

methodology: 1966–1988. Technometrics 1989;31(2):137–57.

[31] Sacks J, Welch WJ, Mitchel TJ, Wynn HP. Design and analysis of

computer experiments. Stat Sci 1989;4(4):409–35.

[32] Bates RA, Buck RJ, Riccomagno E, Wynn HP. Experimental design

and observation for large systems. J R Stat Soc Ser B-Methodol

1996;58(1):77–94.

[33] Andres TH. Sampling methods and sensitivity analysis for large

parameter sets. J Stat Comput Simul 1997;57(1–4):77–110.

[34] Kleijnen JPC. Sensitivity analysis and related analyses: a review of

some statistical techniques. J Stat Comput Simul 1997;57(1–4):

111–42.

[35] Myers RH. Response surface methodology—current status and

future directions. J Qual Technol 1999;31(1):30–44.

[36] Cukier RI, Fortuin CM, Shuler KE, Petschek AG, Schaibly JH. Study

of the sensitivity of coupled reaction systems to uncertainties in rate

coefficients, I. Theory. J Chem Phys 1973;59(8):3873–8.

[37] Schaibly JH, Shuler KE. Study of the sensitivity of coupled reaction

systems to uncertainties in rate coefficients, II. Applications. J Chem

Phys 1973;59(8):3879–88.

[38] Cukier RI, Levine HB, Shuler KE. Nonlinear sensitivity analysis of

multiparameter model systems. J Comput Phys 1978;26(1):1–42.

[39] McRae GJ, Tilden JW, Seinfeld JH. Global sensitivity analysis—a

computational implementation of the Fourier amplitude sensitivity

test (FAST). Comput Chem Eng 1981;6(1):15–25.

[40] Saltelli A, Bolado R. An alternative way to compute Fourier

amplitude sensitivity test (FAST). Comput Stat Data Anal 1998;

26(4):445–60.

[41] Cox DC. An analytic method for uncertainty analysis of nonlinear

output functions, with applications to fault-tree analysis. IEEE Trans

Reliab 1982;3(5):465–8.

[42] Sobol’ IM. Sensitivity analysis for nonlinear mathematical models.

Math Model Comput Exp 1993;1(4):407–14.

[43] Jansen MJW, Rossing WAH, Daamen RA. Monte Carlo estimation

of uncertainty contributions from several independent multivariate

sources. In: Grasman J, Van Straten G, editors. Predictability and

nonlinear modeling in natural sciences and economics. Boston:

Kluwer Academic Publishers; 1994. p. 334–43.

[44] McKay MD. Evaluating prediction uncertainty. LA-12915-MS;

NUREG/CR-6311. Los Alamos, NM: Los Alamos National

Laboratory; 1995.

[45] Saltelli A, Sobol’ IM. About the use of rank transformation in

sensitivity analysis of model output. Reliab Eng Syst Saf 1995;50(3):

225–39.

[46] Homma T, Saltelli A. Importance measures in global sensitivity

analysis of nonlinear models. Reliab Eng Syst Saf 1996;52(1):1–17.

[47] Archer GEB, Saltelli A, Sobol’ IM. Sensitivity measures, ANOVA-

like techniques and the use of bootstrap. J Stat Comput Simul 1997;

58(2):99–120.

[48] McKay MD. Nonparametric variance-based methods of assessing

uncertainty importance. Reliab Eng Syst Saf 1997;57(3):267–79.

[49] Jansen MJW. Analysis of variance designs for model output. Comput

Phys Commun 1999;117(1/2):35–43.

[50] McKay MD, Morrison JD, Upton SC. Evaluating prediction

uncertainty in simulation models. Comput Phys Commun 1999;

117(1/2):44–51.

[51] Rabitz H, Alis OF. General foundations of high-dimensional model

representations. J Math Chem 1999;25(2/3):197–233.

[52] Rabitz H, Alis OF, Shorter J, Shim K. Efficient input–output model

representations. Comput Phys Commun 1999;117(1/2):11–20.

[53] Saltelli A, Tarantola S, Chan KP-S. A quantitative model-

independent method for global sensitivity analysis of model output.

Technometrics 1999;41(1):39–56.

[54] Chan K, Saltelli A, Tarantola S. Winding stairs: a sampling tool to

compute sensitivity indices. Stat Comput 2000;10(3):187–96.

[55] Saltelli A, Tarantola S, Campolongo F. Sensitivity analysis as an

ingredient of modeling. Stat Sci 2000;15(4):377–95.

[56] Hasofer AM, Lind NC. Exact and invariant second-moment code

format. J Eng Mech Div, Proc Am Soc Civil Engrs 1974;100(EM1):

111–21.

[57] Rackwitz R, Fiessler B. Structural reliability under combined

random load sequences. Comput Struct 1978;9(5):489–94.

[58] Chen X, Lind NC. Fast probability integration by three-parameter

normal tail approximation. Struct Saf 1983;1(4):169–76.

[59] Wu Y-T, Wirsching PH. New algorithm for structural reliability.

J Eng Mech 1987;113(9):1319–36.

[60] Wu Y-T. Demonstration of a new, fast probability integration

method for reliability analysis. J Eng Ind, Trans ASME, Ser B 1987;

109(1):24–8.

[61] Wu Y-T, Millwater HR, Cruse TA. Advanced probabilistic structural

method for implicit performance functions. AIAA J 1990;28(9):

1663–9.

[62] Schanz RW, Salhotra A. Evaluation of the Rackwitz–Fiessler

uncertainty analysis method for environmental fate and transport

method. Water Resour Res 1992;28(4):1071–9.

[63] McKay MD, Beckman RJ, Conover WJ. A comparison of three

methods for selecting values of input variables in the analysis of

output from a computer code. Technometrics 1979;21(2):239–45.

[64] Iman RL, Conover WJ. Small sample sensitivity analysis techniques

for computer models, with an application to risk assessment.

Commun Stat: Theory Methods 1980;A9(17):1749–842.

Page 24: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330328

[65] Iman RL, Helton JC, Campbell JE. An approach to sensitivity

analysis of computer models, part 1. Introduction, input variable

selection and preliminary variable assessment. J Qual Technol 1981;

13(3):174–83.

[66] Iman RL, Helton JC, Campbell JE. An approach to sensitivity

analysis of computer models, part 2. Ranking of input variables,

response surface validation, distribution effect and technique

synopsis. J Qual Technol 1981;13(4):232–40.

[67] Saltelli A, Marivoet J. Non-parametric statistics in sensitivity

analysis for model output. A comparison of selected techniques.

Reliab Eng Syst 1990;28(2):229–53.

[68] Iman RL. Uncertainty and sensitivity analysis for computer

modeling applications. In: Cruse TA, editor. Reliability technol-

ogy—1992, the winter annual meeting of the American Society of

Mechanical Engineers, Anaheim, CA, November 8–13, 1992, vol.

28. New York: American Society of Mechanical Engineers,

Aerospace Division; 1992. p. 153–68.

[69] Helton JC. Uncertainty and sensitivity analysis techniques for use in

performance assessment for radioactive waste disposal. Reliab Eng

Syst Saf 1993;42(2/3):327–67.

[70] Kleijnen JPC, Helton JC. Statistical analyses of scatterplots to

identify important factors in large-scale simulations, 1: review and

comparison of techniques. Reliab Eng Syst Saf 1999;65(2):147–85.

[71] Helton JC, Davis FJ. Sampling-based methods. In: Saltelli A,

Chan K, Scott EM, editors. Sensitivity analysis. New York: Wiley;

2000. p. 101–53.

[72] Helton JC, Davis FJ. Illustration of sampling-based methods for

uncertainty and sensitivity analysis. Risk Anal 2002;22(3):591–622.

[73] Helton JC, Davis FJ. Latin hypercube sampling and the propagation

of uncertainty in analyses of complex systems. Reliab Eng Syst Saf

2003;81(1):23–69.

[74] Iman RL, Helton JC. An investigation of uncertainty and sensitivity

analysis techniques for computer models. Risk Anal 1988;8(1):

71–90.

[75] Ronen Y. Uncertainty analysis. Boca Raton, FL: CRC Press; 1988.

[76] Hamby DM. A review of techniques for parameter sensitivity

analysis of environmental models. Environ Monit Assess 1994;

32(2):135–54.

[77] Saltelli A, Chan K, Scott EM, editors. Sensitivity analysis. New

York: Wiley; 2000.

[78] Frey HC, Patil SR. Identification and review of sensitivity analysis

methods. Risk Anal 2002;22(3):553–78.

[79] Ionescu-Bujor M, Cacuci DG. A comparative review of sensitivity

and uncertainty analysis of large-scale systems—I: deterministic

methods. Nucl Sci Eng 2004;147(3):189–2003.

[80] Cacuci DG, Ionescu-Bujor M. A comparative review of sensitivity

and uncertainty analysis of large-scale systems—II: statistical

methods. Nucl Sci Eng 2004;147(3):204–17.

[81] MacDonald RC, Campbell JE. Valuation of supplemental and

enhanced oil recovery projects with risk analysis. J Pet Technol

1986;38(1):57–69.

[82] Breshears DD, Kirchner TB, Whicker FW. Contaminant transport

through agroecosystems: assessing relative importance of environ-

mental, physiological, and management factors. Ecol Appl 1992;

2(3):285–97.

[83] Breeding RJ, Helton JC, Gorham ED, Harper FT. Summary

description of the methods used in the probabilistic risk assessments

for NUREG-1150. Nucl Eng Des 1992;135(1):1–27.

[84] Ma JZ, Ackerman E, Yang J-J. Parameter sensitivity of a model of

viral epidemics simulated with Monte Carlo techniques. I. Illness

attack rates. Int J Biomed Comput 1993;32(3/4):237–53.

[85] Ma JZ, Ackerman E. Parameter sensitivity of a model of viral

epidemics simulated with Monte Carlo techniques. II. Durations and

peaks. Int J Biomed Comput 1993;32(3/4):255–68.

[86] Whiting WB, Tong T-M, Reed ME. Effect of uncertainties in

thermodynamic data and model parameters on calculated process

performance. Ind Eng Chem Res 1993;32(7):1367–71.

[87] Blower SM, Dowlatabadi H. Sensitivity and uncertainty analysis of

complex models of disease transmission: an HIV model, as an

example. Int Stat Rev 1994;62(2):229–43.

[88] Gwo JP, Toran LE, Morris MD, Wilson GV. Subsurface stormflow

modeling with sensitivity analysis using a Latin-hypercube sampling

technique. Ground Water 1996;34(5):811–8.

[89] Helton JC, Anderson DR, Baker BL, Bean JE, Berglund JW,

Beyeler W, et al. Uncertainty and sensitivity analysis results

obtained in the 1992 performance assessment for the Waste Isolation

Pilot Plant. Reliab Eng Syst Saf 1996;51(1):53–100.

[90] Chan MS. The consequences of uncertainty for the prediction of the

effects of schistosomiasis control programmes. Epidemiol Infect

1996;117(3):537–50.

[91] Kolev NI, Hofer E. Uncertainty and sensitivity analysis of a

postexperiment simulation of nonexplosive melt–water interaction.

Exp Therm Fluid Sci 1996;13(2):98–116.

[92] Sanchez MA, Blower SM. Uncertainty and sensitivity analysis of the

basic reproductive rate: tuberculosis as an example. Am J Epidemiol

1997;145(12):1127–37.

[93] Caswell H, Brault S, Read AJ, Smith TD. Harbor porpoise and

fisheries: an uncertainty analysis of incidental mortality. Ecol Appl

1998;8(4):1226–38.

[94] Hofer E. Sensitivity analysis in the context of uncertainty analysis

for computationally intensive models. Comput Phys Commun 1999;

117(1/2):21–34.

[95] Blower SM, Gershengorn HB, Grant RM. A tale of two futures: HIV

and antiretroviral therapy in San Francisco. Science 2000;287(5453):

650–4.

[96] Cohen C, Artois M, Pontier D. A discrete-event computer model of

feline herpes virus within cat populations. Prev Vet Med 2000;

45(3/4):163–81.

[97] Saltelli A, Andres TH, Homma T. Sensitivity analysis of model

output. An investigation of new techniques. Comput Stat Data Anal

1993;15(2):211–38.

[98] Hamby DM. A comparison of sensitivity analysis techniques. Health

Phys 1995;68(2):195–204.

[99] Helton JC, Anderson DR, Jow H-N, Marietta MG, Basabilvazo G.

Performance assessment in support of the 1996 compliance

certification application for the Waste Isolation Pilot Plant. Risk

Anal 1999;19(5):959–86.

[100] Helton JC, Marietta MG. Special issue: the 1996 performance

assessment for the Waste Isolation Pilot Plant. Reliab Eng Syst Saf

2000;69(1–3):1–451.

[101] US DOE (US Department of Energy). Title 40 CFR part 191

compliance certification application for the Waste Isolation Pilot

Plant. DOE/CAO-1996-2184, vols. I–XXI. Carlsbad, NM: US

Department of Energy, Carlsbad Area Office, Waste Isolation Pilot

Plant; 1996.

[102] US EPA (US Environmental Protection Agency). Criteria for the

certification and re-certification of the Waste Isolation Pilot Plant’s

compliance with the disposal regulations: certification decision; final

rule. Fed Register 1998;63:27353–406.

[103] Vaughn P, Bean JE, Helton JC, Lord ME, MacKinnon RJ,

Schreiber JD. Representation of two-phase flow in the vicinity of

the repository in the 1996 performance assessment for the Waste

Isolation Pilot Plant. Reliab Eng Syst Saf 2000;69(1–3):205–26.

[104] WIPP PA (Performance Assessment) Department. BRAGFLO, version

4.00 & 4.01, user’s manual. Carlsbad, NM: Sandia National

Laboratories. Sandia WIPP Records Center, ERMS #230703; 1996.

[105] Bean JE, Lord ME, McArthur DA, MacKinnon RJ, Miller JD,

Schreiber JD. Analysis package for the salado flow calculations

(task 1) of the performance assessment analysis supporting

Page 25: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330 329

the compliance certification application (CCA). Carlsbad, NM:

Sandia National Laboratories. Sandia WIPP Records Center.

ERMS #240514; 1996.

[106] US EPA (US Environmental Protection Agency). 40 CFR part 191:

environmental standards for the management and disposal of spent

nuclear fuel, high-level and transuranic radioactive wastes; final rule.

Fed Register 1985;50(182):38066–89.

[107] US EPA (US Environmental Protection Agency). 40 CFR part 191:

environmental radiation protection standards for the management

and disposal of spent nuclear fuel, high-level and transuranic

radioactive wastes; final rule. Fed Register 1993;58(242):

66398–416.

[108] US EPA (US Environmental Protection Agency). 40 CFR part 194:

criteria for the certification and re-certification of the Waste Isolation

Pilot Plant’s compliance with the 40 CFR part 191 disposal

regulations; final rule. Fed Register 1996;61(28):5224–45.

[109] Helton JC, Martell M-A, Tierney MS. Characterization of subjective

uncertainty in the 1996 performance assessment for the Waste

Isolation Pilot Plant. Reliab Eng Syst Saf 2000;69(1–3):191–204.

[110] Helton JC. Treatment of uncertainty in performance assessments for

complex systems. Risk Anal 1994;14(4):483–511.

[111] Pate-Cornell ME. Uncertainties in risk analysis: six levels of

treatment. Reliab Eng Syst Saf 1996;54(2/3):95–111.

[112] Helton JC. Uncertainty and sensitivity analysis in the presence of

stochastic and subjective uncertainty. J Stat Comput Simul 1997;

57(1–4):3–76.

[113] Iman RL, Conover WJ. A distribution-free approach to inducing

rank correlation among input variables. Commun Stat: Simul

Comput 1982;B11(3):311–34.

[114] Helton JC, Bean JE, Economy K, Garner JW, MacKinnon RJ,

Miller J, et al. Uncertainty and sensitivity analysis for two-phase

flow in the vicinity of the repository in the 1996 performance

assessment for the Waste Isolation Pilot Plant: undisturbed

conditions. Reliab Eng Syst Saf 2000;69(1–3):227–61.

[115] Helton JC, Bean JE, Economy K, Garner JW, MacKinnon RJ,

Miller J, et al. Uncertainty and sensitivity analysis for two-phase

flow in the vicinity of the repository in the 1996 performance

assessment for the Waste Isolation Pilot Plant: disturbed conditions.

Reliab Eng Syst Saf 2000;69(1–3):263–304.

[116] Iman RL. Statistical methods for including uncertainties associated

with the geologic isolation of radioactive waste which allow for a

comparison with licensing criteria. In: Kocher DC, editor. Proceed-

ings of the symposium on uncertainties associated with the

regulation of the geologic disposal of high-level radioactive waste.

NUREG/CP-0022; CONF-810372. Washington, DC: US Nuclear

Regulatory Commission, Directorate of Technical Information and

Document Control; 1981. p. 145–57.

[117] Kaplan S, Garrick BJ. On the quantitative definition of risk. Risk

Anal 1981;1(1):11–27.

[118] Iman RL, Conover WJ. The use of the rank transform in regression.

Technometrics 1979;21(4):499–509.

[119] Iman RL, Davenport JM, Frost EL, Shortencarier MJ. Stepwise

regression with press and rank regression (program user’s guide).

SAND79-1472. Albuquerque: Sandia National Laboratories; 1980.

[120] Conover WJ. Practical nonparametric statistics, 2nd ed. New York:

Wiley; 1980.

[121] Iman RL, Davenport JM. Approximations of the critical region of the

Friedman statistic. Commun Stat, Part A—Theory Methods 1980;

9(6):571–95.

[122] Iman RL, Conover WJ. A measure of top-down correlation.

Technometrics 1987;29(3):351–7.

[123] Garvey JE, Marschall EA, Wright RA. From star charts to stoneflies:

detecting relationships in continuous bivariate data. Ecology 1998;

79(2):442–7.

[124] Fasano G, Franceschini A. A multidimensional version of the

Kolmogorov–Smirnov test. Mon Not R Astron Soc 1987;225(1):

155–70.

[125] Gosset E. A 3-dimensional extended Kolmogorov–Smirnov test as a

useful tool in astronomy. Astron Astrophys 1987;188(1):258–64.

[126] Peacock JA. Two-dimensional goodness-of-fit testing in astronomy.

Mon Not R Astron Soc 1983;202(2):615–27.

[127] Assuncao R. Testing spatial randomness by means of angles.

Biometrics 1994;50:531–7.

[128] Ripley BD. Spatial point pattern analysis in ecology. In: Legendre P,

Legendre L, editors. Developments in numerical ecology. NATO

ASI series, series G: ecological sciences, vol. 14. Berlin: Springer;

1987. p. 407–30.

[129] Zeng G, Dubes RC. A comparison of tests for randomness. Pattern

Recognit 1985;18(2):191–8.

[130] Diggle PJ, Cox TF. Some distance-based tests of independence for

sparsely-sampled multivariate spatial point patterns. Int Stat Rev

1983;51(1):11–23.

[131] Byth K. On robust distance-based intensity estimators. Biometrics

1982;38(1):127–35.

[132] Byth K, Ripley BD. On sampling spatial patterns by distance

methods. Biometrics 1980;36(2):279–84.

[133] Diggle PJ. On parameter estimation and goodness-of-fit testing for

spatial point patterns. Biometrics 1979;35(1):87–101.

[134] Diggle PJ. Statistical methods for spatial point patterns in ecology.

In: Cormack RM, Ord JK, editors. Spatial and temporal analysis in

ecology. Fairfield, MD: International Co-operative Pub House; 1979.

p. 95–150.

[135] Besag J, Diggle PJ. Simple Monte Carlo tests for spatial pattern.

Appl Stat 1977;26(3):327–33.

[136] Diggle PJ, Besag J, Gleaves JT. Statistical analysis of spatial point

patterns by means of distance methods. Biometrics 1976;32:

659–67.

[137] Ripley BD. Tests of ‘randomness’ for spatial point patterns. J R Stat

Soc 1979;41(3):368–74.

[138] Cox TF, Lewis T. A conditioned distance ratio method for analyzing

spatial patterns. Biometrika 1976;63(3):483–91.

[139] Holgate P. The use of distance methods for the analysis of spatial

distribution of points. In: Lewis PAW, editor. Stochastic Point

Processes: Stat Anal, Theory Appl. New York: Wiley; 1972.

p. 122–35.

[140] Holgate P. Tests of randomness based on distance methods.

Biometrika 1965;52(3/4):345–53.

[141] Bowman AW, Azzalini A. Applied smoothing techniques for data

analysis. Oxford: Clarendon; 1997.

[142] Simonoff JS. Smoothing methods in statistics. New York: Springer;

1996.

[143] Hastie TJ, Tibshirani RJ. Generalized additive models. London:

Chapman & Hall; 1990.

[144] Friedman JH, Stuetzle W. Projection pursuit regression. J Am Stat

Assoc 1981;76(376):817–23.

[145] Cleveland WS. Robust locally weighted regression and smoothing

scatterplots. J Am Stat Assoc 1979;4(368):829–36.

[146] Saltelli A, Tarantola S, Campolongo F, Ratto M. Sensitivity analysis

in practice. New York: Wiley; 2004.

[147] Iman RL, Helton JC. The repeatability of uncertainty and sensitivity

analyses for complex probabilistic risk assessments. Risk Anal 1991;

11(4):591–606.

[148] Helton JC, Johnson JD, McKay MD, Shiver AW, Sprung JL.

Robustness of an uncertainty and sensitivity analysis of early

exposure results with the MACCS reactor accident consequence

model. Reliab Eng Syst Saf 1995;48(2):129–48.

[149] Evans M, Swartz T. Approximating integrals via Monte Carlo and

deterministic methods. Oxford: Oxford University Press; 2000.

Page 26: Helton et.al. 2005

J.C. Helton et al. / Reliability Engineering and System Safety 89 (2005) 305–330330

[150] Hurtado JE, Barbat AH. Monte Carlo techniques in computational

stochastic mechanics. Arch Computat Methods Eng 1998;5(1):3–330.

[151] Nicola VF, Shahabuddin P, Nakayama MK. Techniques for fast

simulation of models of highly dependable systems. IEEE Trans

Reliab 2001;50(3):246–64.

[152] Owen A, Zhou Y. Safe and effective importance sampling. J Am Stat

Assoc 2000;95(449):135–43.

[153] Heidelberger P. Fast simulation of rare events in queueing and

reliability models. ACM Trans Model Comput Simul 1995;5(1):

43–85.

[154] Shahabuddin P. Importance sampling for the simulation of highly

reliable Markovian systems. Manage Sci 1994;40(3):333–52.

[155] Goyal A, Shahabuddin P, Heidelberger P, Nicola VF,

Glynn PW. A unified framework for simulating Markovian

models of highly dependable systems. IEEE Trans Comput

1992;41(1):36–51.

[156] Melchers RE. Search-based importance sampling. Struct Saf 1990;

9(2):117–28.

[157] Glynn PW, Iglehart DL. Importance sampling for stochastic

simulations. Manage Sci 1989;35(11):1367–92.