Review and Recommendation of Methods for Sensitivity and ... · and uncertainty in model inputs in...

Review and Recommendation of Methods for Sensitivity and Uncertainty Analysis for the Stochastic Human Exposure

and Dose Simulation (SHEDS) Models

Volume 1: Review of Available Methods for Conducting Sensitivity and Uncertainty

Analysis in Probabilistic Models

2004-206-01

Prepared by:

Amirhossein Mokhtari H. Christopher Frey

Department of Civil, Construction, and Environmental Engineering North Carolina State University

Raleigh, NC

Prepared for:

Alion Science and Technology 1000 Park Forty Plaza

Durham, NC

June 27, 2005

i

Preface

This is one of a two volume series of reports on the topic of review and recommendation

of methods for sensitivity and uncertainty analysis for the Stochastic Human Exposure and Dose

Simulation (SHEDS) models.

The first volume provides a comprehensive review of methods for sensitivity and

uncertainty analysis, with a focus on methods that are relevant to probabilistic models based

upon Monte Carlo simulation or similar techniques for propagation of distributions for variability

and uncertainty in model inputs in order to estimate variability and uncertainty in model outputs.

The methods included in the review are those that are considered to be “available,” which is

interpreted to be methods of practical significance as opposed to all possible methods that have

been proposed but not tested in practice. For each method, there is a description of the method,

followed by a discussion of the advantages and disadvantages of the method. A framework for

selection of sensitivity analysis methods is presented, leading to recommendations for a more

narrow set of such methods that merit more detailed evaluation.

The second volume proposes and applies a methodology for evaluation of the selected

sensitivity analysis methods based upon application of each method to a modeling testbed. The

testbed is a simplified version of a typical SHEDS model. A case study scenario was defined

that includes multiple time scales (e.g., daily, monthly). Seven sensitivity analysis methods were

applied to the testbed, including Pearson correlation, Spearman correlation, sample regression,

rank regression, Analysis of Variance (ANOVA), Fourier Amplitude Sensitivity Test (FAST),

and Sobol’s method. The sensitivity analysis results obtained from these seven methods were

compared. On the basis of these quantitative results, recommendations were made for methods

that offer promise for application to SHEDS models. The statistically-based methods are often

ii

readily available features of commonly available software packages. However, FAST and

Sobol’s method are less readily available. Therefore, algorithms are presented for these two

methods. Furthermore, recommendations are made for additional research and development of

sensitivity analysis methods for application to the SHEDS models.

iii

TABLE OF CONTENTS

1. INTRODUCTION............................................................................................................. 1

2. THE SHEDS MODELS.................................................................................................... 5 2.1 Overview of the SHEDS-Pesticides Model ............................................................ 5 2.2 Main Characteristics of the SHEDS-Pesticide Model ............................................ 6

3. OVERVIEW OF VARIABILITY AND UNCERTAINTY ANALYSIS METHODS .......................................................................................................................11 3.1 Sources of Variability and Uncertainty................................................................. 12

3.1.1 Sources of Variability ............................................................................... 13 3.1.2 Sources of Uncertainty.............................................................................. 14

3.2 Methods for Propagation and Quantification of Variability and Uncertainty ...... 18 3.2.1 Analytical Propagation Techniques .......................................................... 18 3.2.2 Approximation Methods Based Upon Taylor Series ................................ 19 3.2.3 Numerical Propagation Techniques.......................................................... 21

3.3 Comparison of Selected Methods for Propagation of Probability Distributions of Inputs ...........................................................................................26

4. CATEGORIES OF SENSITIVITY ANALYSIS METHODS.................................... 27 4.1 Screening versus Refined Sensitivity Analysis Methods...................................... 27 4.2 Local versus Global Sensitivity Analysis Methods .............................................. 29 4.3 Mathematical, Statistical, and Graphical Sensitivity Analysis Methods .............. 29

5. IDENTIFICATION AND EVALUATION OF SPECIFIC SENSITIVITY ANALYSIS METHODS................................................................................................. 33 5.1 Nominal Range Sensitivity Analysis .................................................................... 33

5.1.1 Description................................................................................................ 33 5.1.2 Advantages................................................................................................ 34 5.1.3 Disadvantages ........................................................................................... 34

5.2 Differential Sensitivity Analysis........................................................................... 34 5.2.1 Description................................................................................................ 35 5.2.2 Advantages................................................................................................ 35 5.2.3 Disadvantages ........................................................................................... 36

5.3 Correlation Analysis ............................................................................................. 36 5.3.1 Description................................................................................................ 36 5.3.2 Advantages................................................................................................ 37 5.3.3 Disadvantages ........................................................................................... 37

5.4 Regression Analysis.............................................................................................. 38 5.4.1 Description................................................................................................ 38 5.4.2 Advantages................................................................................................ 40 5.4.3 Disadvantages ........................................................................................... 41

5.5 Analysis of Variance............................................................................................. 41 5.5.1 Description................................................................................................ 42 5.5.2 Advantages................................................................................................ 43 5.5.3 Disadvantages ........................................................................................... 43

5.6 Response Surface Method..................................................................................... 44

iv

5.6.1 Description................................................................................................ 44 5.6.2 Advantages................................................................................................ 45 5.6.3 Disadvantages ........................................................................................... 45

5.7 Classification and Regression Trees ..................................................................... 45 5.7.1 Description................................................................................................ 46 5.7.2 Advantages................................................................................................ 48 5.7.3 Disadvantages ........................................................................................... 48

5.8 Fourier Amplitude Sensitivity Test....................................................................... 49 5.8.1 Description................................................................................................ 49 5.8.2 Advantages................................................................................................ 52 5.8.3 Disadvantages ........................................................................................... 53

5.9 Sobol’s Method..................................................................................................... 53 5.9.1 Description................................................................................................ 54 5.9.2 Advantages................................................................................................ 55 5.9.3 Disadvantages ........................................................................................... 55

5.10 Mutual Information Index..................................................................................... 56 5.10.1 Description................................................................................................ 56 5.10.2 Advantages................................................................................................ 57 5.10.3 Disadvantages ........................................................................................... 57

6. COMPARISON OF SELECTED SENSITIVITY ANALYSIS METHODS ............ 59

7. SELECTION OF SENSITIVITY ANALYSIS METHODS ....................................... 65 7.1 What Are the Objectives of Sensitivity Analysis?................................................ 65 7.2 Based upon the Objectives, What Information is Needed from Sensitivity

Analysis?............................................................................................................... 66 7.3 What are the Characteristics of the Model that Constrain or Indicate Preference

Regarding Method Selection?............................................................................... 66 7.4 How Detailed is the Analysis?.............................................................................. 67 7.5 Is the Implementation of the Selected Sensitivity Analysis Method Post Hoc?... 68 7.6 Decision Framework to Assist in Selecting Sensitivity Analysis Methods.......... 69 7.7 Using the Decision Framework for Selection of a Preferred Set of Sensitivity

Analysis Methods for Application to the SHEDS-Pesticides Model.................... 72

8. SUMMARY ..................................................................................................................... 75

9. REFERENCES................................................................................................................ 79

v

LIST OF FIGURES Figure 2-1. Schematic Diagram of the SHEDS-Pesticides Simulation Process. .................. 7 Figure 5-1. An Example of a Non-monotonic Relationship between Output and Input

Values with Zero Correlation Coefficient.................................................................... 37 Figure 5-2. Schematic Diagram of a Classification and Regression Tree Illustrating

Root Node, Internal Nodes, and Terminal Nodes (Leaves)......................................... 46 Figure 7-1. Decision Framework for Selecting an Appropriate Sensitivity Analysis

Method. ........................................................................................................................ 69 Figure 7-2. Decision Framework for Selecting an Appropriate Sensitivity Analysis

Method for Identifying Key Sources of Variability and Uncertainty and for Model Refinement. ...................................................................................................... 71

vi

LIST OF TABLES Table 6-1. Summary of Key Characteristics of Selected Sensitivity Analysis Methods............. 60

1

1. INTRODUCTION

The purpose of this project is to identify, evaluate, and recommend methods for

sensitivity analysis applicable to the Stochastic Human Exposure and Dose Simulation (SHEDS)

risk assessment models. The EPA SHEDS models are aggregate, probabilistic and physically-

based human exposure models that simulate variability and uncertainty in cumulative human

exposure and dose (EPA, 2000; Price et al., 2003). Variability refers to the heterogeneity of

values with respect to time, space, or a population. Variability cannot be reduced by further

measurement. Uncertainty arises due to lack of knowledge regarding the true value of a quantity.

Uncertainty can be reduced by further measurements or information (Murphy, 1998; Anderson

and Hattis, 1999; Cullen and Frey, 1999).

There are a variety of ways to propagate information about variability or uncertainty

through a model, and hence, quantify the probability distribution of the model output. Although

the focus in this report will be on the use of numerical methods based upon Monte Carlo

simulations, some analytical and approximation methods will be also discussed.

Understanding variability can guide the identification of significant subpopulations that

merit more focused study. In contrast, knowing the uncertainty in the measurement of

characteristics of interest for the population can aid in determining whether additional research

or alternative measurement techniques are needed to reduce uncertainty (Cullen and Frey, 1999).

Insights from sensitivity analysis can be used for: (1) identification of key sources of

uncertainty; (2) identification of key controllable sources of variability; and (3) model

refinement, verification, and validation (Frey et al., 2004).

Sensitivity analyses of risk models are used to identify inputs that matter the most with

respect to exposure or risk and aid in developing priorities for risk mitigation and management

2

(Baker et al., 1999; Jones, 2000). Sensitivity analysis can identify important uncertainties.

Additional data collection or research can be prioritized based on the key uncertainties in order

to reduce the model output uncertainty (Cullen and Frey, 1999). Knowledge of key controllable

sources of variability is useful in identifying control measures to reduce exposure or risk.

Sensitivity analysis has been used for verification and validation purposes during the

process of model development and refinement (e.g., Kleijnen, 1995; Kleijnen and Sargent, 2000;

Fraedrich and Goldberg, 2000). Sensitivity analysis can be used for verification by assessing

whether the model output responds appropriately to a change in model inputs. Sensitivity

analysis can validate a model by determining the degree to which a model is an accurate

representation of the real world. Sensitivity analysis also can be used to evaluate the robustness

of model results (e.g., Philips et al., 2000; Ward and Carpenter, 1996; Limat et al., 2000;

Manheim, 1998; Saltelli et al., 2000).

There are many sensitivity analysis methods applied in various scientific fields, including

engineering, economics, physics, social sciences, medical decision making, and others (e.g.,

Baniotopoulos, 1991; Cheng, 1991; Merz et al., 1992; Helton and Breeding, 1993; Beck et al.,

1997; Agro et al., 1997; Kewley et al., 2000; Oh and Yang, 2000). Given the myriad of

sensitivity analysis methods, there is a need for insight regarding which methods to choose and

regarding how to apply preferred methods.

The key questions that are addressed in this report include the following:

(1) What are the main characteristics of the SHEDS models that are relevant to the

process of choosing appropriate sensitivity analysis methods?

(2) What are the available uncertainty analysis methods?

(3) What are the available sensitivity analysis methods?

3

(4) What are the key criteria for sensitivity analysis methods applied to the SHEDS

models?

(5) How well do different sensitivity analysis methods address the selected criteria?

(6) How can one select among available sensitivity analysis methods?

(7) What are the recommended sensitivity analysis methods for application to the

SHEDS models?

Chapter 2 briefly explains the SHEDS models including their key characteristics. Chapter 3

provides a brief discussion regarding available uncertainty analysis methods. Chapter 4 describes

classification of sensitivity analysis methods based upon their scope, applicability, and

characteristics. Chapter 5 identifies and describes 12 selected available sensitivity analysis

methods that are of practical significance. Chapter 6 provides a comparison of the selected

methods based on selected criteria. Chapter 7 provides guidance regarding a procedure for

selection of a preferred (set of) method(s) for application to the SHEDS models, including a

decision framework for method selection. Chapter 8 presents a summary and answers to the key

questions.

5

2. THE SHEDS MODELS

SHEDS is a family of models that are developed to estimate multimedia and multi-

pathway pollutant exposures of general as well as at-risk populations. The SHEDS models are

being designed to predict and diagnose complex relationships between pollutant sources and

dose received by different subpopulation (e.g., children and the elderly) (Price et al., 2003). The

SHEDS models provide estimates of variability and uncertainty in the predicted exposure

distributions and characterize factors influencing high-end exposures. These models address both

aggregate (all sources, routes, and pathways for a single chemical) and cumulative (aggregate for

multiple chemicals) exposures. The SHEDS models include SHEDS-Pesticides, SHEDS-Wood,

SHEDS-Air Toxics and SHEDS-PM (EPA, 2000). The discussion presented here is mainly based

upon SHEDS-Pesticides, because this model has typical characteristics of the SHEDS series of

models.

Sections 2.1 and 2.2 provide an overview of and briefly explain the main characteristics

of the SHEDS-Pesticides model, respectively.

2.1 Overview of the SHEDS-Pesticides Model

The EPA SHEDS-Pesticides model is an aggregate, probabilistic and physically-based

human exposure model which simulates variability and uncertainty in cumulative human

exposure and dose to pesticides. The key purposes of the model are: (1) to improve the risk

assessment process by predicting both inter-individual variability and uncertainties associated

with population exposure and dose distributions; (2) to improve the risk management process by

identifying critical exposure routes and pathways; and (3) to provide a framework for identifying

and prioritizing measurement needs and for formulating the most appropriate hypotheses and

designs for exposure studies. The schematic diagram of the simulation process for the SHEDS-

6

Pesticides model is given in Figure 2-1.

The SHEDS-Pesticides model predicts, for user-specified cohorts, exposures and doses

incurred via eating contaminated foods or drinking water, inhaling contaminated air, touching

contaminated surface residues, and ingesting residues from hand-to-mouth activities. The model

combines information on pesticide usage, human activity data, environmental residues and

concentrations, exposure and dose factors. The SHEDS-Pesticides model is limited to pesticides

and other dislodgeable compounds present on surfaces in residential environments (Price et al.,

2003).

For each individual, the SHEDS-Pesticides model can construct daily exposure and dose

time profiles for the inhalation, dietary and non-dietary ingestion, and dermal contact exposure

routes. The dose profiles are then aggregated across routes to construct an individual 1-year

profile. Exposure and dose metrics of interest (e.g., peak, time-averaged, time-integrated) are

extracted from the individual’s profiles, and the process is repeated thousands of times to obtain

population distributions. This approach allows identification of the relative importance of routes,

pathways, and model inputs. Two-stage Monte-Carlo sampling is applied to allow explicit

characterization of both variability and uncertainty in model inputs and outputs.

The outputs from the SHEDS-Pesticides model include: (1) the time profile of exposure

and dose metrics for a specified post-application time period for an individual; (2) exposure and

dose contribution by routes and pathways; (3) cumulative density function (CDF) and box-

whisker graphs for aggregate population estimates and uncertainty of percentiles.

2.2 Main Characteristics of the SHEDS-Pesticide Model

The main characteristics of the SHEDS-Pesticides model include: (1) non-linearity and

interaction between inputs; (2) saturation points; (3) different input types (e.g., continuous versus

7

Figure 2-1. Schematic Diagram of the SHEDS-Pesticides Simulation Process.

categorical); and (4) aggregation and carry-over effects.

Non-linearity is a relationship between two variables in which the change in one variable

is not simply proportional to the change in the other variable. An example in the SHEDS-

Pesticides model is an exponential decay term in the airborne concentration of the pollutant as:

)(0 ikExpCCi ×−×= (1)

where,

iC = modeled or measured airborne concentration of pesticide in air (µg/m3) at

the ith day

0C = initial concentration of pesticide in the air (µg/m3), including background

concentration

k = decay rate (1/day)

8

i = day; i = 1,2,…

Interaction is a case in which the effect of an input depends on the value of another input.

Interaction terms can be introduced in a model in multiplicative forms. A saturation point is an

input value above which the model output does not respond to changes in the input. For example,

there is an upper limit for possible dermal exposure in the SHEDS-Pesticides model.

Inputs may be qualitative (categorical) (e.g., gender) or quantitative (e.g., pesticide

concentration). Quantitative inputs can be continuous (e.g., body weight) or discrete (e.g.,

number of application of a pesticide). Some quantitative inputs may be described by empirical

distributions, while other inputs may be represented by parametric distributions.

Aggregation refers to situations in which multiple numerical values are combined into

one numerical value, such as sum or mean value. An example is the total exposure (ETotal) in the

SHEDS-Pesticides model that is an aggregate of exposure via inhalation (EInhalation), dermal

(EDermal), and ingestion (EIngestion) pathways as:

IngestionDermalInhalationTotal EEEE ++= (2)

The carry-over effect is a situation in which the exposure for an individual in each day is a

function of exposure in the same day and the prior day(s). For example, dermal exposure via

hand is corrected for the carry-over effect as:

idHandidHandfcidHand EEEE ,1,,, +×= − (3) where, Ef = fraction of exposure from the prior day considered in the carry-over effect

EdHand,i-1 = dermal exposure via hands at time step i-1

EdHand,i = dermal exposure via hands at time step i

The concepts discussed here about key characteristics of the SHEDS-Pesticides model are further

discussed in Volume 2 in order to develop a simplified version of the model as a testbed for

9

evaluation case studies. The simplified model has the key advantage of shorter simulation time

compared to the original model.

11

3. OVERVIEW OF VARIABILITY AND UNCERTAINTY ANALYSIS METHODS

Quantification of variability and uncertainty is playing an increasing role in regulatory

decisions. Although the importance of uncertainty analysis in decisions has increased, it is often

the case that such analyses require significant resources of time, money, and expertise. Some

regulatory decisions may require a full uncertainty analysis, while others may be based on less

detailed analysis of uncertainties, or the use of decision tools that do not involve quantitative

uncertainty analysis. Uncertainty analysis should be tailored to the specific decision problem

considering the required level of detail being related to the kind of decision problem posed. In

order to do so, a hierarchy of tools for uncertainty analysis should be considered. The hierarchy

proceeds from methods that are relatively fast and inquire limited resources to methods that need

significantly greater resources. The former provide the least detailed understanding of

uncertainty, while the latter produce highly detailed quantitative analyses of uncertainty that can

be directly used in decision analysis.

The use of probabilistic analysis methods for dealing with variability and uncertainty has

been widely recognized for environmental modeling and assessment applications. For example,

EPA has applied two-stage Monte Carlo simulation to characterize both variability and

uncertainty in human cumulative exposure and dose to pollutants, such as pesticides and

particulate matter (PM), in the development of Stochastic Human Exposure and Dose Simulation

(SHEDS) series of models (Zartarian, 2000). In the area of exposure and risk assessment, there

have been a number of analyses in which uncertainty analysis was performed. These include, for

example, Bogen and Spear (1987; 1990), Frey (1992), Hoffman and Hammonds (1996), Cohen

et al. (1996), Evans et al. (1994), Greenland et al. (1991), and others. For policy and decision

12

making, there have been a number of analyses where quantitative uncertainty analysis was used

(e.g., Morgan et al, 1984; Morgan and Henrion, 1990; Harrison, 2002).

Sources of uncertainty are briefly explained in the following. Typical methods for

uncertainty analysis are discussed.

3.1 Sources of Variability and Uncertainty

Variability and uncertainty are two distinct concepts within a decision problem, even

though they often have been lumped together in environmental analyses. Variability results from

natural stochastic behavior, periodicity, or variance in a trait across a population. In contrast,

uncertainty is related to lack of knowledge about the “true” value of a quantity, lack of

knowledge regarding which of several alternative models best describes a mechanism of interest,

or lack of knowledge about which of several alternative probability density functions should

represent a quantity of interest (Bogan and Spear, 1987; Morgan and Henrion, 1990; Frey, 1992;

NRC, 1994; Cullen and Frey, 1999).

If variability and uncertainty are unaccounted for, the quality of environmental

assessment will be potentially affected. For example, in a human health risk assessment context,

individuals in a population will have differing levels of exposure because of variability in

pgysiology, activity patterns, and temporal and spatial variability in pollutant concentrations.

Failure to account for this variability provides results that do not address the wide range of

exposure values possible, and may lead to overestimation or underestimation of risk. Similarly,

failure to account for uncertainty may lead to assumptions of precision that do not convey the

true state of knowledge. Thus, both variability and uncertainty may have remifications on policy

assessments and the ultimate success of the resulting policies.

13

Understanding which sources of variability and uncertainty are reducible provides insight

to the process of determining how to most appropriately allocate resources to improve the

certainty in the results of an analysis (Cullen and Frey, 1999). Understanding the sources of

variability and uncertainty is hence critical in identifying their existence and characterizing their

impact.

In the following, sources of variability and uncertainty are discussed.

3.1.1 Sources of Variability

Variability is present in any dataset that is not static in space, time, or across members of

a population. Common sources of variability are stochasticity, periodicity, population variance,

and existence of subpopulations. These sources are briefly discussed based on Cullen and Frey

(1999).

Stochasticity. Stochasticity is random, non-predictable behavior that is common in many

physical phenomena. Stochasticity is irreducible, although it can typically be represented over

time or space with a frequency distribution. Use of averaging periods can also reduce the effect

of stochasticity.

Periodicity. Periodicity implies cyclical behavior. For example, ambient temperature is

cyclical, tending to rise during the day and decrease at night. Periodicity is sometimes addressed

using averaging periods or assessing maximum values. Time series approaches can also be used

to represent periodicity more explicitly (Hamilton, 1994).

Population Variance. Many traits vary across a population of individuals. Example

includes human physiology (e.g., height, weight), attitudes (e.g., risk-avoidance, activity pattern),

and activities (e.g., commute distance to work). These traits often can be represented with

frequency distributions.

14

Existance of Subpopulations. In some cases, a population of data may include distinct

subpopulations that have significantly different distributions for a given trait. If the

subpopulations are aggregated together, the variance of the trait in the total population may be

greater that that in the individual subpopulations. If distinct subpopulations cab be identified, it

can be advantageous to perform an assessment of each subpopulation separately. For example, in

a risk exposure assessment, an analysis directed toward at-risk subpopulations (e.g., elderly or

children) may provide more precise information that an analysis on the entire population.

3.1.2 Sources of Uncertainty

Uncertainty goes by several names. Some common names are epistemic uncertainty,

lack-of-knowledge uncertainty, or subjective uncertainty. It is often stated that uncertainty is a

property of the analyst (Cullen and Frey, 1999). Different analysts, with different states of

knowledge or access to different datasets or measurement techniques, may have different levels

of uncertainty regarding the predictions that they make (e.g., NCRP, 1996). Sources of

uncertainty can include problem and scenario specification, model uncertainty, random error,

systematic error, lack of representativeness, lack of empirical basis, and disagreement of experts

(Cullen and Frey, 1999). The first two sources of uncertainty are typically related to structural

uncertainty, while the rest are related to uncertainty in model inputs. These sources are briefly

discussed in the following; however, this report is focused on the subset of sources of uncertainty

related to model inputs.

Problem and Scenario Specifications. Typically, a scenario includes specification

of goals and scope of an environmental assessment such as assessment boundaries, the transport

media, the exposure pathways, and various other decisions about which components to include in

the assessment. However, in some cases a scenario may fail to consider all of the factors and

15

conditions contributing to variation in the output, and hence, uncertainty can be introduced. This

source of uncertainty typically results in a bias in estimates. Important factors may be omitted

from assessment because of lack of available resources. For example, an air quality model may

limit the number of meteorological episodes due to computational intensity.

Model Uncertainty. Typically, the computer models used in assessment do not capture

all aspects of the problem. There is a trade-off between model formulation and scope of the

assessment. Some sources of uncertainty in modeling include conceptual model, model structure,

mathematical implementation, detail, resolution, and boundary conditions (Frey, 1992; Cullen

and Frey, 1999; Isukapalli, 1999).

The analyst’s conceptual model of the problem may be different from reality. For

example, the analyst may inadvertently omit important inputs, emphasize some inputs, or

misrepresents dependence between some inputs.

There may also be competing model formulations that may have advantages or

disadvantages over each other. The alternative formulations may also represent different sets of

thought or assumptions.

Model uncertainty due to mathematical implementation of the model may result from

numerical instability. There also a possibility that coding errors introduces additional

uncertainty.

Model detail refers to the degree to which physical and chemical processes are described

within the model. Typically, the assumption is that through having a model with greater level of

detail, less uncertainty should appear in results. However, in some cases more detailed models

may require the introduction of new parameters, the values of which may be uncertain. Thus,

increasing detail may not reduce overall uncertainty or may lead to greater uncertainty.

16

Model resolution refers to factors such as the time steps and grid sizes used. Decreasing

these parameters can yield more detailed results. For example, a model with a grid size of 5 km

would be expected to replicate observed values more closely than a model with a 20 km grid

size. The trade off would be the additional computational cost and a trade off between the level

of detail of results, leading perhaps to more accuracy, and the need for more detailed input

information, which may be subject to more uncertainty than more aggregated input information.

The selection of boundary conditions such as background concentrations and initial

conditions might be required for some times of simulation models (e.g., photochemical gridded

air quality models). Boundary conditions may be based on expert judgment, assumed behavior,

or the outputs from other models. Each source can potentially contribute to uncertainty.

Simulation models generally become less sensitivity to initial conditions as the simulation time

frame is increased, but they can retain an important sensitivity to boundary conditions at all time

steps.

Random Error. Random error or stochasticity represents random, non-predictable

behavior. This source of uncertainty is associated with imperfections in measurement techniques

or with processes that are random or statistically independent of each other (Bevington and

Robinson, 1992). For example, in a time series analysis, the behavior of a time-dependent trait is

characterized with a function. However, this function does not capture the whole variability.

Typically, an error term with a normal distribution is used to represent the unexplained (random)

behavior (Hamilton, 1994).

Systematic Errors. The mean value of a measured quantity may not converge to the

“true” mean value because of biases in measurements and procedures. Systematic error can be

introduced into a dataset from sources such as the imperfect calibration of equipment,

17

simplifying or incorrect assumptions, or errors due to the selection and implementation of

methodologies for collecting and utilizing data. Application of surrogate data is another source of

systematic error (Benvington and Robinson, 1992; Mandel, 1969). This occurs when one

assumes a simplified model for how a system behaves and them makes measurements

accordingly. Other sources of systematic error include, for example, self-selection biases in

voluntary responses to surveys.

Lack of Representativeness. Data used as input to the assessment may not be completely

representative of the study objectives. For example, emission factor data used to estimate

emissions may have been generated using different kinds of equipment or equipment that was

used under different operating or maintenance conditions. Surrogate data is another example of

non-representative data. When data describing an input to an assessment is unavailable, limited,

or not practical to collect, surrogate data is often used.

Lack of Empirical Basis. This type of uncertainty cannot be treated statistically,

because it requires prediction about something that has yet to be built, or measured (Frey, 1992).

This type of uncertainty can be represented using technically-based judgments about the range

and likelihood of possible outcomes (Cullen and Frey, 1999). These judgments may be based on

theoretical foundation or experience with analogous systems. In cases where data exist for

analogous systems, it may be possible to fit probability distribution models to the existing

dataset.

Disagreement of Experts. Expert opinion is used to select values or distributions for

inputs into an assessment. For example, experts may suggest the most appropriate reaction rate

or supply a subjective prior distribution for an input. However, different opinions of experts may

differ on these data and distributions. When there are limited data or alternative theoretical bases

18

for modeling a system, experts may disagree on interpretation of data or on their estimates

regarding the range and likelihood of outcomes for empirical quantities. This disagreement

introduces uncertainty regarding the most appropriate values of distributions to use (Cullen and

Frey, 1999).

3.2 Methods for Propagation and Quantification of Variability and Uncertainty

In order to quantify variability and uncertainty, the distribution of model inputs should be

propagated through the model to obtain distributions on model outputs. Propagation techniques

may be analytical, approximation, or numerical.

3.2.1 Analytical Propagation Techniques

For simple models in which the output is a function of linear combination of model

inputs with no dependency, the propagation of probability distributions through the model is

straightforward. For such cases the Central Limit Theorem (CLT) can be used. CLT can be

stated in variety of ways. One is CLT for the sum of independent random variables. The

distribution of the sum of independent random variables approaches a normal distribution as the

number of random variables becomes large (DeGroot, 1986). It is not required to assume specific

shape for the probability distribution for each of the variables in the sum. The CLT for the sum

of independent variables can be summarizes as:

∑=

=n

iixs

1,µµ (3-1)

where sµ is the mean of the sum and ix,µ are the means of each of the variables being added

together. The variance of the sum is equal to the sum of the variances:

∑=

=n

iixs

1

2,

2 )()( σσ (3-2)

19

The same approach can be used for models with outputs as functions of multiplicative

independent input using a logarithmic transformation.

Advantages. Analytical propagation techniques based on the CLT are straightforward

and easy to implement to simple models in which the model output is a linear sum or products of

model inputs.

Disadvantages. Although the results of the CLT approach are useful in some cases

for propagating the mean and the variance through a simple linear model, they do not imply

anything about the shape of the model output distribution (Cullen and Frey, 1999). Moreover, the

implications of the CLT are relevant only if the conditions of the CLT exist for a particular

situation. Thus, if a model contains both products and sums of inputs, or for which some of the

inputs are dominant over others, or for which some of the inputs are not statistically independent,

the analytical propagation techniques based on the CLT cannot be used.

3.2.2 Approximation Methods Based Upon Taylor Series

There are a number of methods based upon the use of the Taylor series expansions for

propagating the mean and other central moments of random variables through a model. The basic

approach is to take a general function, such as:

),...,,( 21 nxxxhy = (3-3)

and then expand the function about the point [E(x1), E(x2), …, E(xn)] using a multivariate Taylor

series expansion. The series is usually truncated at a specified set of higher order terms. For

example, the mean of the output distribution, E(y), can be approximated by the following Taylor

series expansion (Hahn and Shapiro, 1967):

∑=

+×∂∂

×+=n

ix

in ix

hxExExEhyE1

22

2

21 21)](),...,(),([)( σ {Higher Order Terms} (3-4)

20

The function h and the partial derivatives on the right side are evaluated at the point [E(x1),

E(x2), …, E(xn)]. The variance of the model output, 2yσ , of statistically independent random

variables is approximated by the following:

∑ ∑= =

+×∂∂

×∂∂

+×∂∂

=n

i

n

ii

iix

iy x

xh

xh

xh

i1 1

32

2222 )()()()( µσσ {Higher Order Terms} (3-5)

where, )(3 ixµ is the third central moment of each input random variables.

Advantages. Based upon a sufficient number of central moments for a model output, it

may be possible to select a parametric probability distribution model that provides a good

representation of the output distribution (Cullen and Frey, 1999). Once a parametric distribution

of the output is specified, prediction can be made regarding any percentile of the model output.

Thus, as an advantage of approximation methods based upon Taylor series, it may only be

necessary to propagate the moments of each probability distribution of the model inputs instead

of the entire probability distributions.

Disadvantages. Approximation methods based on Taylor series typically have

three major limitations (Cullen and Frey, 1999). First, as a primary limitation in application of

these techniques, the model function should be differentiable. Therefore, these methods cannot

be applied to problems with discrete or discontinuous behaviors. Second, these methods are

computationally intensive as they typically require the evaluation of second order (and

potentially higher) derivatives of the model. Third, although these techniques are capable of

propagating central moment of input distributions, information regarding the tails of the input

distributions cannot be propagated. In environmental exposure and risk assessment problems

where the shape of the tails is critical, this limitation can be problematic.

21

3.2.3 Numerical Propagation Techniques

The most common techniques for numerical propagation of uncertainty and variability

are sampling based methods. Some of the sampling based methods for propagating probability

distributions are: (1) Monte Carlo and Latin Hypercube Sampling methods; and (2) Fourier

Amplitude Sensitivity Test (FAST); and (3) reliability based methods.

Monte Carlo Simulation. In Monte Carlo simulation, a model is run repeatedly, using

different values for each of the uncertain input parameters each time (Hahn and Shapiro, 1967;

Ang and Tang, 1984; Morgan and Henrion, 1990). The values of each of the uncertain inputs are

generated based on the probability distribution for the input. With many input variables, one can

envision Monte Carlo simulation as providing a random sampling from a space of m dimensions,

where m is the number of inputs to a model.

As a general approach for applying Monte Carlo simulation to a model, for each input a

probability distribution should be specified. Random samples are generated from the each of the

probability distributions. One sample from each input distribution is selected, and the set of

samples is fed into the model. The model is then executed as it would be for any deterministic

analysis. The process is repeated until the specified number of model iterations has been

completed. Thus, instead of obtaining a single number for model outputs as in a deterministic

simulation, a set of samples is obtained. These can be represented as CDFs and summarized

using typical statistics such as mean and variance.

Most numerical simulation methods, including random Monte Carlo, require the

generation of uniformly distributed random numbers between 0 and 1 (Cullen and Frey, 1999).

Given a uniformly distributed random variable, several methods exist from which to simulate

random variables that are described by other probability distributions (e.g., normal, lognormal,

22

and gamma). These methods include the inverse transform, composition, and function of random

variables (e.g., Ang and Tang, 1984). In addition methods exist for simulation of jointly

distributed random variables, which enables one to represent correlations between two or more

simulated random variables.

• Advantages. Advantages of Monte Carlo methods in propagating probability

distributions of inputs flow from the fact that their output provides more

information compared to analytical and approximate methods. Moreover,

because Monte Carlo methods provide a probability distribution of the output,

they avoid the problem of compounding conservative values of input variables

(Burmaster and Harris, 1993). Additional advantages follow from the information

provided by Monte Carlo simulation. For example, results based on Monte Carlo

simulations are typically conducive to sensitivity analysis, permitting the risk

assessors to determine where additional data will be most useful in reducing

uncertainty (Finley and Paustenbach, 1994).

• Disadvantages. Because Monte Carlo simulation requires multiple iterations of a

model, such simulations can be computationally intensive if the model requires a

large run time per simulation. Furthermore, depending on the data quality

objectives of the analysis, it may be necessary to perform a large number of

simulations. Although not a limitation of the method itself, in practice results

based on Monte Carlo simulations are easy to misuse by stretching them beyond

the limits of credibility. For example, problems can arise when inexperienced

analysts use commercial simulation packages due to ease of application and lack

of familiarity with underlying assumptions and restrictions. Typical

23

misapplications of Monte Carlo simulation include failure to properly develop

input distributions and misinterpretation or over-interpretation of results. For

example, it is not possible to have a precise estimate of an upper percentile of an

output distribution without a large simulation sample size.

Latin Hypercube Sampling Methods. An alternative to random Monte Carlo

simulation is Latin Hypercube Sampling (LHS) (McKay et al., 1979). In LHS methods, the

percentiles that are used as inputs to the inverse CDF of each input are not randomly generated.

Instead, the probability distribution for the random variable of interest is first divided into ranges

of equal probability, and one sample is taken from each equal probability range. However, the

order of samples is typically random over the course of the simulation, and the pairing of

samples between two or more random input variables is usually treated as independent. In

median LHS, one sample is taken from the median of each equal-probability interval, while in

random LHS one sample is taken at random within each interval (Morgan and Henrion, 1990).

• Advantages. LHS typically has the same advantages as Monte Carlo simulation.

Furthermore, LHS ensures better coverage of the entire range of the distribution.

Because the distributions are more evenly sampled over the entire range of the

probable values in LHS, the number of samples required to adequately represent a

distribution in less for LHS compared to random Monte Carlo simulation (McKay et

al., 1979; Iman and Helton, 1988; Morgan and Henrion, 1990). Moreover, compared

to Monte Carlo simulation, LHS reduces the statistical fluctuation in simulations of

random variables for a given sample size (Cullen and Frey, 1999).

• Disadvantages. Because LHS is only random in pairing of values of the input

random variables, it is not practical to use LHS to characterize the effect of statistical

24

sampling error using the bootstrap technique (Cullen and Frey, 1999). There are

specific situations in which median LHS cannot provide correct results, such as when

sampling from a periodic function if the sample intervals are spaced with the same

period (Morgan and Henrion, 1990), but this is a rare problem in practice.

Fourier Amplitude Sensitivity Test. FAST has been developed for uncertainty and

sensitivity analysis (Cukier et al., 1973; 1975, 1978; Schaibly and Shuler, 1973). FAST method

applies a functional transformation to each input, assigns each input a distinct integer frequency,

and introduces a common independent variable to all inputs (Cukier et al. 1978; McRae et al.

1982). The inputs vary simultaneously with this independent variable in such a way that output

becomes a periodic function of the independent parameter. Fourier analysis is performed on the

output, which produces Fourier amplitudes for each frequency. FAST provides a way to estimate

the expected value and variance of the output variable and the contribution of individual factors

to this variance (Saltelli et al., 2000). Further explanation of FAST when applies as a sensitivity

analysis method, including advantages and disadvantages is given in Section 5.8.

Reliability Based Methods (FORM and SORM). First -order reliability methods, also

known as FORM, estimates the probability of an event under consideration. FORM can provide

the probability that an output exceeds a specific value, also known as probability of failure

(Karamchandani and Cornel, 1992; Lu et al., 1994; Hamed et al., 1995). The failure probability

can be expressed by:

∫≤

×=≤=0)(

)()0)((X

XXZ

f dfZPP ξξ (3-6)

where, Z(X) is the output function and X is a set of k random variables (x1, x2, …, xk). If Z(X) is

a deterministic function of the random variables, X, it can be linearized in the neighborhood of

some point x* = ),...,,( **2

*1 kxxx as:

25

+−×∂∂

+= ∑=

)()()( *

1

*ii

k

i i

xxxZxZZ X {Higher Order Terms} (3-7)

The point x* is chosen as the design point, which is defined as the point with greatest probability

density, satisfying Z(x*) = 0. This point is found by an optimization procedure. For the case of

independent inputs and by neglecting higher order terms, the output mean and variance can be

estimated as:

∑= ∂

∂×−+≈

k

i iiiZ x

xZxxZ1

*** )()()( µµ (3-8)

∂∂

×≈ ∑=

k

i iiZ x

Z1

222 )(σσ (3-9)

where, iµ and 2iσ are mean and variance of input xi.

Various methods have been suggested to improve the accuracy of FORM calculations

and to give a rough estimate of the quality of approximation. SORM uses second-order terms in

estimation of mean and variance of the output, and hence, provides more accuracy (Fiessler et

al., 1979).

• Advantages. FORM and SORM typically require only the knowledge of the moments

of components reliabilities, that is, no distribution function must be specified. Moreover,

generation of random numbers is not required; therefore, there is no sampling error in

propagation. Finally, they can be applied to dependent as well as independent inputs,

although the equations for dependent variables would be more difficult to derive due to

their complexity.

• Disadvantages. FORM and SORM are approximate and a finite error is associated with

the use of only up to first and second order terms, respectively. Furthermore, the accuracy

of these methods is not readily quantifiable.

26

3.3 Comparison of Selected Methods for Propagation of Probability Distributions of Inputs

The ideal method(s) for propagation of probability distributions of inputs should: (1) not

be dependent to the functional form of the model; (2) provide insight regarding the entire range

of the output distribution; (3) require fewer iterations of the model; (4) not only propagate the

probability distributions, but also provide insights regarding sensitivity of the model output to

inputs; and (5) be typically available in commonly used software packages.

Based on the discussions provided for advantages and disadvantages of typical methods

for propagation of probability distributions of inputs, sampling based numerical methods

including Monte Carlo simulation and LHS are preferred. These methods can be used for a wide

variety of models and can accommodate a wide variety of assumptions regarding input

distributions. Furthermore, these methods enable characterization of the probability distribution

for the model output. The sample values generated as part of a Monte Carlo or Latin hypercube

simulation can be used as the basis for sensitivity analysis with a wide variety of sensitivity

analysis methods. Both of these methods are commonly available and are widely used. The

main potential disadvantage is the need for repeated iterations of a model. As computing power

increases, this limitation typically decreases in practice.

27

4. CATEGORIES OF SENSITIVITY ANALYSIS METHODS

Sensitivity analysis methods can be classified in alternative ways based upon their scope,

applicability, and characteristics. For example, methods are classified as: (1) screening versus

refined depending upon the level of detail or sophistication; (2) local or global depending upon

whether they measure sensitivity at a specific point in the input domain or over the entire input

domain when many inputs are varying simultaneously (Saltelli et al., 2000); and (3)

mathematical, statistical, or graphical based upon characteristics (Frey and Patil, 2002). These

classifications are briefly explained in Sections 4.1 to 4.3.

4.1 Screening versus Refined Sensitivity Analysis Methods

Screening methods are typically used to make a preliminary identification of sensitive

inputs. A sensitive input is any input that has a significant influence on the variation of the output

(Saltelli et al., 2000). An input is more sensitive than another if it produces a larger contribution

to the desired measure of sensitivity, such as contribution to variance in an output (Hofer, 1999;

Helton and Davis, 2002) or association with high-end values of particular concern (Mishra et al.,

2003). However, screening methods are often relatively simple and may not be robust to key

model characteristics such as nonlinearity, thresholds, interactions, and different types of inputs.

Refined methods that can adequately deal with complex model characteristics typically

require greater expertise and resources to implement and interpret. Refined methods are

preferred if the results will be used to make decisions regarding commitments of large amounts

of resources for further model development, data collection for model inputs, or risk

management strategies (Frey et al., 2004).

29

4.2 Local versus Global Sensitivity Analysis Methods

Local sensitivity analysis methods concentrate on the impact of changes in values of

inputs with respect to a specific point in the input domain. An example is the use of partial

derivatives, possibly normalized by the nominal value of the input or by its standard deviation.

Local sensitivity analysis methods have been used for sensitivity analysis of large systems of

differential equations (Cacuci, 1981 a,b; Oblow et al., 1986). Nominal range sensitivity analysis

(NRSA) and differential sensitivity analysis (DSA) methods are examples of local sensitivity

analysis methods.

Global sensitivity analysis apportions variation in the output to variation in the inputs,

described typically by probability distribution functions that cover domains of inputs (Saltelli et

al., 2000). Global methods are applicable to situations in which model inputs are varied

simultaneously over large ranges of values, typically based upon probability distributions

assigned to each input. Examples of commonly used global sensitivity analysis methods are

regression-based techniques (e.g., Helton, 1993), variance decomposition methods (e.g., Saltelli

et al., 1999), and scatter plots (e.g., Kleijnen and Helton, 1999; Saltelli et al., 2000; Frey and

Patil, 2002).

4.3 Mathematical, Statistical, and Graphical Sensitivity Analysis Methods

Mathematical methods typically address the local or linear sensitivity of the output to

perturbations or ranges of individually varied inputs and are helpful in eliminating unimportant

inputs. However, they may not be reliable as a method for ranking and discriminating among

important inputs. Furthermore, mathematical methods do not address the variance in the output

due to the variance in the inputs. The mathematical methods evaluated here include NRSA and

DSA.

30

Statistical methods involve running simulations in which inputs are assigned probability

distributions and depending upon the method, one or more inputs are varied at a time. Examples

of statistical sensitivity analysis methods include regression-based techniques that have been

widely used for sensitivity analysis in different scientific disciplines (Whiting et al., 1993; Gwo

et al., 1996; Kolev and Hofer, 1996; Hofer, 1999). Such methods involve generation and

exploration of a mapping from the model inputs domain to the model simulation results. Among

various available regression-based methods, Pearson (sample) correlation analysis (PCA),

Spearman (rank) correlation analysis (SCA), sample standardized regression analysis (SRA),

rank regression analysis (RRA), and analysis of variance (ANOVA) are selected for evaluation

with respect to their applicability to the SHEDS models.

Besides these common statistical methods, statistical methods evaluated here also

include: Fourier amplitude sensitivity test (FAST), Sobol’s method, mutual information index

(MII), response surface method (RSM), and classification and regression trees (CART). FAST

and Sobol’s method are variance-based methods and are able to provide a total sensitivity index

for each input. This is a key advantage in the case of non-additive models with several inputs

(Saltelli et al., 2000). CART provides a different measure of importance by identifying inputs

and corresponding ranges of variation corresponding to low/high exposure and risk.

Graphical methods give representation of sensitivity in the form of graphs, charts, or

surfaces and typically are used to give visual identification of how an output is affected by

variation in inputs (e.g., Geldermann and Rentz, 2001; Saltelli, 2000). Graphical methods can

reveal complex dependencies between inputs and output (e.g., McCamley and Rudel, 1995), and

hence, can be used as screening techniques. Graphical methods can be used to complement the

31

results of mathematical and statistical methods for better representation (e.g., Stiber et al., 1999;

Critchfield and Willard, 1986).

Selected sensitivity analysis methods are briefly explained in Chapter 5 and strength and

limitations of these methods are discussed.

33

5. IDENTIFICATION AND EVALUATION OF SPECIFIC SENSITIVITY ANALYSIS METHODS

This chapter presents a brief discussion regarding selected methods for sensitivity

analysis which can be applied to the SHEDS models. The methods are briefly discussed. Key

advantages and disadvantages for each method are highlighted. These methods include nominal

range sensitivity analysis (NRSA), differential sensitivity analysis (DSA), Pearson and Spearman

correlation analyses, sample and rank regression analyses, analysis of variance (ANOVA),

response surface method, classification and regression trees (CART), Fourier amplitude

sensitivity test (FAST), Sobol’s method, and mutual information index (MII).

5.1 Nominal Range Sensitivity Analysis

NRSA is also known as local sensitivity analysis or threshold analysis (Cullen and Frey

1999; Critchfield and Willard, 1986). This method is applicable to deterministic models. A

typical use of NRSA is as a screening analysis to identify the most important inputs to propagate

through a model in a probabilistic framework (Cullen and Frey, 1999). NRSA can be used to

prioritize data collection needs as demonstrated by Salehi et al. (2000).

5.1.1 Description

NRSA evaluates the effect of varying only one of the model inputs across its entire range

of plausible values, while holding all other inputs at their nominal values on the model output

(Cullen and Frey, 1999). The sensitivity can be represented as a positive or negative percentage

change compared to the nominal solution (Morgan and Henrion, 1990). The analysis can be

repeated for any number of individual model inputs. The sensitivity index is calculated as

follows:

Sensitivity = input nominal

inputmin inputmax

OutputOutputOutput −

(5-1)

34

The results of NRSA are most valid when applied to a linear model. In such cases, it would be

possible to rank order the relative importance of each input based upon the magnitude of the

calculated sensitivity measure as long as the ranges assigned to each sensitive input are accurate.

However, for a non-linear model, the sensitivity of the output to a given input may depend on

interactions with other inputs, which are not considered. Thus, the results of NRSA are

potentially misleading for nonlinear models.

5.1.2 Advantages

NRSA is a relatively simple method that is easily applied. It works well with linear

models and when the analyst has a good idea of plausible ranges that can be assigned to each

selected input. The results of this approach can be used to rank order key inputs only if there are

no significant interactions among the inputs, and if ranges are properly specified for each input.

5.1.3 Disadvantages

NRSA addresses only a potentially small portion of the possible space of input values,

because interactions among inputs are difficult to capture (Cullen and Frey, 1999). Potentially

important combined effects on the decision (or output) due to simultaneous changes in a few or

all inputs together are not shown by NRSA for other than linear models; thus for nonlinear

models NRSA cannot provide a reliable rank ordering of key inputs (Frey et al., 2003).

5.2 Differential Sensitivity Analysis

Differential Sensitivity Analysis (DSA) is a local sensitivity analysis method. It is most

applicable for calculating the sensitivity of the output to small deviations in the point estimate of

an input.

35

5.2.1 Description

In DSA the local sensitivity is calculated at one or more points in the parameter space of

an input keeping other inputs fixed. The sensitivity index is calculated based on a finite

difference method. DSA is performed with respect to some point x in the domain of the model.

A small perturbation ∆x with respect to the point value of a model input, such as a change of plus

or minus one percent, can be used to evaluate the corresponding change in the model output.

Thus, the sensitivity index may be calculated as:

Sensitivity = x

x-xx x

OutputOutputOutput ∆∆+ − (5-2)

A more generalized form of DSA is the Automatic Differential (AD) sensitivity analysis. AD is

an automated procedure for calculating local sensitivities for large models (Grievank, 2000). In

AD the local sensitivity is calculated at one or more points in the parameter space of the model.

At each point, the partial derivative of the model output with respect to a selected number of

inputs is evaluated. The values of partial derivatives are a measure of local sensitivity. Automatic

differentiation has been applied to models that involve complex numerical differentiation

calculations such as partial derivatives, integral equations, and mathematical series (Hwang et

al., 1997).

5.2.2 Advantages

DSA is conceptually easy to apply and needs only a small amount of computational time

compared to statistical methods if sensitivity at only few points is calculated. It is especially

useful when a high degree of confidence is attributed to a point estimate and thus the variation in

the output need only be tested for perturbations around the point estimate. The sensitivity thus

obtained can aid in identifying the significant figures needed for the point estimates of an input.

36

DSA provides insight into the comparative change in the output associated with an equivalent

perturbation of each input.

5.2.3 Disadvantages

DSA does not consider the possible range of values that inputs can take in calculation of

sensitivity indices. Thus, no inference can be made regarding global sensitivity. DSA is based on

finite difference method. AD is superior to finite difference approximations of the derivatives

because numerical values of the computed derivatives are more accurate and computational

effort is significantly lower (Bischof et al., 1992). For nonlinear models, DSA does not account

for interaction among inputs. Therefore, the significance of differences in sensitivity between

inputs is difficult to determine making the rank ordering of key inputs potentially difficult.

5.3 Correlation Analysis

Correlation analysis measures the strength of a linear or monotonic relationship between

an input and the output. Two types of correlation analyses are considered including Pearson

correlation analysis (PCA) and Spearman correlation analysis (SCA), which are based on

samples and ranks, respectively, of the inputs and outputs.

5.3.1 Description

PCA evaluates the strength of linear association between paired input and output values,

while SCA is a measure of the strength of the monotonic relationship between two variables and

can account for monotonic nonlinear relationships (Gibbons 1985, Siegel and Castellan 1988,

and Kendall 1990). Inputs can be ranked based on the absolute values of correlation coefficients.

Both Pearson and Spearman correlation coefficients can range from -1 to +1. A value of zero

represents a lack of correlation (Edwards, 1976). Correlation analysis measures the effect of one

input (i.e., Xi) at a time on the response variable (Y) (Helton and Davis, 2002).

37

Figure 5-1. An Example of a Non-monotonic Relationship between Output and Input Values with Zero Correlation Coefficient.

5.3.2 Advantages

The Pearson correlation coefficients capture linear relationships in the model. Spearman

correlation coefficients can respond to nonlinear monotonic relationships. Both correlation

coefficients are relatively easy to compute, as they are readily available in many commercial

software packages such as @Risk and Crystal BallTM.

5.3.3 Disadvantages

Correlation does not imply causation. There can be a case where a third variable is

influencing the two variables with high correlation. Pearson coefficients are inaccurate for

nonlinear models and Spearman coefficients are inaccurate for non-monotonic models. Neither

Pearson nor Spearman coefficients can directly deal with interactions. Figure 5-1 shows an

example of a non-monotonic relationship between output and input values for which both PCA

and SCA estimate a zero correlation coefficient.

38

5.4 Regression Analysis

Regression analysis can be employed as a probabilistic sensitivity analysis technique as

demonstrated by Iman et al. (1985). Regression analysis serves three major purposes (Neter et

al., 1996; Sen and Srivastava, 1990): (1) description of the relationship between input and

output variables; (2) control of an output for given values of the inputs; and (3) prediction of an

output based on inputs.

5.4.1 Description

There are many variations of regression analysis (Iman et al., 1985; Neter et al., 1996;

Sen and Srivastava, 1990). Regression analysis can be used to assess the combined effects of

multiple inputs on an output (Neter et al., 1996). Standardized least squares regression is

considered here as a commonly used method. A regression model is defined as:

i

n

iio Tbby ∑

=

+=1

ˆ (5-3)

where b0, b1,…, bn are regression coefficients. The term Ti can be an input term (Xi), an

interaction term (Xi × Xj), or any higher order term. Typically, the data for each input and output

are standardized to remove the effects of scale (Neter et al., 1996).

Regression analysis can be performed based on the sampled values of inputs or rank

transformed values of inputs. Sample regression analysis (SRA) involves fitting a regression

model to a dataset that includes input values and corresponding output values, while rank

regression analysis (RRA) is based upon ranks for the values of inputs and the output. RRA is

especially useful when there is high amount of variance or noise in the data or if the model is

non-linear but monotonic (Saltelli et al., 2000).

For a linear regression model, a standardized regression coefficient provides a measure of

importance for each input (Devore and Peck, 1996; Neter et al., 1996). The typical

39

standardization approach is based on subtracting the mean from each data point and dividing by

the corresponding standard deviation. The following equation is used for normalization of input

and output values (Neter et al., 1996).

−=′

x

XXXσ

−=′

Y

YYYσ

(5-4)

where,

X' = Normalized input data point

Y' = Normalized output data point

X = Mean of the input values

Y = Mean of the output values

σX = Standard deviation of the input

σY = Standard deviation of the output

Similarly, rank regression coefficients, which are independent of scale, can be used to rank the

inputs. However, rank regression coefficients cannot be transformed back to obtain sensitivities

in terms of the original values of an input (Saltelli et al., 2000).

An alternative but related definition of importance is based on the partitioning of the total

sum of squares (TSS) of the output to each of the regression terms to estimate the relative partial

sum of squares (RPSS) (Gardner and Trabalka, 1985; Rose et al., 1991). This approach is used as

a measure of importance when interaction terms are considered in the regression model. The

RPSS for term Ti is calculated as:

TSSRSSRSS

RPSS i

i

TT

)(100 −−

×= (5-5)

where,

RSS = Regression sum of squares for the full model

RSS-Ti = Regression sum of squares for the model with term Ti missing

TSS = Total sum of squares

40

The adequacy of the regression model can be assessed using the coefficient of multiple

determination, R2, which is a measure of the amount of variance in the output explained by a

regression model (Draper and Smith, 1981). In rank regression analysis, a high R2 value

indicates a monotonic relationship.

Sometimes a generally smooth response of the model might contain an isolated

discontinuity or change-point in the curve or in a (possibly higher order) derivative. In many

cases interest focuses on the occurrence of such change-points (e.g., thresholds) (Muller, 1992).

In parametric approaches to the regression change-point problems, simple linear regressions

before and after a possible change-point are assumed. The possibility of a discontinuity in the

form of a jump or a jump in the first derivative, or equivalently, a slope change, is incorporated

into the model (Hinkley, 1969; Durbin and Evans, 1975). A non-parametric regression approach

is also suggested for regression change-point problems (Muller, 1992; Loader, 1996). The non-

parametric regression approach needs assumptions that are weaker than those made in the

parametric approach.

5.4.2 Advantages

Regression techniques allow evaluation of sensitivity of individual model inputs, taking

into account the simultaneous impact of other model inputs on the result (Cullen and Frey,

1999). Rank regression approach can capture any monotonic relationship between an input and

the output, even if the relationship is nonlinear. Sample and rank regression methods are

discussed elsewhere, such as by Neter et al. (1996), Iman et al. (1985), Brikes and Dodge (1993),

and Kendall and Gibbons (1990).

41

5.4.3 Disadvantages

The key potential drawbacks of regression analysis include: possible lack of robustness

if key assumptions of regression are not met; the need to assume a functional form for the

relationship between an output and selected inputs; and potential ambiguities in interpretation.

Regression analysis works best if each input is statistically independent of every other input

(Devore and Peck, 1996). Furthermore, the residuals of a least squares regression analysis must

be normally distributed and independent. If these conditions are violated, the results of the

analysis may not have a strict quantitative interpretation, but instead should be treated as

providing conceptual or qualitative insights regarding possible relationships. The results of

regression analysis can be critically dependent upon the selection of a functional form for the

regression model. Thus, any results obtained are conditioned upon the actual model used.

Regression analysis can yield results that may be statistically insignificant or counter-intuitive

(Neter et al., 1996). The lack of a clear finding may be because the range of variation of that

input was not wide enough to generate a significant response in the output. Thus, regressions

results can be sensitive to the range of variation in the data used to fit the model and may not

always clearly reveal a relationship that actually exists. The regression model may not be useful

when extrapolating beyond the range of values used for each input when fitting the model

(Devore and Peck, 1996).

5.5 Analysis of Variance

ANOVA is a model independent sensitivity analysis method that evaluates whether there

is any statistically significant association between an output and one or more inputs (Krishnaiah,

1981). ANOVA differs from regression analysis in that regression analysis is used to form a

predictive model whereas ANOVA is a general technique that can be used to test the hypothesis

42

that the means among two or more groups are equal, under the assumption that the mean of the

outputs for each of the groups is normally distributed with a same variance (Neter et al., 1996).

Also, ANOVA addresses both categorical inputs and groups of inputs (Steel et al., 1997).

5.5.1 Description

An input is referred to as a “factor” and specific ranges of values for each factor are

defined as factor “levels”. A “treatment” is a specific combination of levels for different factors.

An output is referred to as a “response variable,” and a “contrast” is a linear combination of two

or more factor level means (Neter et al., 1996). For example, a contrast can be used to evaluate

the change in the mean growth of pathogens when the storage temperature varies between high

and low levels for a specific storage time. Contrasts typically provide insight regarding specific

model characteristics such as threshold and saturation points.

Categorical factors are easily treated as levels. Continuous factors can be partitioned to

mutually exclusive and exhaustive subintervals in order to create levels (Kleijnen and Helton,

1999). The optimal definition of levels for a continuous factor is often a matter of judgment and

some experimentation may be required to determine an appropriate division (Frey et al., 2004;

Kleijnen and Helton, 1999).

ANOVA uses the F test to determine whether a significant difference exists among mean

responses for main effects or interactions between factors. The relative magnitude of F values

can be used to rank the factors in sensitivity analysis (Carlucci, 1999). The higher the F value,

the more sensitive the response variable is to the factor. Therefore, factors with higher F values

are given higher rankings. An alternative but related definition of sensitivity can be introduced

based on the partitioning of the total sum of squares (TSS) of the model response (Y) on all of the

main effects and interaction effects to estimate the relative partial sum of squares (RPSS)

43

(Gardner and Trabalka, 1985; Rose et al., 1991). However, the F value is preferred as it accounts

for not only the sum of squares but also the degrees of freedom associated with each factor.

5.5.2 Advantages

ANOVA can be used to analyze both continuous and discrete factors (Montgomery,

1997). The results of ANOVA can be robust to departures from key assumptions (Lindman,

1974; Conover, 1980). ANOVA allows evaluation of the “main effect” between factors. The

main effect is the effect of the factor alone, averaged across the levels of other factors. ANOVA

can also be used to evaluate the “interaction effect” between factors. An interaction effect is the

variation among the differences between means for different levels of one factor over different

levels of another factor. If there is a significant interaction, detailed contrasts can be evaluated.

By comparing results for different levels of each factor, it might be possible to identify

thresholds in the model response.

5.5.3 Disadvantages

ANOVA can become computationally intensive if there are a large number of factors. If

this becomes a problem, a suggestion is to try to reduce the number of factors analyzed by using

some less computationally intensive method to screen out insensitive inputs (Winter et al., 1991).

If there is a significant departure of the response variable from the assumption of normality, then

the results may not be robust (Lindman, 1974). Errors in the response variables due to

measurement errors in the factors can result in biased estimates of the effects of factors. If the

factors are correlated, then the effect of each individual factor on the response variable can be

difficult to assess (Neter et al., 1996), unless methods such as principal component analysis are

used.

44

5.6 Response Surface Method

The RSM can be used to represent the relationship between an output and one or more

inputs (Myers and Montgomery, 1995; Neter et al., 1996; Khuri and Cornell, 1987). The

objective of the RSM is to develop a simplified version of the original model so that it is possible

to retain the key characteristics of the model and to shorten the amount of time required to

predict the output for a given set of inputs. RSM is typically applied to large models so that

statistical methods that require multiple model evaluations can be applied. RSM is often used as

a step prior to application of techniques that require many model evaluations, such as Monte

Carlo simulation.

5.6.1 Description

A Response Surface (RS) can be linear or nonlinear and is typically classified as first-

order or second-order methods (Myers and Montgomery, 1995). Nonlinear response surfaces

consider interactions terms between inputs. The amount of time and effort needed to develop an

RSM depend on the number of inputs included and the type of RS structure. Screening

sensitivity analysis methods can be used prior to application of RSM in order to limit the inputs

that are included in the RS to those that are important.

The least squares regression method is typically used to fit a standardized first or second

order equation to the dataset including the output values from a model and sampled values from

probability distributions of inputs when developing an RS. The precision and accuracy of the RS

can be evaluated by comparing the prediction of the RS with those of the original model for the

same values of the model input. The normality of residuals as the key assumption of least-

squares regression should be satisfied. For cases that there is deviation from the normality

45

assumption, other techniques such as rank-based or nonparametric approaches should be

considered (Khuri and Cornell, 1987; Vidmar and McKean, 1996).

Once a response surface is developed, the sensitivity of the model output to one or more

of the selected inputs can be determined by: (1) inspection of the functional form of the response

surface; (2) statistical analysis if regression analysis was used to develop the response surface; or

(3) application of other sensitivity analysis methods to the response surface.

5.6.2 Advantages

The RSM can simplify potentially computationally intensive models and thus enables

much faster model run times. Thus, iterative numerical approaches such as Monte Carlo are

easier to be applied to the RS compared to the original model. Moreover, the functional form of

the RS and corresponding coefficients for selected inputs provide insight with regard to

sensitivity.

5.6.3 Disadvantages

Because an RS development requires data including output values and corresponding

values from probability distributions of inputs calculated based on the original model, for some

models this process can be resource intensive. The RS is calibrated to data generated from the

original model. Thus, the RSM only is valid within the range of values used to generate the

calibration dataset. Typically, the effect of all inputs with respect to sensitivity cannot be

evaluated in RSM as most RS studies are based on fewer inputs that are primarily screened out

among the list of original inputs.

5.7 Classification and Regression Trees

Tree-based models are statistical models designed for supervised prediction problems,

and can be used for partitioning data (Breiman et al., 1984). In a supervised problem a set of

46

Figure 5-2. Schematic Diagram of a Classification and Regression Tree Illustrating Root Node, Internal Nodes, and Terminal Nodes (Leaves).

inputs is used to predict the output values. When the output is categorical, the model is a

classification tree. For a continuous output, the tree-based model is a regression tree. The

combined methodology is referred to as classification and regression trees (CART).

5.7.1 Description

Figure 5-2 shows a schematic diagram of a tree-based model. Each tree is read from top

to bottom starting at the root node. Each internal node represents a split base on the values of an

input. Each input can be selected once or multiple times in a tree. Each split at an internal node

represents an inequality. A case moves left if the inequality is true and right otherwise. The

terminal nodes of the tree are called leaves. Each leaf represents a predicted mean output value.

For a categorical output, each leaf shows specific classes of the output, while for a continuous

output, leaves show statistically significantly different mean values for the output. Each path

47

from the root node to a leaf is a “classification rule” that specifies specific cut-off values of

selected inputs that lead to a leaf.

Each split in a tree is selected in order to reduce some measure of node impurity (Loh et

al., 1997). In classification trees, three common measures of node impurity are Gini Index

(Breiman et al, 1984), entropy (Quinlan, 1993), and the chi-squared test (Kass, 1980). For

regression trees, deviance is typically used as a measure of node impurity (Morgan and Sonquist

1963, Breiman et al., 1984). A splitting criterion is typically defined based on the reduction in

the value of a measure of node impurity associated with a specific input and a cut-off value

selected for the split. An optimal split is the one that maximizes the reduction in the measure of

node impurity. Regression trees are selected for further discussion here since the outputs in the

SHEDS model are continuous.

The measure of impurity in the split based on the change in deviance (∆D) is defined as:

{ })2()1()0( DDDD +−=∆ (5-6)

D(t) represents the total deviance in the node t and is calculated as:

∑=

−=tn

jtjt yytD

1

2)()( (5-7)

where,

nt = Number of data points at node t; t = 0, 1, 2.

yjt = jth output value at node t; t = 0,1, 2, and j = 1,…, nt.

ty = Mean output value at node t; t = 0, 1, 2.

An appropriate split and corresponding cut-off value are those that maximize the ∆D value in

Equation 5-6. Termination of splitting typically is specified to occur when the number of data

48

points at a node drops below a selected minimum, or when the maximum possible reduction in

the dataset impurity for a particular node is less than a selected minimum (Breiman et al., 1984).

Breiman et al. (1984) devised a measure of input sensitivity for CART based upon the

contributions of inputs to the reduction in the dataset impurity. Based on this measure, the

amount of cumulative contribution of selected inputs in a tree to the reduction of the total dataset

impurity indicates input sensitivity. For regression trees and for cases that an input is selected

multiple times in a tree, corresponding to each selection, a value for ∆D is calculated using

Equation (5-6). Inputs are prioritized with regard to sensitivity based on the relative magnitude

of their total contribution to the reduction of the dataset impurity (e.g., deviance). Hence, the

higher the contribution of the input to the reduction of the total impurity, the higher is the rank.

5.7.2 Advantages

CART has several advantages as a sensitivity analysis method. CART is nonparametric

and does not require assumptions of a particular distribution for the error term. CART is resistant

to the effects of outliers since splits usually occur at non-outlier values (Roberts et al., 1999).

CART results are invariant with respect to monotonic transformations of the independent

variables. As a result, it is not necessary to test a number of transformations to find the “best” fit

(Hallmark et al., 2002). CART can capture non-additive behaviors (Mishra et al, 2003).

5.7.3 Disadvantages

There are some disadvantages to CART. Because CART is not a standard analysis

technique, it is not included in many major statistical software packages (Levis, 2000).

Moreover, there are alternative ways to prioritize inputs based on the results of the CART, which

requires judgment on the part of the analyst.

49

5.8 Fourier Amplitude Sensitivity Test

FAST is a procedure that has been developed for uncertainty and sensitivity analysis

(Cukier et al. 1973; Schaibly and Shuler 1973; Cukier et al. 1975; Cukier et al., 1978). This

procedure provides a way to estimate the expected value and variance of the output variable and

the contribution of individual inputs to this variance. The evaluation of sensitivity estimates can

be carried out independently for each input using just one simulation because all terms in a

Fourier expansion are mutually orthogonal.

5.8.1 Description

FAST method applies a functional transformation to each input, assigns each input a

distinct integer frequency, and introduces a common independent variable to all inputs (Cukier et

al. 1973; Schaibly and Shuler 1973; Cukier et al. 1978; McRae et al. 1982). Inputs vary

simultaneously with this independent variable in such a way that the output becomes a periodic

function of the independent parameter. Fourier analysis is performed on the output, which

produces Fourier amplitudes for each frequency. FAST computes the “main effect” and “total

effect” contributions of each input to the variance of the output. The total effect includes both the

main effect as well as interaction effects (Sobol 1990; Homma and Saltelli, 1996). For example,

if there are three inputs A, B and C, the total effect of A is given by S(A) + S(AB) + S(ABC),

where S(x) is the sensitivity index of x.

Consider a computer model whose output is a nonlinear function of its inputs:

),...,,( 21 nxxxfy = (5-8)

where n is the number of inputs. The inputs are assumed to be random variables with their

assigned probability distribution functions (PDFs). A summary statistic that will be useful is the

rth moment of y defined as:

50

∫ ∫ ××>=<1

111)( ),,(),,(

x xnnn

rr

n

dxdxxxPxxfy KKKK (5-9)

where P(x1,…,xn) is the joint probability density function of inputs. Multi-dimensional Monte

Carlo integration procedure can be used to calculate the integral in Equation (5-9). However,

such an integration is cumbersome (Cukier et al. 1975). The key concept of the FAST method is

to convert the n-dimensional integral in Equation (2) into an equivalent one-dimensional integral.

The FAST method uses transformations as:

)][sin( sGx iii ω= i = 1, 2, …, n (5-10)

where Gi are a set of known functions, ωi are a set of frequencies, and s is a scalar variable. As s

varies, all inputs vary simultaneously in their own regions of variance at the rates according to

the frequencies assigned to them (Koda et al. 1979). The frequencies should be distinct and

incommensurate. Mathematically, a frequency set, {ω1,ω2,...,ωn}, is said to be incommensurate if

the equation:

∑=

=×n

iiia

10ω (5-11)

can be satisfied only if ai=0 for every i, where ai are integers (Lu and Mohanty, 2001). After

substituting Equation (5-10) in Equation (5-9), the n-dimensional integral in Equation (5-9) over

xi becomes a one-dimensional integral over s:

∫−

∞→ ×=T

Tn

rT

r dssxsxsxfT

y ))(),...,(),((21lim 21

)( (5-12)

The use of incommensurate frequencies, however, would require that the integral in Equation (5-

12) be carried out as infinitely long interval of s, which is not computationally feasible (Cukier et

al. 1973; Schaibly and Shuler 1973; Cukier et al. 1978; McRae et al. 1982). Therefore, integer

51

frequencies are used in practice so that the integral in Equation (5-12) can be evaluated over the

finite interval between -π to π. Thus, Equation (5-12) is changed to:

∫−

×=π

ππdssxsxsxfy n

rr ))(),...,(),((21

21)( (5-13)

The mean and variance of y can be estimated as:

∫−

×=π

ππdssxsxfy n ))(),...,((

21

1 (5-14)

∫−

−×=π

ππσ 2

122 ))(),...,((

21 ydssxsxf n (5-15)

The evaluation of σ2 can be carried out by using the s-space Fourier coefficients of y as:

∑∞

=

+=1

222 )(2j

jj BAσ (5-16)

where, Aj and Bj are the Fourier coefficients defined as:

×××=

××=

∫

∫

−

−π

π

π

π

π

π

dsjssxsxfB

dsjssxsxfA

nj

nj

)sin())(),...,((21

)cos())(),...,((21

1

1

(5-17)

If the Fourier coefficients in Equation (5-17) are evaluated for the fundamental frequencies of the

transformation in Equation (5-10) and its higher harmonics, that is, j = p×ωi, p=1,2,…, the

variance in Equation (5-16) is the part of the total variance, σ2, that corresponds to the variance

of y arising from the uncertainty in the ith input.

∑∞

=

+=1

222 )(2p

pp iiiBA ωωωσ (5-18)

52

The ratio 2

2

σσ ω

ωi

iS = is the so-called partial variance or main effect of input, which serves as a

basic measure of sensitivity for the FAST method. This ratio is a normalized sensitivity measure,

so that inputs can be ordered based on the relative magnitude ofi

Sω .

The computation of the total effect using FAST was proposed by Saltelli et al., (1999). In

order to estimate the total effect for each input, frequencies that do not belong to the set {p1ω1,

p2ω2,…,pkωk}, for pi=1,2,…, ∞ and i=1, 2, …, k, where k is the number of inputs are considered.

These frequencies contain information about the residual variance that is not accounted for by

the main effect of the input (Saltelli et al., 2000). This residual variance includes the interactions

between the factors at any order.

A frequency ωi for the input Xi and a set of identical frequencies, but different from ωi, to

all the remaining inputs, denoted by ω~i are assigned. The partial variance, 2)(~iωσ , is computed as:

∑∞

=

+=1

2)(~

2)(~

2)(~ )(2

pipipi BA ωωωσ (5-19)

2)(~iωσ is a measure including all the effects of any orders that do not involve the input Xi. The

total effect for input Xi is computed as:

2

2)(~1

σσ ω

ωi

iST −= (5-20)

5.8.2 Advantages

The FAST method is superior to local sensitivity analysis methods because it can

apportion the output variance to the variance in the inputs. It also can be used for local sensitivity

analysis with little modification (Fontaine et al., 1992). It is model independent and works for

monotonic and non-monotonic models (Saltelli et al., 2000). Furthermore, it can allow arbitrarily

53

large variations in input parameters. Therefore, the effect of extreme events can be analyzed

(e.g., Lu and Mohanty, 2001; Helton, 1993). The evaluation of sensitivity estimates can be

carried out independently for each factor using just a single set of runs (Saltelli et al., 2000). The

FAST method can be used to determine the difference in sensitivities in terms of the differing

amount of variance in the output explained by each input and, thus, can be used to rank order key

inputs.

5.8.3 Disadvantages

The FAST method suffers from computational complexity for a large number of inputs

(Saltelli and Bolado, 1998). The classical FAST method is good only for models with no

important or significant interactions among inputs (Saltelli and Bolado, 1998). However, the

extended FAST method developed by Saltelli et al., (1999) can account for high-order

interactions. The reliability of the FAST method can be poor for discrete inputs (Saltelli et al.,

2000). Current software tools for FAST are not readily amenable to application to the complex

risk assessment models.

5.9 Sobol’s Method

Sobol’s methods (Sobol, 1990, 1993; Saltelli et al., 2000) are variance based “global

sensitivity analysis” methods based upon “Total Sensitivity Indices” (TSI) that take into account

interaction effects. The TSI of an input is defined as the sum of all the sensitivity indices

involving that input. The TSI includes both the main effect as well as interaction effects (Sobol

1990; Homma and Saltelli, 1996). For example, if there are three inputs A, B and C, the TSI of A

is given by S(A) + S(AB) + S(ABC), where S(x) is the sensitivity index of x.

54

5.9.1 Description

The underlying principle upon which Sobol’s approach calculates the sensitivity indices

is the decomposition of function f(x) into summands of increasing dimensionality (Chan et al.,

2000):

∑ ∑∑= +==

++++=n

inn

n

ijjiij

n

iiin xxfxxfxffxxf

11,......,2,1

1101 ),....,(.......),()(),......( (5-21)

The form presented in Equation (5-21) can only be arrived at when f0 is a constant, and the

integral of every summand over any of its own variables is always zero, i.e.

skifdxxxfkss iiiii ≤≤=∫ 1,0),.......,(

1

0,........, 11

(5-22)

Where,

s = summand index

k = input variable index

A consequence of Equation (5-21) and (5-22) is that all the summands in Equation (5-21) are

orthogonal, i.e. if (i1,….., is) ≠ (j1,….., jl), then

0,....,,...., 11∫ =n lsK jjii dxff (5-23)

Where, Kn is the n-dimensional space of input parameters. The total variance D of f(x) is defined

to be:

∫ −= 20

2 )( fdxxfD (5-24)

and the partial variances are computed from each of the terms in Equation (5-21).

∫ ∫=1

0

1

01,....,,......, ......),...,(.....

111 sss iisiiii dxdxxxfD (5-25)

Where 1≤i1<…< is ≤ n and s=1,.., n. By squaring and integrating Equation (5-21) over Kn, and by

Equation (5-23) we have:

55

∑ ∑∑= +==

+++=n

i

n

ijnij

n

ii DDDD

1 1,....2,1

1.... (5-26)

Thus, a sensitivity measure S(i1, …,is) is defined as

D

DiiS sii

s....,,.........

1,1),.....,( = (5-27)

The sum of all the sensitivity indices is always unity. The integrals in Equation (5-24) and (5-25)

can be computed by the Monte Carlo (MC) integral method (Saltelli, 2002b).

5.9.2 Advantages

Sobol’s method can cope with both nonlinear and non-monotonic models, and provide a

truly quantitative ranking of inputs and not just a relative qualitative measure (Chan et al., 2000).

The types of influence of an input that are captured by Sobol’s method include additive, non-

linear or with interactions. Furthermore, Sobol’s method can be smoothly applied to categorical

variables without re-scaling. Sobol (1993) and Saltelli (2002) describe such an implementation.

5.9.3 Disadvantages

Sobol’s method is based on the sampling of the distribution function of the input factors

and on the repeated execution of the model, in order to determine the distribution of the output;

therefore it is, in general, computationally intensive (Saltelli et al., 2000). Also, the ease of

application depends on the complexity of the model. Hence, it is difficult to apply Sobol’s

method to models with large number of inputs and complex model structure such as modularity.

Sobol’s method provides a factor-based decomposition of the output variance, and

implicitly assumes that the second central moment is sufficient to describe output variability.

However, when the region of interest is the tails of the output distribution, this assumption is not

valid (Saltelli, 2002b).

56

5.10 Mutual Information Index

MII produces a measure of the information about the output provided by particular input.

Each sensitivity measure is calculated based upon conditional probabilistic analysis. Inputs are

ranked based on the relative magnitude of sensitivity measures. MII takes into account the joint

effects of variation in all inputs with respect to the output. MII is typically used for models with

dichotomous outputs; but it can also be used for outputs that are continuous (Critchfield and

Willard, 1986).

5.10.1 Description

The MII method involves three general steps (Critchfield and Willards, 1986): (1)

generating an overall confidence measure of the output value; (2) obtaining a conditional

confidence measure for a given value of an input; and (3) calculation of sensitivity indices.

Confidence is defined as the probability for the outcome of interest and the CDF of the output is

used to estimate the overall confidence measure of the output value. Conditional confidence is

calculated by holding an input at some a value and simultaneously varying all other inputs.

The mutual information between two random variables is the amount of information

about a variable that is provided by the other variable (Jelinek, 1970). The average MII for each

input (IXY) is calculated based on the PDF of the input and the overall and conditional confidence

in the output as:

∑∑ ××=X Y Y

XYXYX

aXY P

PLogPPI )( |

| (5-28)

where, PY|X is conditional confidence of the output; PY is overall confidence of the output; and

PX is the probability distribution for the input. If aXYI is large, corresponding input provides

57

substantial information about the output. For inputs that are statistically independent with the

output, aXYI is zero.

The amount of information about a variable that is provided by the variable itself is

measured in terms of the “average self-information” (IYY) of that variable (Jelinek, 1970) and

defined as:

∑ ×=Y Y

YYY PLogPI )1( (5-29)

For the purpose of sensitivity analysis, a normalized measure of the MII (SXY) is used which is

the ratio of IXY and IYY.

5.10.2 Advantages

The mutual information is a more direct measure of the probabilistic relationship of two

random variables than other measures such as correlation coefficients (Jelinek, 1970). For

example, the correlation coefficient of two random variables examines the degree of linear

relationship between them. Two uncorrelated variables may not be independent; however, two

variables with zero mutual information are statistically independent. In addition, the results from

MII can be graphically presented.

5.10.3 Disadvantages

The MII method is computationally intensive and practically difficult to apply to complex

models (Merz et al., 1992). An approach using symbolic algebra is suggested for reducing the

computationally intensiveness of the method (Critchfield and Willard, 1986). Because of the

simplifying assumptions used in MII, it is difficult to evaluate the robustness of rankings based

on the sensitivity measures.

59

6. COMPARISON OF SELECTED SENSITIVITY ANALYSIS METHODS

In this chapter each of the 12 selected sensitivity analysis methods are compared based

on selected key criteria. A key criterion for sensitivity analysis and for the risk model and

analysis in general, is that it must be relevant to a decision. This means that the model output

of interest must be directly related to the decision. For example, if a decision is informed by

whether risk is above or below a threshold, then the model output should be a variable

indicating the probability that the estimated risk is above or below the threshold. The

sensitivity analysis should pertain to variation in inputs that cause a change in the value of the

output that would lead to a different decision.

Technical requirements of a sensitivity analysis method can be manifold and differ

from one application to another, and from one decision application to another. The ideal

sensitivity analysis method would be applicable to models that have characteristics such as

nonlinearity, interactions, thresholds, different input types, and temporal and spatial

dimensions (e.g., seasonality and inter-annual variability). A list of key criteria for evaluation

of sensitivity analysis methods are summarized in Table 6-1. The evaluation and comparison

of these methods with respect to the selected criteria are based on extensive case studies by

Frey and Patil (2002) and Frey et al. (2003).

Ideally, a sensitivity analysis method should respond to the effects of simultaneous

variation in all inputs. An appropriate method should: address nonlinearities in response to an

input; identify the presence or absence of thresholds in the model response; identify and

provide detailed insights regarding the existence of interaction among inputs; and be able to

handle alternative input types. Because high exposure cases are often of special interest,

methods that can help identify and characterize conditions leading to high exposures may be

60

Table 6-1. Summary of Key Characteristics of Selected Sensitivity Analysis Methods

Sensitivity Analysis Method Correlation Regression Characteristic NRSA DSA Sample Rank Linear Rank ANOVA CART FAST Sobol RSM MII

Simultaneous Variation No No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Non-linearity No No No Yes No Yes Yes Yes Yes Yes Yes Yes Threshold No No No No No No Yes a Yes No No No No Interaction No No No No Yes Yes Yes Yes Yes Yes Yes Yes

Qualitative vs. Quantitative inputs No No No No Yes No Yes Yes Yes Yes Yes No

High Exposure No No No No No No Yes Yes No No No No Two-Dimensional

Analysis No No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Ease of Implementation b No No Yes Yes Yes Yes No No No No No No

Quantitative Ranking of Inputs Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Measure of Statistical

Significance No No Yes Yes Yes Yes Yes No No No Yes Yes

Discrimination of Important Inputs Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Identify Contribution to the Output Variance

No No No No No No No No Yes Yes No No

Robust in Practice No No No Yes No Yes Yes Yes Yes Yes Yes Yes a Depends on proper definition of factor levels b Assumes availability of simulation results from a Monte Carlo or similar analysis c Can be based upon expert judgment

ANOVA: Analysis of Variance FAST: Fourier Amplitude Sensitivity Test RSM: Response Surface Method CART: Classification & Regression Tree MII: Mutual Information Index DSA: Differential Sensitivity Analysis NRSA: Nominal Sensitivity Analysis

61

preferable. The ability to address uncertainty regarding the importance of inputs via two-

dimensional analysis can provide an indication of robustness and confidence in the efficiency

of an identified control measure.

Some methods are easier to apply in practice than others. The ease of application may

often constrain the feasibility of a method. A method is typically easier to implement when

software tools for implementing the method already exist, especially if they have user-friendly

interfaces. Of course, ease of implementation will be a function of software availability and

programming skill level. For example, Pearson correlation coefficients are easy to implement

for users of Crystal BallTM. Of course, ease of implementation will be a function of software

availability and programming skill level. However, methods such correlation coefficients do

not account for characteristics such as interaction and thresholds. Thus, even though

regression, ANOVA, CART, FAST and Sobol’s method are more difficult to apply than some

of the readily available methods such as correlation coefficients, their use may be necessary to

capture important characteristics of the model.

The ability to produce quantitative rankings and the ability to evaluate the statistical

significance of the rankings are useful to identify the relative importance of inputs and the

confidence that should be imputed to the rankings. Some methods produce more useful

measures by which to discriminate the importance among similarly ranked inputs. Finally,

although each method has a different theoretical basis, the bottom line for practical use of the

methods is whether they produce reasonable results even if there are departures from key

assumptions of the method.

Based upon these criteria and the judgments made regarding how well each method

addresses each of the criteria, there is no method that perfectly addresses all criteria, nor is

62

there any criterion that is addressed by every one of the methods. Thus, there are trade-offs

regarding the selection of sensitivity analysis methods.

The mathematical methods of NRSA and DSA offer few theoretical advantages, with

the exception of providing quantitative rankings and perhaps insight regarding distinctions

among the rankings. These methods can be easy to apply depending upon the structure of the

model itself. These methods were less reliable than others in providing reasonably correct

ranking of the inputs.

The statistical methods were comparable in performance in many respects. However,

ANOVA, FAST, Sobol’s method, and CART are better suited to dealing with nonlinearity,

thresholds, and interactions. Statistical methods provide a clear indication of the quantitative

rank order of importance of inputs and regarding the robustness of the rankings.

Of the statistical methods, the theoretical basis of Pearson (sample) correlation and

linear regression is the weakest with respect to application to nonlinear models. Furthermore,

correlation and linear regression analyses do not deal with thresholds. Thus, users should be

cautious about the application of Pearson (sample) correlation and linear regression.

Generally, a higher value of R2 implies more confidence in the results. Rank regression and

rank correlation theoretically should perform well for monotonic models.

Statistical methods typically can be automated for two-dimensional analysis and hence

provide probability of rank associated with a single sampling based on the number of test

samplings. However, the reliability of results based on ANOVA in the two-dimensional case

must be tested since either the definition of cut-off points for levels would vary or the sample

size in each level would vary. Correlation coefficients do not address interactions and

thresholds. Regression analysis can accommodate additional terms to account for interactions

63

and change-point regression can be used to handle thresholds. Thus, the reliability of ranking

obtained from correlation analysis is typically expected to be less than that of regression

analysis.

65

7. SELECTION OF SENSITIVITY ANALYSIS METHODS

The selection of a sensitivity analysis method appropriate for a particular model and

case study depends on the characteristics of the model and the case study. Frey et al. (2004)

suggested a series of five key questions and encouraged the analyst to find clear answers to

these questions prior to selecting a specific method. The analyst should determine whether

there are additional considerations in a particular case study beyond the issues addressed here.

The key questions are:

• What are the objectives of sensitivity analysis?

• Based upon the objectives, what information is needed from sensitivity analysis?

• What are the characteristics of the model that constrain or indicate preference

regarding method selection?

• How detailed is the analysis?

• Is the implementation of the selected sensitivity analysis method post hoc?

These questions are briefly discussed in Sections 7.1 to 7.5, respectively. Section 6.6 presents

a decision framework to assist in selecting sensitivity analysis methods based on the insights

from the key questions and comparison of the methods in Chapter 6. Section 7.7 applies the

decision framework to select a preferred (set of) sensitivity analysis method(s) for further

evaluation using the SHEDS-Pesticides model.

7.1 What Are the Objectives of Sensitivity Analysis?

Some common objectives of sensitivity analysis are to: (1) rank the importance of

model inputs (e.g., critical control points); (2) identify combinations of input values that

contribute to high exposure and/or risk scenarios; (3) identify and prioritize key sources of

66

variability and uncertainty; (4) identify locations of change-points in the model response; and

(5) evaluate the validity of the model.

Mathematical and statistical sensitivity analysis methods typically provide quantitative

rankings of model inputs (Frey et al., 2003). Graphical sensitivity analysis techniques

typically do not provide quantitative rankings of inputs. Some statistical methods, such as

CART and ANOVA, can be used to identify combinations of inputs that lead to high

exposures and risk (Mishra et al., 2003; Frey et al., 2003). Statistically based sensitivity

analysis methods typically provide quantitative ranking of inputs which can be used for

identification of key sources of variability and uncertainty in a probabilistic simulation. For

identification of change-points, methods that are not sensitive to the linearity assumption are

suggested as change-points are typically associated with saturation points or thresholds. Some

statistical (e.g., ANOVA, CART) and graphical methods (e.g., scatter plots) can provide

useful insights in these situations (Frey et al., 2003).

7.2 Based upon the Objectives, What Information is Needed from Sensitivity Analysis?

Examples of information that could be useful, depending upon the objectives, include:

(1) qualitative or quantitative ranking of inputs; (2) discrimination of the importance among

different inputs; (3) grouping of inputs that are of comparable importance; (4) identification of

inputs that are not important; (5) identification of change-points; and (6) identification of

inputs and ranges that produce high exposure or risk.

7.3 What are the Characteristics of the Model that Constrain or Indicate Preference Regarding Method Selection?

Specific characteristics that may constrain the application of sensitivity analysis

methods include: (1) nonlinearities; (2) interactions; (3) thresholds and saturation points; and

(4) categorical inputs (Frey, 2002).

67

An ideal sensitivity analysis method should be model independent (Saltelli et al.,

2000). Some methods are considered to be global and model-independent. Examples are

Sobol’s method, FAST, and CART. An analyst should select a sensitivity analysis method

that can deal with interactions. Some commonly used mathematical sensitivity analysis

methods (e.g., NRSA, DSA) are not able to capture interaction effects between model inputs.

However, many statistical methods can identify interactions between model inputs. Sensitivity

analysis methods such as ANOVA and CART that enable an analyst to compare variation of

the model output in different regions of an input domain are appropriate for identification of

thresholds and saturation points. Mathematical methods such as NRSA and DSA do not

accommodate categorical inputs. Among statistical sensitivity analysis methods, ANOVA and

CART can deal with categorical inputs. Regression analysis can deal with categorical inputs

by using dummy variables (Neter et al.,1996).

7.4 How Detailed is the Analysis?

Screening or refined sensitivity analyses can be conducted based upon the objectives

of an analysis. The choice between the two is typically governed by resource availability, the

importance of the analysis, and the stage of the assessment. Screening analyses require fewer

resources than refined analyses. A screening analysis might be used in the early stages of

model development to help refine the model and its inputs and to assess model validity.

Typical screening analysis methods include local mathematical methods such as DSA or

NRSA, as well as sample or rank correlation coefficients available in commercial statistical

software packages.

A refined analysis might typically be used in the later stages of analyses with a model

that has undergone previous screening analyses. A refined analysis is preferred if the results

68

will be used to make decisions regarding commitments of large amounts of resources for

further model development, data collection for model inputs, or risk management strategies.

However, the choice of a refined analysis should account for a trade-off, if any, between the

skills of the analyst, resources, anticipated needs for custom programming, and time required

or available to do the analysis.

Screening and refined analysis can be used together. The time and effort to execute a

refined analysis often depends on the number of inputs that are included in the analysis.

Therefore, it is often useful to use a screening method to identify model inputs that are not

important with regard to variation in the output of interest. The refined analysis can then be

applied to a smaller set of inputs for which there is reason to believe that the model output has

at least some sensitivity.

7.5 Is the Implementation of the Selected Sensitivity Analysis Method Post Hoc?

A sensitivity analysis method is referred to as post hoc if it is applied to previously

prepared results from probabilistic or deterministic simulation of a model but is not included

as a component of the model itself. Therefore, the application of a post hoc sensitivity

analysis method does not contribute to the process of model simulation, but it may impose

requirements regarding the type and format of data that should be stored from the simulation.

Examples of methods that can be applied post hoc include sample and rank correlations,

regression analysis, ANOVA, CART, and graphical techniques.

In contrast, some methods require a different simulation strategy and therefore may

have to be programmed or implemented manually or interactively with the model. Some of

this will depend upon how the model is structured. For instance, among statistical methods for

sensitivity analysis, Sobol’s method, FAST, and MII typically are not post hoc. The

69

Figure 7-1. Decision Framework for Selecting an Appropriate Sensitivity Analysis Method.

application of these kinds of methods requires more advance planning and coding than do the

post hoc methods. Although these are powerful methods that offer advantages over commonly

used methods, their widespread practical application is limited until software becomes

available by which these methods can be easily incorporated into a risk model (Saltelli et al.,

2004).

7.6 Decision Framework to Assist in Selecting Sensitivity Analysis Methods

Figure 7-1 shows that the first step in selecting an appropriate sensitivity analysis

method is to decide on the level of detail expected from sensitivity analysis. This figure

presents two levels of sensitivity analysis: (1) screening analysis; and (2) refined analysis.

For screening analysis, the choice of sensitivity analysis method depends on the simulation

70

approach of a model. If a deterministic approach is selected for screening analysis,

particularly for local sensitivity analysis, then mathematical sensitivity analysis methods are

recommended. For probabilistic approaches, commonly used techniques including sample and

rank correlation coefficients are listed as appropriate methods. In contrast, if a practitioner

decides to perform a refined analysis, the choice of a method depends on the objective of

sensitivity analysis. Three objectives are listed in the decision framework for refined analysis:

(1) model refinement; (2) identifying key sources of variability and uncertainty; and (3)

identifying high exposure scenarios. For the latter objective, two classes of methods are

introduced. Class A introduces methods that provide explicit measures for addressing high

exposure scenarios. CART is the only method introduced in this class capable of addressing

high exposure scenarios directly (Frey et al., 2003; Mishra et al., 2003). Methods introduced

in Class B (including ANOVA, conditional sensitivity analysis, and scatter plots) require

judgment of an analyst for interpretation of the results and identification of inputs responsible

for high exposure scenarios. The decision framework for method selection considering the

first two objectives is illustrated in Figure 7-2.

When the objective of sensitivity analysis is either model refinement or identifying

key sources of variability and uncertainty, the choice of an appropriate method for sensitivity

analysis depends on the characteristics of the model under study. Figure 7-2 considers four

characteristics for a model: (1) non-linearity; (2) interaction; (3) categorical inputs; and (4)

threshold and saturation points.

For non-linear models, the choice of a method first depends on whether or not the

model is monotonic. The decision framework is further classified based upon the condition of

71

Figure 7-2. Decision Framework for Selecting an Appropriate Sensitivity Analysis Method for Identifying Key Sources of Variability and Uncertainty and for Model Refinement.

whether a sensitivity analysis method is post hoc or integrated with software. For models with

interactions, the selection of method depends on whether the implementation of the selected

method is post hoc. Post hoc methods are classified as Class C or D. Methods in Class C

directly take into account interactions between model inputs, while the methods in Class D

involve analyst judgment in order to address interaction between inputs. Methods appropriate

for models with categorical inputs are classified based upon post hoc or integrated

implementation. When the model under study has thresholds or saturation points, sensitivity

analysis methods are classified based on whether they explicitly address such characteristics.

Methods in class C include ANOVA and scatter plots (SP), which explicitly handle possible

thresholds or saturation points in a model, while methods in Class D include CART and

72

conditional sensitivity analysis, which require analyst judgment for quantifying available

thresholds and saturation points.

7.7 Using the Decision Framework for Selection of a Preferred Set of Sensitivity Analysis Methods for Application to the SHEDS-Pesticides Model

With the help of the decision framework provided here, it is possible for an analyst to

select an appropriate sensitivity analysis method for an application to the SHEDS-Pesticides

model. Because the model has many inputs, an analyst may decide to narrow the scope of

sensitivity analysis by selecting a subset of inputs that control much of the output variation.

Figure 7-1 presents recommended methods for screening level analysis. If an analyst prefers a

deterministic approach to screening analysis and wants to account for the extreme values of

inputs, NRSA is a good method to choose. However, for the SHEDS-Pesticides model a

probabilistic approach using correlation analysis is a better choice. After selecting a subset of

inputs, an analyst may want to continue sensitivity analysis by identifying key sources of

variability and uncertainty.

The SHEDS-Pesticides model is a monotonic model that has non-linear and

interaction terms. Figure 7-2 recommends two sets of methods for sensitivity analysis,

including post hoc techniques and methods that should be implemented within the model (i.e.,

integrated with software). Five post hoc methods are recommended for sensitivity analysis

including ANOVA, CART, sample and rank regression, and Spearman (rank) correlation

analysis. ANOVA, sample regression analysis, and rank regression analysis are appropriate

methods for application to the SHEDS-Pesticides model because they can directly respond to

the model characteristics. FAST and Sobol’s method are suggested under the category of

methods that should be integrated into model due to their different simulation strategy.

73

Based on the insight from the decision tree, and on additional considerations, seven

methods are selected for further evaluation in case studies that are documented in Volume 2.

These seven methods are FAST, Sobol’s method, Pearson correlation, Spearman correlation,

sample and rank regression, and ANOVA. FAST and Sobol’s method provide the most direct

measures of contribution to variance, whereas the R2 of regression-based methods gives a

measure of the amount of output variance explained with the regression model. This measure

can provide insight with respect to the contribution to variance as a key objective of this work.

ANOVA not only provides a similar diagnostic check with respect to the amount of output

variance explained with the inputs and their combinatory effects considered in the model, but

also it has the advantage of being model independent. CART is not included in the list of

methods for further evaluation because it does not provide a measure of contribution to

variance, but instead apportions the deviance. However, this was judged to be less relevant for

the specific objective of this work (i.e., contribution to variance). CART can be used if the

objective of sensitivity analysis changes to identify combinations of inputs that lead to high

(or low) exposure or risk. The two correlation-based sensitivity analysis methods are widely

used in practice, and thus, are a useful benchmark for comparison.

75

8. SUMMARY

This chapter summarizes key insights and findings from the review of the SHEDS-

Pesticide model, review of sensitivity and uncertainty analysis methods, discussion of criteria

for selection of sensitivity analysis methods, and identification of methods that offer the most

promises for application to the SHEDS-Pesticide model.

• The main characteristics of the SHEDS-Pesticides model include: (1) non-linearity

and interaction between inputs; (2) saturation points; (3) different input types (e.g.,

continuous versus categorical); and (4) aggregation and carry-over effects.

• Typical methods for propagation of uncertainty were classified to: (1) analytical; (2)

approximation; and (3) numerical techniques. The ideal method(s) for propagation of

probability distributions of inputs should: (1) not be dependent to the functional form

of the model; (2) provide insight regarding the entire range of the output distribution;

(3) require fewer iteration of the model; (4) not only propagate the probability

distributions, but also provide insights regarding sensitivity of the model output to

inputs; and (5) be typically available in commonly used software packages.

• Based on the evaluation of different uncertainty analysis methods, sampling based

numerical methods including Monte Carlo simulation and LHS are preferred

techniques for propagating probability distributions of model inputs through the model

and they satisfy most of the criteria for an ideal method.

• Sensitivity analysis methods are typically classified into categories based on their

scope, applicability, and characteristics. For example, Frey and Patil (2002) broadly

classify sensitivity analysis methods as mathematical, statistical, and graphical

methods. Alternatively, Saltelli et al. (2000) classify sensitivity analysis methods as

76

screening, local, and global. Examples of commonly used sensitivity analysis methods

are regression-based techniques (e.g., Helton, 1993), variance decomposition methods

(e.g., Saltelli et al., 1999), and scatter plots (e.g., Kleijnen and Helton, 1999; Saltelli et

al., 2000; Frey and Patil, 2002).

• A key criterion for sensitivity analysis and for the risk model and analysis in general,

are that it must be relevant to a decision. This means that the model output of interest

must be directly related to the decision. Technical requirements of a sensitivity

analysis method can be manifold and differ from one application to another, and from

one decision application to another. The ideal sensitivity analysis method would be

applicable to models that have characteristics such as nonlinearity, interactions,

thresholds, different input types, and temporal and spatial dimension (e.g., seasonality

or inter-annual variability). An ideal sensitivity analysis method would be model

independent. Specifically, the sensitivity analysis method should not require the

introduction of any assumptions regarding the functional form of the risk model and,

therefore, should be applicable to a wide range of different model formulations. The

method should provide not just a rank ordering of key inputs, but also some

quantitative measure of the sensitivity of each input so that it is possible to distinguish

the most strongly sensitive inputs from those with weaker influence on the selected

model output.

• Based upon the key criteria for sensitivity analysis methods applicable to the SHEDS

models and the judgments made regarding how well each method addresses each of

the criteria, there is no method that perfectly addresses all criteria, nor is there any

77

criterion that is addressed by every one of the methods. Thus, there are trade-offs

regarding the selection of sensitivity analysis methods.

• The selection of sensitivity analysis methods depends on factors such as objective of

the analysis, characteristics of the model under study, amount of detail expected from

sensitivity analysis, characteristics of the software used for sensitivity analysis, and

available computing resources. A series of key questions and brief discussions

regarding the insight that an analyst may gain by addressing those questions were

provided. Based on the key questions and discussions provided and insights from

comparison of selected methods, a decision framework summarizing the discussions

regarding selection of appropriate sensitivity analysis methods was introduced.

• Based upon review of methods with respect to criteria established that pertain to the

key assessment objectives and main model characteristics, seven sensitivity analysis

methods are selected for further evaluation. These methods include: Pearson and

Spearman correlation analyses, sample and rank regression analysis, ANOVA, FAST,

and Sobol’s method.

In Volume 2, the seven selected sensitivity analysis methods are applied to a modeling

testbed. The testbed is a simplified version of a typical SHEDS model. A case study scenario

was defined that includes multiple time scales (e.g., daily, monthly). The sensitivity analysis

results obtained from these seven methods are compared. On the basis of these quantitative

results, recommendations are made for methods that offer promise for application to SHEDS

models. The statistically-based methods are often readily available features of commonly

available software packages. However, FAST and Sobol’s method are less readily available.

Therefore, algorithms are presented for these two methods. Furthermore, recommendations

78

are made for additional research and development of sensitivity analysis methods for

application to the SHEDS models

79

9. REFERENCES

Agro, K. E., C. A. Bradley, and N. Mittmann (1997), "Sensitivity Analysis in Health

Economic and Pharmacoeconomic Studies - an Appraisal of the Literature,"

Pharmacoeconomics, 11(1):75-88.

Anderson, E. L., and D. H. Hattis (1999), “When and How Can You Specify a Probability

Distribution When You Don’t Know Much?,” Risk Analysis, 19(1): 43-68.

Ang, and W. H. Tang (1975). Probability Concept in Engineering Planning and Design.

Volume 1, John Wiley and Sons, New York.

Baker, S., D. Ponniah, and B. Klubek (1999), “Survey of Risk Management in Major UK

Companies,” Journal of Food Protection, 62(9): 1050-1053.

Baniotopoulos, C. C. (1991), "A Contribution to the Sensitivity Analysis of the Sea-Bed

Structure Interaction Problem for Underwater Pipelines," Computers & Structures,

40(6):1421-1427.

Bevington, P. R., and D. K. Robinson (1992). Data Reduction and Error Analysis for the

Physical Sciences. McGraw-Hill, New York, NY.

Beck, M. B., J. R. Ravetz, and L. A. Mulkey (1997), "On the Problem of Model Validation

for Predictive Exposure Assessments," Stochastic Hydrology and Hydraulics,

11(3):229-254.

Bevington, P. R., and D. K. Robinson (1992). Data Reduction and Error Analysis for the

Physical Sciences. McGraw-Hill, New York, NY.

Bischof, C. H., A. Carle, G. Corliss, A. Griewank, and P. Hovland (1992). ADIFOR

Generating Derivative Codes from FORTRAN Programs. ANL-4501, Prepared by

80

Mathematics and Computer Science Division, Argonne National Laboratory, Argonne,

Illinois.

Bogen, K. T., and R. C. Spear (1987), “Integrating Uncertainty and Inter-individual

Variability in Environmental Risk Assessment,” Risk Analysis, 7(4): 427-436.

Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone (1984). Classification and

Regression Trees. Belmont, Calif.: Wadsworth International Group.

Brikes, D., and Y. Dodge (1993). Alternative Methods of Regression. John Wiley and Sons:

New York.

Burmaster, D. E., and R. H. Harris (1993), “The Magnitude of Compounding Conservatisms

in Superfund Risk Assessment,” Risk Analysis, 131, 133-134.

Burmaster, D. E., and K. M. Thompson (1995), “Back-Calculating Cleanup Targets in

Probabilistic Risk Assessment When the Acceptability of Cancer Risk is Defined

Under Different Risk Management Policies,” Human and Ecological Risk Assessment,

1(1): 101-120.

Cacuci, D. G. (1981a), “Sensitivity Theory for Nonlinear Systems. I. Non-linear Functional

Approach,” Journal of Mathematical Physics, 22(7): 2794-2802.

Cacuci, D. G. (1981b), “Sensitivity Theory for Nonlinear Systems. II. Extensions to

Additional Classes of Response,” Journal of Mathematical Physics, 22(7): 2803-2812.

Carlucci, A., F. Napolitano, A. Girolami, and E. Monteleone (1999), "Methodological

Approach to Evaluate the Effects of Age at Slaughter and Storage Temperature and

Time on Sensory Profile of Lamb Meat," Meat Science, 52(4):391-395.

Chan., K., S. Tarantola, A. Saltelli, and I. M. Sobol (2000). Variance Based Methods in

Sensitivity Analysi. John Wiley and Sons: New York.

81

Cheng, T. C. E. (1991), "EPQ with Process Capability and Quality Assurance

Considerations," Journal of the Operational Research Society, 42(8):713-720.

Cohen, J. T., M. A. Lampson, and S. Bowers (1996), “The Use of Two-Stage Monte Carlo

Simulation Techniques to Characterize Variability and Uncertainty in Risk Analysis,”

Human and Ecological Risk Assessment, 2(4): 939-971.

Conover, W. J. (1980). Practical Non-Parametric Statistics. 2nd Edition. New York: John

Wiley and Sons Inc.

Critchfield, G. C., and K. E. Willard (1986), "Probabilistic Analysis of Decision Trees Using

Mote Carlo Simulation," Medical Decision Making, 6(1):85-92.

Cukier, R. I., C. M. Fortuin K. E. Shuler, et al. (1973), ”Study of the Sensitivity of Coupled

Reaction Systems to Uncertainties in Rate Coefficients: Part I Theory,” Journal of

Chemical Physics, 59, 3873-3878.

Cukier, R. I., J. H. Schaibly, and K. E. Shuler (1975), “Study of the Sensitivity of Coupled

Reaction Systems to Uncertainties in Rate Coefficients: Part III Analysis of

Approximation,” Journal of Chemical Physics, 63(5), 1140-1149.

Cukier R. I. , H. B. Levine, and K. E. Shuler (1978), “Nonlinear Sensitivity Analysis of

Multi-Parameter Model Systems,” Journal of Computational Physics, 26(1): 1-42.

Cullen, A. C, and H. C. Frey (1999). Probabilistic Techniques in Exposure Assessment.

Plenum Press, New York, USA.

DeGroot, M. H. (1986). Probability and Statistics, 2nd Edition. Addison-Wesley, MA.

Devore, J. L., and R. Peck (1996). Statistics: The Exploration and Analysis of Data. 3rd

Edition. Brooks/Cole Publishing Company: London, England.

82

Draper, N. R. and H. Smith (1981). Applied Regression Analysis. Second Edition. John

Wiley and Sons: New York.

Durbin, J., and J. M. Evans (1975), “Techniques for Testing the Constancy of Regression

Relationships over Time,” Journal of the Royal Statistical Society, 37(1): 149-172.

Edwards, A. L. (1976). An Introduction to Linear Regression and Correlation. W. H.

Freeman, San Francisco, CA, USA.

Evans, J. S., G. M. Gray, R. L. Sielken, et al. (1994), “Use of Probabilistic Expert Judgment

in Uncertainty Analysis of Carcinogenic Potency,” Regulatory Toxicology and

Pharmacology, 20(1): 15-36.

Fiessler, B., H. J. Neumann, and R. Rackwitz (1979), “Quadratic Limit States in Structural

Reliability Theory,” Journal of Engineering Mechanics ASCE, 105: 661-676.

Finley, B., and D. Paustenbach (1994), “The Benefits of Probabilistic Exposure Assessment:

Three Case Studies Involving Contaminated Air, Water, and Soil,” Risk Analysis,

53(1): 54-57.

Fontaine, D.D., P.L. Havens, G.E. Blau, and P.M. Tillotson (1992), “The Role of Sensitivity

Analysis in Groundwater Risk Modeling for Pesticides,” Weed Technology, 6(3):716-

724.

Fraedrich, D., and A. Gildberg (2000), “A Methodological Framework for the Validation of

Predictive Simulations,” European Journal of Operational Research, 124(1): 55-62.

Frey, H. C. (1992), “Quantitative Analysis of Uncertainty and Variability in Environmental

Policy Making,” American Association for Advancement of Science, Washington DC.

Frey, H. C., A. Mokhtari, and J. Zheng (2004). Recommended practice regarding selection,

application, and interpretation of sensitivity analysis methods applied to food safety

83

process risk models. Prepared by North Carolina State University for Office of Risk

Assessment and Cost-Benefit Analysis, U.S. Department of Agriculture. Washington

DC. www.ce.ncsu.edu/risk.

Frey, H. C., and R. Patil (2002), “Identification and Review of Sensitivity Analysis Methods,”

Risk Analysis, 22(3): 553-577.

Gardner, R. H., and J. R. Trabalka (1985). Methods of Uncertainty Analysis for a Global

Carbon Dioxide Model. DOE/OR/21400-4, US Department of Energy, Washington,

DC.

Geldermann, J., and O. Rentz (2001), "Integrated Technique Assessment with Imprecise

Information as a Support for the Identification of Best Available Techniques (BAT),"

OR Spektrum, 23(1):137-157.

Gibbons, J. D. (1985). Nonparametric Statistical Inference. Marcel Dekker, Inc, New York

and Basel, 2nd edition.

Greenland, S., K. B. Michels, J. M. Robins, et al. (1999), “Statistical Uncertainty in Trends

and Dose-Response Relations,” American Journal of Epidemiology, 149(12): 1077-

1086.

Grievank, A., (2000). Principles and Techniques of Algorithmic Differentiation: Evaluating

Derivatives. SIAM Publisher: Philadelphia.

Gwo, J. P., L. E. Toran, and M. D. Morris (1996),”Subsurface Stormflow Modeling with

Sensitivity Analysis Using a Latin-Hypercube Sampling Technique,” Ground Water,

34(5): 811-818.

Hahn, G. J., and S. S. Shapiro (1967). Statistical Models in Engineering. Wiley Classics

Library, John Wiley and Sons: New York.

84

Harrison, K. W. (2002). Environmental and Water Resource Decision Making Under

Uncertainty. Ph.D. Dissertation, North Carolina State University.

Hallmark, S. L., R. Guensler, I. Fomunung (2002), “Characterizing On-Road Variables that

Affect Passenger Vehicle Model Operation,” Transportation Research Part D:

Transport and Environment, 7(2): 81-98.

Hamed, M. M., J. P. Conte, and P. B. Bedient (1995), “Probabilistic Screening Tool for

Ground-Water Contaminant Assessment,” Journal of Environmental Engineering,

121(11):767-775.

Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press, Princeton, NJ.

Helton, J. C. (1993), "Uncertainty and Sensitivity Analysis Techniques for Use in

Performance Assessment for Radioactive-Waste Disposal," Reliability Engineering

and System Safety, 42(2-3): 327-367.

Helton, J. C., and R. J. Breeding (1993), "Evaluation of Reactor Accident Safety Goals,"

Reliability Engineering & System Safety, 39(2):129-158.

Helton, J. C., and F. J. Davis (2002), “Illustration of Sampling-Based Methods for Sensitivity

Analysis,” Reliability Engineering and System Safety, 81(1): 23-69.

Hinkley, D. V. (1969), “Inference about the Change-Point in a Sequence of Random

Variables,” Biometrika, 57(1): 1-17.

Hofer, E. (1999),”Sensitivity Analysis in the Context of Uncertainty Analysis for

Computationally Intensive Models,” Computer Physics Communications, 117(1-2):

21-34.

85

Hoffman, F. O., and J. S. Hammonds (1994), “Propagation of Uncertainty in Risk

Assessments; The Need to Distinguish Between Uncertainty Due to Lack of

Knowledge and Uncertainty Due to Variability,” Risk Analysis, 14(5): 707-712.

Homma, T., and A. Saltelli (1996), “Importance Measures in Global Sensitivity Analysis of

Nonlinear Models,” Reliability Engineering and System Safety, 52(1): 1-17.

Hwang, D., D.W. Byun, and M. T. Odman (1997), “An Automatic Differentiation Technique

for Sensitivity Analysis of Numerical Advection Schemes in Air Quality Models,”

Atmospheric Environment, 31(6): 879-888.

Iman, R. L., and J. C. Helton (1988)An Investigation of Sensitivity and Uncertainty Analysis

Techniques for Computer Models,” Risk Analysis, 8(1): 71-90.

Iman, R. L., M. J. Shortencarier, and J. D. Johnson (1985). A FORTRAN 77 Program and

Users Guide for the Calculation of Partial Correlation and Standardized Regression

Coefficients. Report No. SAND85-0044, Sandia National Laboratories, Albuquerque,

NM.

Jelinek, F. (1970). Probabilistic Information Theory. McGraw-Hill Book Company: New

York.

Jones, R. N. (2000),”Analyzing the Risk of Climate Change Using an Irrigation Demand

Model,” Climate Research, 14(2): 89-100.

Karamchandani, A., and C. A. Cornel (1992), “Sensitivity Estimation Within First and

Second Order Reliability Methods,” Structural Safety, 11(2):95-107.

Kass, G. V. (1980), “An Explanatory Technique for Investigating Large Quantities of

Categorical Data,” Applied statistics, 29(2): 119-127.

86

Kendall, M. G., and J. D. Gibbons (1990). Rank Correlation Methods. Fifth Edition. Charles

Griffin: London.

Kewley, R. H., M. J. Embrechts, and C. Breneman (2000), "Data Strip Mining for the Virtual

Design of Pharmaceuticals with Neural Networks," IEEE Transactions on Neural

Networks, 11(3):668-679.

Khuri, A. J., and J. A. Cornell (1987). Response Surfaces. Marcel Dekker Inc, New York,

USA.

Kleijnen, J. P. C. (1995), “Verification and Validation of Simulation-Models,” European

Journal of Operational Research, 82(1): 145-162.

Kleijnen, J. P. C., and J. C. Helton (1999), “Statistical Analyses of Scatterplots to Identify

Important Factors in Large-Scale Simulations, 1: Review and Comparison of

Techniques,” Reliability Engineering and System Safety, 65(2): 147-185.

Kleijnen, J. P. C, and R. G. Sargent (2000), “A Methodology for Fitting and Validating

Metamodels in Simulation,” European Journal of Operational Research, 120(1): 14-

29.

Koda, M., G. J. McRae, and J. H. SeinFeld (1979), “Automatic Sensitivity Analysis of

Kinetic Mechanisms,” International Journal of Chemical Kinetics, 11, 427-444.

Kolev, N. I., and E. Hofer (1996), ”Uncertainty and Sensitivity Analysis of a Post-Experiment

Simulation of Non-Explosive Melt-Water Interaction,” Experimental Thermal and

Fluid Science, 13(2): 98-116.

Krishnaiah, P. R. (1981). Analysis of Variance. Elsevier: New York.

Limat, S., M. C. Woronoff-Lemsi, E. Deconinck, et al. (2000), "Cost-Effectiveness of

Cd34(+) Dose in Peripheral Blood Progenitor Cell Transplantation for Non-Hodgkin's

87

Lymphoma Patients: a Single Center Study," Bone Marrow Transplantation,

25(9):997-1002.

Lindman, H. R (1974). Analysis of Variance in Complex Experimental Designs. W. H.

Freeman & Co.: San Francisco, CA.

Loader, C. R. (1996), “Change-Point Estimation Using Nonparametric Regression,” Annals of

Statistics, 24(6): 1667-1678.

Loh, W., Y. Shih (1997), “Split Selection Methods for Classification Trees,” Statistica Sinica,

7(4): 815-840.

Lu, R., Y. Luo, and J. P. Conte (1994), “Reliability Evaluation of Reinforced Concrete

Beam,” Structural Safety, 14:277-298.

Lu, Y., and S. Mohanty (2001), “Sensitivity Analysis of a Complex, Proposed Geologic

Waste Disposal System Using the Fourier Amplitude Sensitivity Test Method,”

Reliability Engineering and System Safety, 72(3), 275-291.

Mandel, J. (1969). The Statistical Analysis of Experimental Data. John Wiley and Sons, New

York.

Manheim, L. M. (1998), "Health Services Research Clinical Trials: Issues in the Evaluation

of Economic Costs and Benefits," Controlled Clinical Trials, 19(2):149-158.

McCamley, F., and R. K. Rudel (1995), "Graphical Sensitivity Analysis for Generalized

Stochastic Dominance," Journal of Agricultural and Resource Economics, 20(2):403-

403.

McKay, M. D., W. J. Conover, and R. J. Beckman (1979), “ A Comparison of Three Methods

for Selecting Values of Input Variables in the nalysis of output from a Computer

Code,” Technometrics, 21(2): 239-245.

88

McRae, G. J., J. W. Tiden, J. H. Seinfeld (1982), ”Global Sensitivity Analysis; A

Computational Implementation of the Fourier Amplitude Sensitivity Test (FAST),”

Computers and Chemical Engineering, 6(1), 15-25.

Merz, J. F., M. J. Small, and P. S. Fischbeck (1992), “Measuring Decision Sensitivity: A

Combined Monte Carlo – Logistic Regression Approach,” Medical Decision Making,

12(3):189-196.

Mishra, S., N. E. Deeds, and B. S. RamaRao (2003), “Application of Classification Trees in

the Sensitivity Analysis of Probabilistic Model Results,” Reliability Engineering and

System Safety, 79(2): 123-129.

Morgan, M. G., and M. Henrion (1990). Uncertainty: A Guide to Dealing with Uncertainty in

Quantitative Risk and Policy Analysis. New York: Cambridge University Press.

Morgan, M. G., S. C. Morrise, M. Henrion, et al. (1984), “Technical Uncertainty in

Quantitative Policy Analysis- A Sulfur Air Pollution Example,” Risk Analysis, 4(3):

201-213.

Montgomery, D.C. (1997). Design and Analysis of Experiments. Wiley and Sons Ltd.: New

York.

Morgan, M. G., and M. Henrion (1990). Uncertainty: A Guide to Dealing with Uncertainty in

Quantitative Risk and Policy Analysis. Cambridge University Press: Cambridge, NY.

Morgan, J. N., J. A. Sonquist (1963), “Problem in the Analysis of Survey Data, and a

Proposal,” Journal of the American Statistical Association, 58(302): 415-434.

Muller, H. G. (1992), “Change-Points in Nonparametric Regression Analysis,” Annals of

Statistics, 20(4): 737-761.

89

Murphy, B. L. (1998), “Dealing with Uncertainty in Risk Assessment,” Human and

Ecological Risk Assessment, 4(3): 685-699.

Myers, R. H., and D. C. Montgomery (1995). Response Surface Methodology: Process and

Product Optimization Using Designed Experiments. Wiley and Sons Ltd, New York,

USA.

National Council on Radiation Protection and measurement (1996). A guidefor uncertainty

analysis in dose and risk assessment related to environmental contamination, NCRP

Commentary, Number 14, Bethesda, MD.

Neter, J., M. H. Kutner, C. J. Nachtsheim, et al. (1996). Applied Linear Statistical Models,

Fourth Edition. McGraw-Hill, Chicago, IL, USA.

Oblow, E. M., F G. Pin, and R. Q. Wright (1986), “Sensitivity Analysis Using Computer

Calculus: A Nuclear Waste Isolation Application,” Nuclear Science and Engineering,

94(1): 46-56.

Oh, B. H., and I. H. Yang (2000), "Sensitivity Analysis of Time-Dependent Behavior in PSC

Box Girder Bridges," Journal of Structural Engineering-ASCE, 126(2):171-179.

Phillips, A., D. Janies, and W. Wheeler (2000), “Multiple Sequence Alignment in

Phylogenetic Analysis,” Molecular Phylogenetics and Evolution, 16(3):317-330.

Price, P. S., F. C. Chaisson, M. Koontz, et al. (2003). Construction of a Comprehensive

Chemical exposure framework using person oriented modeling. Developed for the

Exposure Technical Implementation Panel, American Chemistry Council, Contract #

1388. Environmental and Occupational Health Science Institute, Rutgers University.

Quinlan, J. R. (1993). Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.

90

Roberts, C. A., S. Washington, and J. Leonard (1999), “Forecasting Dynamic Vehicular

Activity on Freeways: Bridging the Gap Between Travel Demand and Emerging

Emissions Models,” Presented at the 78th Annual Meeting of the Transportation

Research Board, Washington, D.C.

Rose, K. A., E. P. Smith, R. H. Gardner, et al. (1991), ”Parameter Sensitivities, Monte Carlo

Filtering, and Model Forecasting under Uncertainty,” Journal of Forecasting, 10(1):

117-133.

Saltelli, A., and R. Bolado (1998), “An Alternative Way to Compute Fourier Amplitude

Sensitivity Test (FAST),” Computational Statistics and Data Analysis, 26(4):445-460.

Saltelli, A., S. Tarantola, and K. P. Chan (1999), “A Quantitative Model-Independent Method

for Global Sensitivity Analysis of Model Output,” Technometrics, 41, 39-56.

Saltelli, A., K. Chan, and M. Scott (2000). Sensitivity Analysis: Probability and Statistics

Series. John Wiley & Sons, New York, USA.

Saltelli, A. (2002), “Making Best Use of Model Evaluations to Compute Sensitivity Indices,”

Computer Physics Communication, 145(2): 280-297.

Saltelli, A, S. Tarantola, F. Campolongo, et al. (2004). Sensitivity Analysis in Practice. New

York, NY: Wiley.

Schaibly, J. H., and K. E. Shuler (1973), “Study of the Sensitivity of Coupled Reaction

Systems to Uncertainties in Rate Coefficients. Part II, Applications,” Journal of

Chemical Physics, 59(6): 3879-3888.

Sen, A., and M. Srivastava (1990). Regression Analysis: Theory, Methods, and Applications.

Springer-Verlag: New York.

91

Siegel, S., and N. J. Castellan (1988). Nonparametric Statistics for the Behavioural Sciences

(2nd edn), McGraw-Hill, New York.

Sobol, I.M. (1993), “Sensitivity Estimates for Nonlinear Mathematical Models,”

Mathematical Modeling and Computation, 1(4):407-414.

Steel, R.G., J. H. Torrie, and D. A. Dickey (1997). Principals and Procedures of Statistics; A

Biometric Approach. 3rd Edition. WCB McGraw-Hill, Boston, MA, USA.

Stiber, N. A., M. Pantazidou, and M. J. Small (1999), “Expert System Methodology for

Evaluating Reductive Dechlorination at TCE Sites,” Environmental Science and

Technology, 33(17):3012-3020.

U.S. EPA (1996),”Summary Report for the Workshop on Monte Carlo Analysis,”

EPA/630/R-96/010, Risk Assessment Forum, Washington, DC.

U.S. EPA (1997),”Guiding Principles for Monte Carlo Analysis,” EPA/630/R-97/001, Risk

Assessment Forum, Washington, DC.

U.S. EPA (2000), “First-Generation Multimedia, Multipathway Exposure Models,” NERL

Research Abstract, National Exposure Research Laboratory, Research Triangle Park,

NC, September 2000.

Vidmar, T. J., and J.W. McKean (1996), “A Monte Carlo Study of Robust and Least Squares

Response Surface Methods,” Journal of Statistical Computation and Simulation,

54(1):1-18.

Ward, M. P., and T.E. Carpenter (1996), "Simulation Modeling of the Effect of Climatic

Factors on Bluetongue Virus Infection in Australian Cattle Herds - 1. Model

Formulation, Verification and Validation," Preventive Veterinary Medicine, 27(1-2):1-

12.

92

Whiting, W. B., T. M. Tong, and M. E. Reed (1993), “Effect of Uncertainties in

Thermodynamic Data and Model Parameters on Calculated Process Performances,”

Industrial and Engineering Chemistry Research, 32(7): 1367-1371.

Wilson, R., and E. A. Crouch (1981),”Regulations and Carcinogens,” Risk Analysis, 1(1):47-

57.

Winter, B. J., D. R. Brown, and K. M. Michels (1991). Statistical Principles in Experimental

Design. Third Edition. McGraw-Hill Inc.: New York.

Zartarian, V. G., H. Ozkaynak, J. M. Burke, et al. (2000), “A Modeling Framework for

Estimating Children’s Residential Exposure and Dose to Chlorpyrifos via Dermal

Residue Contact and Non-Dietary Ingestion,” Environmental Health Perspective,

108(4): 505-514.

Review and Recommendation of Methods for Sensitivity and ... · and uncertainty in model inputs in...

Documents

Transcript of Review and Recommendation of Methods for Sensitivity and ... · and uncertainty in model inputs in...