Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep...

139
Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School of Civil and Environmental Engineering and School of Operations Research and Industrial Engineering Cornell University South Florida Water District Afternoon Meeting Sept. 24, 3003
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep...

Page 1: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Calibration, Sensitivity Analysis and Uncertainty

Analysis

Prof. Christine Shoemaker

Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman

School of Civil and Environmental Engineering and School of Operations Research and Industrial Engineering

Cornell University

South Florida Water District Afternoon MeetingSept. 24, 3003

Page 2: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Models Help Extract Information

Point Data

from monitoring or experiments at limited number of points in space and time

Model

that describes temporal and spatial connections

Forecasts (with statistical representation)

Comparison of Alternative Management Options

Understanding Processes

from Point Data to Processes Continuous in Space and Time

Page 3: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Models Help Extract Information from

Data___________________

Point Data

from monitoring or experiments at limited number of points in space and time

Model

that describes temporal and spatial connections

Forecasts (with statistical representation)

Comparison of Alternative Management Options

Understanding Processes

Model Outputs

for Multiple Outputs

Page 4: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Steps in Modeling

• Calibration—selecting parameter values within acceptable limits to fit the data as well as possible

• Validation—applying the model and calibrated parameters to independent data set

• Sensitivity Analysis—assess the impact of changes in uncertain parameter values on model output

• Uncertainty Analysis-assessing the range of model outcomes likely given uncertainty in parameters, model error, and exogenous factors like weather.

Page 5: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Methods and Applications

• We will discuss a general methodology for calibration, sensitivity analysis and uncertainty analysis that can be applied to many types of computationally expensive models.

• We will present numerical examples for two “real life” examples: a watershed and a groundwater remediation.

Page 6: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Part I

• Sensitivity Analysis for models with many parameters

Page 7: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Cannonsville Watershed• Cannonsville Reservoir Basin – agricultural basin• Supply of New York City drinking water • To avoid $8 billion water filtration plant, need

model analysis to help manage phosphorous

1200 km2

Watershed subject to economic constraints if P violations of TMDL.

Page 8: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Monitoring Stations

#S

#S

#S

#S#S

$T

$T

$T$T

$T

%U

%U

Trout Creek

W. Br. Delaware R. @ Walton

Little Delaware R.

W. Br. Delaware @ Delhi

Beerston

Town Brook

Town Brook

Subwatersheds BoundariesRivers and Streams

#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations

5 0 5 10 Kilometers

N

There are over 20,000 data for this watershed

Page 9: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Questions

• Using all this data, can we develop a model that is a useful forecasting tool to assess the impact of weather and phosphorous management actions on future loading the reservoir?

• What phosphorous management strategies should be undertaken if any?

Page 10: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Methodology for Sensitivity Analysis of a Model with Many Parameters: Application to Cannonsville Basin

• Joint work with Jennifer Benaman (Cornell Ph.D. in Civil and Environmental Engineering, 2003)

• Funded by EPA Star Fellowship

Page 11: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Sensitivity Analysis with Many Parameters

• Sensitivity Analysis measures the change in model output associated with the change (perturbation) in model input (e.g. in parameter values).

• Purposes include:– To help select which parameters should be adjusted

in a calibration and which can be left at default values

Page 12: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Sensitivity Analysis with Many Parameters

– To prioritize additional data collection, and

– To estimate potential errors in model forecasts that could be due to parameter

value errors.• Sensitivity Analysis and calibration are

difficult with a large number of parameters.

Page 13: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Questions

• Can we develop a sensitivity analysis method that is:– robust (doesn’t depend strongly on our

assumptions)?– computationally efficient for a large

number of parameters (hundreds)?– allows us to consider many different

model outputs simultaneously?– .

Page 14: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

• 160 parameters– 35 basinwide– 10 vary by land use (10 x 5 land uses)– 7 vary by soil (7 x 10 soil types)– 2 additional for corn and hay– 1 additional for pasture

• Ranges obtained from literature, databases, and SWAT User’s Manual

Choose Parameters Establish ParameterRanges

Choose OutputVariables of Concern

Application to Cannonsville Watershed

Page 15: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Monitoring Stations

#S

#S

#S

#S#S

$T

$T

$T$T

$T

%U

%U

Trout Creek

W. Br. Delaware R. @ Walton

Little Delaware R.

W. Br. Delaware @ Delhi

Beerston

Town Brook

Town Brook

Subwatersheds BoundariesRivers and Streams

#S USGS Flow Gauges$T Climate Stations%U Sediment Monitoring Stations

5 0 5 10 Kilometers

N

Page 16: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Output Variables of Concern• Basinwide (average annual from 1994-1998)

– Surface water runoff– Snowmelt– Groundwater flow– Evapotranspiration– Sediment yield

• Location in-stream (monthly average over entire simulation)– Flow @ Beerston– Flow @ Trout Creek– Flow @ Town Brook– Flow @ Little Delaware River– Sediment load @ Beerston– Sediment load @ Town Brook

Choose Parameters Establish ParameterRanges

Choose OutputVariables of Concern

Page 17: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Independent Sensitivity Index, SI (after McCuen, 1973)

o

io

o

io

P

PPO

OO

iSI

• The maximum in magnitude for each parameter for a given output variable was taken as the SI

• When the initial value of a parameter was zero, this SI could not be calculated so we have a second index.

PoMin MaxXPi

XPi

RunModel

RunModel

Oi OiOo

(Calibration)

Output

Parameter

Page 18: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Combined Sensitivity

Index, comSI• Summed for all output variables:

jallfor

kjjk SIcomSI ,

where,

comSIk = Combined Sensitivity Index for a parameter k over all output variables, j

j = Weighting given to output variable, j

Where j = 1.0

Sij,k = Sensitivity index for parameter k, output variable j (could be one of two forms)

The larger comSIk, the more sensitive the model is to the change of parameter k

Page 19: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

A Example of comSI DistributionC

omS

I k1.5

1.0

0.5

0.0

parameters ranked by sensitivity

Page 20: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Four Example Weightings for Combined Sensitivity-comSI

A. All equal j = 1/11 for all output variables

B. Focus on Beerston [main point for quality data]j = 0.125 for sediment load and flow @ Beerston

j = 0.125 for basinwide sediment yield and surface runoff

j = 0.0714 for all others

C. Focus on spatial distributionj = 0.0 for all basinwide output variables

j = subwatershed area of gauge/total area considered in sensitivity analysis

D. Focus on basinwide managementj = 0.0 for all gauge output variables

j = 0.5 for basinwide sediment yield

j = 0.5/4 for all other basinwide parameters

Page 21: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

What We Have….

For each weighting method (A, B, C, or D):

2 parameter change methods (midpoint and +/- 25%)2 comSI’s (SI and modSI)

4 evaluations of each of the 160 parameters analyzed

Parameters ranked from largest comSIk to smallest

x

Rank ParametersBy Index

We compute top 20 parameters for each ranking and display percentage (out of 4 evaluations)

Page 22: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Final Results

Weighting Method A Weighting Method B Weighting Method C Weighting Method D

All Equal Weights Focus on Beerston Focus on CalibrationFocus on Basinwide

ManagementAPMBASIN 100 100 100 100BIOMIXBASIN 100 100 100 100CN2CSIL 100 100 100 100CN2FRSD 100 100 100 100CN2PAST 100 100 100 100RSDCOPAST 100 100 100 100SLSUBBSNBASIN 100 100 100 100SMFMNBASIN 100 100 100 100T_BASEPAST 100 100 100 100T_OPTPAST 100 100 100 100USLEKNY129 100 100 100 100ESCONY129 100 75 75 100SMTMPBASIN 100 75 75 100LAT_SEDBASIN 100 50 100 100CN2HAY 75 75 75 75ESCONY132 75 75 75 50GWQMNBASIN 75 75 75 75TIMPBASIN 75 50 75 75BIO_MINPAST 75 50 50 75ROCKNY132 75 25 50 50REVAPMNBASIN 50 50 50 75ROCKNY129 50 25 50 25USLEPCSIL 25 25 50 25HVSTICSIL 25 25 25 50USLECPAST 25 25 25 25SMFMXBASIN 25 0 0 50GSIPAST 0 0 25 0ROCKNY026 0 0 25 0

Percentage of times in the 'Top 20'

These are in top 20 for ALL cases

These are in top 20 most of the time

Page 23: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.
Page 24: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Computational Issues

• We have a robust method for determining importance and sensitivity of parameters.

• An advantage is that the number of model simulations is independent of the number of output variables, sensitivity indices, or weighting factors considered in the combined sensitivity analysis. (Almost no extra computation is required to do many output variables, indicies or weightings.)

• The number of simulations is simply the number required to do a single (non robust) univariate sensitivity analysis multiplied by the number of perturbation methods (=2 in this example).

Page 25: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Next Steps• Once the most important parameters have

been identified we can extend the analysis to more detailed analyses including:– Multivariate sensitivity analysis (changes in more

than one parameter at a time)– Uncertainty Analysis (e.g. GLUE)

• Both of these analyses above are highly computationally demanding and can hence only be done with a small number of parameters.

• The (univariate) sensitivity analysis done here can identify the small number of parameters on which these analyses should be focused.

Page 26: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Questions

• Can we develop a sensitivity analysis method that is:

– robust (doesn’t depend strongly on our assumptions)?

– computationally efficient for a large number of parameters (hundreds)?

– allows us to consider many different model outputs simultaneously?

– Yes, the results for Cannonsville indicate this is possible with this methodology.

– Models with longer simulation times require more total simulation times or fewer parameters.

Page 27: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Part II: Use of Response Surface Methods in Non-Convex Optimization, Calibration and Uncertainty Analysis

• Joint work with– Pradeep Mugunthan (PhD Candidate in Civil and

Environmental Engineering)– Rommel Regis (Postdoctoral Fellow with PhD in

Operations Research)

– Funded by three National Science Foundaton (NSF) Projects

Page 28: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Automatic Calibration

• We would like to find the set of parameter values (decision variables) such that– the calibration error (objective function) is

minimized – subject to constraints on the allowable range

of the parameter values.

This is an Optimization Problem.

It can be a global optimization problem.

Page 29: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

NSF Project 1: Function Approximation Algorithms for Environment Analysis with

Application to Bioremediation of Chlorinated Ethenes

• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,

• The project is funded by the NSF Environmental Engineering Program.

• The following slides will discuss the application of these concepts to uncertainty analysis.

Page 30: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

“Real World Problem”:Engineered Dechlorination by Injection of Hydrogen

Donor and Extraction

We have developed a user friendly transport model of engineered anaerobic degradation of chlorinated ethenes that models chemical and biological species and utilizes MT3D and RT3D.

This model is the application for the function approximation research.

Page 31: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Computational Effort for Trial and Error (Manual) Calibration

• Assume that you have P parameters and you want to consider N levels of each.

• Then the total number of combinations of possible sets of parameter is NP.

• So with 10 parameters, considering only 2 values each (very crude evaluation), there are 21024 possible combinations, too many to evaluate all of them for computationally expensive function.

• With 8 parameters considering a more reasonable 10 values each gives 100 million possible combinations of parameters!

• With so many possibilities it is hard to find with trial and error good solutions with few (e.g. 100) function evaluations.

Page 32: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Optimization

• Because our model is computationally expensive, we need to find a better way than trial and error to get a good calibration set of parameters.

• Optimization can be used to efficiently search for a “best” solution.

• We have developed optimization methods that are designed for computationally expensive functions.

Page 33: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Optimization

• Our goal is to find the

minimum of f(x)

where x є D• We want to do very few evaluations of f(x)

because it is “costly to evaluate.

This can be a measure of error between model prediction and observations

X can be parameter values

Page 34: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Function ApproximationMethods

• A nonlinear function approximation R(x) is a continuous nonlinear multivariate approximation to f(x).

• R(x) has also been called a “response surface” or a “surrogate model”.

• We use radial basis functions for our response surface, but other methods for non-convex surfaces could also be used.

Page 35: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Why Use Function Approximation Methods?

• A function approximation R(x) can be used as part of an efficient parallel optimization algorithm in order to reduce the number of points at which we evaluate f(x), and thereby significantly reduce computational cost.

• Our function approximation algorithm searches for the global minimum.

Page 36: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Goal: Derivative-Free Optimization of Costly, Black Box, Nonconvex Functions

• For some complex functions derivatives are unavailable (because they are infeasible to compute or they cannot be accurately computed sufficiently quickly).

• Inaccurate derivatives lead to inefficient gradient-based optimization and some important functions are not differentiable.

• Our method does not require f’(x)

Page 37: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Goal: Derivative-Free Optimization of Costly, Black Box, Nonconvex Functions

• Costly functions f(x) require a substantial amount of computation (minutes, hours, days) to evaluate the function once.

• Examples of costly functions include simulation models or codes that solve systems of nonlinear partial differential equation.

• Our method seeks to minimize number of costly function evaluations.

Page 38: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Goal: Derivative-free Optimization of Costly, Black Box, Nonconvex Functions

• Gradient-based optimization methods stop at local minima instead of searching further for the global minimum.

• For black box functions, we don’t know if the function is nonconvex or convex.

• Our method is a good global optimization methods for black box and other nonconvex functions.

• The method can be easily connected to different simulation models (as are heuristic methods like GAs).

Page 39: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Global versus Local Minima

F(x)

X (parameter value)

Local minimum

Global minimum

Many optimization methods only find one local minimum.

We want a method that finds the global minimum.

Page 40: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Experimental Design with Symmetric Latin Hypercube (SLHD)

• To fit the first function approximation we need to have evaluated the function at several points.

• We use a symmetric Latin Hypercube (SLHD) to pick these initial points.

• The number of points we evaluate in the SLHD is (d+1)(d+2)/2, where d is the number of parameters (decision variables).

Page 41: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

x (parameter value-one dimensional example)

Objective

Function

f(x)

measure of error

One Dimensional Example of Experimental Design to Obtain Initial Function Approximation

Costly Function Evaluation (e.g. over .5 hour CPU time for one evaluation).

Page 42: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

x (parameters)

f(x)

Function Approximation with Initial Points from Experimental Design

In real applications x is multidimensional since there are many parameters (e.g. 10).

Page 43: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

x (parameter value)

f(x)

Update in Function Approximation with New Evaluation

Update done in each iteration for function approximation for each algorithm expert.

Function Approximation is a guess of the function value of f(x) for all x.

new

Page 44: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Outline of Function Approximation Optimization

• 1. Use a “space filling” experimental design (e.g. SLHD) to select a limited number of evaluation points.

• 2. Make an approximation of the function (with Radial Basis Function splines) based on experimental design points.

• 3. Use our algorithm to select next function point for evaluation based on FA and location of previously evaluated points

• 4. Construct a new function approximation that incorporates the newly evaluated point and all prior points.

• 5. Stop if reach maximum number of iterations. Otherwise go to Step 3

Page 45: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Use of Derivatives

• We use the gradient-based methods only on the function approximations R(x) (for which accurate derivatives are inexpensive to compute).

• We do not try to compute gradients/derivatives for the underlying costly function f(x).

Page 46: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Our RBF Algorithm

• Our paper on RBF optimization algorithm has will appear soon in Jn. of Global Optimization .

• The following graphs show a related RBF method called “Our RBF” as well as an earlier RBF optimization suggested by Gutmann (2000) in Jn. of Global Optimization called “Gutmann RBF”.

Page 47: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Comparison of RBF Methods on a 14-dimensional Schoen Test Function (Average of 10 trials)

120 140 160 180 200 220 240 260 280 30015

20

25

30

35

40

45

number of function evaluations

mean o

f th

e b

est

valu

e in 3

0 r

uns

Comparison of RBF Methods on a 14-dimensional Schoen Test Function

ExpRBF-LGutmannRBFGreedyRBF

Objective Function

Our RBF

Number of Function Evaluations

Page 48: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Comparison of RBF Methods on a 12-dimensional Groundwater Aerobic Bioremediation Problem ( a PDE system)

(Average of 10 trials)

80 100 120 140 160 180 200400

500

600

700

800

900

1000

1100

number of function evaluations

mean o

f th

e b

est

valu

e in 1

0 r

uns

Comparison of RBF Methods on a 12-dimensional Groundwater Bioremediation Problem

ExpRBF-LGutmannRBFGreedyRBF

Objective Function

Number of Function Evaluations

Our RBF

Page 49: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

The following results are from:

NSF Project 1: Function Approximation Algorithms for Environment Analysis with

Application to Bioremediation of Chlorinated Ethenes

• Title: “Improving Calibration, Sensitivity and Uncertainty Analysis of Data-Based Models of the Environment”,

• The project is funded by the NSF Environmental Engineering Program.

Page 50: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Now a real costly function: DECHLOR: Transport Model of

Anaerobic Bioremediation of Chlorinated Ethene

• This model was originally developed by Willis and Shoemaker based on kinetics equations by Fennell and Gossett.

• This model will be our “costly” function in the optimization.

• Model based on data from a field site in California.

Page 51: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Engineered Dechlorination by Injection of Hydrogen Donor and Extraction

Injected Donor promotes degradation by providing hydrogen. Forecasting effectiveness requires transport model that predicts movement of donor and species through space over time and effect of pumping and injection of donor.

Page 52: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Dechlorination: PCE to ETH

Tetrachloroethene (PCE)

Trichloroethene (TCE)

cis-1,2-Dichloroethene (cDCE)

Vinyl Chloride (VC)

Ethene (ETH)

Dechlorination 1

Dechlorination 2

Dechlorination 3

Dechlorination 4

Cl

ClClC C

H

Cl

ClClC C

Cl

ClClC C

H H

ClC C

H H

H

C CH H

HH

Complicated Chemistry so Difficult Model to Compute and Analyze

All compounds except Ethene are very toxic

Page 53: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Complex model: 18 species at each of thousands of nodes of finite difference model

Butyrate

Propionate

H2

Acetate

LactateLac2Ace

PCE DCETCE VC Ethene

Dechlorinator

Lac2Prop

But2AceProp2Ace

Methane

But2Ace

Hyd2Meth

Chlorinated.Ethenes

Donors

Page 54: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Example of Objective Function for Optimization of Chlorinated Ethene Model

2 T

1 t

I

1 i

s tij

o tij

J

1 j ) Y (Y SSE

where, SSE is the sum of squared errors between observed and simulated chlorinated ethenes

is the observed molar concentration of species j at time t, location i

is the simulated molar concentration of species j at time t, location i

t = 1 to T represent time points at which measured data is available

j = 1 to J represents PCE, TCE, DCE, VC and ethene in that order

i = 1 to I is a set monitoring locations

otijY

stijY

Observation Model

Page 55: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Algorithms Used for Comparison of Optimization Performance on Calibration

• Stochastic Greedy Algorithm– Neighborhood defined to make search global– Neighbors generated from triangular distribution around current

solution. Moves only to a better solution.• Evolutionary Algorithms

– Derandomized evolution strategy DES with lambda = 10 and b1 = 1/n and b2 = 1/n0.5 (Ostermeier et al. 1992)

– Binary or Real Genetic algorithm GA, population size 10, one point cross-over, mutation probability 0.1, crossover probability 1

• RBF Function Approximation Algorithms– RBF Gutmann- radial basis function approach, with cycle length

five, SLH space filling designRBF-Cornell radial basis function approach.

• FMINCON– derivative based optimizer in Matlab with numerical derivatives

• 10 trials of 100 function evaluations were performed for heuristic and function approximation algorithms for comparison

Page 56: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Comparison of algorithms for NS as objective function on a hypothetical problem

-1

4

9

14

19

30 50 70 90

Number of function evaluations

-(Ave

rage

NS)

FMINCON

RBF-CORNELL

RBF-GUT

FMINCON+RBF

DES

RealGA

BinaryGA

Average is based on 10 trials. The best possible value for –NS is –1. 28 Experimental design evaluations done.

ours

Lower curve is better

Page 57: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Boxplot comparing best objective value (CNS) produced by the algorithms in each trial over 10 trials

ours

average

outlier

Page 58: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Conclusions• Optimizing costly functions is typically done only

once.• The purpose for our examination of multiple

trials is to examine how well one is likely to do if you do solve the problem only once.

• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.

• Our RBF has both the smallest Mean and the smallest Variance.

• The second best method is Gutmann RBF, so RBF methods seem very good in general.

Page 59: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Empirical Cummulative Distribution Plot – CNS Objective

Curves to the left are best and stochastically dominate (higher probability of lower objective function value)’

ours

Page 60: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Conclusions• Optimizing costly functions is typically done only

once.• The purpose for our examination of multiple

trials is to examine how well one is likely to do if you do solve the problem only once.

• Hence we want the method that has both the smallest Mean objective function value and the smallest Variance.

• Our RBF has both the smallest Mean and the smallest Variance.

• The second best method is Gutmann RBF, so RBF methods seem very good in general.

Page 61: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Alameda Field Data

• The next step was to work with a real field site. • We obtained data from a DOD field site studied

by a group (including Alleman, Morse, Gossett, and Fennell).

• Running the simulation model takes about three hours for one run of the chlorinated ethene model at this site because of the nonlinearities in the kinetics equations.

Page 62: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Site Layout

Page 63: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Numerical Set-up for Simulation • Finite difference grid

with over 5300 nodes• 18 species and about

6000(?) time steps equals almost 8 billion variables

• Each simulation takes between 2-3 hours

• Optimization attempted for calibration of 8 parameters (6 biokinetic and 2 biological)

65 ft (22m)

No flowNo flow

h = 21’ (6.4 m)

flow

Numerical set-up for simulation

Page 64: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Range of objective values for SSE objective function at Alameda field site - Mean, min and max are shown for each

algorithm

150000

250000

350000

450000

550000

650000

DES FA-Gutmann FA-RS FMINCON

SS

E (m M

)2

ours

gradient

Page 65: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Conclusions on RBF Optimization of Calibration

• Radial Basis Function Approximation Methods can be used effectively to find optimal solutions of costly functions.

• “Our RBF” performed substantially better than the previous RBF method by Gutmann on the difficult chlorinated ethene remediation problem, especially because our RBF is robust (small variance).

• Both Genetic algorithms and derivative-based search did very poorly.

• The two RBF methods did much better on the Alameda field data problem than other methods.

Page 66: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

However,300 hours is a long time to wait!

Solution: Parallel Algorithms• We would like to be able to speed up

calculations for costly functions by using parallel computers.

• To get a good speed up on a parallel computer, you need an algorithm that parallelizes efficiently.

• We are developing such an algorithm through a second NSF grant (from Computer and Information Science Directorate).

Page 67: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Part II: Uncertainty Analysis

• Modellers have discovered that there is often more than one set of parameters that gives and “adequate” fit to the data.

• One approach to assessing uncertainty associated with a model output is to look at the weighted mean and the variability of the output associated all the sets of parameters that give an equally good fit.

Page 68: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

x (parameters)

f(x)

More than one parameter value might give acceptable goodness of fit

If we impose a “filter” and allow only the acceptable points, then only the black points are incorporated in the analysis.

acceptable

Page 69: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Analysis: GLUE Approach

• GLUE is a methodology (by Bevins and co-workers) used largely in watersheds (where computation times are not long).

Page 70: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Analysis via GLUE: Dots are Model Simulations of Parameter Combinations Chosen at

Random (Two Parameter Example)

parameter 1

parameter 2

parameter combination that gives R2 greater than .75parameter combination that gives R2 less than .75

Page 71: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Glue Methodology (used mostly in watershed modeling)

• Step 1: Select combinations of parameter values at random and simulate model for each combination.

• Step 2:compare goodness of fit (e.g. R 2) for each model simulation compared with data

• Step 3: Simulate model at acceptable points and weight output to determine variability characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years)

Page 72: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Problems with GLUE Methodology

• We applied GLUE to the Cannonsville Watershed SWAT model predictions for sediment (a very hard quantity to model).

• We did 20,000 Monte Carlo runs (which took about three weeks of computer time).

• Of the 20,000 runs only two runs were within the allowable R2. (only two )

• This does not adequately characterize uncertainty, and it is not computationally feasible to make more runs.

Page 73: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

• For computationally expensive models like our groundwater problem or your Everglades problem, it is not feasible to run the model 20,000 times!

Page 74: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

• Hence GLUE has the problem that it finds very few samples within an acceptable level (filter) if the filter is fairly stringent.

Page 75: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Our Method 1(RS)

• Step 1: Use function approximation method to select parameter combinations and simulate model for each combination.

• Step 2:compare goodness of fit (e.g. R 2) for each model simulation compared with data

• Step 3: Determine statistical characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years) at all acceptable points.

Page 76: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Analysis via our Function Approximation Approach: Search focuses on the acceptable region;

hence more acceptable solutions

parameter 1

parameter 2

parameter combination that gives R2 greater than .75parameter combination that gives R2 less than .75

Page 77: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Comparison to Expected distribution with GLUE (Two Parameter Example)

parameter 1

parameter 2

parameter combination that gives R2 greater than .75parameter combination that gives R2 less than .75

Page 78: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Groundwater Example Used for Numerical Comparsion with GLUE

• 2-D confined aquifer contaminated with chlorinated ethenes.

• Same PDE equations as earlier field case• 400m long, 100m wide• Modeled using a coarse 10mx10m finite difference grid

– Simulation time for 6 month calibration period was approximately ¾ minute in a Pentium4® 3GHz computer

– Typical simulation time for long-term forecast scenarios is of the order of several hours to days

Page 79: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Calibration Problem

• Calibration of 3 parameters were considered – 2 biological parameters and one biokinetic parameter

• Synthetic observations were generated for a period of 6 months using a known set of parameters

• Optimal calibration was attempted using a response surface (RS) optimization method (Regis and Shoemaker, 2004)

• GLUE based calibration/uncertainty assessment was also performed for comparison

Page 80: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Output Definition

• Output: The total moles of toxic compounds (chlorinated ethenes) remaining in aquifer at final time period. (This cannot be measured but can be estimated through model.)

• Uncertainty in the Output was analyzed using GLUE and RS based methods

Page 81: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Goodness-of-fit Measure

• Nash-Sutcliffe Efficiency Measure (Nash and Sutcliffe, 1970)

• Optimization algorithm was setup to minimize CNS = 1-NS, so that a CNS of zero is best

S

i

tj

avi

obstji

tj

obstji

simtji

CC

CC

SNS

12

,,,

2

,,,,,

11

1 NS

Page 82: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Our Method 1 with 200 function evaluations

Page 83: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Output Total Moles of Chlorinated Ethenes Remaining

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

GLUE 1 with 500 function evaluations

Page 84: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Total Moles of Chlorinated Ethenes Insitu

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Is the mean, range is 99% of data

This is the true answer

Page 85: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Total Moles of Chlorinated Ethenes Insitu

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

RS200 uses 200 function evaluations. G200 found 0 solutions (none) for this filter. GS500 found only 5 solutions.

This is the true answer

Page 86: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Total Moles of Chlorinated Ethenes Insitu

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

Is the mean, range is 99% of data

This is the true answer

The mean estimate is almost perfect for our RS method and is far off for GLUE method with 250% as many points evaluated !

Page 87: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Total Moles of Chlorinated Ethenes Insitu

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

Even with 2000 function evaluations, GLUE has a much worse mean than our RS method with only 1/10 as many function evaluations.

Page 88: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Our Method 2(RSG)

• Step 1: Same as in Method 1• Step Construct a function approximation surface of the

output• Step 3: Make a large number of samples from function

approximation. Do further function evalutions if function approximation is negative and refit function approximation.

• Step 4: Filter out points that are not acceptable and compute statistics

• Determine statistical characteristics of model output (e.g. mean and variance of amount of contamination remaining after N years) at all acceptable points.

Page 89: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Total Moles of Chlorinated Ethenes Insitu

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

Our Method 2 with 200 function evaluations and 20,000 samples from the response surface

Page 90: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Difference Between Method 1 and Method 2

The uncertainty analysis in Method 1 is based only on actual function evaluations.

The uncertainty analysis in Method 2 is based on a very large number of samples from the function approximation.

Page 91: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Comments on Results

• A strict filter produces very few points with GLUE – even after 2000 function evaluations, only 12 points

remain after filtering

• Our RS method produces the tightest bounds and also provides more points for uncertainty assessment with only 200 function evaluations– Limited with respect to sample independence

• The RSG provides an improvement over GLUE – Independent samples for uncertainty assessment– A larger sample size for a tight filter

Page 92: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Effect of Relaxing Filter – CNS of 0.1

Empirical 98% Bounds obtained using a filter of 0.1 for CNS

G2000RS200 G200 G500 G1000 RSG20k TRUE135.00

140.00

145.00

150.00

155.00

160.00

165.00

To

tal m

ole

s o

f ch

lori

nat

ed

eth

enes

90

12 44 84 167

1542

Page 93: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Empirical 98% Bounds obtained using a filter of 1 for CNS

G2000RS200 G200 G500 G1000 RSG20k TRUE120.00

125.00

130.00

135.00

140.00

145.00

150.00

155.00

160.00

165.00

To

tal m

ole

s o

f ch

lori

nat

ed

eth

enes

Effect of Relaxing Filter – CNS of 1

197 197 490 979 1961 19543

Page 94: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Comparison of percentage of points after filtering

0

20

40

60

80

100

120

0.01 0.1 0.3 1 inf

CNS Filter

Per

cen

tag

e o

f p

oin

ts

afte

r fi

lter

ing

RS200

G200

G500

G1000

G2000

RSG20k

Comparison of percentage of points after filtering

0

10

20

30

40

50

0.01 0.1 0.3

CNS Filter

Perc

en

tag

e o

f p

oin

ts a

fter

filt

eri

ng

Filter

RS200

G200

G500

G1000

G2000

RSG20k

Percentage of Points for Different Filters

Page 95: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Observations on the Effect of Filtering

• Relaxing the filter produces more points for uncertainty assessment– Bounds are wider– With very weak filters (e.g. CNS of 1) the

distribution of output is skewed to the upper bound

• an artifact introduced due to a relatively high “likelihood weight” assigned to poor points

• Unreliable bounds

• A strong filter produces very few points

Page 96: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Advantages of Method 2

• The samples are independent

• Reuse information from calibration

• Computationally cheap – – use only the same number of costly function

evaluations as in the regular RS optimization method (200 in these examples)

– Can obtain goodness-of-fit and output values for many thousands of points

Page 97: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Summary

• Models can help us use data take a small scale and at discrete time points to understand and manage environmental processes over large spatial areas and time frames.

• Development of computationally efficient methods for automatic calibration, sensitivity and uncertainty analysis are very important.

Page 98: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

New Project 2: Parallel Optimization Algorithms

• Funded by the Computer Science (CISE) Directorate at NSF

• The method is general and can be used for a wide range of problems including other engineering systems in addition to environmental systems.

• This research is underway.

Page 99: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

NSF Project 1: Optimization Algorithms in Parallel and Serial

• Funded by the Computer Science (CISE) Directorate at NSF

• This project is for developing efficient optimization algorithms for “costly functions”, which take a long time for one evaluation (e.g. like a groundwater simulation model).

• The method is general and can be used for a wide range of problems including other engineering systems in addition to environmental systems.

• The new algorithms are applied in 2nd project.

Page 100: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Development of Parallel Algorithms

• This proposal focuses on developing function approximation algorithms that will be very efficient when done in parallel.

• One can re-code any optimization algorithm to run in parallel, but it might not be very efficient (e.g. runs only twice as fast with 10 processors as with one processor).

Page 101: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

MAPO: Multi-Algorithm Parallel Optimization

• MAPO is an optimization algorithm we are developing that is designed to find with parallel computation good solutions with relatively few function evaluations.

• The basic idea is that in each iteration there is a “committee” of algorithm experts, each of whom recommend several candidate points.

• .

Page 102: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

MAPO: Multi-Algorithm Parallel Optimization

• MAPO is an optimization algorithm we are developing that is designed to find with parallel computation good solutions with relatively few function evaluations.

• The basic idea is that in each iteration there is a “committee” of algorithm experts, each of whom recommend several candidate points.

• All candidate points are compared and P are selected for evaluation (P=no. of processors)

• .

Page 103: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

MAPO: Multi-Algorithm Parallel Optimization

• MAPO is an optimization algorithm we are developing that is designed to find with parallel computation good solutions with relatively few function evaluations.

• The basic idea is that in each iteration there is a “committee” of algorithm experts, each of whom recommend several candidate points.

• All candidate points are compared and P are selected for evaluation (P=no. of processors)

• Results of evaluations for f(x) values from all P processors are available for all algorithms to build their response surfaces in the next generation.

Page 104: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Evaluation Points in MAPO

Previously evaluated points

new point for evaluationMinimum distance

Variable 1

Variable 2

For P parallel processors we want to evaluate P points at once. P=4 in picture

Candidate point not evaluated

Page 105: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

MAPO Algorithm is Designed to be Coarse-Grained

Serial: Use Multiple Algorithms with Function Approximation to Generate N candidate points for Costly Function Evaluation

Serial: Use multiple criteria and machine learning methods to select the P Points at which Costly Function Evaluation is actually done

Parallel: do the P costly function evaluations (usually one function evaluation per processor on M processors)

Serial: Use the values of the new Function Evaluations to update the Function approximations

QUICK

QUICK

SLOW

One iteration loops through all these steps.

Page 106: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Test Functions in Numerical Results

• The following results are based application of function approximation methods for global optimization to four test functions, which are designed to mimic a typical global optimization problem.

Page 107: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Algorithms Used in Numerical Examples

• In MAPO, we used four algorithm experts.

• Each algorithm expert used a greedy search on Radial Basis Functions (RBF) plus a linear polynomial for function approximation.

• The form of the RBF function approximation was different (cubic, thinplate, multquadric, Gaussian) for each expert.

Page 108: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Alternative Parallel Algorithms

• The 8 alternative parallel algorithms were parallel versions of each of the 4 individual algorithm experts, parallelized in one of two ways:– no communication (Run the algorithm separately

on p processors without communication and stop when one of the processors finds a solution within 1% of optimum.)

– with communication (Run the algorithm separately on p processors and communicate all function evaluation values after each iteration so that improved function approximations can be made. Same stopping criterion).

Page 109: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Estimating Wall Clock Time with p Processors for f(x) Evaluations Requiring τ CPU minutes

Wall Clock Time(p) = Ep*(τ+c + A(p))

Ep= number of iterations (=number of function evaluations/ p)

p= number of processors

τ= time required for one costly function evaluation

c=communication time per iteration (negligible for large τ)

A(p)=serial calculations that do not depend on the evaluation time τ (e.g. doing the optimization search on function approximations and selecting evaluation points from the candidate points)

Page 110: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Importance of Ep

Wall Clock Time(p) = Ep (τ +C+ A(p))

Ep= number of iterations (=number of function evaluations/ p). Best if value is low.

p= number of processors

τ= time required for one simulation

C= communication time

A(p) is on the order of a few minutes. C is very small. τ is assumed to be at least thirty minutes and Ep is much greater than 1.

Hence Ep is important!

. The following graphs give Ep for alternative algorithms.

Page 111: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Number of Iterations on 8 Processors Required to Reach Within 1% of Optimum. Results are averaged over 10 runs. Value of 80 indicates some runs did not reach within 1% of optimum

0

10

20

30

40

50

60

70

80

90

Schoen4-10 MAPO c-no-c tp-no-c mc-no-c g-no-c c-wc tp-wc mq-wc q-wc

MAPO and Alternative Parallel Algorithms

Eva

luat

ion

s/ 8

pro

cess

ors

Best results are low values

MAPO

Alternative parallel algorithms

No Communication With Communication

Ep

Test Function Schoen4-10

Page 112: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Number of Iterations on 8 Processors Required to Reach Within 1% of Optimum. Results are averaged over 10 runs. Value of 80 indicates some runs did not reach within 1% of optimum

0

10

20

30

40

50

60

70

80

90

Schoen4-10 MAPO c-no-c tp-no-c mc-no-c g-no-c c-wc tp-wc mq-wc q-wc

MAPO and Alternative Parallel Algorithms

Eva

luat

ion

s/ 8

pro

cess

ors

Best results are low values

No Convergence at 80

Test Function Schoen4-10-Average of 10 runs

Ep

MAPO

Page 113: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Number of Iterations for Hartman 6 Test Problem for 8 processors. Each result is average of 10 runs, except 80 indicates some runs did not reach within 1 percent of optimum.

0

10

20

30

40

50

60

70

80

90

MAPO c-no-c tp-no-c mc-no-c g-no-c c-wc tp-wc mq-wc q-wc

MAPO and Alternative Algorithms

Fu

nct

ion

Eva

luat

ion

s/ 8

pro

cess

ors

Series1

Series2

Series3

Test Function Hartman 6

no convergence

MAPO

Page 114: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Number of Iterations on 8 Processors on Four Test Problems. Results are average of 10 runs. 80 indicates some runs did not reach 1% of the optimum

0

10

20

30

40

50

60

70

80

90

MAPO c-no-c tp-no-c mc-no-c g-no-c c-wc tp-wc mq-wc q-wc

MAPO and Eight Alternative Parallel Algorithms

Fu

nct

ion

Eva

luat

ion

s/ 8

pro

cess

ors

Goldstein-Price

Schoen3-10

Schoen4-10

Hartman 6

Comparison on Four Test Problems

MAPO is only method that converges for all runs on all 4 problems.

NO CONVERGENCE

Test Problem

Page 115: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Discussion of Numerical Results: Robustness of MAPO

• MAPO is the only algorithm that always finds a solution within 1% of the optimum for all test problems.

• MAPO was designed to be robust because it uses multiple algorithms.

Page 116: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Discussion of Numerical Results:Computational Speed of MAPO

• MAPO gives the fastest results for large τ (e.g. lowest E p ) of all the methods tried here on 8 processors, because it requires fewer function evaluations to find a good solution.

Page 117: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

General Applications of MAPO

• In these numerical examples we used four experts that were all RBF greedy methods that differed only in the functional form of the RBF.

Page 118: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Examples of Algorithms That Can Be Used in MAPO

• FA-greedy search with Radial Basis Functions RBF (find the local minimums of the RBF function approximation FA surface), which uses gradients of the FA.

• Gutmann’s RBF (2001) algorithm, which seeks to minimize the “bumpiness” of an FA that passes through a target minimum point.

• Our own new FA algorithms, which combine a search for minimum with selecting new points that improve the FA.

• Kriging• Neural Nets• Other Splines for Function Approximation.

Page 119: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Combining Parallel Optimization and Parallel Simulation

• Especially when combined with parallelized costly function evaluation, parallel optimization can lead to efficient use of a large number of processors.

Page 120: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Multiplicative Benefits of Parallelizing Optimization

• Assume the simulation alone runs efficiently for S processors and above S, efficiency diminshes.

• Assume optimization can run efficiently alone with M processor.

• Combining both optimization and simulation permits efficient use of SM processors.

• For example, if S=20, M=8, then the problem can be efficiently solved on 160 processors.

Page 121: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Purpose of Generating More Candidate Points Than Processors

• The MAPO algorithm uses multiple algorithms to generate a large number of candidate points .

• Only a subset of these candidate points is selected for actual costly function evaluation.

• The evaluation points are chosen from the candidate points based on a variety of criteria.

• In this way, the points selected for costly function evaluation have “competed” to be most likely to be helpful in finding the minimum and/or improving the function approximation.

Page 122: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Future Work

• Extend the current results to more processors to examine speed-ups.

• Incorporate more types of algorithm experts. A greater diversity of experts will probably generate even better results for MAPO.

• Apply MAPO method to additional “real” problems that actually do require long computation times for each simulation.

Page 123: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Possible Additional Directions for Costly Function Optimization

• Objective functions that have a more complex statistical structure.

• For example, do calibration assuming parameters are described by probablility distributions with correlations between parameters.

• Goal is to quantify uncertainty associated with model predictions.

• (This is a topic of an expected new NSF project done in collaboration with a statistician.)

Page 124: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

General Summary

• Function Approximation Optimization including the new algorithms shows promise for optimization analysis of costly, nonlinear models with multiple local minima.

• MAPO shows promise as an efficient parallel algorithm for costly functions.

Page 125: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.
Page 126: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Criteria for Selecting Evaluation Points from Candidate Points

• Criteria include– the value of the current function

approximations at the candidate point, – the distance from other points evaluated, and– disagreement among algorithm experts.

Page 127: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Our Radial Basis Function Derivative-Free Optimization Algorithm

• Following graph illustrates the search procedure and the function approximation.

• We use a Radial Basis Functions RBF for the function approximation.

• The objective function in the following contour plots is a two dimensional global optimization test function.

Page 128: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Exploratory Radial Basis Function Method Applied on Goldstein-Price

Best Value = 109.6351 (Global Min Value = 3)Points on right graph show the experimental design (SLHD)

After 6 function evaluations (experimental design)

Evaluation point

x1

x2

X1 and X2 are two parameters, and contours indicate error function value

Global min

There are 3 other local min

TRUE Value Function Approximation

Page 129: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Exploratory Radial Basis Function Method Applied on Goldstein-Price

Best Value = 7.8632 (Global Min Value = 3)

After 20 function evaluations

TRUE RESPONSE SURFACE

Page 130: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Exploratory Radial Basis Function Method Applied on Goldstein-Price

Best Value = 3.4139 (Global Min Value = 3)

After 40 function evaluations

TRUE RESPONSE SURFACE

Page 131: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

done

Page 132: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

How does our algorithm work?

One approach is pick the next point to satisfy an optimization problem based on both the function approximation value (which we want to be small in a local search) and the distance to the nearest previously evaluated points (which we want to be far for a global search).

We cycle the allowable minimum distance so that some search is local and some is global.

A proof shows this approach will eventually generate a point that is arbitrarily close to the global optimum.

Page 133: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Previously evaluated points

Potential new point for evaluation

Minimum distance

Variable 1

Variable 2

We would like to make the minimum distance large to explore the space.

Page 134: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Outline of Function Approximation Optimization

• 1. Use a “space filling” experimental design (e.g. SLHD) to select a limited number of evaluation points.

• 2. Make an approximation of the function (with Radial Basis Function splines) based on experimental design points.

• 3. Use our algorithm to select next function point for evaluation based on FA and location of previously evaluated points. (We have developed different algorithms. This is a research topic)

• 4. Construct a new function approximation that incorporates the newly evaluated point and all prior points.

• 5. Stop if reach maximum number of iterations. Otherwise go to Step 3

Page 135: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Optimization of Calibration of Chlorinated Ethenes Model

• The model involves solution of a system of partial differential equations by finite difference methods.

• There are 18 species variables at each node in the finite difference grid

• The equations are highly nonlinear because of biological reactions.

• Optimization applied to two cases: – hypothetical-8 minutes/simulation – field data application- 3 hours/simulation

Page 136: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

GLUE procedure for Calibration and Uncertainty

• Generalized Likelihood Uncertainty Estimation (GLUE) is an importance sampling method (Beven and Binley, 1992)

i

Groundwater model simulation

y(i– performance measure for

Assumed Input Sampling distribution of parameters

L(i– Likelihood weight for i

Calibration

Groundwater model simulation

C(i– Output measure of interest Cav = L(iC(i

Uncertainty

Weighted moments for O/P

Page 137: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Limitations of GLUE

• Assumes an input sampling distribution• No feedback of performance• Higher dimensions requires very large number of

simulations• Very few “good” solutions when even for modest

goodness-of-fit measures – Bounds with a small number of statistical samples are unreliable– Goodness-of-fit measure should be relaxed to get a larger

statistical sample, in such case the likelihood weights will all be nearly equal since majority of the points are poor

• Will require a very large number of simulations to produce more points for reasonable goodness-of-fit measure (for eg. R2 of 0.7 or higher)

Page 138: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

• Propose to use solutions searched during automated calibration in assessing forecast uncertainty

• Automated calibration algorithms (e.g. RS algorithm) search the parameter space intelligently due to feedback on performance

– Likely to produce several “good” solutions

– Assign likelihood weights to solutions searched

• Estimate weighted moments as before

• Computationally cheaper alternative – better suited for higher dimensional problems than GLUE

• Limitations are that the samples searched are not independent and identically distributed to get “true” bounds

Improving Importance Sampling

Page 139: Calibration, Sensitivity Analysis and Uncertainty Analysis Prof. Christine Shoemaker Pradeep Mugunthan, Dr. Rommel Regis, and Dr. Jennifer Benaman School.

Uncertainty Estimates for Total Moles of Chlorinated Ethenes Insitu

Bounds obtained using a filter of 0.01 for CNS

G2000G1000G500RS200 RSG20k TRUE141.00

142.00

143.00

144.00

145.00

146.00

147.00

To

tal

mo

les

of

chlo

rin

ated

eth

enes

5

6 12126

35

Number of points after applying filter

Our Method 1 with 200 function evaluations