Bayesian and Geostatistical Approaches
to Inverse Problems
Peter K. KitanidisCivil and Environmental
EngineeringStanford University
2
Outline:
• Important points• Current Work
3
Inverse Problem:
•Estimate functions from sparse and noisy observations;
•The unknowns are sensitive to data gaps or flaws (Problem is ill-posed in the sense of Hadamard);
•Data are insufficient to zero in on a unique solution;
•Usually, it is the small-scale variability that cannot be resolved.
4
5
Cheney, M. (1997), Inverse boundary-value problems, American Scientist, 85: 448-455.
6
Bayesian Inference Applied to Inverse Modeling
'''
'
p | pp
p | p d
y s ss
y s s s
Posterior distributionof unknown parameter
Prior distribution ofunknown parameter
Likelihood of unknownparameter given data
y : measurementss : “unknown”
7
Bayesian Inference Applied to Inverse Modeling
'''
'
p | pp
p | p d
y s ss
y s s s
Combined information (data and structure)
Information about structure
Information from observations
y : measurementss : “unknown”
8
How do you get the structure?
• We often use an “empirical Bayes” in which the structure pdf is parameterized and inferred from the data; the approach is rigorous and robust.– Alternative interpretation: We use cross-
validation.
• In specific applications, we may use “geological” or other information to describe structure.
9
Computational cost
• Reduce cost by dealing with special cases, or
• Bite the bullet and use computer intensive numerical methods (MCMC, etc.)
'''
'
p | pp
p | p d
y s ss
y s s s
10
S t T 2
x12 2
x22 st x1 x2 #
A source identification A source identification problemproblemIdentify the pumping rate at an extraction Identify the pumping rate at an extraction
well from head observations, in a well from head observations, in a neighboring monitoring well: neighboring monitoring well:
2 21 2
0 0
1 1exp
4 4
t x xt s d
T t D t
The importance of The importance of properly weighing properly weighing
observationsobservations
110 100 200 300 400 500 600 700 800 900 1000
-0.005
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Time t [min]
Pum
ping
(ex
trac
tion)
rat
e q
[m3/m
in]
Best Estimate for Pumping Rate
EstimateActual
Over-weighting
Observations
Five slides from:
Kitanidis, P. K. (2007), On stochastic inverse modeling, in Subsurface Hydrology Data Integration for Properties and Processes edited by D. W. Hyndman, F. D. Day-Lewis and K. Singha, pp. 19-30, AGU, Washington, D. C.
120 100 200 300 400 500 600 700 800 900 1000
-0.005
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
Time t [min]
Pum
ping
(ex
trac
tion)
rat
e q
[m3/m
in]
Best Estimate for Pumping Rate
EstimateActual
Under-weighting
Observations
13
0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.025
5.2
5.4
5.6
5.8
6
6.2x 10
-4
cR
0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 0.026
7
8
9
10x 10
-7
Act
ual M
SE
theta ratio
2
1
ˆ1
ii
m
k
ssm
MSE
measure validationcrossa cR
14
10-4
10-3
100
Q2
10-4
10-3
10-2
10-1
100
101
Act
ual M
SN
E
theta2
2
2
1
ˆ1
i
iim
k
ss
mNMSE
measure validationcrossa 2 Q
0 100 200 300 400 500 600 700 800 900 1000-0.005
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035Estim ateU pper C onf.Low er C onf.Actual
Tim e [m in]
Pu
mp
ing
Rat
e [
m/m
in]
3
“Optimal” Weighting
16
The cost of computations…
• Moore’s law: Cost of computations is halved every 1.5 years. Thus, between 1975 and 2006: 2^(31/1.5)=1.7E6.
• $5,000 of computer usage for a project in 1975.
• 1975$5,000 -- adjust for inflation -> 2006$20,000.
• $20,000/1.7E6 corresponds to 1 cent worth of computational power in 2006.
17
From the BOISE HYDROGEOPHYSICAL RESEARCH SITE (BHRS)
18
• Based on Michalak and Kitanidis (2003 and 2004)
• Use EM method on marginal distributions to find optimal parameters for structure and epistemic error.
• Employ a Gibbs sampler to build a set of conditional realizations of posterior pdf. (A large enough set of conditional realizations has the same statistical properties as the actual posterior distribution.)
METHOD—Markov Chain Monte Carlo
19
PCE data at location PPC13. Measurement data and fitted concentrations resulting from the estimated boundary conditions.Michalak, A.M., and P.K. Kitanidis (2003) “A Method for
Enforcing Parameter Nonnegativity in Bayesian Inverse Problems with an Application to Contaminant Source Identification,” Water Resources Research, 39(2), 1033, doi:10.1029/2002WR001480.
A problem of forensic environmental engineering
20
Location PPC13. Estimated time variation of boundary concentration at the interface between the aquifer and aquitard. The end time represents the sampling date (June 6, 1996).
21
PCE data at location PPC13 with non-negativity constraint. Measurement data and fitted concentrations resulting from the estimated boundary conditions.
22
Location PPC13 with non-negativity constraint. Estimated time variation of boundary concentration at the interface between the aquifer and aquitard. The end time represents the sampling date (June 6, 1996).
23
TRACER RESPONSE—Synthetic Case
Output
Without Error
With Error
Tracer Input
True Transfer Function
Fienen, M. N., J. Luo, and P. Kitanidis (2006), A Bayesian Geostatistical Transfer Function Approach to Tracer Test Analysis, Water Resour. Res., 42, W07426, 10.1029/2005WR004576.
24
Current Work
• Large variance and highly nonlinear problems (Convergence of Gauss-Newton, usefulness of Fisher matrix, etc.)
• Tomographic inverse problems (development of protocols, processing of large data sets.)
25
Current Work (cont.)
• Identification of zone boundaries.• Solution methods for very large
data sets.• Making tools available to users.
26
Identification of zone boundaries:
Example• Linear
tomography• Zones + small-
scale variability• measurement
error (2%)
Four slides from the work of Michael Cardiff
27
Example Problem Performance
28
We are developing…
• Stochastic analysis of zone uncertainty
• Merging of structural (level set) and geostatistical inverse problem concepts
• Use of level sets for joint inversion
29
Toolbox for
COMSOL Multiphysics is a commercial general purpose PDE solver.
We are adding inverse-model capabilities, including adjoint-state sensitivity analysis and stochastics.
See: Cardiff, M, and P. K. Kitanidis, “Efficient solution of nonlinear underdetermined inverse problems with a generalized PDE solver”, Computers and Geosciences, in review, 2007.
30
•Stochastic: (aka probabilistic or statistical): We assign a probability to every possible solution.
Our approach is:
•Bayesian: Because the Bayesian approach provides a general framework.
•Practical: Our methods are evolving, with particular emphasis on practicality, robustness, and computational efficiency.
•Geostatistical: We have adopted the best ideas from the geostatistical school.
31
For More Info
See publications on the WWW:http://
www.stanford.edu/group/peterk/publications.htm
Top Related