Calculation of the Inverse of the Covariance · 2017-11-02 · Calculation of the Inverse of the...

Mathematical Geology, Vol. 30, No. 7, 1998

Calculation of the Inverse of the Covariance1

Dean S. Oliver2

In reservoir characterization, the covariance is often used to describe the spatial correlation andvariation in rock properties or the uncertainty in rock properties. The inverse of the covariance,on the other hand, is seldom discussed in geostatistics. In this paper, I show that the inverse isrequired for simulation and estimation of Gaussian random fields, and that it can be identified withthe differential operator in regularized inverse theory. Unfortunately, because the covariance matrixfor parameters in reservoir models can be extremely large, calculation of the inverse can be aproblem. In this paper, 1 discuss four methods of calculating the inverse of the covariance, two ofwhich are analytical, and two of which are purely numerical. By taking advantage of the assumedstationarity of the covariance, none of the methods require inversion of the full covariance matrix.

INTRODUCTION

The probability density for a multinomial random variable m = (m1 . . . , mM)is proportional to exp [ - 1 ( m - ^)TC-1(m - ^)] where C is the covariancematrix and the mi, are the means of the mi, so calculation of the probability of arandom variable requires evaluation of a product of the form

In many geostatistical or reservoir characterization applications, m represents atwo- or three-dimensional grid of values of porosity or permeability within areservoir. In such cases, it is not unusual for the dimension of m to exceed 106.It would then be extremely difficult, if not impossible, to directly invert the

'Received 30 June 1997; accepted 11 March 1998.2Department of Petroleum Engineering, The University of Tulsa, 600 South College Avenue, Tulsa,Oklahoma 74104. e-mail: [email protected]

0882-8121/98/1000-0911S15.00/1 © 1998 International Association for Mathematical Geology

911

KEY WORDS: operator inverse, matrix inversion, Gaussian simulation, regularization, differen-tial inversion.

covariance matrix. Two questions arise: (1) Do we need to calculate productsof this form? (2) If so, is there a better approach than matrix inversion? Toanswer these questions consider briefly two applications from geostatistics (es-timation and simulation) and an application from inverse theory (regularization).

ESTIMATION

If, prior to measurements, a set of model parameters m = { m 1 . . . , mM}is known to be multivariate Gaussian with mean, m = {m1 ... , mM} andcovariance CM, then the best (maximum probability density) model after incor-porating data, dobs, with normally distributed errors, e, related to the parametersas dobs = g(m) + e, is found by minimizing the function

SIMULATION

Most of the common methods of geostatistical simulation for Gaussianrandom fields (e.g., sequential simulation, turning bands, matrix factorization,

with respect to m. The standard Gauss-Newton approach to minimization ofS(m) is to iteratively evaluate

where G1 is the matrix of sensitivities of data to small changes in model param-eters, i.e., G1 = [Vmg(ml)]T and a, is an arbitrary small positive real number,chosen to ensure that S(m l+1) < S(m l). But, if the number of gridblocks islarge, the matrix CM may be too large to invert and the Gauss-Newton methodis rewritten as follows:

This formulation seems to avoid the need to evaluate CM1 , but in fact the

parameter al, which controls the rate of convergence, is unknown. Equation (4)tells us what direction to go to improve the solution but it does not specify howfar to proceed in that direction. The parameter controlling the size of the steptoward the minimum, al, can only be estimated by evaluating S(m), whichrequires the calculation of CM1(m — m).

912 Oliver

Calculation of the Inverse of the Covariance 913

etc.) do not require calculation of CM1. These methods, however, are not capable

of simulating Gaussian random fields that have been conditioned to productiondata. The conditional probability density function for the random variable mdescribed in the previous section is

where S(m) was defined in Equation (2). The goal of any simulation method,whether it be the genetic algorithm, simulated annealing, or a stationary Markovchain Monte Carlo method, should be to generate realizations from the densityfunction. Regardless of method, it is necessary to evaluate S(m), and henceCM

1, repeatedly.A practical method for conditioning large grids of permeability and porosity

values to nonlinear data has been proposed by Oliver, He, and Reynolds (1996).In that method, one begins by generating realizations of the permeability andporosity fields, muc, that are conditional to all data except production data. Next,one generates a realization of the production data, duc. The conditional reali-zation is the sum of an unconditional realization and a smooth correction to thepermeability and porosity fields that minimizes the following objective function:

Note that this approach to conditional simulation is computationally equivalentto the problem of estimation in that evaluation of [m — muc]TCM

1 [m - muc] isstill required.

REGULARIZATION

In some forms of inverse theory, the ill-posedness of the estimation problemis avoided by adding a regularization or smoothing term to the data misfit func-tion. The result in an objective function which, for a twice differentable contin-uous parameter field, m(x), might take the form (Lukas, 1980; Chung andKravaris, 1990):

For discrete values on a grid, the objective function might be written as(Oldenburg and others, 1993):

914 Oliver

The matrix A in Equation (8) is a finite difference approximation to the deriv-atives in Equation (7) and a is a weighting parameter that is determined by theexpected magnitude of the errors in the data and the supposed variance of themodel parameters.

A comparison of the objective functions from geostatistics with typicalobjective functions from inverse theory suggests that the matrix A in Equation(8) might be identified with an inverse covariance matrix and the differentialoperator in Equation (7) might be identified with the inverse of a covariancefunction.

DIFFERENTIAL INVERSION

The type of inverse that we are concerned with is the convolution inverseor the matrix inverse, depending on whether the random variable m is a randomfunction or a random finite-dimensional vector. In either case, most geophysicalcovariance operators can be thought of as smoothing operators and their inversescan be thought of as "roughening" operators. (The principal exception is thenugget variogram which provides no smoothing.) This idea is consistent withthe previous identification of CM1 with a differential operator in the regulari-zation approach.

The function g is said to be a one-dimensional convolution inverse of K if

One fairly straightforward way to calculate the convolution inverse is to use thefact that the (properly defined) Fourier transform of a convolution product isthe product of the Fourier tranforms, i.e.,

where "hats" denote Fourier transformed variables. Thus, formally,

Unfortunately, most covariance functions do not have inverses that can be writ-ten in terms of simple functions. Instead, we must make use of generalizedfunctions which include the Dirac delta function and its derivatives.

As a simple example of a convolution inverse, suppose K is the exponentialcovariance in one dimension, that is, K = a2 exp (-|x|/a). Then


and, because 5(n)(x) = (i£)", where d(n)(x) is the nth derivative of the Diracdelta function,

We can easily confirm that this is the convolution inverse by computing theconvolution of K with g.

Computation of terms of the form [m - m]t CW1[m - m] becomes quite

simple for this choice of covariance, as the following equation illustrates.

The last line of Equation (15) is identical to the regularization term inEquation (7), thus showing that regularization of a one-dimensional randomfield with a smoothing term of the form jf (n)2 dn + a 2 j f (?)2 dn is equivalentto assuming an exponential model covariance function in Bayesian inversion.For this choice of covariance, there would be no good reason to calculateCM

1 by inverting the covariance matrix numerically because the inverse is sim-ply a low order differential operator. If values of the parameters are only avail-able on a grid with spacing h, then from Equation (13), we might obtain thefollowing approximation to the inverse as a 1-D stencil:

916 Oliver

Formal methods for differential inversion of convolution operators in mul-tiple dimensions have been derived by Eddington (1913), and Hohlfeld andothers (1993). In addition, King and Smith (1988) used differential inversion toformally derive the square roots of several inverse covariance operators. In thispaper, I follow the approach of Murthy (1995) but extend his results to two andthree dimensions. Consider two-dimensional convolution equations in polar co-ordinates (Bracewell, 1978) of the form

where r = Vx2 + y2, R2 = r2 + r'2 - 2rr' cos 6, d(r) is the one-dimensionaldelta function (J!"f(r) 6(r - a) dr = f(a)), and 6(r)/(*r) = d(x, y) is thetwo-dimensional delta function. Except for the restriction that the kernel Kdepends only on the radial coordinate r, this is the standard formula for theconvolution product in two dimensions. The corresponding relationship betweenthe two-dimensional Fourier transforms of K and g is

If UK is analytic at the origin and a function only of £2 = u2 + v2, then

Because

we can write

After inversion, we obtain formally

As a simple example, consider Whittle's covariance (Whittle, 1954), Kw(r)= a2rK1(r/a)/a where K1 is the modified Bessel function of the second kind,of order one. The zero-order Hankel transform (corresponding to the two-di-mensional Fourier transform) of Kw is


SO

Term-by-term inversion, using Equation (22), gives

so

Note that Kw is a solution of the following partial differential equation,

which confirms that Kw1 * Kw = 6(x,y).

A second simple example is provided by the three-dimensional exponentialcovariance, Ke(r) = a2 exp ( - r/a). The order-1/2 Hankel transform (corre-sponding to the three-dimensional Fourier transform) of Ke is

so

where, in this example, £2 = u2 + v2 + w2, corresponding to the real-spacevariable r2 = x2 + y2 + z2. Term-by-term inversion gives

so the convolution product of the inverse of the covariance with a function / is

In most cases, however, derivation of an inverse is not so simple. Considerthe two-dimensional exponential covariance, Ke(r) = a2 exp (-r/a), for whichthe Hankel transform is

Formal term-by-term Fourier inversion of 1/Ke results in an infinite serieswhose terms involve high order derivatives:

918 Oliver

SO

Although the coefficients of high order derivatives are small for a < 1, itis unclear that useful approximation to the inverse operator can be obtained byneglecting the high order terms.

Whittle's covariance in two dimensions and the exponential covariance inone and three dimensions are fairly unusual in that they can be easily invertedanalytically. In his monograph, Tarantola (1987, p. 427-429) derived inverseoperators for several covariance models including the one- and three-dimensionalexponential covariance functions. Also, King and Smith (1988) derived thesquare root of the inverse covariance operators for nearest neighbor models intwo and three dimensions. Other inverses could be derived for special cases,but differential inversion is probably of limited usefulness in practice.

FINITE DIFFERENCE INVERSION

One technique that seems to be intermediate between the formal solutionin terms of derivatives of delta functions and a purely numerical approach toinversion is the direct expansion of the inverse in terms of powers of the centraldifference operator. A more detailed description of the following method canbe found in Froberg (1965, p. 151-154).

Suppose that we wish to calculate the one-dimensional convolution inverseof the Gaussian correlation function

on a discrete grid with spacing h. The continuous Fourier transform of C is

so, formally using the method of the previous section, we calculate the "inversecovariance function" as


where Dx is the differentiation operator. At this point, it would be straightforwardto write C-l as an infinite series in derivatives of the delta function. It wouldbe more useful, however, to have an expression for C - 1 (x ) in terms of thecentral difference operator 6x. which is defined by

and

Define U = hDx and accept (see Froberg, 1965, p. 150) the fact that thecentral difference operator is related to the differentiation operator by the fol-lowing formula.

Consider an expansion of a function Y = exp ( - A z 2 ) in terms of powersof u = 2 sinh (z/2). If we identify z with U, Y with C-1 and u with 8x weobtain an expansion of C-1 in terms of 8X. To complete the equivalence, I havedefined X = a2/(4h2). The Taylor series expansion is used to develop the terms,i.e.,

so we need to evaluate derivatives of Y with respect to u at u = 0. Begin bydifferentiating Y with respect to u.

Obviously, K( l )(0) = 0. The condition for the second derivative is obtained bydifferentiation of Equation (42).

Thus, a differential equation involving Y(2) and lower order derivatives is

At U = 0 we obtain

920 Oliver

To obtain the coefficients of higher order terms in the expansion, we dif-ferentiate Equation (44) repeatedly and evaluate each ODE at u = 0 to obtaina relation for y(n)(0). The following results were obtained with the help ofMathematica (Wolfram, 1996).

Now we see that we can write the inverse of the Gaussian covarianceoperator as

In order to evaluate the accuracy of this approach to generating a smallstencil for the covariance inverse, I calculated finite stencils for the Gaussiancovariance with a = h = 1, then numerically inverted the stencils to determineif the original covaraince was recovered. Figure 1 shows a series of stencils


Figure 1. The one-dimensional inverse Gaussian covariance estimated from the first few termsof the expansion of the series of finite differences (left) and the inverse of the estimated inversecompared to the actual Gaussian (right). Top row used central differences through 8x

0. Bottomrow used central differences through 64. Other two rows are intermediate. For this example,a = 1.0 and h = 1.0.

922 Oliver

consisting of 11, 9, 7, and 5 terms. The 11-term approximation returns a co-variance that is indistinguishable from the exact result. The smaller stencilsreturn progressively worse approximations, but even the five-term stencil mightbe adequate for smoothing. Note that it corresponds to a covariance that is morehighly peaked than the true covariance.

I repeated the previous exercise with a broader Gaussian kernel, expectingthat, as the kernel becomes smoother, the inverse should get rougher. The rangeof the covariance in this example is 3 times as large as the range in the previousexample and the grid spacing is unchanged, i.e., a = 3 and h = 1. Figure 2shows a series of stenciles consisting of 11, nine, seven, and five terms. Al-though the number of significant nonzero terms in the stencil is slightly largerthan the previous example the conclusions are largely similar. The 11- and nine-term approximations are very close to the true result, and the seven-term ap-proximation might be acceptable, but the five-term stencil is probably inade-quate.

We saw previously that the 2-D exponential covariance is another operatorwhose inverse is not easily derived analytically although the operator should befairly well conditioned. I will use the expansion in central differences to obtainan approximate inverse.

Let the covariance kernel be given by

The 2-D Fourier transform of the covariance is

so, the kernel of the inverse covariance can be formally written as

which, if expanded in a Taylor series in Dx and Dy , results in an infinite seriesof high order derivatives.

Let 8X be the central difference operator in the x direction, i.e.,

and

Cross products of 62 and By are similarly defined.

Figure 2. The one-dimensional inverse Gaussian covariance estimated from the first fewterms of the expansion of the series of finite differences (left) and the inverse of the estimatedinverse compared to the actual Gaussian (right). Top row used central differences through510. Bottom row used central differences through 64. Other two rows are intermediate. Forthis example, a = 3.0 and h = 1.0.


924 Oliver

The arrangement of terms in Equation (53) is not meant to imply that this is amatrix quantity. The spatial arrangement of terms does make the sum easier tovisualize, and corresponds to the arrangement of the stencil. In this case, Iassumed that the grid spacing h was the same in the x and y directions althoughthat is clearly not a necessary assumption.

Define Ux = hxDx where hx is the grid spacing in the x-direction. As before,the central difference operators are related to the difference operators as follow,

Our objective is to evaluate (1 - a2D2 - b2D2)3/2 from Equation (50) in termsof Sx and dy. We begin by writing

where X = a2/h2 is the square of the ratio of correlation length to grid spacingin the x-direction. The first few terms of the Taylor series expansion of Y interms of 5x and 8y will look like this

where I have used u to represent 6Z and v to represent 8y.The coefficients in the series can be determined by deriving a series of

partial differential equations that can be used to evaluate derivatives of Y at theorigin. Two possible differential equations to start with are

Expressions for derivatives of Y, for use in the Taylor series expansion,


can be obtained by repeated differentiation of Equations (57) and (58) followedby evaluation at the origin.

As an example, if a and b, the scaling variables in the 2-D exponential covar-iance, are both equal to the grid spacing, and if we only use the constant, 62,

926 Oliver

b2, 54, b2 d2, and 64 terms in the Taylor series expansion of the inverse covariancekernel we obtain the following finite difference stencil.

One way to check the goodness of this solution is to compare its numericallycalculated convolution inverse using digital Fourier transforms to the exponentialcovariance kernel that we started with. Figure 3 shows that the inverse is almosta perfect replica of the original kernel except at the central value. This does notseem to be a result of using a small stencil because as more terms are added(see the bottom row of Fig. 3) the inverse does not seem to improve.

In this case, a digital Fourier transform routine was used to calculate theinverse of the 2-D exponential covariance on a grid, and to calculate the inverseof the inverse. (Note that it is not always possible to use the DFT for this.) In

Figure 3. Comparisons of a slice through the two-dimensional "inverse of the inverse" calculatedusing a digital Fourier transform with the true exponential covariance (solid) for fourth-order terms(top row), sixth-order terms (lower left), and eighth-order terms (lower right).


Figure 4. Purely numerical calculation using a 2-D DFT. On the right the inverse of the inverseis seen to be indistinguishable from the original covariance function.

Figure 4, the inverse and the inverse of the inverse are compared with theexponential covariance. The agreement in this case is nearly perfect.

NUMERICAL INVERSION

Numerical calculation of the kernel function of the inverse of covarianceoperator (e.g., using a numerical inverse Hankel transform following Chave,1983) is generally impossible because the kernel functions involve derivativesof delta functions. Fortunately, for practical applications, we only require valuesof the covariance and the inverse covariance at grid locations. A seeminglygeneral and efficient method of inversion for the discrete problem is to use thediscrete Fourier transform (DFT) in multiple dimensions. Applying the DFT tothe problem of calculating the inverse of a 2-D isotropic exponential covariancewith range parameter a that is 4 times larger than the grid spacing results in adiscrete inverse that appears to be well approximated by a 5 x 5 stencil.

The discrete Fourier transform method is, however, not a foolproof methodof determining the convolution inverse for typical covariance functions. Thebiggest problem is that the magnitude of the Fourier transform is typically verysmall in some regions of the Fourier domain. Any error in those small valuesis magnified when the reciprocal is calculated. As would be expected, theGaussian covariance, because it is smooth at the origin, causes greater problemsthan the exponential covariance.

Another purely numerical approach is to choose, a priori, the size of thestencil, then solve a system of linear equations that must be satisfied by theelements of the stencil in order for the stencil to approximate the convolutioninverse. This is approximately the approach that Oldenburg, McGillivray, andEllis (1993) used except that they started with a 2-D regularization stencil, thennumerically calculated its inverse. Two problems can occur in this approach.The first is that it can be very difficult to guess a priori the number of termsthat might be needed for the stencil. The second is that determination of the

928 Oliver

smooth covariance stencil from the regularization stencil is easier than the re-verse problem. Despite the potential problems, this approach is certainly a viableand general option.

As an example, consider the 2-D isotropic Gaussian covariance with rangeparameter a = h = 1. The stencil for this covariance, determined from exp(-(i2 + j2)), is approximately given by

The goal is to calculate an inverse stencil for this covariance. It is not obvioushow big the inverse stencil must be in order to obtain reasonable accuracy, buta 5 x 5 stencil might be a practical first guess. The convolution of the 3 X 3covariance with the 5 X 5 covariance inverse gives ten independent equationsthat must be satisfied by the six distinct elements of the inverse stencil. Assumingthat I wrote the ten equations correctly, a least-squares solution is given by

We can check this solution by convolving the covariance stencil in Equation(61) with the inverse stencil in Equation (62). The product should have a 1 inthe center and 0s everywhere else. Except for a few terms on the edges, theactual product,

is closer to the desired result.


2-D APPLICATION

After an inverse stencil has been calculated by one of the methods describedin the previous sections, the convolution product of the covariance inverse witha two- or three-dimensional property field can be performed quite efficiently.Here I show a simple two-dimensional example of a kriged surface to whichthe inverse stencil is applied. I began by distributing 40 data locations randomlyin a 128 X 128 grid. A property value at each of the chosen locations (Fig. 5)was assigned from the normal distribution, then the property field was krigedusing an exponential variogram with a range parameter a four times as large asthe grid spacing. The result of kriging is the relatively smooth surface shownon the left side in Figure 6.

The inverse stencil for this problem was calculated using a DFT algorithm.Although it qualitatively appeared that a 5 X 5 stencil would be sufficient, Iused a 7 X 7 stencil to represent the inverse. A convolution of this stencil with

Figure 5. Locations of data observations for a test problem.

930 Oliver


the kriged surface results in the "spiky" surface on the right side of Figure 6.Note that the surface is approximately zero except at data locations. The meanabsolute deviation of the inverted field from the true field is approximately 0.2in this example.

DISCUSSION

In this paper, I described a number of different ways to calculate the co-variance inverse without inverting the full covariance matrix for a parameterfield. In some cases, it was posible to invert the covariance function analyticallyto obtain an inverse covariance operator that involves only low order derivatives.In other cases, it was possible to use the discrete Fourier transform to numeri-cally calculate the inverse. I also demonstrated a fairly general but complextechnique that generates the terms in a stencil as a series expansion of the inverseoperator in terms of powers of central difference operators. Stencils of differentsizes, corresponding to different approximations sometimes resulted in quitedifferent stencils but gave almost identical results when the stencil was inverted.

There are several reasons that it might be desirable to be able to evaluatethe inverse of the model covariance matrix, C-1. One is to be able to evaluatethe "objective function" in reservoir characterization. This is necessary bothfor implementation of the restricted step method with Gauss-Newton, and forconditioning of Gaussian Random Fields to production using simulated anneal-ing or stationary Markov chain Monte Carlo methods.

Note that we will often want to calculate C-1 where CM, is not the sta-tionary a priori covariance from variogram analysis but rather the conditional(a posteriori) covariance matrix that accounts for the incorporation of staticdata. In this case we have

and we would typically want to calculate products like the following:

The last term is clearly easy to evaluate if one proceeds by first calculating theproduct G(m - m'). Thus we need only be concerned with the problem ofdetermining C - 1 , the inverse of the stationary covariance.

Values of elements in the inverse stencil are functions of grid spacing. Thisdependence is explicitly shown in Equation (16), (47), and (59). The greatestefficiency is gained if the entire inverse covariance matrix can be replaced by asingle, relatively small, stencil. If only one stencil is used but the grids are notuniform, then the effect is to use a different variogram in different parts of the

field (actually the same variogram model but a different range). Recall, however,that the purpose of using the stencil is as an approximation to the inverse of thecovariance so that a smooth (or small) correction to an unconditional realizationcan be calculated. The sensitivity functions for most dynamic data are alreadysomewhat smooth and the corrections to the property fields in an iterative up-dating procedure involve the convolutions of C-1 with the sensitivity functions(which are smoother still). For incorporation of many types of dynamic data,doubling the range of the variogram might not make a great deal of differencein the correction surface, although this should probably be investigated.

REFERENCES

Ababou, R., Bagtzoglou, A. C., and Wood, E, F., 1994, On the condition number of covariancematrices in kriging, estimation, and simulation of random fields: Math. Geology, v. 26, no.1, p. 99-133.

Bracewell, R. N., 1978, The fourier transform and its applications, 2nd edn.: McGraw-Hill, NewYork, 444 p.

Chave, A. D., 1983, Numerical integration of related Hankel transforms by quadrature and contin-ued fraction expansion: Geophysics, v. 48, no. 12, p. 1671-1686.

Chung, C. B., and Kravaris, C., 1990, Incorporation of a Priori Information in Reservoir HistoryMatching by Regularization: unpublished manuscript, SPE-21615, available from the Societyof Petroleum Engineers, Richardson, Texas, 42 p.

Eddington, A. S., 1913, On a formula for correcting statistics for the effects of a known probableerror of observation: Monthly Notices Roy. Astronom. Soc., v. 73, p. 359-360.

Froberg, C.-E., 1965, Introduction to numerical analysis, 2nd edn.: Addison-Wesley, Reading,MA, 433 p.

Hohlfeld, R. G., King, J. I. F., Drueding, T. W., and Sandri, G. v. H., 1993, Solution ofconvolution integral equations by the method of differential inversion: SIAM Jour. Appl. Math.,v. 53, no. 1, p. 154-167.

King, P. R., and Smith, P. J., 1988, Generation of correlated properties in heterogeneous porousmedia: Math. Geology, v. 20, no. 7, p. 863-877.

Lukas, M. A., 1980, Regularization, in Anderrsen, R. S., de Hoog, F. R., and Lukas, M. A.,eds., The application and numerical solution of integral equations: Sijthoff & Noordhoff In-ternational, Alphen aan den Rijn, The Netherlands, p. 151-182.

Murthy, A. S. V., 1995, A note on the differential inversion method of Hohlfeld et al.: SIAM Jour.Appl. Math., v. 55, no. 3, p. 719-722.

Oldenburg, D. W., McGillivray, P. R., and Ellis, R. G., 1993, Generalized subspace methods forlarge-scale inverse problems: Geophys. J. Int., v. 114, p. 12-20.

Oliver, D. S., He, N., and Reynolds, A. C., 1996, Conditioning permeability fields to pressuredata, in Heinemann, Z. E., and Kriebernegg, M., eds., Proceedings of the 5th EuropeanConference on the Mathematics of Oil Recovery: Mining University Leoben, Austria, p.259-269.

Tarantola, A., 1987, Inverse problem theory: Methods for data fitting and model parameter esti-mation: Elsevier, Amsterdam, The Netherlands, 613 p.

Whittle, P., 1954, On stationary processes in the plane: Biometrika, v. 41, p. 434-449.Wolfram, S., 1996, Mathematica, 3rd ed: Wolfram Media, Champaign, IL, 1395 p.

932 Oliver


APPENDIX. ILL-CONDITIONED MATRIX

One of the problems with calculating the inverse is that different inverseswork approximately the same but have quite different magnitudes. This can beseen most clearly in Figure 2. One explanation for the multiplicity of inversessis that the covariance matrix is poorly conditioned (Ababou, Bagtzoglou, andWood, 1994). If some eigenvalues are very small, the covariance matrix canbe decomposed approximately as

where Ap is a diagonal matrix of those eigenvalues of C whose magnitudesexceed a cutoff value e, that is Xn < e for n > p. The columns of Up are theeigenvectors corresponding to the first p eigenvalues of C. Because the covari-ance matrix C is symmetric positive definite, this is equivalent to the singularvalue decomposition.

If we could show that C-1 ~ Up Ap-1 Up + aA for certain matrices A and

arbitrary a < 1, then we should be reassured that, although the inverses wecalculate may not be unique, they do satisfy the required relation, CC-1 = I.Let

where Un is the nth eigenvector of C and n > p. Clearly, A is symmetric and

because Xn is assumed to be very small.

Calculation of the Inverse of the Covariance · 2017-11-02 · Calculation of the Inverse of the...

Documents

Transcript of Calculation of the Inverse of the Covariance · 2017-11-02 · Calculation of the Inverse of the...