4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging...

24
1 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging - introduction to geostatistics Spline interpolation was originally developed for image processing. In GIS, it is mainly used in visualization of spatial data, where the appearance of interpolated surface is important. In geology and geomorphology, on the other hand, a different interpolation method called ‘kriging’ is widely used which was developed by a South African geologist D. G. Krige. 4. Preprocessing Though originally developed for geological analysis, kriging is now widely used not only in geology and geomorphology but also in other fields related to GIS, say, human geography, epidemiology, biostatistics, and archaeology. 4. Preprocessing Kriging is a part of geostatistics, a statistics that treats spatially distributed (and usually countinuous) stochastic phenomena. Geostatistics includes kriging, spatial modeling of continuous surface, spatial continuous processes, spatiotemporal statistics, and so forth. 4. Preprocessing References - geostatistics 1. Isaak, E. H. and Srivastava, R. M. (1989): An Introduction to Applied Geostatistics, Oxford University Press. 2. Cressie, N. (1993): Statistics for Spatial Data, 2nd Edition, John Wiley. 3. Wackernagel, H. (1995): Multivariate Geostatistics, Springer. 4. Christakos, G. and Hristopulos, D. T. (1998): Spatiotemporal Environmental Health Modelling, Kluwer. 4. Preprocessing References – geostatistics (cntd.) 5. Chiles, J.-P. and Delfiner, P. (1999): Geostatistics: Modeling Spatial Uncertainty, John Wiley. 6. Christakos, G. (2000): Modern Spatiotemporal Geostatistics, Oxford University Press. 7. Webster, R. and Oliver, M. A. (2001): Geostatistics for Environmental Scientists, John WIley. 8. Mallet, J.-C. (2002): Geomodeling, Oxford University Press. 4. Preprocessing 4.6.1 Variogram, covariogram, correlogram S: Study region A: Area of S f(x): Surface function defined Our objective is to estimate f(x) in S from observed data at sample points. S 83 66 74 76 80 61

Transcript of 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging...

Page 1: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

1

4. Preprocessing

4.6 Two-dimensional continuous interpolation 3:Kriging - introduction to geostatistics

Spline interpolation was originally developed for image processing. In GIS, it is mainly used in visualization of spatial data, where the appearance of interpolated surface is important.

In geology and geomorphology, on the other hand, a different interpolation method called ‘kriging’ is widely used which was developed by a South African geologist D. G. Krige.

4. Preprocessing

Though originally developed for geological analysis, kriging is now widely used not only in geology and geomorphology but also in other fields related to GIS, say, human geography, epidemiology, biostatistics, and archaeology.

4. Preprocessing

Kriging is a part of geostatistics, a statistics that treats spatially distributed (and usually countinuous) stochastic phenomena.

Geostatistics includes kriging, spatial modeling of continuous surface, spatial continuous processes, spatiotemporal statistics, and so forth.

4. Preprocessing

• References - geostatistics

1. Isaak, E. H. and Srivastava, R. M. (1989): An Introduction to Applied Geostatistics, Oxford University Press.

2. Cressie, N. (1993): Statistics for Spatial Data, 2nd Edition, John Wiley.

3. Wackernagel, H. (1995): Multivariate Geostatistics, Springer.

4. Christakos, G. and Hristopulos, D. T. (1998): Spatiotemporal Environmental Health Modelling, Kluwer.

4. Preprocessing

• References – geostatistics (cntd.)

5. Chiles, J.-P. and Delfiner, P. (1999): Geostatistics: Modeling Spatial Uncertainty, John Wiley.

6. Christakos, G. (2000): Modern Spatiotemporal Geostatistics, Oxford University Press.

7. Webster, R. and Oliver, M. A. (2001): Geostatistics for Environmental Scientists, John WIley.

8. Mallet, J.-C. (2002): Geomodeling, Oxford University Press.

4. Preprocessing

4.6.1 Variogram, covariogram, correlogram

S: Study regionA: Area of Sf(x): Surface function defined

Our objective is to estimate f(x) in S from observed data at sample points.

S

83 66

74 76

80

61

Page 2: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

2

4. Preprocessing

In kriging we use one of three functions defined from the surface function to be estimated.

1. variogram2. covariogram3. correlogram

To explain those functions, for the present, we assume that the surface function f(x) is known.

4. Preprocessing

Variogram function indicates the average (square) difference of the surface value between two points of distance h. Therefore, from the variogram function, we can see how f(x) fluctuates in S.

Definition of variogram

( )( ) ( ){ } 2

,

,

d d

2 d dS S h

S S h

f fhγ ∈ ∈ − =

∈ ∈ − =

−=∫ ∫

∫ ∫x t x t

x t x t

x t t x

t x

0.00.20.40.00.8

1.0

0.0 1.0 2.0

γ (h)

h

Figure: Example of variogram

4. Preprocessing

In usual, variogram function increases monotonically with h, because the surface values at two near locations are more similar than those at two distant locations. This gives one theoretical basis for various spatial interpolation methods.

However, we often find variogram functions that do not increase monotonically with h.

221

108124

Linear Spherical Exponential

Quadratic Wave Power

Figure: Typical variograms

4. Preprocessing

Anisotropic variogram

Variogram is a function of h, the distance between two points in S. This implies that variogram does not consider anisotropy in the fluctuation of surface function. This type of variogram is thus often called ‘isotropic variogram’.

On the other hand, ‘anisotropic variogram’ explicitly considers anisotropy of the surface function. It is a function of both the distance h and the direction θ.

Page 3: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

3

4. Preprocessing

u: Unit vector parallel with the X-axis.

The function γ(h, θ) indicates the fluctuation of the surface function in the direction of angle θ measured counterclockwise from the X-axis.

Definition of anisotropic variogram

( )( ) ( ){ }( )

( )

2,

, , cos

,, , cos

d d,

2 d d

S S h

S S h

f fh

θ

θ

γ θ−

∈ ∈ − = =−

−∈ ∈ − = =

−=∫ ∫

∫ ∫

x t ux t x t

x t u

x t ux t x t

x t u

x t t x

t x

4. Preprocessing

Covariogram and correlogram

Covariogram and correlogram are also functions of h, the distance between two points in S, and they also represent the fluctuation of the surface function.

4. Preprocessing

Covariogram

µ: Mean of the function f(x) in S

( )( ){ } ( ){ }

,

,

d d

d dS S h

S S h

f fC h

µ µ∈ ∈ − =

∈ ∈ − =

− −=∫ ∫

∫ ∫x t x t

x t x t

x t t x

t x

( )dS

f

Aµ ∈= ∫x

x x

4. Preprocessing

Covariogram corresponds to covariance used in general statistics.

If we assume that the surface function f(x) follows a stochastic process, its covariance of two points of distance h is given by covariogram.

4. Preprocessing

Relationship between variogram and covariogram

( ) ( )2C h hσ γ= −

( ){ } 2

2d

Sf

A

µσ ∈

−= ∫x

x x

σ2: Variance of the function f(x)

4. Preprocessing

This equation indicates that variogram and covarioram are equivalent in the sense that one completely determines the other.

It also shows that, if a variogram is an increasing function of h, the covariogram is a decreasing function of h.

Page 4: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

4

4. Preprocessing

( )( ){ } ( ){ }

( )

,

2

,

2

d d

d dS S h

S S h

f fh

C h

µ µρ

σ

σ

∈ ∈ − =

∈ ∈ − =

− −=

=

∫ ∫∫ ∫

x t x t

x t x t

x t t x

t x

Correlogram

4. Preprocessing

Relationship between variogram and correlogram

As well as covariogram function, correlogram function is usually a monotonic decreasing function of h.

The three functions, variogram, covariogram, and correlogram are equivalent and interchangeable; we can calculate any of the three function from the other functions.

( ) ( )21h

ρσ

= −

4. Preprocessing

4.6.2 Outline of kriging

Pi: The ith sample point in S (i=1, 2, ..., n)zi: The locational vector of Pi

f(x): Surface function to be interpolated in S

77

7874

82

84

81

76

7688 8683

80

66

75 74

65

5964

61

4. Preprocessing

Kriging interpolates the value at a certain location by a weighted summation of the values at surrounding sample points. Estimator function of f(x) is thus given by

wi(x): Weight function for Pi at x

( ) ( ) ( )1

ˆn

i ii

f w f=

=∑x x z

4. Preprocessing

For simple explanation, vector and matrix notation is introduced.

( )( )

( )

1

2

n

ff

f

=

zz

f

z

( )

( )( )

( )

1

2

n

ww

w

=

xx

w x

x

4. Preprocessing

Then the estimator function is written as

( ) ( )Tf̂ =x w x f

The problem is how we determine the weight function, which is the main issue in kriging.

There are various kriging methods that use different methods of determining the weight function.

Page 5: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

5

4. Preprocessing

In kriging, to specify the weight function w(x), we assume that the surface function f(x) follows a stochastic process, and we treat the surface values at sample points as observed data obtained from the stochastic process. The surface function f(x) is not a deterministic function.

This claims that we cannot specify the surface function f(x) by observation because of measurement error. Kriging uses the framework of statistics, an this is why it is often called ‘geostatistics’.

4. Preprocessing

Besides this assumption, every kriging method imposes its own conditions (assumptions) that the weight function w(x) has to satisfy.

We then calculate w(x) based on the conditions, and finally estimate the surface function for the whole region by using

( ) ( ) ( )1

ˆn

i ii

f w f=

=∑x x z

4. Preprocessing

4.6.3 Simple kriging

Simple kriging puts two assumptions on the behavior of f(x).

Assumption A1:The expectation of the surface function, E[f(x)], is constant in S.

Though this assumption is not realistic, and because of this disadvantage simple kriging is not used in GIS, it makes the methodology of kriging easier to understand.

4. Preprocessing

In practice, instead of Assumption A1, simple kriging uses the assumption below.

Assumption A1’:The expectation of the surface function is zero in S.

4. Preprocessing

Assumptions A1 and A1’ are equivalent because of the following reason. Let µ be the expectation of f(x), that is, E[f(x)]. We then define g(x) by

Instead of estimating f(x) directly, we obtain the same result by estimating g(x) and add µ to the estimated g(x). This greatly reduces the amount of calculation.

( ) ( )g f µ= −x x

4. Preprocessing

Assumption A2:The covariance of the surface function of two locations is given by a function of only the distance between the locations.

The covariance of f(x) of two locations xi and xj is

( ) ( ) ( ) ( ){ } ( ) ( ){ }( ) ( )

C , E E E

E

i j i i j j

i j

f f f f f f

f f

= − − =

x x x x x x

x x

Page 6: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

6

4. Preprocessing

The covariance of f(x) of two locations xi and xj is defined by

By Assumption A2 the covariance function becomes

( ) ( ) ( ) ( )( )

C , Ei j i j

i j

f f f f

C

=

= −

x x x x

x x

( ) ( ) ( ) ( ){ } ( ) ( ){ }( ) ( )

C , E E E

E

i j i i j j

i j

f f f f f f

f f

= − − =

x x x x x x

x x

4. Preprocessing

Covariances are usually represented as a matrix called covariance matrix:

( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

1 1 1 2 1

2 1 2 2 2

1 2

, , ,

, , ,

, , ,

n

n

n n n n

C f f C f f C f f

C f f C f f C f f

C f f C f f C f f

=

z z z z z z

z z z z z zC

z z z z z z

4. Preprocessing

In simple kriging, covariance matrix is then written as

( ) ( )( ) ( )

( ) ( )

21 2 1

22 1 2

21 2

n

n

n n

C C

C C

C C

σσ

σ

− −

− − = − −

z z z z

z z z zC

z z z z

( )2 0Cσ =

4. Preprocessing

Similarly, for plain explanation, covariance vector is introduced:

( )

( ) ( )( ) ( )

( ) ( )

( )( )

( )

1 1

2 2

,

,

, n n

C f f C

C f f C

C f f C

− − = =

x z x z

x z x zc x

x z x z

4. Preprocessing

In kriging, not limited to simple kriging, it is desirable that estimator functions have the following properties.

Property P1: unbiasednessThe expectation of estimator function is equal to the expectation of the original surface function.

Property P2: efficiencyThe variance of estimator function is smaller than any other estimator function.

Desirable properties of estimator functions

4. Preprocessing

Expectation of the estimator function:

( ) ( ) ( )

( )1

ˆE E

E

n

i ii

f w f

f=

=

=

∑x x z

x

Expectation of estimator function

Page 7: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

7

4. Preprocessing

As far as we use a linear combination of observed data, the estimator function is unbiased independently of the weight function w(x).

We do not have to take Property P1 into account in estimation of the weight function w(x). We can focus only on property P2 to specify w(x).

4. Preprocessing

Variance of estimator function

Instead of the variance of estimator function, we usually discuss the mean square error (MSE) of estimator function, because it is equivalent to the variance but easier to derive analytically.

4. Preprocessing

( )

( ) ( ){ }( ){ } ( ){ } ( ) ( )

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

2

2 2

2

1 1 1

T 2 T

ˆMSE

ˆE

ˆ ˆE E 2E

2

2

n n n

i j i j i ii j i

f

f f

f f f f

w w C w Cσ

σ= = =

= − = + −

= − + − −

= + −

∑∑ ∑

x

x x

x x x x

x x z z x x z

w x Cw x w x c x

4. Preprocessing

We choose the weight function w(x) that minimizes the variance of estimator function represented by the mean square error. Mathematically, we solve

( )( )

( )( ) ( ) ( ) ( )T 2 T

ˆmin MSE

min 2

f

σ

⇔ + −w x

w x

x

w x Cw x w x c x

4. Preprocessing

We can solve this minimization problem by solving

The result is

( ) ( )1−=w x C c x

( ) ( )ˆMSE 0f∂ = ∂x

w x

4. Preprocessing

Consequently, estimator function of the surface is given by

( ) ( ) ( )

( )( )( )

1

T1

T 1

ˆn

i ii

f w f=

=

=

=

∑x x x

C c x f

c x C f

Page 8: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

8

4. Preprocessing

But how do we calculate covariance functions?

In simple kriging we arbitrarily choose a covariogram function from typical ones, such as

and calculate the covariances.

( )

( )

110010.001

C h e

C hh

−=

=+

4. Preprocessing

An example

176.0

183.0

160.0

122.0148.0

?

µ=157.8

4. Preprocessing

To use Assumption A1’ (E[f(x)]=0 in S), we substitute µfrom the observed data of f(x) at sample points.

4. Preprocessing

15.63

22.63

-0.37

-38.37-12.37

?

µ=0

4. Preprocessing

( ) 320exp100

hC h = −

We then assume a covariogram function given by

4. Preprocessing

0.107

0.066

0.128

0.225 0.351

Figures indicate the weight for sample points

( )ˆ 150.5f =x

Page 9: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

9

4. Preprocessing

Limitations of simple kriging

1. The choice of covariogram function is arbitrary.

2. The sum of weight w(x) is not equal to one.

3. It is not realistic to assume that the expectation of f(x) is constant in S. There is usually at least a slight variation in E[f(x)] among locations.

4. Preprocessing

The first shortcoming can be partly corrected as follows.

We first choose a theoretical (typical) covariogram function that contains several free parameters. We then fit the function to the observed data at sample points and estimate the parameter values by a statistical procedure, for instance, the least square method.

4. Preprocessing

The second and third problems, on the other hand, cannot be resolved if we continue to use simple kriging.

4. Preprocessing

4.6.4 Ordinary kriging

Ordinary kriging overcomes the second limitation of simple kriging, that is, the sum of weight w(x) is not equal to zero.

In ordinary kriging the sum of weight w(x) is equal to zero at any location in S.

4. Preprocessing

The weight w(x) is estimated in the same way as the simple kriging:

Ordinary kriging add one constraint to the minimization problem:

( )( )ˆmin MSE f

w xx

( )T 1=w x 1

4. Preprocessing

Since the new minimization problem is an optimization problem with a constraint, it cannot be solved by only differentiating the mean square error of the estimator function of f(x) by w(x).

To solve optimization problems with constraints, we use the method of Lagrange multipliers.

Page 10: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

10

4. Preprocessing

We introduce a new function λ(x), which is called a Lagrange multiplier. The original problem with a constraint

then becomes a problem without constraints:

( )( )

( )T

ˆmin MSE

s.t. 1

f

=w x

x

w x 1

( ) ( )( ) ( ) ( )T

,ˆmin MSE f

λλ + w x x

x w x 1 x

4. Preprocessing

The result is

where

( ) ( )1−+ + +=w x C c x

( )

( )

( )( )

1

n

w

+

=

x

w xxx

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

1 1 1

1

, , 1

, , 1

1 1 0

n

n n n

C f f C f f

C f f C f f+

=

z z z z

Cz z z z

( )

( ) ( )

( ) ( )

( )

( )

1 1,

,

1 1n n

C f f C

C f f C+

− = = −

x z x z

c xx z x z

4. Preprocessing

In ordinary kriging covariogram function is arbitrarily chosen from typical theoretical functions, or estimated from the observed data at sample points by the least square method.

4. Preprocessing

An example

176.0

183.0

160.0

122.0148.0

?

µ=157.8

4. Preprocessing

( ) 320exp100

hC h = −

To estimate the surface function, we again specify a covariogram function arbitrarily which is given by

4. Preprocessing

0.141

0.094

0.151

0.251 0.363

Figures indicate the weight of sample points

( )ˆ 150.5f =x

Page 11: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

11

4. Preprocessing

The surface value estimated by ordinary kriging is identical to that by simple kriging. This, however, happened only by chance. These two methods usually yield different results.

4. Preprocessing

Limitations of ordinary kriging

Ordinary kriging inherits one disadvantage from simple kriging:

It is not realistic to assume uniformity in the expectation of f(x). There is usually at least a slight variation in E[f(x)] among locations.

4. Preprocessing

4.6.5 Universal kriging

Universal kriging overcomes the limitation of ordinary kriging.

Universal kriging assumes heterogeneity in the expectation of f(x) in S.

4. Preprocessing

Universal kriging is based on the following four assumptions.

Assumption A1:The surface function f(x) is not deterministic but follows a stochastic process.

Assumption A2:The expectation of the surface function, E[f(x)], is given by a function of location x indicated by µ(x).

4. Preprocessing

Assumption A3:The covariance of f(x) between two locations is given by a function of only the distance between the locations.

Assumption A4:The sum of the weight function w(x) is equal to one at any location.

4. Preprocessing

The mean square error is then given by

where

( )

( ) ( ){ }( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

2

TT 2 T 2 T

ˆMSE

ˆE

2 2

f

f f

σ µ

= −

= + − + −

x

x x

w x Cw x w x c x w x µ w x x w x µ

( )( )

( )

1

2

0 00 0

0 0 n

µµ

µ

=

xx

µ

x

Page 12: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

12

4. Preprocessing

We then solve

by the method of Lanrange multipliers to estimate both w(x) and µ(x) simultaneously.

( ) ( )( )

( ),

T

ˆmin MSE

s.t. 1

=w x x

x

w x 1

4. Preprocessing

4.6.6 Advanced kriging

1) Block kriging

It often happens in spatial analysis that we need only the mean value of f(x) in a subregion in S, instead of f(x) for the whole region S.

Block kriging estimates the average value of f(x) in a region in S without estimating the whole surface. Block kriging is thus more efficient than universal kriging.

4. Preprocessing

2) CokrigingCokriging interpolates two surfaces simultaneously, say, f(x) and g(x), by using two sets of measured values at sample points. In short, to interpolate f(x) or g(x), cokriging uses twice as much information as universal kriging does.

Though we expect that the result is twice as accurate as that of universal kriging, whether or not cokriging works successfully depends on the correlation between the two functions.

4. Preprocessing

3) Disjunctive krigingInstead of linear combination of observed data,

disjunctive kriging uses a nonlinear function of the data to estimate the surface value at a sample point. This increases flexibility of the surface function estimated, but decreases the efficiency of computation.

( ) ( ) ( )1

ˆn

i ii

f w f=

=∑x x z

4. Preprocessing

4.6.7 Comparison of spline and kriging

Kriging is a method of spatial interpolation most frequently used in GIS and spatial analysis. GIS community prefers kriging to spline.

Kriging may seem rather complicated and its mathematical background difficult to understand. However, today’s GIS and its add-in extensions can estimate the weight function and interpolate the surface function automatically. You can do kriging without complete understanding of its theoretical background.

4. Preprocessing

Spline Kriging

Original application fields Image processing GeologyGeomorphology

Today’s application fields Image processing Spatial preprocessingVisualization Spatial analysis

Evaluation of result Appearance Goodness of fit as a modelTheoretical background Not clear StatisticsSoftware packages Available Available

Page 13: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

13

4. Preprocessing

4.6.8 Cross validation

One of the difficulties in spatial interpolation is that we cannot tell to what degree an interpolated function is accurate.

To evaluate its accuracy, we estimate the function value at a sample point from observed data at other sample points, and calculate the estimation error at the sample point. We repeat this process for all the sample points, and call this process ‘cross validation’.

4. Preprocessing

4.6.9 Subjects for further research

1. Evaluation of interpolation accuracy2. Comparison of interpolation methods3. Spatiotemporal interpolation4. Three or higher dimensional spatial interpolation

4. Preprocessing

4.7 Spatial smoothing

Spatial smoothing is another method of generating a continuous surface from the data measured at sample points.

However, spatial smoothing is different from spatial interpolation.

4. Preprocessing

Spatial interpolation:The estimated function always passes the observed data points exactly. The function values measured at sample points are assumed to be accurate.

Spatial smoothing:The estimated function does not necessarily pass the observer data points. The function values measured at sample points are considered to be inaccurate to some extent because of measurement error.

Spatial interpolation and spatial smoothing

4. Preprocessing

Therefore, if observed data are reliable, we can choose spatial interpolation. Spatial interpolation is also used when the quality of observed data is not known.

On the other hand, if observed data are inaccurate to some extent, we should use spatial smoothing. Spatial smoothing is also useful if we want smooth surfaces. Spatial smoothing generally yields smoother surfaces because it permits the surface not to pass through all the observed data points. Spatial smoothing is more flexible.

4. Preprocessing

Spatial interpolation

Sample points

Spatial smoothing

Page 14: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

14

4. Preprocessing

Two types of spatial smoothing

Global smoothing:Global smoothing fits one function to the observed data at sample points.

Local smoothing:Local smoothing divides the data region into subregions and fit different functions in individual subregions.

4. Preprocessing

4.7.1 Global smoothing: Trend surface analysis

The trend surface analysis fits a polynomial function to the observed data at sample points.

Polynomial function of degree two:

Parameters are estimated so as to minimize the mean square error at the sample points.

( ) 2 200 10 01 11 20 02

2 2 2 221 12 22

f a a x a y a xy a x a y

a x y a xy a x y

= + + + + +

+ + +

x

Figure: Observed data at sample points Figure: Trend surface fitted to the data

4. Preprocessing

Note

Polynomial function fits better the observed data with an increase of its degree.

However, polynomial functions of higher degrees tend to fluctuate greatly. They are less stable than those of lower degrees, as seen in the spatial interpolation, regression models, etc..

It is not always better to use polynomial functions of higher degrees. The balance between the fitness of function to observed data and the stability of function is important.

4. Preprocessing

4.7.2 Local smoothing 1: moving average

There are several methods of local smoothing.

In the following, for simple explanation, we consider one-dimensional local smoothing instead of two-dimensional local smoothing.

Extension of one-dimensional local smoothing to the two-dimensional case is quite natural and thus easy to perform.

Page 15: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

15

4. Preprocessing

In moving average method, the surface value at x, denoted by f(x), is given by the average of values measured at sample points located within a certain distance w from x.

The distance w is called the ‘window width’.

This method is called ‘moving average’ because we move x in S, calculating the average of observed data at sample points, in order to generate a surface for the whole region in S.

4. Preprocessing

The result depends on the window width.

If we choose a narrow window, only a few sample points are used to calculate the surface. As a result, the surface obtained is fluctuated.

If we choose a wide window, many sample points are taken into account in calculation. This yields a smooth surface.

Figure: Original data and linear interpolation

Window width

Figure: Moving average (narrow window)

Window width

Figure: Moving average (wide window)

4. Preprocessing

Properties of moving average

Wider windows give smoother functions.

However, the function obtained is usually discrete and stepwise, though it is smoother than the original data. It does not look so nice, so moving average is not used in visualization of spatial data.

Page 16: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

16

4. Preprocessing

This is because the window function is discrete.

Compare two close sample points, one located inside a window and the other outside the window. The former is taken into account in smoothing calculation while the latter is completely neglected. Only a slight difference in the location of x yields a big difference in the surface function f(x).

4. Preprocessing

4.7.3 Local smoothing 2: weighted moving average

The surface function f(x) is given by the sum of observed data at sample points weighted with the distance from x.

zi: Locational vector of the ith sample pointhi: Observed data at the ith sample pointw(d): Weight function of distance d

( )( )( )

i ii

ii

w hf

w

−=

∑∑

x zx

x z

4. Preprocessing

Property of weight functions

A wide variety of functions can be used as the weight function.

However, to obtain natural surfaces, the weight should at least decrease monotonically with an increase of the distance from a sample point, which is denoted by d. In short, w(d) should be a decreasing function of d. This type of function is called ‘distance-decay function’.

4. Preprocessing

Examples of the weight function

( ) ( )1 1,2,...nw d nd

= =

( ) ( )expw d dα= −

The below are typical weight functions used in weighted moving average.

( ) 1w dd

=

Figure: Weighted moving average

( ) 2

1w dd

=

Figure: Weighted moving average

Page 17: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

17

4. Preprocessing

If we use a negative power function of d as the weight function,

then the surface function gets closer to the surface obtained by linear function with an increase of n.

( ) 1nw d

d=

4. Preprocessing

4.7.4 Local smoothing 3: kernel smoothing

Kernel smoothing was originally developed for estimating the probability function of a point distribution from observed data.

Kernel smoothing is substantially the same operation as the weighted moving average. Kernel smoothing sums up the observed data measured at sample points weighted by a distance-decay function.

4. Preprocessing

In kernel smoothing the weight function is called ‘kernel function’. The normal (Gaussian) distribution and the quadratic function are generally used as the kernel function. Kernel smoothing will be discussed in more detail in Chapter 6 with its application to visual analysis.

In GIS, ‘weighted moving average’ and ‘kernel smoothing’ often refer to the same operation.

4. Preprocessing

4.7.5 Local smoothing 4: spline smoothing

Spline interpolation can also be applied to spatial smoothing.

In spatial interpolation, we estimate spline functions each of which is defined in a subregion bounded by knots. The number and location of knots are given arbitrarily in advance; sample points are often used as knots.

4. Preprocessing

In spatial smoothing, on the other hand, we determine the subregions (knots) and the surface function simultaneously from observed data. The number and location of knots are not determined in advance; they are also determined from observed data.

Spline smoothing is more complicated than spline interpolation because the former has more free parameters to be estimated than the latter.

Figure: Observed data at sample points

Page 18: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

18

Knots

Figure: Spline smoothing by the quadratic function

Knots

Figure: Spline smoothing by the cubic function

4. Preprocessing

4.7.6 Applications of spatial smoothing

Spatial smoothing is useful for

1. estimating the surface function for the whole region from the sample data, and

2. visualizing the global structure of the surface function.

Figure: Density distribution of NO2 and OX

Figure: Distribution of members of athletic clubs

4. Preprocessing

4.8 Raster-vector conversion

It consists of two different conversions:

1. raster-to-vector conversion,2. vector-to-raster conversion.

Page 19: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

19

4. Preprocessing

4.8.1 Vector-to-raster conversion

Vector-to-raster conversion is straightforward.

1. A lattice is overlaid on the vector data.2. The cells the lines pass are extracted.

Figure: Original vector data

Figure: Overlay of a lattice Figure: Raster data

4. Preprocessing

4.8.2 Raster-to-vector conversion

In contrast, raster-to-vector conversion, the reverse of vector-to-raster conversion, is very difficult.

Raster-to-vector conversion is indispensable if we want to create vector data from scanned images of paper maps automatically.

Many algorithms of raster-to-vector conversion have been proposed in literature. In the following a typical algorithm is explained.

4. Preprocessing

A raster-to-vector conversion consists of three steps.

1. Line thinning2. Cell connection3. Elimination of redundant lines

Page 20: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

20

4. Preprocessing

Line thinning

Line thinning is a process of thinning lines represented by a set of cells in raster data. It peels out the ‘skin’ of lines one by one from the outer boundary.

Figure: Original raster data

Figure: Raster data obtained by peeling the outer skin Figure: Raster data obtained after the second peeling

4. Preprocessing

Cell connection

After thinning lines, we connect neighboring cells by vector line segments.

There are two definitions of ‘neighborhood’ of a cell: 4-connected neighborhood and 8-connected neighborhood.

4. Preprocessing

4-connected neighborhood: the upper, lower, right, and left neighborhood

8-connected neighborhood: the four neighborhoods above and the four diagonal neighborhoods

Page 21: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

21

Figure: Raster data after line thinning Figure: Vector data obtained by 4-connected algorithm

Figure: Vector data obtained by 8-connected algorithm

4. Preprocessing

Elimination of redundant lines

4-connected algorithm often fails to connect cells located diagonally. On the other hand, 8-connected algorithm generates redundant lines.

We thus apply 8-connected algorithm to raster data and then eliminate redundant lines.

4. Preprocessing

Elimination rule

If two cells that are diagonally neighboring are connected by both the orthogonal and diagonal lines, eliminate the latter.

Figure: Vector data obtained after elimination of redundant lines

Page 22: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

22

4. Preprocessing

Comparison of the original and retrieved data

Green lines: original vector dataYellow lines: vector data retrieved from the raster data

generated by the vector-to-raster conversion.

4. Preprocessing

Once vector data are rasterized, their retrieval is possible to some extent but the product is not identical to the original vector data. Raster-to-vector data is not yet practical in spatial preprocessing.

We should remember this when we create spatial data, especially when we decide the data format - raster or vector data, because we cannot retrieve vector data from raster data once the original vector data are lost.

4. Preprocessing

4.9 Areal interpolation

Areal interpolation is a process of transferring attribute data aggregated by a certain zonal system to another system.

It is necessary when, for example, we use two spatial datasets simultaneously, one reported by census districts and the other by a square lattice. We have to transfer the attribute data of the former dataset into the latter one.

5742

7615 42

2033

21

Figure: Data reported by census tracts

712

69

?

Figure: How many people are there in the circle?

4. Preprocessing

Terminology

Source zones:Source zones are spatial units used for aggregating spatial data and reporting the aggregated attribute data.

Target zones:Target zones are spatial units in which we want to know the attribute data.

Target zone

?

Source zones

712

6 9

Page 23: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

23

4. Preprocessing

4.9.1 Areal weighting method

Areal weighting method assumes that spatial objects are uniformly distributed in each source zone.

This method thus allocates attribute data in proportion to the size of subregions.

4. Preprocessing

411

5

4 9

1

3

5

1

7

12

69

Estimation of attribute data in a target zone

4. Preprocessing

Because of uniformity assumption, areal weighting method does not work when the distribution of spatial objects is not uniform, say, when they are intensely clustered.

In urban area population distribution is globally smooth and often uniform but it has a great variation at a local scale. Therefore, in estimation of population distribution, areal weighting method is appropriate for global interpolation but is not suitable for local interpolation.

4. Preprocessing

4.9.2 Point-in-polygon method

7

12

6 9

Point-in-polygon method can be used when every source zone has its own representative point, which is usually located at the centroid (gravity center) of the zone.

4. Preprocessing

Point-in-polygon method assumes that in each source zone all the spatial objects are located exactly at the representative point, and assigns the attribute value to the representative point.

The attribute value of the target zone is then calculated by summing up all the attribute values of the representative points contained in the target zone.

4. Preprocessing

Estimation of attribute data in a target zone

167

12

6 97

12

6

9

Page 24: 4. Preprocessing 4.6 Two-dimensional continuous interpolation 3: Kriging ...ua.t.u-tokyo.ac.jp/okabelab/sada/docs/pdf_class/Ch04_… ·  · 2002-05-124.6 Two-dimensional continuous

24

4. Preprocessing

In contrast to areal weighting method, point-in-polygon method works successfully when spatial objects are clustered around representative points.

In Japan, point-in-polygon method is used to create the census data in raster format (mesh data) from the original census data aggregated by census tracts.

4. Preprocessing

4.9.3 Advanced methods of areal interpolation

• Kernel methodKernel method assumes that in each source zone spatial objects are distributed around the representative point following a kernel function, a decreasing function of the distance from the representative point.

Kernel method puts small bumps called ‘kernels’ centered at representative points, whose summation represents density distribution of spatial objects. We calculate attribute data in target zones by integrating the density function.

4. Preprocessing

• Intelligent methodsIntelligent methods use additional information about the distribution of spatial objects. In estimation of population distribution, land use and land cover data are often used. This drastically improve the accuracy of attribute data estimated in target zones.

4. Preprocessing

Homework Q.4.3 (15 pts)

1) Discuss the similarities and differences between spatial interpolation and spatial smoothing.

2) Discuss how we choose one from the two spatial operations, spatial interpolation and spatial smoothing.

3) Indicate two concrete applications of those operations: in one case spatial interpolation is more appropriate than spatial smoothing, while in the other case spatial smoothing is suitable.

4. Preprocessing

Homework Q.4.4 (20 pts)

Universal kriging is defined as a mathematical optimization problem with constraints:

Show the process of solving this problem and give explicit forms of w(x) and µ(x).

( ) ( )( )

( ),

T

ˆmin MSE

s.t. 1

=w x x

x

w x 1