Environmental Data Analysis with MatLab Lecture 21: Interpolation.

Post on 13-Dec-2015

219 views 1 download

Transcript of Environmental Data Analysis with MatLab Lecture 21: Interpolation.

Environmental Data Analysis with MatLab

Lecture 21:

Interpolation

Lecture 01 Using MatLabLecture 02 Looking At DataLecture 03 Probability and Measurement Error Lecture 04 Multivariate DistributionsLecture 05 Linear ModelsLecture 06 The Principle of Least SquaresLecture 07 Prior InformationLecture 08 Solving Generalized Least Squares ProblemsLecture 09 Fourier SeriesLecture 10 Complex Fourier SeriesLecture 11 Lessons Learned from the Fourier TransformLecture 12 Power Spectral DensityLecture 13 Filter Theory Lecture 14 Applications of Filters Lecture 15 Factor Analysis Lecture 16 Orthogonal functions Lecture 17 Covariance and AutocorrelationLecture 18 Cross-correlationLecture 19 Smoothing, Correlation and SpectraLecture 20 Coherence; Tapering and Spectral Analysis Lecture 21 InterpolationLecture 22 Hypothesis testing Lecture 23 Hypothesis Testing continued; F-TestsLecture 24 Confidence Limits of Spectra, Bootstraps

SYLLABUS

purpose of the lecture

to introduce

Interpolation

the process of filling in missing data points

time0 1 2

A(t)Scenario 1: data are collected at irregular time intervals, but you want to compute power spectral density, which requires evenly sampled data.

frequency

psd

?

time0 1 2

A(t)Scenario 2: two datasets are collected with different sampling intervals, but you want to combine them into a scatter plot

AB?

1 2

B(t)

in both scenarios

the times that the data are collected at are

inconvenient

we encountered a problem similar to this one back in Lecture 8,

where we used

prior information

to fill in data gaps

time0 1 2

observed data with missing pointsdobs (t

)

time0 1 2

dest (t)estimated data with missing points filled in

find diest so that

diest ≈ diobsat the observation points

and

roughness of diest ≈ 0everywhere

the solution is inexact

diest ≠ di

obs

everywhere

and

roughness of diest ≠ 0

everywhere

but the inexactness isn’t a problem

because

bothobservations

andprior information

have error

now we examine an alternative approach

traditional interpolation

similar, but subtly different

find d(t) so that

d(ti) = diobsat the observation points

and

roughness of d(t) = 0in between the observation points

find d(t) so that

d(ti) = diobsat the observation points

and

roughness of d(t) = 0in between the observation points

exact

exact

find d(t) so that

d(ti) = diobsat the observation points

and

roughness of d(t) = 0in between the observation points

“interpolant”

disadvantagethe observation points are singled out as special

advantageinterpolant d(t) is an analytic function that is known

everywhere

disadvantagethe observation points are singled out as special

advantageinterpolant d(t) is an analytic function that is known

everywhere

can evaluate d(t) at any time, tcan differentiate d(t), integrate it, etc.

d(t) behaves differently at the observation points than between them

the interpolation problem

find an interpolantd(t)that goes through all the data points

and

“does something sensible”

or

“satisfies some prior information”

between them

some obvious ideas don’t work at all

an (N-1) order polynomial can easily be constructed to that it passes through N points

so use a polynomial for d(t)

d(t)

time, t

example

d(t)

time, t

what happened here? and here?

example

solution

a low-order polynomial

has less potential for wild swings

so use many low-order polynomial

each valid in a small time interval

such a function is called a “spline”

simplest case

set of linear polynomials

each valid between two data points

“connect the data points with straight lines”

tdti ti+1

d(t)

disadvantage

advantages

conceptually very simple

always get what you expect

d(t) has kinks at observation points

zero roughness between observations

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-5

0

5

d(t)

time, t

example

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-5

0

5

d(t)

time, t

example

kink

in MatLab

observations

times of interpolation

interpolated observations

getting rid of the kinks

use cubic polynomialsSi(t) = c0 + c1 t + c2 t2 + c3 t3each valid between two data points

cubic polynomial has 4 coefficients

two constrained by need to pass through two data

two to implement prior information

no kinks in d(t) or its first derivative

the trick

second derivativeof cubic is linear

so use linear interpolation formulafor second derivative

t2nd d

eriv

ativ

e

ti ti+1ti-1yi-1yi

yi+1

t2nd d

eriv

ativ

e

ti ti+1ti-1yi-1yi

yi+1

the second derivative at the observation points, denoted yi,

become an unknown in the problem

the second derivative is now integrated twice to give the spline function

here ai and bi are two more unknowns that arise from the integration constants

finallyone finds the y’s, a’s and b’s

so that the spline

1. goes through the observations

and

2. has a first derivative that is continuous across the observation points

the solution involves solving a matrix equation for the unknowns

(see text for details)

in MatLab

observations

times of interpolation

interpolated observations

d(t)

time, t

example

d(t)

time, t

exampleno kinks

interpolation involves

prior information of smoothness

in generalized least-squaresthe prior information of smoothness is quantified by a

roughness matrix, HHm

then we minimize the overall roughness, which is to say the overall error in the prior information(Hm)T (Hm)

note that

(Hm)T (Hm) = mT (HTH) mbut in generalized error also has the form

mT Cm-1 mwhere Cm-1 is a covariance matrix

so in this caseCm = (HTH)-1

so the prior information that the data are smooth

is equivalent to the requirement that they have a specific covariance matrix

which for stationary time series is equivalent to saying that they have a specific autocorrelation function

so an alternative, more flexible way of interpolating data

is by specifying the autocorrelation function that we want the results to have

this is called Kriging(after Danie G Krige, its inventor)

Kriging

estimate data at arbitrary time t0

determine weights wby

minimizing the variance of

with respect to wiwe’ll find that we don’t need to know d0true

only its autocorrelation

assuming and

j

assuming and

means approximately cancel

j

assuming and

means approximately cancel

expand square

j

assumming and

means approximately cancel

expand square

insert weighted average formula

j

assumming and

means approximately cancel

expand square

insert weighted average formula

jidentify terms proportional to autocorrelation

now differentiate with respect to the weight, wk

which yields the matrix equation

Mw = v

now differentiate with respect to the weight, wk

which yields the matrix equation

Mw = v note that the autocorrelation appears on both sides of the equation, so that its overall normalization cancels out

all we need now do is specify an autocorrelation function

for examplewe could use the Normal function

the variance, L2, controls the width of the autocorrelation and hence the smoothness of the interpolation

In MatLab

observations: tobs, dobsinterpolated values: test, destNormal autocorrelation function with variance L2

0 10 20 30 40 50 60 70 80 90 100-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

x

d

0 20 40 60 80 100-2

-1

0

1

2

x

dd(t)

A) Kriging B) Generalized Least Squares

time, t time, t

d(t) d(t)

Example

Interpolation in two-dimensions

construct an interpolantd(x,y)that goes through the observations

anddoes something sensible in between

1 dimensions

td

t0 x2 dimensions

y0

notion of bracketing observations more complicated

y0

x0

1 dimensions

td

ti ti+1t0 xy0

x02 dimensions

ynotion of bracketing observations

more complicated

triangular tile

segment of t-axis

Delaunay triangles

set of most equilateral triangles connecting data points

0 5 10 15 20 25 30 35 40

0

5

10

15

20

25

30

35

40

y

x

data

0 5 10 15 20 25 30 35 40

0

5

10

15

20

25

30

35

40

y

x

dataA) Observations B) Delaunay triangles

y y

x x

0 5 10 15 20 25 30 35 40

0

5

10

15

20

25

30

35

40

y

x

data

0 5 10 15 20 25 30 35 40

0

5

10

15

20

25

30

35

40

y

x

dataA) Observations B) Delaunay triangles

y y

x x

triangle enclosing a point of interest

0 5 10 15 20 25 30 35 40

0

5

10

15

20

25

30

35

40

y

x

linear interpolation

0 5 10 15 20 25 30 35 40

0

5

10

15

20

25

30

35

40

y

x

cubic interpolationD) Cubic SplinesC) Linear Splines

y y

x x

In MatLab

linear splines

cubic splines