Lecture 9 Interpolation and Splines. Lingo Interpolation – filling in gaps in data Find a function...

Post on 15-Jan-2016

220 views 3 download

Transcript of Lecture 9 Interpolation and Splines. Lingo Interpolation – filling in gaps in data Find a function...

Lecture 9

Interpolation and Splines

Lingo

Interpolation – filling in gaps in data

Find a function f(x) that

1) goes through all your data points

2) does something sensible in between

Lingo

Splines – a broad class of ways of performing interpolation

(we’ll get to the details, eventually)

Find a function f(x) that

1) goes through all your data points

(observations)

2) does something sensible in between

(prior information)

Why not just use least-squares?

Remember this?

mest = mA + M [ dobs – GmA]

where M = [GTCd-1G + Cm

-1]-1 GT Cd-1

m – a vector of all the points at which you want to estimate the function, including the points for which you have observations

d – a vector of just those points where you have observations

So the equation Gm=d is very simple, a model parameter equals the data when the corresponding observation is available:

…0 … 0 1 0 … 0…

…mi

…dj

… =

Just a single “1” per row

You then implement a smoothness constraint through minimizing |Dm|2, where D is some measure of the non-smoothness of m

Thus mA= 0 and Cm-1 = 2DTD and Cd=I

…0 … 1 -2 1 … 0…

D =

One possibility is to use the finite-difference approximation of the second derivative

First derivative

[dm/dx]i (1/x) mi – mi-1

mi – mi-1

Second derivative

[d2m/dx2]i [dm/dx]i+1 - [dm/dx]i

= mi+1 – mi – mi + mi-1

= mi+1 – 2mi + mi-1

example101 equally spaced along the x-axis

So 101 values of the function f(x)

40 of these values measured (the data, d)the rest are unknown

Two prior informationminimize 2nd derivative for interior 99 x’sminimize 1st derivative at left and right x’s

(nice to have the same numberof priors as unknowns, but notrequired)

= 10-6

data

result

f(x)

x

can be chosen by trail and error

but usually the result fairly insensitive to , as long as its

small

varying over six orders of magnitude

log 10

(Tot

al E

rror

)log10()x

f(x)

A purist might say that this is not really interpolation, because the curve goes through the data

only in the limit

but for small ’sthe error is extremely small

an aside

Construct an equation F m = h as follows:

G

D

d

m =

then note [FTF]-1FT h = [GTG+2 DTD]-1GT d

so if you want, you can just append D to the bottom of G and solve by simple least-squares

solved via solved via[FTF]-1FT h [GTG+2 DTD]-1GT d

exactly the same!

another reason to work with

F m = h G

D

d

m =

both G and D, and therefore F, too, are mostly zero

(that is, they’re sparse matrices)

very efficient algorithms are available for solving Fm=h

in the least-squares sense when F is a sparse matrix

(note GTG and DTD are not as sparse as G or D

and [GTG and DTD]-1 is not sparse at all)

2D Example

(here a sparse solver would really be useful, for the number of unknowns is very large)

21 unknowns

21 u

nk

now

ns

2121=441 unknowns

44 observed data

Prior information:

2f = d2f/dx2 + d2f/dy2 = 0 in interior of the box

nf = 0 on edges of box

… a generalization of the 1D case

results

comparison

one limitation of this method is that it is discrete

it only gives the unknown function at specific, prescribed values of xi

one might prefer to have an analytic formula for the value of the function at

any x

LINGO

an analytic formula that gives the value of the function at any x is

called an interpolant

high order polynomial

something that sound like a good idea but isn’t

even though an N-1 polynomal can computed to pass through any N points

example: 10th order polynomial fit to 11 points

Big swings not what we hoped for

solution

simple functione.g. a low order polynomial

that is valid in some interval near xi

obviously, we need many such polynomials to over the whole x-axis

This approach is called a spline

we’ve all used one already -linear splines

xxi xi+1

y iy i+

1y

in this intervaly(x) = yi + (yi+1-yi)(x-xi)/(xi+1-xi)

cubic splines – somewhat more complicatedbut a lot nicer …

xxi xi+1

y iy i+

1

y

cubic a+bx+cx2+dx3 in this interval

a different cubic in this interval

counting up unknowns …

xxi xi+1

y iy i+

1y

four coefficients a, b, c, d in every interval

unknownsN dataN-1 intervals4 coefficients per interval4(N-1)=4N-4 coefficientstotal=4N-4 unknowns

constraintscurve goes thru point at end of its interval

2(N-1)=2N-2dy/dx match at interior points

N-2 constraintsd2y/dx2 match at interior points

N-2 constraintsd2y/dx2 =0 at end points

2 constraintstotal: 4N-4 constraints

formulating the cubic spline problem in an efficient manner

f(x) with N observations (xi, fi)

let hi = xi = xi+1- xi

and

fi = fi+1-fi

Si(x) are cubic polynomials, one for each interval

Let the 2nd derivatives have values

d2Si/dx2=y”i at the left hand end of its interval

But since the second derivative is presumed continuous across intervals, d2Si/dx2=y”i+1 on the right hand side of its interval too.

since Si(x) is a cubic, its second derivative is a linear function

So within an interval d2Si/dx2 varies linearly

d2Si/dx2 = y”i (xj+1-x)/hj + y”i+1 (x-xj)/hj

we’ll wind up solving for these y”i’s

now integrate twice to get Si(x)

d2Si/dx2 = y”i (xj+1-x)/hj + y”i+1 (x-xj)/hj

Si(x) = y”i (xj+1-x)3/(6hj) + y”i+1 (x-xj)3/(6hj) +

ci(x-xi) + di(xi+1-x)

where ci and di are integration constants

now choose the integration constants ci and di such that the the cubic goes through the data points. That is, Si(xi)=fi and Si(xi+1)=fi+1

this leads to

ci = fi+1/hi – y”i+1hi/6

di = fi/hi – y”ihi/6

so

Si(x) = y”i (xj+1-x)3/(6hj)

+ y”i+1 (x-xj)3/(6hj)

+ {fj+1/hi – y”i+1hi/6}(x-xi)

+ {fi/hi – y”ihi/6}(xi+1-x)

where ci and di are integration constants

but we still haven’t implemented the continuity of dS/dx condition …

so we compute the derivative

S’i(x) = dSi/dx =

y”i (xi+1-x)2/(2hi)

+ y”i+1 (x-xi)2/(2hi)

+ {fi+1/hi – y”i+1hi/6}

- {fi/hi – y”ihi/6}

= ½y”i (xi+1-x)2hi + ½y”i+1 (x-xi)2/hi

+ fi+1/hi – (y”i+1-y”i)hi/6

now require the first derivative match across intervals: Si-1’(xi)=Si’(xi)

this leads to an equation for the unknown y”i

hi-1y”i-1 + 2(hi+hi-1)y”i + hiy”i+1 = bj

with bj = 6fi/hi – 6fi-1/hi-1

the equation for the unknown y”i

hi-1y”i-1 + 2(hi+hi-1)y”i + hiy”i+1 = bi

is just a matrix equationi=1: h0y”0 + 2(h1+h0)y”1 + h1y”2 = b1

i=2: h1y”1 + 2(h2+h1)y”2 + h2y”2 = b2

i=2: h2y”2 + 2(h3+h2)y”3 + h3y”3 = b3

…i=N-1 hN-2y”N-2 + 2(hN-1+hN-2)y”N-1 + hN-1y”N = bN-1

i=N hN-1y”N-1 + 2(hN+hN-1)y”N + hNy”N+1 = bN

= we’ll discuss the issue raise by y”0 and y”N+1 in a moment

the matrix equation, with gi=2(hi+hi-1), is

h0 g1 h1

h1 g2 h2

h2 g3 h3

… hN-2 gN-1 hN-1

hN-1 gN hN

y”0

y”1

y”2

y”3

…y”N

y”N+1

b0

b1

b2

b3

…bN

bN+1

=

A Tridiagonal Matrix, by the way. Very fast solvers are available …

the matrix equation, with gi=2(hi+hi-1), is

h0 g1 h1

h1 g2 h2

h2 g3 h3

… hN-2 gN-1 hN-1

hN-1 gN hN

y”0

y”1

y”2

y”3

…y”N

y”N+1

b0

b1

b2

b3

…bN

bN+1

=

I’ve written this as if there were two extra points, one to the left of the first point and one to the right of the last point. Of course, there aren’t. The way to handle this is to prescribe y”0 and y”N+1 and move them to the r.h.s. of the equation.

N

N+2

moving over these two now-specified unknowns

g1 h1

h1 g2 h2

h2 g3 h3

… hN-1 gN hN

hN gN

y”1

y”2

y”3

…y”N

b1 – h0y”0

b2

b3

…bN – hN+1y”N+1

=

we can set y”0 and y”N to whatever we want. A simple choice is zero, in which case the splines are called natural cubic splines

N

N

example of cubic spline interpolation

very easy in MatLab

new_y = spline(x,y,new_x);