Regression / Calibration

68
Regression / Calibration MLR, RR, PCR, PLS

description

Regression / Calibration. MLR, RR, PCR, PLS. Paul Geladi. Head of Research NIRCE Unit of Biomass Technology and Chemistry Swedish University of Agricultural Sciences Umeå Technobothnia Vasa [email protected] [email protected]. Univariate regression. y. Slope. a. Offset. - PowerPoint PPT Presentation

Transcript of Regression / Calibration

Page 1: Regression / Calibration

Regression / Calibration

MLR, RR, PCR, PLS

Page 2: Regression / Calibration

Paul Geladi

Head of Research NIRCEUnit of Biomass Technology and ChemistrySwedish University of Agricultural SciencesUmeåTechnobothniaVasa [email protected] [email protected]

Page 3: Regression / Calibration

Univariate regression

Page 4: Regression / Calibration

x

y

Offset

Slope

Page 5: Regression / Calibration

x

y

Offset a

Slope b

y = a + bx +

Page 6: Regression / Calibration

x

y

Page 7: Regression / Calibration

x

y Linear fit

Underfit

Page 8: Regression / Calibration

x

y Overfit

Page 9: Regression / Calibration

x

y Quadratic fit

Page 10: Regression / Calibration

Multivariate linear regression

Page 11: Regression / Calibration

y = f(x)

Works sometimes

y = f(x)

Works only for a few variables

Measurement noise!

∞ possible functions

Page 12: Regression / Calibration

X y

I

K

Page 13: Regression / Calibration

y = f(x)

y = f(x)

Simplified by:

y = b0 + b1x1 + b2x2 + ... + bKxK + f

Linear approximation

Page 14: Regression / Calibration

y = b0 + b1x1 + b2x2 + ... + bKxK + f

y : responsexk : predictorsbk : regression coefficientsb0 : offset, constantf : residual

Nomenclature

Page 15: Regression / Calibration

X y

I

K

X, y mean-centered b0 out

Page 16: Regression / Calibration

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

y = b1x1 + b2x2 + ... + bKxK + f

} I samples

Page 17: Regression / Calibration

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

y = b1x1 + b2x2 + ... + bKxK +f

Page 18: Regression / Calibration

Xy

I

K

f

b

= +

y = Xb + f

Page 19: Regression / Calibration

X, y known, measurableb, f unknown

No solution

f must be constrained

Page 20: Regression / Calibration

The MLR solution

Multiple Linear Regression

Ordinary Least Squares (OLS)

Page 21: Regression / Calibration

b = (X’X)-1 X’y

Problems?

Least squares

Page 22: Regression / Calibration

3b1 + 4b2 = 14b1 + 5b2 = 0

One solution

Page 23: Regression / Calibration

3b1 + 4b2 = 14b1 + 5b2 = 0 b1 + b2 = 4

No solution

Page 24: Regression / Calibration

3b1 + 4b2 + b3 = 14b1 + 5b2 + b3 = 0

∞ solutions

Page 25: Regression / Calibration

b = (X’X)-1 X’y

-K > I ∞ solutions-I > K no solution-error in X-error in y-inverse may not exist-inverse may be unstable

Page 26: Regression / Calibration

3b1 + 4b2 + e = 14b1 + 5b2 + e = 0 b1 + b2 + e = 4

Solution

Page 27: Regression / Calibration

Wanted solution

- I ≥ K- No inverse- No noise in X

Page 28: Regression / Calibration

Diagnostics

y = Xb + f

SS tot = SSmod + SSres

R2 = SSmod / SStot = 1- SSres / SStot

Coefficient of determination

Page 29: Regression / Calibration

Diagnostics

y = Xb + f

SSres = f’f

RMSEC = [ SSres / (I-A) ] 1/2

Root Mean Squared Error of Calibration

Page 30: Regression / Calibration

Alternatives to MLR/OLS

Page 31: Regression / Calibration

Ridge Regression (RR)

b = (X’X)-1 X’y

I easiest to invert

b = (X’X + kI)-1 X’y

k (ridge constant) as small as possible

Page 32: Regression / Calibration

Problems

- Choice of ridge constant

- No diagnostics

Page 33: Regression / Calibration

Principal Component Regression (PCR)

- I ≥ K

-Easy inversion

Page 34: Regression / Calibration

Principal Component Regression (PCR)

X T

K A

PCA

- A ≤ I- T orthogonal- Noise in X removed

Page 35: Regression / Calibration

Principal Component Regression (PCR)

y = Td + f

d = (T’T)-1 T’y

Page 36: Regression / Calibration

Problem

How many components used?

Page 37: Regression / Calibration

Advantage

- PCA done on data- Outliers- Classes- Noise in X removed

Page 38: Regression / Calibration

Partial Least SquaresRegression

Page 39: Regression / Calibration

X Yt u

Page 40: Regression / Calibration

X Yt u

w’ q’

Outer relationship

Page 41: Regression / Calibration

X Yt u

w’ q’

Inner relationship

Page 42: Regression / Calibration

X Yt u

w’ q’

A

A A

A

p’

Page 43: Regression / Calibration

Advantages

- X decomposed- Y decomposed- Noise in X left out- Noise in Y left out

Page 44: Regression / Calibration

PCR, PLS are one component at a time methods

After each component, a residual is calculated

The next component is calculatedon the residual

Page 45: Regression / Calibration

Another view

y = Xb + f

y = XbRR + fRR

y = XbPCR + fPCR

y = XbPLS + fPLS

Page 46: Regression / Calibration

bbb123OLSShrunk and rotatedA regression vector with too much shrinkage

Subspace of useful regression vectors

Page 47: Regression / Calibration

Prediction

Page 48: Regression / Calibration

Xcal ycal

I

K

Xtest ytest

J

yhat

Page 49: Regression / Calibration

Prediction diagnostics

yhat = Xtestb

ftest = ytest -yhat

PRESS = ftest’ftest

RMSEP = [ PRESS / J ] 1/2

Root Mean Squared Error of Prediction

Page 50: Regression / Calibration

Prediction diagnostics

yhat = Xtestb

ftest = ytest -yhat

R2test = Q2 = 1 - ftest’ftest/ytest’ytest

Page 51: Regression / Calibration

Some rules of thumb

R2 > 0.65 5 PLS comp.

R2test > 0.5

R2 - R2test < 0.2

Page 52: Regression / Calibration

Bias

f = y - Xb

always 0 bias

ftest = y - yhat

bias = 1/J ftest

Page 53: Regression / Calibration

Leverage - influence

b= (X’X)-1 X’y

yhat = Xb = X(X’X)-1 X’y = Hy

the Hat matrix

diagonal elements of H: Leverage

Page 54: Regression / Calibration

Leverage - influence

b= (X’X)-1 X’y

yhat = Xb = X(X’X)-1 X’y = Hy

the Hat matrix

diagonal elements of H: Leverage

Page 55: Regression / Calibration

Leverage - influence

Page 56: Regression / Calibration

Leverage - influence

Page 57: Regression / Calibration

Leverage - influence

Page 58: Regression / Calibration

ypred0OutlierBiasedftestUnbiasedLarge varianceSmall varianceHeteroscedastic

Residual plot

Page 59: Regression / Calibration

Residual

-Check histogram f

-Check variablewise E

-Check objectwise E

Page 60: Regression / Calibration
Page 61: Regression / Calibration

Measured responsePredicted responseMeasured responsePredicted responseHeteroscedasticMeasured responsePredicted responseOutlier byextrapolationBad outlierEFG

Page 62: Regression / Calibration

X Yt u

w’ q’

A

A A

A

p’

Page 63: Regression / Calibration

Plotting: line plots

Scree plot RMSEC, RMSECV, RMSEP

Loading plot against wavel.

Score plot against time

Residual against sample

Residual against yhat

T2 against sample

H against sample

Page 64: Regression / Calibration

Plotting: scatter plots 2D, 3DScore plot

Loading plot

Biplot

H against residual

Inner relation t - u

Weight wq

Page 65: Regression / Calibration

Nonlinearities

Page 66: Regression / Calibration

xyxyxyABDLinearWeak nonlinearxyCStrong nonlinearNon-monotonicxyELinear approximations

Page 67: Regression / Calibration

Remedies for nonlinearites. Making nonlinear data fit a linear model or making the model nonlinear.

-Fundamental theory (e.g. going from transmittance to absorbance)

-Use extra latent variables in PCR or PLSR

-Use transformations of latent variables

-Remove disturbing variables

-Find subsets that behave linearly

Page 68: Regression / Calibration

Remedies for nonlinearites. Making nonlinear data fit a linear model or making the model nonlinear.

-Use intrinsically nonlinear methods

-Locally transform variables X, y, or both nonlinearly (powers, logarithms, adding powers)

-Transformation in a neighbourhood (window methods)

-Use global transformations (Fourier, Wavelet)

-GIFI type discretization