Regression / Calibration
description
Transcript of Regression / Calibration
![Page 1: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/1.jpg)
Regression / Calibration
MLR, RR, PCR, PLS
![Page 2: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/2.jpg)
Paul Geladi
Head of Research NIRCEUnit of Biomass Technology and ChemistrySwedish University of Agricultural SciencesUmeåTechnobothniaVasa [email protected] [email protected]
![Page 3: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/3.jpg)
Univariate regression
![Page 4: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/4.jpg)
x
y
Offset
Slope
![Page 5: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/5.jpg)
x
y
Offset a
Slope b
y = a + bx +
![Page 6: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/6.jpg)
x
y
![Page 7: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/7.jpg)
x
y Linear fit
Underfit
![Page 8: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/8.jpg)
x
y Overfit
![Page 9: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/9.jpg)
x
y Quadratic fit
![Page 10: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/10.jpg)
Multivariate linear regression
![Page 11: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/11.jpg)
y = f(x)
Works sometimes
y = f(x)
Works only for a few variables
Measurement noise!
∞ possible functions
![Page 12: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/12.jpg)
X y
I
K
![Page 13: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/13.jpg)
y = f(x)
y = f(x)
Simplified by:
y = b0 + b1x1 + b2x2 + ... + bKxK + f
Linear approximation
![Page 14: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/14.jpg)
y = b0 + b1x1 + b2x2 + ... + bKxK + f
y : responsexk : predictorsbk : regression coefficientsb0 : offset, constantf : residual
Nomenclature
![Page 15: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/15.jpg)
X y
I
K
X, y mean-centered b0 out
![Page 16: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/16.jpg)
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
y = b1x1 + b2x2 + ... + bKxK + f
} I samples
![Page 17: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/17.jpg)
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
y = b1x1 + b2x2 + ... + bKxK +f
![Page 18: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/18.jpg)
Xy
I
K
f
b
= +
y = Xb + f
![Page 19: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/19.jpg)
X, y known, measurableb, f unknown
No solution
f must be constrained
![Page 20: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/20.jpg)
The MLR solution
Multiple Linear Regression
Ordinary Least Squares (OLS)
![Page 21: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/21.jpg)
b = (X’X)-1 X’y
Problems?
Least squares
![Page 22: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/22.jpg)
3b1 + 4b2 = 14b1 + 5b2 = 0
One solution
![Page 23: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/23.jpg)
3b1 + 4b2 = 14b1 + 5b2 = 0 b1 + b2 = 4
No solution
![Page 24: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/24.jpg)
3b1 + 4b2 + b3 = 14b1 + 5b2 + b3 = 0
∞ solutions
![Page 25: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/25.jpg)
b = (X’X)-1 X’y
-K > I ∞ solutions-I > K no solution-error in X-error in y-inverse may not exist-inverse may be unstable
![Page 26: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/26.jpg)
3b1 + 4b2 + e = 14b1 + 5b2 + e = 0 b1 + b2 + e = 4
Solution
![Page 27: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/27.jpg)
Wanted solution
- I ≥ K- No inverse- No noise in X
![Page 28: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/28.jpg)
Diagnostics
y = Xb + f
SS tot = SSmod + SSres
R2 = SSmod / SStot = 1- SSres / SStot
Coefficient of determination
![Page 29: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/29.jpg)
Diagnostics
y = Xb + f
SSres = f’f
RMSEC = [ SSres / (I-A) ] 1/2
Root Mean Squared Error of Calibration
![Page 30: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/30.jpg)
Alternatives to MLR/OLS
![Page 31: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/31.jpg)
Ridge Regression (RR)
b = (X’X)-1 X’y
I easiest to invert
b = (X’X + kI)-1 X’y
k (ridge constant) as small as possible
![Page 32: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/32.jpg)
Problems
- Choice of ridge constant
- No diagnostics
![Page 33: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/33.jpg)
Principal Component Regression (PCR)
- I ≥ K
-Easy inversion
![Page 34: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/34.jpg)
Principal Component Regression (PCR)
X T
K A
PCA
- A ≤ I- T orthogonal- Noise in X removed
![Page 35: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/35.jpg)
Principal Component Regression (PCR)
y = Td + f
d = (T’T)-1 T’y
![Page 36: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/36.jpg)
Problem
How many components used?
![Page 37: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/37.jpg)
Advantage
- PCA done on data- Outliers- Classes- Noise in X removed
![Page 38: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/38.jpg)
Partial Least SquaresRegression
![Page 39: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/39.jpg)
X Yt u
![Page 40: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/40.jpg)
X Yt u
w’ q’
Outer relationship
![Page 41: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/41.jpg)
X Yt u
w’ q’
Inner relationship
![Page 42: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/42.jpg)
X Yt u
w’ q’
A
A A
A
p’
![Page 43: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/43.jpg)
Advantages
- X decomposed- Y decomposed- Noise in X left out- Noise in Y left out
![Page 44: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/44.jpg)
PCR, PLS are one component at a time methods
After each component, a residual is calculated
The next component is calculatedon the residual
![Page 45: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/45.jpg)
Another view
y = Xb + f
y = XbRR + fRR
y = XbPCR + fPCR
y = XbPLS + fPLS
![Page 46: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/46.jpg)
bbb123OLSShrunk and rotatedA regression vector with too much shrinkage
Subspace of useful regression vectors
![Page 47: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/47.jpg)
Prediction
![Page 48: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/48.jpg)
Xcal ycal
I
K
Xtest ytest
J
yhat
![Page 49: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/49.jpg)
Prediction diagnostics
yhat = Xtestb
ftest = ytest -yhat
PRESS = ftest’ftest
RMSEP = [ PRESS / J ] 1/2
Root Mean Squared Error of Prediction
![Page 50: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/50.jpg)
Prediction diagnostics
yhat = Xtestb
ftest = ytest -yhat
R2test = Q2 = 1 - ftest’ftest/ytest’ytest
![Page 51: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/51.jpg)
Some rules of thumb
R2 > 0.65 5 PLS comp.
R2test > 0.5
R2 - R2test < 0.2
![Page 52: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/52.jpg)
Bias
f = y - Xb
always 0 bias
ftest = y - yhat
bias = 1/J ftest
![Page 53: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/53.jpg)
Leverage - influence
b= (X’X)-1 X’y
yhat = Xb = X(X’X)-1 X’y = Hy
the Hat matrix
diagonal elements of H: Leverage
![Page 54: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/54.jpg)
Leverage - influence
b= (X’X)-1 X’y
yhat = Xb = X(X’X)-1 X’y = Hy
the Hat matrix
diagonal elements of H: Leverage
![Page 55: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/55.jpg)
Leverage - influence
![Page 56: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/56.jpg)
Leverage - influence
![Page 57: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/57.jpg)
Leverage - influence
![Page 58: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/58.jpg)
ypred0OutlierBiasedftestUnbiasedLarge varianceSmall varianceHeteroscedastic
Residual plot
![Page 59: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/59.jpg)
Residual
-Check histogram f
-Check variablewise E
-Check objectwise E
![Page 60: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/60.jpg)
![Page 61: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/61.jpg)
Measured responsePredicted responseMeasured responsePredicted responseHeteroscedasticMeasured responsePredicted responseOutlier byextrapolationBad outlierEFG
![Page 62: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/62.jpg)
X Yt u
w’ q’
A
A A
A
p’
![Page 63: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/63.jpg)
Plotting: line plots
Scree plot RMSEC, RMSECV, RMSEP
Loading plot against wavel.
Score plot against time
Residual against sample
Residual against yhat
T2 against sample
H against sample
![Page 64: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/64.jpg)
Plotting: scatter plots 2D, 3DScore plot
Loading plot
Biplot
H against residual
Inner relation t - u
Weight wq
![Page 65: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/65.jpg)
Nonlinearities
![Page 66: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/66.jpg)
xyxyxyABDLinearWeak nonlinearxyCStrong nonlinearNon-monotonicxyELinear approximations
![Page 67: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/67.jpg)
Remedies for nonlinearites. Making nonlinear data fit a linear model or making the model nonlinear.
-Fundamental theory (e.g. going from transmittance to absorbance)
-Use extra latent variables in PCR or PLSR
-Use transformations of latent variables
-Remove disturbing variables
-Find subsets that behave linearly
![Page 68: Regression / Calibration](https://reader035.fdocuments.in/reader035/viewer/2022081507/56815a94550346895dc80e18/html5/thumbnails/68.jpg)
Remedies for nonlinearites. Making nonlinear data fit a linear model or making the model nonlinear.
-Use intrinsically nonlinear methods
-Locally transform variables X, y, or both nonlinearly (powers, logarithms, adding powers)
-Transformation in a neighbourhood (window methods)
-Use global transformations (Fourier, Wavelet)
-GIFI type discretization