8/15/2019 Part 9 - Model Validation and Structure Selection
1/41
System IdentificationControl Engineering B.Sc., 3rd yearTechnical University of Cluj-Napoca
Romania
Lecturer: Lucian Buşoniu
8/15/2019 Part 9 - Model Validation and Structure Selection
2/41
8/15/2019 Part 9 - Model Validation and Structure Selection
3/41
Model validation Structure selection Overparametrization
Recall: Importance of validation
Model validation is a crucial step: the model must be good enough(for our purposes).
If validation is unsuccessful, previous steps in the workflow must beredone – e.g., the model structure can be changed.
8/15/2019 Part 9 - Model Validation and Structure Selection
4/41
Model validation Structure selection Overparametrization
Motivation
So far, we validated and selected models mostly in a heuristic way, byexamining plots – using common sense .
Next, some mathematically well-founded tests will be given.
However, common sense remains indispensable – mathematical testswork under assumptions that may not always be satisfied.
8/15/2019 Part 9 - Model Validation and Structure Selection
5/41
Model validation Structure selection Overparametrization
Focus: Prediction error methods
This lecture focuses on single-output models obtained by prediction error methods .
Some of the tests can be extended to other settings.
M d l lid ti St t l ti O t i ti
8/15/2019 Part 9 - Model Validation and Structure Selection
6/41
Model validation Structure selection Overparametrization
Table of contents
1 Model validation with correlation tests
Basic correlation tests
Matlab example
Global correlation tests
2 Structure selection
3 Testing for overparametrization
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
7/41
Model validation Structure selection Overparametrization
Whiteness: Intuition
Recall general PEM model structure:
y (k ) = G (q −1)u (k ) + H (q −1)e (k )
where e (k ) is assumed to be zero-mean white noise.
PEM are derived so that the prediction errorε(k ) = y (k )− y (k ) = e (k ). If the system satisfies the model structure(so the assumption holds), and moreover the model is accurate, thenε(k ) is also zero-mean white noise.
Whiteness hypothesis
(W) The prediction errors ε(k ) are zero-mean white noise.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
8/41
Model validation Structure selection Overparametrization
Independence of past inputs: Intuition
y (k ) = G (q −1)u (k ) + v (k )
If the model is accurate, it entirely explains the influence of inputs
u (k ) on current and future outputs y (k + τ ). Therefore, the errorsε(k + τ ) = y (k + τ )− y (k + τ ) are only influenced by thedisturbances v , and are independent of inputs u (k ).
Independence hypothesis 1
(I1) The prediction errors ε(k + τ ) are independent of inputs u (k ) forτ ≥ 0 (i.e., of past and current inputs).
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
9/41
Model validation Structure selection Overparametrization
Independence of all inputs: Intuition
y (k ) = G (q −1)u (k ) + v (k )
If the experiment is closed-loop, u (k ) depends on past outputsy (k − j ) and this will lead to a correlation of ε(k − j ) with u (k ) (notethe independence of ε(k + τ ) and u (k ) is unaffected). If open-loop,
then ε(k − j ) is also independent from u (k ).
Independence hypothesis 2
(I2) The prediction errors ε(k + τ ) are independent of u (k ) for any τ (i.e., of all inputs).
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
10/41
Model validation Structure selection Overparametrization
All hypotheses
(W) The prediction errors ε(k ) are zero-mean white noise.
(I1) The prediction errors ε(k + τ ) are independent of u (k ) for τ ≥ 0(i.e., of past and current inputs).
(I2) The prediction errors ε(k + τ ) are independent of u (k ) for any τ (i.e., of all inputs).
A good model should satisfy (W) and (I1), and if there is no feedback,also (I2).
We will develop tests that allow to either accept or reject thesehypotheses for a given model, and therefore validate or reject themodel.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
11/41
p
Whiteness: Correlations
Recall the correlation function (equal to the covariance in zero-meancase):
r ε(τ ) = E {ε(k + τ )ε(k )}
If ε(k ) is zero-mean white noise:
The correlation function is zero, r ε(τ ) = 0 for any nonzero τ .
At zero, r ε(0) is the variance σ2 of the white noise.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
12/41
p
Whiteness: Correlations from data
Estimating correlations from data:
r ε(τ ) = 1
N
N −τ
k =1 ε(k + τ )ε(k )Normalization by the (estimated) variance:
x (τ ) = r ε(τ )
r ε(0)Then, x (τ ) should be small for nonzero τ .
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
13/41
Whiteness test at τ
For any Gaussian distribution N (µ, σ2), the probability of drawing asample in the interval µ − 1.96σ, µ + 1.96σ is found (from theGaussian formula) to be 0.95.
It can be shown that for large N , x (τ ) is distributed according to
Gaussian N (0, 1N ), and therefore that:
P (|x (τ )| ≤ 1.96√ N
) = 0.95
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
14/41
Whiteness test at τ (continued)
Whiteness test at τ
If |x (τ )| ≤ 1.96√ N
, then the whiteness hypothesis r ε(τ ) = 0 is accepted.
Otherwise, the whiteness hypothesis is rejected.
Intuition: If the test succeeds, then the hypothesis is likely to be true.
(Mathematically, the probability of rejecting the hypothesis when it isin fact true is 0.05.)
In practice, we require the hypothesis to hold at all τ = 0 supportedby the data.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
15/41
Independence: Correlations
To verify independence of ε from u , use cross-correlation function:
r εu (τ ) = E {ε(k + τ )u (k )}
1 If (I1) is true, then r εu (τ ) for τ
≥0.
2 If (I2) is true, then r εu (τ ) for any τ .
Estimation from data and normalization:
r εu (τ ) = 1N
N −τ k =1 ε(k + τ )u (k ) if τ ≥ 0
1N N k =1−τ ε(k + τ )u (k ) if τ
8/15/2019 Part 9 - Model Validation and Structure Selection
16/41
Independence test at τ
Like before, for large N , x (τ ) is distributed according to Gaussian N (0, 1N
), and therefore:
P (|x (τ )| ≤ 1.96√ N
) = 0.95
Independence test at τ
If |x (τ )| ≤ 1.96√ N
, then the independence hypothesis r εu (τ ) = 0 is
accepted.Otherwise, the independence hypothesis is rejected.
In practice, we require the hypothesis to hold at all τ ≥ 0 supportedby the data. If the model is accurate, then checking it at τ
8/15/2019 Part 9 - Model Validation and Structure Selection
17/41
Table of contents
1 Model validation with correlation tests
Basic correlation tests
Matlab example
Global correlation tests
2 Structure selection
3 Testing for overparametrization
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
18/41
Matlab example: Experimental data
The real system is in output-error form:
y (k ) = B (q −1)F (q −1)
u (k ) + e (k )
and has order n = 3.
plot(id); and plot(val);
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
19/41
Matlab: ARX model
First, we try an ARX model:mARX = arx(id, [3, 3, 1]); resid(mARX, id);
The model fails the whiteness test (W). This is because the system isnot within the model class, leading to an inaccurate model. Themodel is rejected.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
20/41
Matlab: ARX model (continued)
Simulating on the validation data confirms the fact that the model ispoor.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
21/41
Matlab: OE model
mOE = oe(id, [3, 3, 1]); resid(mOE, id);
The OE model passes all the tests – as expected because the OEmodel class contains the real system. The model is thereforevalidated.
Important note: The Matlab functions use 0.99 probability instead of
0.95.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
22/41
Matlab: OE model (continued)
Simulating the model on the validation data confirms the model hasgood quality.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
23/41
Table of contents
1 Model validation with correlation tests
Basic correlation tests
Matlab example
Global correlation tests
2 Structure selection
3 Testing for overparametrization
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
24/41
Chi-squared distribution
Let x be a vector of m random variables x i so that each x i is hasGaussian distribution N (0, 1).Define χ2(m ) as the distribution of the sum
m i =1 x
2i = x
x . This is
called the Chi-squared distribution (Matlab: chi2pdf).
Image credit: Wikipedia
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
25/41
Whiteness test
Define r = [ r ε(1), . . . , r ε(m )], and choose a small α, e.g. 0.05.Define β (α) to be the value β so that a random variable z ≥ 0distributed with χ2(m ) has probability 1 − α of being at most β :
P (z ≤ β ) = 1 − α
(Compare: P (|x (τ )|) ≤ 1.96√ N
) = 0.95 for the single-τ case.)
Whiteness test
If N r r
br 2ε(0)
≤β (α), then the hypothesis that the sequence ε(k ) is white
noise is accepted.Otherwise, the hypothesis is rejected.
(Compare: |x (τ )| ≤ 1.96√ N
)
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
26/41
Independence test
Can be extended in a similar way, formalism is complicated and willnot be given here.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
27/41
Table of contents
1 Model validation with correlation tests
2 Structure selection
3 Testing for overparametrization
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
28/41
Consider we are given several model structures M1,M2, . . . ,M.Example: ARX structures of increasing order.
How to choose among them?
First idea: choose Mi leading to the smallest mean squared error:
V ( θ) = 1N
N k =1
ε(k )2
This does not take into account the complexity of the model, so it maylead to overfitting. We explore other options (without going into their
derivation).
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
29/41
Akaike’s information criterion
W AIC = N log V (
θ) + 2p , or equivalently: log V (
θ) +
2p
N
where N is the number of data points and p the number ofparameters (e.g., na + nb in ARX).
Choice: Model with smallest W AIC.
Intuition: The term 2p penalizes the complexity of the model (numberof parameters). Taking the logarithm of the MSE allows to better take
into account small values of the MSE.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
30/41
Final prediction error
W FPE = V (
θ)
1 + p /N
1− p /N
Choice: Model with smallest W FPE.
Intuition: When N is large:
V (
θ)
1 + p /N
1− p /N = V ( θ)(1 +
2p /N
1− p /N ) ≈ V ( θ)(1 +
2p
N )
The term 2p N
penalizes model complexity.
Model validation Structure selection Overparametrization
8/15/2019 Part 9 - Model Validation and Structure Selection
31/41
F-test
Take just two model classes M1,M2 so that M1 ⊂M2 (e.g., theyare both ARX models and M2 has larger na and nb ).Choice: M1 is chosen over M2 (M2 does not offer a significantimprovement) if:
N
V 1( θ1)− V 2( θ2)V 2( θ2) ≤ β (α)β (α) is defined like before using the Chi-squared distribution, as thevalue β so that a random variable z ≥ 0 distributed with χ2(p 2 − p 1)has probability 1− α of being at most β :
P (z ≤ β ) = 1 − α
Subscripts 1 and 2 refer to the respective model classes. E.g. p 2 isthe number of parameters of M2, and is larger than p 1.
Model validation Structure selection Overparametrization
M l b l
8/15/2019 Part 9 - Model Validation and Structure Selection
32/41
Matlab example
An OE system with n = 2.
Model validation Structure selection Overparametrization
M tl b ith AIC
8/15/2019 Part 9 - Model Validation and Structure Selection
33/41
Matlab: selstruc with AIC
Recall arxstruc:
Na = 1:15; Nb = 1:15; Nk = 1:5;
NN = struc(Na, Nb, Nk); V = arxstruc(id, val, NN);
struc generates all combinations of orders in Na, Nb, Nk.
arxstruc identifies for each combination an ARX model on thedata id, simulates it on the data val, and returns information
about the MSEs, model orders etc. in V.
Model validation Structure selection Overparametrization
M tl b ith AIC ( ti d)
8/15/2019 Part 9 - Model Validation and Structure Selection
34/41
Matlab: selstruc with AIC (continued)
To choose the structure with the Akaike’s information criterion:N = selstruc(V, ’aic’);
For our data, N= [8, 8, 1].
Alternatively, graphical selection also allows using AIC:N = selstruc(V, ’plot’);
Model validation Structure selection Overparametrization
M tl b R lt
8/15/2019 Part 9 - Model Validation and Structure Selection
35/41
Matlab: Results
Model validation Structure selection Overparametrization
Remarks
8/15/2019 Part 9 - Model Validation and Structure Selection
36/41
Remarks
AIC, FPE also work if the system is not in the model class.
Matlab offers functions aic, fpe that compute these criteria for alist of models.
Model validation Structure selection Overparametrization
Table of contents
8/15/2019 Part 9 - Model Validation and Structure Selection
37/41
Table of contents
1 Model validation with correlation tests
2
Structure selection
3 Testing for overparametrization
Model validation Structure selection Overparametrization
Motivation
8/15/2019 Part 9 - Model Validation and Structure Selection
38/41
Motivation
Consider the real system obeys the ARMAX structure:
A0(q −1)y (k ) = B 0(q −1)u (k ) + C 0(q −1)e (k )
where subscript 0 indicates quantities related to the real system.
This is equivalent to the model:
M (q −1)A0(q −1)y (k ) = M (q −1)B 0(q −1)u (k ) + M (q −1)C 0(q −1)e (k )
with M (q −1) some polynomial of order nm .
So, using ARMAX identification with na = na 0 + nm , nb = nb 0 + nm ,nc = nc 0 + nm will produce an accurate model. This model ishowever too complicated (overparametrized), and will have some(nearly) common factors M (q −1) in all polynomials.
8/15/2019 Part 9 - Model Validation and Structure Selection
39/41
Model validation Structure selection Overparametrization
Matlab: overparameterized OE model
8/15/2019 Part 9 - Model Validation and Structure Selection
40/41
Matlab: overparameterized OE model
On the same data as for correlation tests (recall system has ordern = 3):
mOE = oe(id, [5, 5, 1]);
Looking at the validation data, the model is accurate:
Model validation Structure selection Overparametrization
Matlab: testing for pole-zero cancellations
8/15/2019 Part 9 - Model Validation and Structure Selection
41/41
Matlab: testing for pole-zero cancellations
pzmap(mOE, ’sd’, 1.96);
Arguments ’sd’, nsd
specify the confidence region as a number ofstandard deviations (nsd taken here 1.96 to get 0.95 confidence).
Two pairs of poles and zeros have overlapping confidence regions ⇒likely they are canceling each other. This indicates the order should
be 3 (the true system order).
Top Related