Lecture Notes - MIT- System Identification

70
System Identification 6.435 SET 12 – Identification in Practice – Error Filtering – Order Estimation – Model Structure Validation – Examples Munther A. Dahleh Lecture 12 6.435, System Identification Prof. Munther A. Dahleh 1

Transcript of Lecture Notes - MIT- System Identification

Page 1: Lecture Notes - MIT- System Identification

System Identification

6.435

SET 12 – Identification in Practice

– Error Filtering

– Order Estimation

– Model Structure Validation

– Examples

Munther A. Dahleh

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

1

Page 2: Lecture Notes - MIT- System Identification

“Practical” Identification

• Given:

• Want

1) a model for the plant

2) a model for the noise

3) an estimate of the accuracy

• choice of the model structure

flexibility parsimony

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

2

Page 3: Lecture Notes - MIT- System Identification

• What do we know?

We know methods for identifying “models” inside a “priori” given model structures.

• How can we use this knowledge to provide a model for the plant, the process noise, with reasonable accuracy.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

3

Page 4: Lecture Notes - MIT- System Identification

Considerations

• Pre-treatment of data

– Remove the biase (may not be due to inputs)

– Filter the high frequency noise

– Outliers ……

• Introduce filtered errors. Emphasize certain frequency range. (The filter depends on the criterion).

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

4

Page 5: Lecture Notes - MIT- System Identification

• Pick a model structure (or model structures)

– Which one is better?

– How can you decide which one reflects the real system?

– Is there any advantage from picking a model with a large number of parameters, if the input is “exciting” only a smaller number of frequency points?

• What are the important quantities that can be computed directly from the data (inputs & outputs), that are important to identification?

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

5

Page 6: Lecture Notes - MIT- System Identification

Pre-treatment of Data

• Removing the biase

– If , then the relation between the static input and output is given by

– The static component of may not be entirely due to , i.e. the noise might be biased.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

6

Page 7: Lecture Notes - MIT- System Identification

– Method I: Subtract the means:

Define

New Data:

– Method II: Model the offset by an unknown constant , and estimate it.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

7

Page 8: Lecture Notes - MIT- System Identification

• High Frequency disturbances in the data record.

– “High” means above the frequency of interest.

– Related to the choice of the sampling period.

sampler

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

8

Page 9: Lecture Notes - MIT- System Identification

– Without an anti-aliasing filter, high frequency noise is folded to low frequency.

– W Sampleanti-alias

– high frequency noise depends on:

a) high frequency noise due to

b) aliasing.

– Problem occurs at both the inputs and outputs.

L – LTI LP filter.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

9

Page 10: Lecture Notes - MIT- System Identification

equivalently

i.e. multiply the noise filter by

• Outliers, Bursts

– Either erroneous or high-disturbed data point.

– Could have a very bad effect on the estimate.

– Solution:

a) Good choice of a criterion (Robust to changes)

b) Observe the residual spectrum. Sometimes it is possible to determine bad data.

c) Remove by hand!!! Messing up with real data.

d) Failure-detection using hypothesis testing or statistical methods. (Need to define a threshold).

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

10

Page 11: Lecture Notes - MIT- System Identification

Role of Filters: Affecting the Biase Distribution

• Frequency domain interpretation of parameter estimation:

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

11

Page 12: Lecture Notes - MIT- System Identification

If : independently parametrized model structure

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

12

Page 13: Lecture Notes - MIT- System Identification

• Heuristically, is chosen as a compromise between minimizing the integral of and matching the error spectrum

• weighting function

– Input spectrum

– Noise spectrum

• With a pre-filter:

• Can view the pre-filters as weighting functions to emphasize certain frequency ranges. This interpretation may not coincide with “getting rid of high frequency components of the data”.

• Depending on the criterion, the choice of L can be different.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

13

Page 14: Lecture Notes - MIT- System Identification

OE Model Structures

• If rolls off, then as long as is small around the contribution of the criterion will be very small.

• If , we expect to match much better at low-frequency.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

14

Page 15: Lecture Notes - MIT- System Identification

• Example (book)

No noise.

PSRB

OE:

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

15

Page 16: Lecture Notes - MIT- System Identification

– Good match at low frequency. Not as good at high frequency.

– Introduce a high-pass filter (5th order Butterworth filter, cut-off freq = 0.5 rad/sec.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

16

Page 17: Lecture Notes - MIT- System Identification

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

17

Page 18: Lecture Notes - MIT- System Identification

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

18

Page 19: Lecture Notes - MIT- System Identification

ARX Model Structure

• not independently-parametrized.

and is large at high frequency. • rolls off, If looks likefrequency part of the criterion.

If , then it will emphasize the high

• Conclusions are not as transparent in the noisy case. However, it is in general true for large (SNR).

• Same example

and the frequency response of is:

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

19

Page 20: Lecture Notes - MIT- System Identification

– Not a very good match at low frequency.

– Better than OE at high frequency.

– Can change this through a pre-filter. (5th order Butterworth, lowpass with cut-off frequency = 0.5)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

20

Page 21: Lecture Notes - MIT- System Identification

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

21

Page 22: Lecture Notes - MIT- System Identification

– Better low frequency fit.

– Another interpretation

small at high frequency.

low frequency

Filters:

high frequency if L – low pass.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

22

Page 23: Lecture Notes - MIT- System Identification

Conclusions

• Pre filters can be viewed as “design” parameters as well as the “standard” interpretation for noise reduction.

• Pre-treatment of the “data” is quite valuable, however should be done with “care”.

• Sampling can be quite tricky. Need to estimate the bandwidth of the system.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

23

Page 24: Lecture Notes - MIT- System Identification

Model Structure Determination

• Flexible vs Parsimony

• Lots of trial and error. Usually more than one “experiment” is

Validation

Choice of model structure

available. estimated

parameters

New data

• Theme: Fit the data with the least “complex” model structure. Avoid over-fitting which amounts to fitting noise.

• Better to compare similar model structures, although it is necessary to compare different structures at the end.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

24

Page 25: Lecture Notes - MIT- System Identification

• Available data dictates the possible model structures and their dimensions.

• Can noise help identify the parameters?

• How “bad” is the effect of noise? Consistency in general is guaranteed for any SNR. What does that mean?

• Is there a rigorous way of comparing different model structures?

min prediction

model structure

• Akaike’s Information Theoretic Criterion (AIC).

• Akaike’s final prediction error critierion (FPE).

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

25

Page 26: Lecture Notes - MIT- System Identification

Order Estimation

• Qualitative

– Spectral estimate

– Step response if available. Otherwise, step response of spectral estimate.

• Quantitative

– Covariance matrices

– Information matrix

– Residual-input correlation

– Residual whiteness

• All methods are limited by the input used.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

26

Page 27: Lecture Notes - MIT- System Identification

Covariance Matrix

• Two basic results

– u is p.e of order 2n and

A, B are coprime

– u is p.e of order n

white or persistent

• To determine n, obtain estimates of

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

27

Page 28: Lecture Notes - MIT- System Identification

Case 1: u is WN, v is WN

Increase s until is “singular”.

Use “SVD”, robust rank tests.

, observe a sudden drop in the rank.

Case 2: u is p.e of order , Noise is white.

If is singular.

you really cannot estimate the order of the system if it is larger than .

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

28

Page 29: Lecture Notes - MIT- System Identification

Case 3: u is p.e of order , Noise free case.

cannot be determined. Not a likely hypothesized model structure.

is a bad estimate of . Of course N is fixed

(data length).

“Enhanced criterion”

estimated noise contribution.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

29

Page 30: Lecture Notes - MIT- System Identification

• If noise level is high, use an instrumental variable

test: rank

• If u is p.e , then generically

• Other tests:

Estimates of

non singular

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

30

Page 31: Lecture Notes - MIT- System Identification

Examples

• unknown. Study possible conclusions for different experiments

and different SNR.

• Model structure

• Inputs

– WN

Can determine from the spectrum or u (or simply FFT).

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

31

Page 32: Lecture Notes - MIT- System Identification

• SNR:

• All examples, you can access both the inputs and outputs.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

32

Page 33: Lecture Notes - MIT- System Identification

First Experiment

Test for model order:

From data is singular for

(noticed a sudden drop)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

33

Page 34: Lecture Notes - MIT- System Identification

→ Estimated system

small (ARX)

Comment: If m is the correct structure , the above are “good”

estimates of the parameters.

Plot shows different SNR.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

34

Page 35: Lecture Notes - MIT- System Identification

random

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

35

Page 36: Lecture Notes - MIT- System Identification

ARX (ARX)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

36

Page 37: Lecture Notes - MIT- System Identification

Second Experiment

u is p.e of order 2.

1 0.1 0.01

-0.0014 -1.38 -1.398

0.0005 0.479 0.4885

-0.0520 1.255 0.9631

0.054 0.356 0.5437

high high low

2.4x10-6

3.03x10-7

2.67x10-7

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

37

Page 38: Lecture Notes - MIT- System Identification

Theoretical Analysis:

Data is informative (although det is small)

regardless of

the estimates were quite bad for in

comparison to WN inputs.

Even though noise “helps” in obtaining asymptotic convergence (through providing excitation), it is not helpful for finite-data records, since its effect cannot be averaged. Accuracy depends on

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

38

Page 39: Lecture Notes - MIT- System Identification

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

39

Page 40: Lecture Notes - MIT- System Identification

ARX (with ARX plant)

2nd input

λ = 1

λ = 1

λ = 0

λ = 0

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

40

Page 41: Lecture Notes - MIT- System Identification

Third Experiment

u is p.e of order 4.

1 -1.475 0.5317 2.38 -1.134 high 0.25 0.1 -1.39 0.478 1.186 0.333 smaller 0.07 0.01 -1.395 0.4862 1.022 0.487 low 0.069

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

41

Page 42: Lecture Notes - MIT- System Identification

Theoretical Analysis:

Data is informative w. r. to

Results for are better in this case than

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

42

Page 43: Lecture Notes - MIT- System Identification

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

43

Page 44: Lecture Notes - MIT- System Identification

ARX (of an ARX plant)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

44

Page 45: Lecture Notes - MIT- System Identification

SNR = 1

ARX (of an ARX plant)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

45

Page 46: Lecture Notes - MIT- System Identification

SNR = 0.01

ARX (of an ARX plant)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

46

Page 47: Lecture Notes - MIT- System Identification

Conclusions

• Accuracy of estimates depend on

• Estimate of accuracy

If ARX,

Large is large.

u is not rich is singular.

is large.

u is rich (but close to not rich) is large.

small ⇒ better accuracy.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

47

Page 48: Lecture Notes - MIT- System Identification

• Explanation (Proof of HW#3 problem 2)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

48

Page 49: Lecture Notes - MIT- System Identification

is near singular

⇒ Low accuracy for estimates. This can be countered

by a small .

• How about other structures?

“same conclusions as long as v is persistent.”

OE, ARMAX, ……

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

49

Page 50: Lecture Notes - MIT- System Identification

Comparisons of Different Model Structures

• There is a trade off between the model structure complexity and the min. error

min V

dim of model

structure

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

50

Page 51: Lecture Notes - MIT- System Identification

• Akaike’s Final Prediction Error criterion (FPE)

model structure

• Based on minimum max likelihood. Tradeoff dm

vs

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

51

Page 52: Lecture Notes - MIT- System Identification

• A natural way to evaluate a model structure is by

is a random variable

• Obtain estimates of both and J

• Result

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

52

Page 53: Lecture Notes - MIT- System Identification

Proof: expand

also

around

Notice:

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

53

Page 54: Lecture Notes - MIT- System Identification

and

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

54

Page 55: Lecture Notes - MIT- System Identification

Akaike’s Information Theoretic Criterion

• Let (Log likelihood function)

• Assume true system

• The matrix is invertible (identifiability).

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

55

Page 56: Lecture Notes - MIT- System Identification

• Model structure determination problem

• For every fixed does not affect the min.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

56

Page 57: Lecture Notes - MIT- System Identification

Example

• Assume that the innovations are Gaussian with unknown variance.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

57

Page 58: Lecture Notes - MIT- System Identification

• Approximately minimize

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

58

Page 59: Lecture Notes - MIT- System Identification

Akaike’s Final Prediction Error Criterion

• Let

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

59

Page 60: Lecture Notes - MIT- System Identification

estimate

of

(FPE)

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

60

Page 61: Lecture Notes - MIT- System Identification

Example 2

• is unknown.

• 3 experiments

• Consider 2 - model structures

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

61

Page 62: Lecture Notes - MIT- System Identification

Experiment 1

• u = rand sequence

⇒ system has dim = 2

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

62

Page 63: Lecture Notes - MIT- System Identification

• Estimated parameters

ARX -1.4 0.49 1 0.5 0 OE -1.4 0.49 1 0.5 0

• Try different structures

AIC or FPE is the best choice.

Lecture 12 6.435, System Identification 63 Prof. Munther A. Dahleh

Page 64: Lecture Notes - MIT- System Identification

Experiment 2

• u is p.e of order 2

ARX:

OE: ” ” ” ”

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

64

Page 65: Lecture Notes - MIT- System Identification

Structure ARX Parameters OE Parameters (1, 1, 1) (-0.8627, 2.3182) ( 0.034) (-0.85, 2.51) 0.214 (2, 2, 1) (-1.4, 0.49, 1, 0.5) 0 (-1.4, 0.49, 1, 0.5) 0 (3, 3, 1) (- * * * * …… ) 170 ( * * * * …… ) 0.3317

Loss of identifiability.

Clearly is the preferred model structure

Remark: Both experiments were generated from the model

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

65

Page 66: Lecture Notes - MIT- System Identification

Experiment 3

Structure ARX Parameters OE Parameters (2 2 1) (-1.4004, 0.4903,

0.98, 0.51) 3.4x10-5 (-1.40, 0.49, 0.95,

0.54) 9.2x10-6

(3 3 1) singular singular

OE model is preferred.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

66

Page 67: Lecture Notes - MIT- System Identification

Structure ARX Parameters OE Parameters (2 2 1) (-1.4, 0.49,

1.0031, 0.4944) 1.6x10-5 (1.399, 0.489, 0.99,

0.50) 1.3x10-5

(3 3 1) ( * * * …… ) 1.3x10-5 ( * * * …… ) 4.2x10-5

Num. errors

AIC for ARX {In fact does better}.

AIC for ARX and OE OE structure.

Remark: Data generated by

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

67

Page 68: Lecture Notes - MIT- System Identification

Validation

• Use different sets of data to validate the model structure and the estimated model.

• You can obtain different estimates using the data and then average them. OR you can construct new input-output pairs and re-estimate.

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

68

Page 69: Lecture Notes - MIT- System Identification

Conclusions

• Criterion contains a penalty function for the dimension of the system.

• “AIC” is one way of doing that. (FPE) is an estimate of the AIC with a quadratic objective.

• “AIC” has connections with information theory observation

assumed PDF

true PDF

Entropy of w. r. to

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

69

Page 70: Lecture Notes - MIT- System Identification

over the observation = information distance

• “AIC” is the average information distance

after some simplification

• Careful about Numerical errors!?

Lecture 12 6.435, System Identification Prof. Munther A. Dahleh

70