CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 1

Process and Disturbance Models



Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics



The Task of Dynamic Model Building

partitioning process data into a deterministic component (the process) and a stochastic component (the disturbance)

process disturbance

?

time seriesmodel

transfer functionmodel



Process Model Types

• non-parametric– impulse response– step response– spectrum

• parametric– transfer function models

» numerator» denominator

– difference equation models » equivalent to transfer function models with backshift

operator

}technically “parametric” when in finite form (e.g., FIR)



Impulse and Step Process Models

described as a set of weights:

y t h i u t ii

N( ) ( ) ( )= −∑

=0

y t s i u t ii

N( ) ( ) ( )= −∑

=Δ

0

impulsemodel

stepmodel

Note - typically treat Δu(t-N) as a step from 0 - i.e., Δu(t-N) = u(t-N)



Process Spectrum Model

represented as a set of frequency response values, or graphically

frequency (rad/s)

amplitude

ratio



Process Transfer Function Models

numerator, denominator dynamics and time delay

G qB q q

F qp

f

( )( )

( )

( )−

− − +

−=11 1

1

poles

zeros

time delayextra 1 stepdelay introduced by zero order hold and sampling - f is pure time delay

q-1 is backwards shift operator: q-1 y(t)=y(t-1)



Model Types for Disturbances

• non-parametric– “impulse response” - infinite moving average– spectrum

• parametric– “transfer function” form

» autoregressive (denominator)» moving average (numerator)



ARIMA Models for Disturbances

)()1)((

)()(

11

1ta

qqD

qCtd

d−−

−

−=

autoregressive component

moving average component

random shock

AutoRegressive Integrated Moving Average ModelTime Series Notation - ARIMA(p,d,q) model has • pth-order denominator - AR• qth-order numerator - MA• d integrating poles (on the unit circle)



ARMA Models for Disturbances

)()(

)()(

1

1ta

qD

qCtd −

−=

autoregressive component

moving average component

random shock

Simply have no integrating component



Typical Model Combinations

• model predictive control– impulse/step process model + ARMA disturbance model

» typically a step disturbance model which can be considered as a pure integrator driven by a single pulse

• single-loop control– transfer function process model + ARMA disturbance model



Classification of Models in Identification

• AutoRegressive with eXogenous inputs (ARX)• Output Error (OE)• AutoRegressive Moving Average with eXogenous

inputs (ARMAX)• Box-Jenkins (BJ)• per Ljung’s terminology



ARX Models

– u(t) is the exogenous input– same autoregressive component for process, disturbance– numerator term for process, no moving average in

disturbance– physical interpretation - disturbance passes through entire

process dynamics » e.g., feed disturbance

A q y t B q q u t a tf( ) ( ) ( ) ( ) ( )( )− − − += +1 1 1



Output Error Models

– no disturbance dynamics – numerator and denominator process dynamics – physical interpretation - process subject to white noise

disturbance (is this ever true?)

y tB q

A qq u t a tf( )

( )

( )( ) ( )( )= +

−

−− +

1

11



ARMAX Models

– process and disturbance have same denominator dynamics– disturbance has moving average dynamics– physical interpretation - disturbance passing though process

which enters at a point away from the input» except if C(q-1) = B(q-1)

A q y t B q q u t C q a tf( ) ( ) ( ) ( ) ( ) ( )( )− − − + −= +1 1 1 1



Box-Jenkins Model

– autoregressive component plus input, disturbance can have different dynamics

– AR component A(q-1) represents dynamic elements common to both process and disturbance

– physical interpretation - disturbance passes through other dynamic elements before entering process

A q y tB q

F qq u t

C q

D qa tf( ) ( )

( )

( )( )

( )

( )( )( )−

−

−− +

−

−= +11

11

1

1



Range of Model Types

Output Error

ARX

ARMAX

Box-Jenkins

least general

most general



Outline




Model Estimation - General Philosophy

Form a “loss function” which is to be minimized to obtain the “best” parameter estimates

Loss function » “loss” can be considered as missed trend or information» e.g. - linear regression

• loss would represent left-over trends in residuals which could be explained by a model

• if we picked up all trend, only the random noise e(t) would be left• additional trends drive up the variation of the residuals• loss function is the sum of squares of the residuals (related to

the variance of the residuals)



Linear Regression - Types of Loss Functions

First, consider the linear regression model:

Least Squares estimation criterion -

Y x x x e e Np p= + + + + +β β β β σ0 1 1 2 220L , ~ ( , )

min ( $)

min ( { })

min

{ , ,... }

{ , ,... }

{ , ,... }

β

β

β

β β β β

i

i

i

i pi i

i

n

i pi i i p pii

n

i pi

i

n

y y

y x x x

e

= =

= =

= =

−∑

= − + + + +∑

= ∑

12

1

10 1 1 2 2

2

1

12

1

L

squared prediction error at point “i”




The model describes how the mean of Y varies:

and the variance of Y is because the random component in Y comes from the additive noise “e”. The probability density function at point “i” is

where ei is the noise at point “i”

E Y x x xp p{ } = + + + +β β β β0 1 1 2 2 L

σ 2

fy x x

e

Yi p p

i

ii i=

− − + + +⎛

⎝

⎜⎜

⎞

⎠

⎟⎟

= −⎛

⎝⎜⎜

⎞

⎠⎟⎟

12 2

12 2

0 1 12

2

2

2

πσβ β β

σ

πσ σ

exp{ ( )}

exp{ }

L




We can write the joint probability density function for all observations in the data set:

⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛∑−

=

⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛∑ +++−−

=

=

=

21

2

2/

21

2110

2/

2

}{exp

)2(

1

2

)}({exp

)2(

11

σσπ

σ

βββ

σπ

n

ii

nn

n

ippi

nnYY

e

xxyf

ii

n

L

K




Given parameters, we can use to determine probability that a given range of observations will occur.

What if we have observations but don’t know parameters?» assume that we have the most common, or “likely”,

observations - i.e., observations that have the greatest probability of occurrence

» find the parameter values that maximize the probability of the observed values occurring

» the joint density function becomes a “likelihood function” » the parameter estimates are “maximum likelihood

estimates”

fY Yn1K




Maximum Likelihood Parameter Estimation Criterion -

⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛∑ +++−−

= ==

=

21

2110

2/,...,1,

,...,1,

2

)}({exp

)2(

1max

)(max

σ

βββ

σπ

β

β

β

n

ippi

nnpi

pi

ii

i

i

xxy

L

L

y




Given the form of the likelihood function, maximizing is equivalent to minimizing the argument of the exponential, i.e.,

For the linear regression case, the maximum likelihood parameter estimates are equivalent to the least squares parameter estimates.

min ( { })

min

{ , ,... }

{ , ,... }

β

β

β β β βi

i

i pi i i p pii

n

i pi

i

n

y x x x

e

= =

= =

− + + + +∑

= ∑

10 1 1 2 2

2

1

12

1

L




Least Squares Estimation» loss function is sum of squared residuals = sum of

squared prediction errors

Maximum Likelihood» loss function is likelihood function, which in the linear

regression case is equivalent to the sum of squared prediction errors

Prediction Error = observation - predicted value

y y y x x xi i i i i p pi− = − + + + +$ { }β β β β0 1 1 2 2 L



Loss Functions for Identification

Least Squares

“minimize the sum of squared prediction errors”

The loss function is

where N is the number of points in the data record.

( ( ) $( ))y t y tt

N−∑

=1

2



Least Squares Identification Example

Given an ARX(1) process+disturbance model:

the loss function can be written as

y t a y t b u t e t( ) ( ) ( ) ( )= − + − +1 11 1

( ( ) $( )) ( ( ) { ( ) ( )})y t y t y t a y t bu tt

N

t

N−∑ = − − + −∑

= =2

21 1

2

21 1




In matrix form,

and the sum of squares prediction error is

e e eT

yy

y N

y uy u

y N u N

ab

with =

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥−

− −

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥

⎡⎣⎢

⎤⎦⎥

( )( )

( )

( ) ( )( ) ( )

( ) ( )

23

1 12 2

1 1

11M M M




The least squares parameter estimates are:

Note that the disturbance structure in the ARX model is such that the disturbance contribution appears in the formulation as a white noise additive error --> satisfies assumptions for this formulation.

$$ ( )

( )( )

( )

ab

yy

y N

T T1

1

1

23⎡

⎣⎢⎤⎦⎥=

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥

−Φ Φ ΦM



Least Squares Identification

• ARX models fit into this framework• Output Error models -

or in difference equation form:

y tB q

A qq u t e t

A q y t B q q u t A q e t

f

f

( )( )

( )( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

( )

( )

= +

= +

−

−− +

− − − + −

1

11

1 1 1 1

or

y t a y t a y t p B q q u t A q e tpf( ) ( ) ( ) ( ) ( ) ( ) ( )( )= − + + − + +− − + −

11 1 11 L

violates least squaresassumptions of independent errors



Least Squares Identification

Any process+disturbance model other than the ARX model will not satisfy the structural requirements.

Implications? » estimators are not consistent - don’t asymptotically tend

to true values of parameters» potential for bias



Prediction Error Methods

Choose parameter estimates to minimize some function of the prediction errors.

For example, for the Output Error Model, we have

Use a numerical optimization routine to obtain “best” estimates.

ε ( ) ( )( )

( )( )( )t y t

B q

A qq u tf= −

−

−− +

1

11

predictionprediction error




AR(1) Example -

Use model to predict one step ahead given past values:

This is an optimal predictor when e(t) is normally distributed, and can be obtained by taking the “conditional expectation” of y(t) given information up to and including time t-1. e(t) disappears because it has zero mean and adds no information on average.

“one step ahead predictor”

y t a y t b u t e t( ) ( ) ( ) ( )= − + − +1 11 1

$( ) ( ) ( )y t a y t b u t= − + −1 11 1




Prediction Error for the one step ahead predictor:

We could obtain parameter estimates to minimize sum of squared prediction errors:

ε ( ) ( ) $( ) ( ) { ( ) ( )}t y t y t y t a y t bu t= − = − − + −1 11 1

ε ( ) ( ( ) $( ))t y t y tt

N

t

N2

2 2

2

= =∑ = −∑

same as Least Squares Estimates for thisARX example




What happens if we have an ARMAX(1,1) model?

One step ahead predictor is:

But what is e(t-1)? » estimate it using measured y(t-1) and estimate of y(t-1)

y t a y t b u t e t c e t( ) ( ) ( ) ( ) ( )= − + − + + −1 1 11 1 1

$( ) ( ) ( ) ( )y t a y t b u t c e t= − + − + −1 1 11 1 1

$( ) ( ) $( )

( ) { ( ) ( ) ( )}

e t y t y t

y t a y t b u t c e t

− = − − −= − − − + − − −

1 1 11 2 2 21 1 1




Note that estimate of e(t-1) depends on e(t-2), which depends on e(t-3), and so forth

» eventually end up with dependence on e(0), which is typically assumed to be zero

» “conditional” estimates - conditional on assumed initial values

» can also formulate in a way to avoid conditional estimates

» impact is typically negligible for large data sets• during computation, it isn’t necessary to solve recursively all the way

back to the original condition

» use previous prediction to estimate previous prediction error)1(ˆ)1()1(̂ −−−=− tytyte




Formulation for General Case - given a process plus disturbance model:

we can write

so that the prediction is:

The random shocks are estimated as

y t G q u t H q e t( ) ( ) ( ) ( ) ( )= +− −1 1

y t G q u t H q e t e t( ) ( ) ( ) ( ( ) ) ( ) ( )= + − +− −1 1 1

$( ) ( ) ( ) ( ( ) ) ( )y t G q u t H q e t= + −− −1 1 1

e t H q y t G q u t( ) ( ){ ( ) ( ) ( )}= −− − −1 1 1




Putting these expressions together yields

which is of the form

The prediction error for use in the estimation loss function is

$( ) ( ){( ( ) ) ( ) ( ) ( )}y t H q H q y t G q u t= − +− − − −1 1 1 11

$( ) ( , ) ( ) ( , ) ( )y t L q y t L q u t= +− −1

12

1θ θ

ε θ θ( ) ( ) $( ) ( ) { ( , ) ( ) ( , ) ( )}t y t y t y t L q y t L q u t= − = − +− −1

12

1




How does this look for a general ARMAX model?

Getting ready for the prediction,

we obtain

A q y t B q u t C q e t( ) ( ) ( ) ( ) ( ) ( )− − −= +1 1 1

y t A q y t B q u t C q e t e t( ) ( ( )) ( ) ( ) ( ) ( ( ) ) ( ) ( )= − + + − +− − −1 11 1 1

$( ) ( ( )) ( ) ( ) ( ) ( ( ) ) ( )y t A q y t B q u t C q e t= − + + −− − −1 11 1 1




Note that the ability to estimate the random shocks depends on the ability to invert C(q-1)

» invertibility discussed in moving average disturbances» ability to express shocks in terms of present and past

outputs - convert to an infinite autoregressive sum

Note that the moving average parameters appear in the denominator of the prediction

» the model is nonlinear in the moving average parameters, and conditionally linear in the others



Likelihood Function Methods

Conditional Likelihood Function» assume initial conditions for outputs, random shocks» e.g., for ARX(1), values for y(0)» e.g., for ARMAX(1,1), values for y(0), e(0)

General argument -

• form joint distribution for this expressionover all times

• find optimal parameter values to maximize likelihood

y t G q u t H q e t e t( ) ( ) ( ) ( ( ) ) ( ) ( )− − − =− −1 1 1

normallydistributed,zero mean,known variance




Exact Likelihood Function

Note that we can also form an exact likelihood function which includes the initial conditions

» maximum likelihood estimation procedure estimates parameters AND initial conditions

» exact likelihood function is more complex

In either case, we use a numerical optimization procedure to solve for the maximum likelihood estimates.




Final Comment - » derivation of likelihood function requires convergence of

moving average, autoregressive elements» moving average --> invertibility» autoregressive --> stability

Example - Box-Jenkins model:`

can be re-arranged to yield the random shock

A q y tB q

F qu t

C q

D qe t( ) ( )

( )

( )( )

( )

( )( )−

−

−

−

−= +11

1

1

1

e t A q y tB q

F qu t

D q

C q( ) { ( ) ( )

( )

( )( )}

( )

( )= −−

−

−

−

−1

1

1

1

1

inverted MA component

inverted AR component



Outline




Model-Building Strategy

• graphical pre-screening• select initial model structure• estimate parameters• examine model diagnostics• examine structural diagnostics• validate model using additional data set}

modify modeland re-estimateas required



Example - Debutanizer

Objective - fit a transfer function +disturbance model describing changes in bottoms RVP in response to changes in internal reflux

Data– step data– slow PRBS (switch down, switch up, switch down)



Graphical Pre-Screening

• examine time traces of outputs, inputs, secondary variables– are there any outliers or major shifts in operation?

• could there be a model in this data?• engineering assessment

– should there be a model in this data?



Selecting Initial Model Structure

• examine auto- and cross-correlations of output, input– look for autoregressive, moving average components

• examine spectrum of output– indication of order of process

» first-order» second-order underdamped - resonance» second or higher order overdamped



Selecting Initial Model Structure...

• examine correlation estimate of impulse or step response– available if input is not a step – what order is the process ?

» 1st order, 2nd order over/underdamped– size of the time delay



Selecting Initial Model Structure

Time Delays

For low frequency input signal (e.g., few steps or filtered PRBS), examine transient response for delay

For pre-filtered data, examine cross-correlation plots - where is first non-zero cross-correlation?



Debutanizer Example

• step response– indicates settling time ~100 min– potentially some time delay– positive gain– 1st order or overdamped higher-order

• correlation estimate of step response– indicates time delay of ~4-5 min– overdamped higher-order



Debutanizer Example - PRBS Test

0 50 100 150-0.2

-0.1

0

0.1

0.2

Output # 1

Input and output signals

0 50 100 150-50

0

50

Time

Input # 1



Debutanizer Example - Step Response Test

0 50 100 1500

0.05

0.1

0.15

0.2

Output # 1

Input and output signals

0 50 100 15049

49.5

50

50.5

51

Time

Input # 1



Debutanizer Example - Correlation Step Response Estimate

0 5 10 15 20 25 30 35 400

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

-3

Time

Step Response



Debutanizer Example

• process spectrum– suggests higher-order

• disturbance spectrum– cut-off behaviour suggests AR type of disturbance

• initial model– ARX with delay of 4 or 5– ARMAX– Box-Jenkins– NOT output error - disturbance isn’t white



Debutanizer Example - Process Spectrum Plot

10-2

10-1

100

101

10-6

10-4

10-2

Amplitude

Frequency response

10-2

10-1

100

101

-1000

-500

0

Frequency (rad/s)

Phase (deg)



Debutanizer Example - Disturbance Spectrum

10-2

10-1

100

101

10-8

10-6

10-4

10-2

100

Frequency (rad/s)

Power Spectrum



Additional Initial Selection Tests



Singularity Test

Form the data vector

Covariance matrix for this vector will be singular if s>model order, non-singular if s≤model order

Notes:

1. Test developed for deterministic model – results are exact for this

2. Test is approximate when random shocks enter process – results will depend on signal-to-noise ratio

[ ]ϕ ( )

( ) ( ) ( ) ( )

t

y t y t s u t u t s

=

− − − −1 1L L

CovN

t t T

t

N( ) ( ) ( )ϕ ϕ ϕ= ∑

=

1

1



Pre-Filtering

If input is not white noise, cross-correlation does not show process structure clearly

» autocorrelation in u(t) complicates structure

Solution - estimate time series model for input, and pre-filter using inverse of this model– prefilter input and output to ensure consistency

Now estimate cross-correlations between filtered input, filtered output– look for sharp cut-off - negligible denominator– gradual decline - denominator dynamics



Pre-Filtering

• can also examine cross-correlation plots for indication of time delay– first non-zero lag in cross-correlation function

Note that differencing, which is used to treat non-stationary disturbances, is a form of pre-filtering– more on this later...



Outline




Model Diagnostics

Analyze residuals:

– look for unmodelled trends» auto-correlation» cross-correlation with inputs» spectrum - should be flat

– assess size of residual standard error

Wet towel analogy - wring out all moisture (information) until there is nothing left



Unmodelled Trends in Residuals

• autocorrelations– should be statistically zero

• cross-correlations – between residual and inputs should be zero for lags greater

than the numerator order » i.e., at long lags

– if cross-correlation between inputs and past residuals is non-zero, indicates feedback present in data (inputs depend on past errors)

» i.e., at negative lags



Debutanizer Example

Consider ARX(2,2,5) model– 2 poles, 1 zero, delay of 5

Autocorrelation plots– no systematic trend in residuals

Cross-correlation plots– no systematic relationship between residuals and input



Debutanizer Example - Residual Correlation Plots

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Autocorrelation of residuals for output 1

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Samples

Cross corr for input 1and output 1 resids



Debutanizer Example - Predicted vs. Response

0 50 100 150-0.15

-0.1

-0.05

0

0.05

0.1

0.15

Time

Measured and simulated model output



Detecting Incorrect Time Delays

If cross-correlation between residual and input is non-zero for small lags, the time delay is possibly too large

– additional early transients aren’t being modeled because model assumes nothing is happening



Debutanizer Example

Let’s choose a delay of 7

Cross-correlation plot– indicates significant cross-correlation between input and

output at positive lag– estimate of time delay is too large



Model Diagnostics

Quantitative Tests

– significance of parameter estimates– ratio tests - of explained variation

Debutanizer Example– parameters are all significant



Debutanizer Example - Parameter Estimates

This matrix was created by the command ARX on 11/16 1996 at 11:36

Loss fcn: 5.805e-006 Akaike`s FPE: 6.123e-006 Sampling interval 1

The polynomial coefficients and their standard deviations are

B =

1.0e-003 * 0 0 0 0 0 0.1428 -0.0605

0 0 0 0 0 0.0243 0.0272

A = 1.0000 -1.3924 0.4303

0 0.0747 0.0697

parameter

parameter

standarderror

standarderror

AR parameters

numerator parameters



Model Diagnostics

Cross-Validation

Use model to predict behaviour of a new data set collected under similar circumstances

Reject model if prediction error is large



Debutanizer Example

Use initial step test data as a cross-validation data set.

Prediction errors are small, and trend is predicted quite well

Conclusion - acceptable model



Debutanizer Example - Prediction for Validation Data

0 50 100 1500

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Time

Measured and simulated model output



Debutanizer Example - Residual Correlation Plots for Validation Data

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Autocorrelation of residuals for output 1

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Samples

Cross corr for input 1and output 1 resids



Outline




Initially...

Use the structure selection methods described earlier.

Once you have estimated several candidate models...



Model Structure Diagnostics

Akaike’s Information Criterion (AIC)

– weighted estimation error » unexplained variation with term penalizing excess

parameters» analogous to adjusted R2 for regression

– find model structure that minimizes the AIC



Akaike’s Information Criterion

Definition

AIC N V pN= +log( ( $))θ 2

number of data points in sample

}related to prediction error(residual sum of squares)

number ofparameters



Akaike’s Information Criterion

best model at minimum

AIC

# of parameters



Akaike’s Final Prediction Error

An attempt to estimate prediction error when model is used to predict new outputs

Goal - choose model that minimizes FPE (balance between number of parameters and explained variation)

( )FPEp Np N N

residual sum of squares=+−

⎛⎝⎜

⎞⎠⎟

11

1//



Minimum Data Length (MDL)

• Another approach - find “minimum length description” of data - measure is based on loss function + penalty for terms

• find description that minimizes this criterion

NN

VN)log(

)dim(θ+



Cross-Validation

Collect additional data, or partition your data set, and predict output(s) for the additional input sequence– poor predictions - modify model accordingly, re-estimate

with old data and re-validate– good predictions - use your model!

Note - cross-validation set should be collected under similar conditions– operating point, no known disturbances (e.g., feed changes)



Debutanizer Example

Search over a range of ARX model orders and time delay:

poles: 1-4

zeros: 1-4

time delay: 1-6

Examine mean square error, MDL, AIC and/or FPE

- Matlab generated -> ARX(2,2,5) model is best



Debutanizer Example

0 2 4 6 8 100

0.02

0.04

0.06

0.08

0.1

0.12

# of par's

% Unexplained of output variance

Model Fit vs # of par's

AIC optimal (ARX3,2,5)

MDL optimal (ARX2,2,5)



Other methods...

Look for Singularity of the “Information Matrix”



Outline

• The Modeling Task• Types of Models• Model Building Strategy• Model Diagnostics• Identifying Model Structure• Modeling Non-Stationary Data• MISO vs. SISO Model Fitting• Closed-Loop Identification



What is Non-Stationary Data?

Non-stationary disturbances – exhibit meandering or wandering behaviour– mean may appear to be non-zero for periods of time– stochastic analogue of integrating disturbance

Non-stationarity is associated with poles on the unit circle in the disturbance transfer function

» AR component has one or more roots at 1



Non-StationaryData

0 100 200 300-4

-2

0

2

4AR parameter of 0.3

output

0 100 200 300-5

0


output

0 100 200 300-10

-5

0

5


time

output

0 100 200 300-20

-10

0

10

20Non-stationary

time

output



How can you detect non-stationary data?

Visual– meandering behaviour

Quantitative– slowly decaying autocorrelation behaviour– difference the data– examine autocorrelation, partial autocorrelation functions for

differenced data– evidence of MA or AR indicates a non-stationary, or

integrated MA or AR disturbance



Differencing Data

… is the procedure of putting the data in “delta form”

Start with y(t) and convert to

– explicitly accounting for the pole on the unit circle

Δy t y t y t( ) ( ) ( )= − −1



Detecting Non-Stationarity

-2 0 2 4 6 8 10 120

0.5

1

response

Autocorrelation for Non-Stationary Disturbance

-2 0 2 4 6 8 10 12-0.5

0

0.5

1

time

response

Autocorrelation for Differenced Disturbance



Impact of Over-Differencing

Over-differencing can introduce extra meandering and local trends into data

Differencing - “cancels” pole on unit circle

Over-differencing - introduces artificial unit pole into data



Recognizing Over-Differencing

Visual– more local trends, meandering in data

Quantitative– autocorrelation behaviour decays more slowly than initial

undifferenced data



Estimating Models for Non-Stationary Data

Approaches

Estimate the model using the differenced data

Explicitly incorporate the pole on the unit circle in the disturbance transfer function specification



Estimating Models from Differenced Data

• Prepare the data by differencing BOTH the input and the output

• Specify initial model structure after using graphical, quantitative tools

• Estimate, diagnose model for differenced data• Convert model to undifferenced form by multiplying

through by (1-q-1)• Assess predictions on undifferenced data for fitting

and validation data sets



Differenced Form of Box-Jenkins Model

Note - in time series literature,

is used to denote differencing

A q y tB q

F qq u t

C q

D qa tf( ) ( )

( )

( )( )

( )

( )( )( )−

−

−− +

−

−+ = +11

11

1

11Δ Δ

∇= − =−( )1 1q Δ



Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics• Estimating MIMO models



SISO Approach

Estimate models individually

Advantage– simplicity

Disadvantage– need to reconcile disturbance models for each input-output

channel in order to obtain one disturbance model for the output

– can’t assess directionality with respect to inputs



MISO Approach

Estimate the transfer function models + disturbance model for a single output and all inputs simultaneously

Advantage– consistency - obtain one disturbance model directly– potential to assess directionality

Disadvantage– complexity - recognizing model structures is more difficult



A Hybrid Approach

• conduct preliminary analysis using SISO approach– model structures– apparent disturbance structure

• estimate final model using MISO approach– must decide on a common disturbance structure

• feasible if input sequences are independent



Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics• Closed-loop vs. open-loop estimation



The Closed-Loop Identification Problem

Yt

UtSPt

+-

ControllerGc

ProcessGp

dither signal Wt

X



Where should the input signal be introduced?

Options:

Dither at the controller output– clearer indication of process dynamics– preferred approach

Perturbations in the setpoint– additional controller dynamics will be included in estimated

model



What do the closed-loop data represent?

• dither signal case, without disturbances• open-loop

– input-output data represents

• closed-loop– input-output data represents

Y G Wt p t=

YG

G GWt

p

p ct=

+1



Estimating Models from Closed-Loop Data

Approach #1:

Working with W-Y data,

estimate and back out controller to

obtain process transfer function. – we already know the controller transfer function

G

G Gp

p c1+




Approach #2:

Estimate transfer functions for the process

(U ->Y), and for the controller (Y->U) simultaneously.




Approach #3:

Fit the model as in the open-loop case (U->Y).

Note that so that we are effectively

using a filtered input signal.

WGG

Ucp+

=1

1



Some Useful References

Identification Case Study - paper by Shirt, Harris and Bacon (1994).

Closed-Loop Identification - issues

- paper by MacGregor and Fogal

System Identification Workshop

- paper edited by Barry Cott

CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

Documents

Transcript of CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.