CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

110
CHEE825/436 - Module 4 J. McLellan - Fall 2005 1 Process and Disturbance Models

Transcript of CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

Page 1: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 1

Process and Disturbance Models

Page 2: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 2

Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics

Page 3: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 3

The Task of Dynamic Model Building

partitioning process data into a deterministic component (the process) and a stochastic component (the disturbance)

process disturbance

?

time seriesmodel

transfer functionmodel

Page 4: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 4

Process Model Types

• non-parametric– impulse response– step response– spectrum

• parametric– transfer function models

» numerator» denominator

– difference equation models » equivalent to transfer function models with backshift

operator

}technically “parametric” when in finite form (e.g., FIR)

Page 5: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 5

Impulse and Step Process Models

described as a set of weights:

y t h i u t ii

N( ) ( ) ( )= −∑

=0

y t s i u t ii

N( ) ( ) ( )= −∑

0

impulsemodel

stepmodel

Note - typically treat Δu(t-N) as a step from 0 - i.e., Δu(t-N) = u(t-N)

Page 6: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 6

Process Spectrum Model

represented as a set of frequency response values, or graphically

frequency (rad/s)

amplitude

ratio

Page 7: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 7

Process Transfer Function Models

numerator, denominator dynamics and time delay

G qB q q

F qp

f

( )( )

( )

( )−

− − +

−=11 1

1

poles

zeros

time delayextra 1 stepdelay introduced by zero order hold and sampling - f is pure time delay

q-1 is backwards shift operator: q-1 y(t)=y(t-1)

Page 8: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 8

Model Types for Disturbances

• non-parametric– “impulse response” - infinite moving average– spectrum

• parametric– “transfer function” form

» autoregressive (denominator)» moving average (numerator)

Page 9: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 9

ARIMA Models for Disturbances

)()1)((

)()(

11

1ta

qqD

qCtd

d−−

−=

autoregressive component

moving average component

random shock

AutoRegressive Integrated Moving Average ModelTime Series Notation - ARIMA(p,d,q) model has • pth-order denominator - AR• qth-order numerator - MA• d integrating poles (on the unit circle)

Page 10: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 10

ARMA Models for Disturbances

)()(

)()(

1

1ta

qD

qCtd −

−=

autoregressive component

moving average component

random shock

Simply have no integrating component

Page 11: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 11

Typical Model Combinations

• model predictive control– impulse/step process model + ARMA disturbance model

» typically a step disturbance model which can be considered as a pure integrator driven by a single pulse

• single-loop control– transfer function process model + ARMA disturbance model

Page 12: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 12

Classification of Models in Identification

• AutoRegressive with eXogenous inputs (ARX)• Output Error (OE)• AutoRegressive Moving Average with eXogenous

inputs (ARMAX)• Box-Jenkins (BJ)• per Ljung’s terminology

Page 13: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 13

ARX Models

– u(t) is the exogenous input– same autoregressive component for process, disturbance– numerator term for process, no moving average in

disturbance– physical interpretation - disturbance passes through entire

process dynamics » e.g., feed disturbance

A q y t B q q u t a tf( ) ( ) ( ) ( ) ( )( )− − − += +1 1 1

Page 14: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 14

Output Error Models

– no disturbance dynamics – numerator and denominator process dynamics – physical interpretation - process subject to white noise

disturbance (is this ever true?)

y tB q

A qq u t a tf( )

( )

( )( ) ( )( )= +

−− +

1

11

Page 15: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 15

ARMAX Models

– process and disturbance have same denominator dynamics– disturbance has moving average dynamics– physical interpretation - disturbance passing though process

which enters at a point away from the input» except if C(q-1) = B(q-1)

A q y t B q q u t C q a tf( ) ( ) ( ) ( ) ( ) ( )( )− − − + −= +1 1 1 1

Page 16: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 16

Box-Jenkins Model

– autoregressive component plus input, disturbance can have different dynamics

– AR component A(q-1) represents dynamic elements common to both process and disturbance

– physical interpretation - disturbance passes through other dynamic elements before entering process

A q y tB q

F qq u t

C q

D qa tf( ) ( )

( )

( )( )

( )

( )( )( )−

−− +

−= +11

11

1

1

Page 17: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 17

Range of Model Types

Output Error

ARX

ARMAX

Box-Jenkins

least general

most general

Page 18: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 18

Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics

Page 19: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 19

Model Estimation - General Philosophy

Form a “loss function” which is to be minimized to obtain the “best” parameter estimates

Loss function » “loss” can be considered as missed trend or information» e.g. - linear regression

• loss would represent left-over trends in residuals which could be explained by a model

• if we picked up all trend, only the random noise e(t) would be left• additional trends drive up the variation of the residuals• loss function is the sum of squares of the residuals (related to

the variance of the residuals)

Page 20: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 20

Linear Regression - Types of Loss Functions

First, consider the linear regression model:

Least Squares estimation criterion -

Y x x x e e Np p= + + + + +β β β β σ0 1 1 2 220L , ~ ( , )

min ( $)

min ( { })

min

{ , ,... }

{ , ,... }

{ , ,... }

β

β

β

β β β β

i

i

i

i pi i

i

n

i pi i i p pii

n

i pi

i

n

y y

y x x x

e

= =

= =

= =

−∑

= − + + + +∑

= ∑

12

1

10 1 1 2 2

2

1

12

1

L

squared prediction error at point “i”

Page 21: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 21

Linear Regression - Types of Loss Functions

The model describes how the mean of Y varies:

and the variance of Y is because the random component in Y comes from the additive noise “e”. The probability density function at point “i” is

where ei is the noise at point “i”

E Y x x xp p{ } = + + + +β β β β0 1 1 2 2 L

σ 2

fy x x

e

Yi p p

i

ii i=

− − + + +⎛

⎜⎜

⎟⎟

= −⎛

⎝⎜⎜

⎠⎟⎟

12 2

12 2

0 1 12

2

2

2

πσβ β β

σ

πσ σ

exp{ ( )}

exp{ }

L

Page 22: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 22

Linear Regression - Types of Loss Functions

We can write the joint probability density function for all observations in the data set:

⎟⎟⎟⎟

⎜⎜⎜⎜

⎛∑−

=

⎟⎟⎟⎟

⎜⎜⎜⎜

⎛∑ +++−−

=

=

=

21

2

2/

21

2110

2/

2

}{exp

)2(

1

2

)}({exp

)2(

11

σσπ

σ

βββ

σπ

n

ii

nn

n

ippi

nnYY

e

xxyf

ii

n

L

K

Page 23: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 23

Linear Regression - Types of Loss Functions

Given parameters, we can use to determine probability that a given range of observations will occur.

What if we have observations but don’t know parameters?» assume that we have the most common, or “likely”,

observations - i.e., observations that have the greatest probability of occurrence

» find the parameter values that maximize the probability of the observed values occurring

» the joint density function becomes a “likelihood function” » the parameter estimates are “maximum likelihood

estimates”

fY Yn1K

Page 24: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 24

Linear Regression - Types of Loss Functions

Maximum Likelihood Parameter Estimation Criterion -

⎟⎟⎟⎟

⎜⎜⎜⎜

⎛∑ +++−−

= ==

=

21

2110

2/,...,1,

,...,1,

2

)}({exp

)2(

1max

)(max

σ

βββ

σπ

β

β

β

n

ippi

nnpi

pi

ii

i

i

xxy

L

L

y

Page 25: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 25

Linear Regression - Types of Loss Functions

Given the form of the likelihood function, maximizing is equivalent to minimizing the argument of the exponential, i.e.,

For the linear regression case, the maximum likelihood parameter estimates are equivalent to the least squares parameter estimates.

min ( { })

min

{ , ,... }

{ , ,... }

β

β

β β β βi

i

i pi i i p pii

n

i pi

i

n

y x x x

e

= =

= =

− + + + +∑

= ∑

10 1 1 2 2

2

1

12

1

L

Page 26: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 26

Linear Regression - Types of Loss Functions

Least Squares Estimation» loss function is sum of squared residuals = sum of

squared prediction errors

Maximum Likelihood» loss function is likelihood function, which in the linear

regression case is equivalent to the sum of squared prediction errors

Prediction Error = observation - predicted value

y y y x x xi i i i i p pi− = − + + + +$ { }β β β β0 1 1 2 2 L

Page 27: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 27

Loss Functions for Identification

Least Squares

“minimize the sum of squared prediction errors”

The loss function is

where N is the number of points in the data record.

( ( ) $( ))y t y tt

N−∑

=1

2

Page 28: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 28

Least Squares Identification Example

Given an ARX(1) process+disturbance model:

the loss function can be written as

y t a y t b u t e t( ) ( ) ( ) ( )= − + − +1 11 1

( ( ) $( )) ( ( ) { ( ) ( )})y t y t y t a y t bu tt

N

t

N−∑ = − − + −∑

= =2

21 1

2

21 1

Page 29: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 29

Least Squares Identification Example

In matrix form,

and the sum of squares prediction error is

e e eT

yy

y N

y uy u

y N u N

ab

with =

⎢⎢⎢

⎥⎥⎥−

− −

⎢⎢⎢

⎥⎥⎥

⎡⎣⎢

⎤⎦⎥

( )( )

( )

( ) ( )( ) ( )

( ) ( )

23

1 12 2

1 1

11M M M

Page 30: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 30

Least Squares Identification Example

The least squares parameter estimates are:

Note that the disturbance structure in the ARX model is such that the disturbance contribution appears in the formulation as a white noise additive error --> satisfies assumptions for this formulation.

$$ ( )

( )( )

( )

ab

yy

y N

T T1

1

1

23⎡

⎣⎢⎤⎦⎥=

⎢⎢⎢

⎥⎥⎥

−Φ Φ ΦM

Page 31: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 31

Least Squares Identification

• ARX models fit into this framework• Output Error models -

or in difference equation form:

y tB q

A qq u t e t

A q y t B q q u t A q e t

f

f

( )( )

( )( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

( )

( )

= +

= +

−− +

− − − + −

1

11

1 1 1 1

or

y t a y t a y t p B q q u t A q e tpf( ) ( ) ( ) ( ) ( ) ( ) ( )( )= − + + − + +− − + −

11 1 11 L

violates least squaresassumptions of independent errors

Page 32: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 32

Least Squares Identification

Any process+disturbance model other than the ARX model will not satisfy the structural requirements.

Implications? » estimators are not consistent - don’t asymptotically tend

to true values of parameters» potential for bias

Page 33: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 33

Prediction Error Methods

Choose parameter estimates to minimize some function of the prediction errors.

For example, for the Output Error Model, we have

Use a numerical optimization routine to obtain “best” estimates.

ε ( ) ( )( )

( )( )( )t y t

B q

A qq u tf= −

−− +

1

11

predictionprediction error

Page 34: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 34

Prediction Error Methods

AR(1) Example -

Use model to predict one step ahead given past values:

This is an optimal predictor when e(t) is normally distributed, and can be obtained by taking the “conditional expectation” of y(t) given information up to and including time t-1. e(t) disappears because it has zero mean and adds no information on average.

“one step ahead predictor”

y t a y t b u t e t( ) ( ) ( ) ( )= − + − +1 11 1

$( ) ( ) ( )y t a y t b u t= − + −1 11 1

Page 35: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 35

Prediction Error Methods

Prediction Error for the one step ahead predictor:

We could obtain parameter estimates to minimize sum of squared prediction errors:

ε ( ) ( ) $( ) ( ) { ( ) ( )}t y t y t y t a y t bu t= − = − − + −1 11 1

ε ( ) ( ( ) $( ))t y t y tt

N

t

N2

2 2

2

= =∑ = −∑

same as Least Squares Estimates for thisARX example

Page 36: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 36

Prediction Error Methods

What happens if we have an ARMAX(1,1) model?

One step ahead predictor is:

But what is e(t-1)? » estimate it using measured y(t-1) and estimate of y(t-1)

y t a y t b u t e t c e t( ) ( ) ( ) ( ) ( )= − + − + + −1 1 11 1 1

$( ) ( ) ( ) ( )y t a y t b u t c e t= − + − + −1 1 11 1 1

$( ) ( ) $( )

( ) { ( ) ( ) ( )}

e t y t y t

y t a y t b u t c e t

− = − − −= − − − + − − −

1 1 11 2 2 21 1 1

Page 37: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 37

Prediction Error Methods

Note that estimate of e(t-1) depends on e(t-2), which depends on e(t-3), and so forth

» eventually end up with dependence on e(0), which is typically assumed to be zero

» “conditional” estimates - conditional on assumed initial values

» can also formulate in a way to avoid conditional estimates

» impact is typically negligible for large data sets• during computation, it isn’t necessary to solve recursively all the way

back to the original condition

» use previous prediction to estimate previous prediction error)1(ˆ)1()1(̂ −−−=− tytyte

Page 38: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 38

Prediction Error Methods

Formulation for General Case - given a process plus disturbance model:

we can write

so that the prediction is:

The random shocks are estimated as

y t G q u t H q e t( ) ( ) ( ) ( ) ( )= +− −1 1

y t G q u t H q e t e t( ) ( ) ( ) ( ( ) ) ( ) ( )= + − +− −1 1 1

$( ) ( ) ( ) ( ( ) ) ( )y t G q u t H q e t= + −− −1 1 1

e t H q y t G q u t( ) ( ){ ( ) ( ) ( )}= −− − −1 1 1

Page 39: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 39

Prediction Error Methods

Putting these expressions together yields

which is of the form

The prediction error for use in the estimation loss function is

$( ) ( ){( ( ) ) ( ) ( ) ( )}y t H q H q y t G q u t= − +− − − −1 1 1 11

$( ) ( , ) ( ) ( , ) ( )y t L q y t L q u t= +− −1

12

1θ θ

ε θ θ( ) ( ) $( ) ( ) { ( , ) ( ) ( , ) ( )}t y t y t y t L q y t L q u t= − = − +− −1

12

1

Page 40: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 40

Prediction Error Methods

How does this look for a general ARMAX model?

Getting ready for the prediction,

we obtain

A q y t B q u t C q e t( ) ( ) ( ) ( ) ( ) ( )− − −= +1 1 1

y t A q y t B q u t C q e t e t( ) ( ( )) ( ) ( ) ( ) ( ( ) ) ( ) ( )= − + + − +− − −1 11 1 1

$( ) ( ( )) ( ) ( ) ( ) ( ( ) ) ( )y t A q y t B q u t C q e t= − + + −− − −1 11 1 1

Page 41: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 41

Prediction Error Methods

Note that the ability to estimate the random shocks depends on the ability to invert C(q-1)

» invertibility discussed in moving average disturbances» ability to express shocks in terms of present and past

outputs - convert to an infinite autoregressive sum

Note that the moving average parameters appear in the denominator of the prediction

» the model is nonlinear in the moving average parameters, and conditionally linear in the others

Page 42: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 42

Likelihood Function Methods

Conditional Likelihood Function» assume initial conditions for outputs, random shocks» e.g., for ARX(1), values for y(0)» e.g., for ARMAX(1,1), values for y(0), e(0)

General argument -

• form joint distribution for this expressionover all times

• find optimal parameter values to maximize likelihood

y t G q u t H q e t e t( ) ( ) ( ) ( ( ) ) ( ) ( )− − − =− −1 1 1

normallydistributed,zero mean,known variance

Page 43: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 43

Likelihood Function Methods

Exact Likelihood Function

Note that we can also form an exact likelihood function which includes the initial conditions

» maximum likelihood estimation procedure estimates parameters AND initial conditions

» exact likelihood function is more complex

In either case, we use a numerical optimization procedure to solve for the maximum likelihood estimates.

Page 44: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 44

Likelihood Function Methods

Final Comment - » derivation of likelihood function requires convergence of

moving average, autoregressive elements» moving average --> invertibility» autoregressive --> stability

Example - Box-Jenkins model:`

can be re-arranged to yield the random shock

A q y tB q

F qu t

C q

D qe t( ) ( )

( )

( )( )

( )

( )( )−

−= +11

1

1

1

e t A q y tB q

F qu t

D q

C q( ) { ( ) ( )

( )

( )( )}

( )

( )= −−

−1

1

1

1

1

inverted MA component

inverted AR component

Page 45: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 45

Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics

Page 46: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 46

Model-Building Strategy

• graphical pre-screening• select initial model structure• estimate parameters• examine model diagnostics• examine structural diagnostics• validate model using additional data set}

modify modeland re-estimateas required

Page 47: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 47

Example - Debutanizer

Objective - fit a transfer function +disturbance model describing changes in bottoms RVP in response to changes in internal reflux

Data– step data– slow PRBS (switch down, switch up, switch down)

Page 48: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 48

Graphical Pre-Screening

• examine time traces of outputs, inputs, secondary variables– are there any outliers or major shifts in operation?

• could there be a model in this data?• engineering assessment

– should there be a model in this data?

Page 49: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 49

Selecting Initial Model Structure

• examine auto- and cross-correlations of output, input– look for autoregressive, moving average components

• examine spectrum of output– indication of order of process

» first-order» second-order underdamped - resonance» second or higher order overdamped

Page 50: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 50

Selecting Initial Model Structure...

• examine correlation estimate of impulse or step response– available if input is not a step – what order is the process ?

» 1st order, 2nd order over/underdamped– size of the time delay

Page 51: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 51

Selecting Initial Model Structure

Time Delays

For low frequency input signal (e.g., few steps or filtered PRBS), examine transient response for delay

For pre-filtered data, examine cross-correlation plots - where is first non-zero cross-correlation?

Page 52: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 52

Debutanizer Example

• step response– indicates settling time ~100 min– potentially some time delay– positive gain– 1st order or overdamped higher-order

• correlation estimate of step response– indicates time delay of ~4-5 min– overdamped higher-order

Page 53: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 53

Debutanizer Example - PRBS Test

0 50 100 150-0.2

-0.1

0

0.1

0.2

Output # 1

Input and output signals

0 50 100 150-50

0

50

Time

Input # 1

Page 54: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 54

Debutanizer Example - Step Response Test

0 50 100 1500

0.05

0.1

0.15

0.2

Output # 1

Input and output signals

0 50 100 15049

49.5

50

50.5

51

Time

Input # 1

Page 55: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 55

Debutanizer Example - Correlation Step Response Estimate

0 5 10 15 20 25 30 35 400

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

-3

Time

Step Response

Page 56: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 56

Debutanizer Example

• process spectrum– suggests higher-order

• disturbance spectrum– cut-off behaviour suggests AR type of disturbance

• initial model– ARX with delay of 4 or 5– ARMAX– Box-Jenkins– NOT output error - disturbance isn’t white

Page 57: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 57

Debutanizer Example - Process Spectrum Plot

10-2

10-1

100

101

10-6

10-4

10-2

Amplitude

Frequency response

10-2

10-1

100

101

-1000

-500

0

Frequency (rad/s)

Phase (deg)

Page 58: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 58

Debutanizer Example - Disturbance Spectrum

10-2

10-1

100

101

10-8

10-6

10-4

10-2

100

Frequency (rad/s)

Power Spectrum

Page 59: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 59

Additional Initial Selection Tests

Page 60: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 60

Singularity Test

Form the data vector

Covariance matrix for this vector will be singular if s>model order, non-singular if s≤model order

Notes:

1. Test developed for deterministic model – results are exact for this

2. Test is approximate when random shocks enter process – results will depend on signal-to-noise ratio

[ ]ϕ ( )

( ) ( ) ( ) ( )

t

y t y t s u t u t s

=

− − − −1 1L L

CovN

t t T

t

N( ) ( ) ( )ϕ ϕ ϕ= ∑

=

1

1

Page 61: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 61

Pre-Filtering

If input is not white noise, cross-correlation does not show process structure clearly

» autocorrelation in u(t) complicates structure

Solution - estimate time series model for input, and pre-filter using inverse of this model– prefilter input and output to ensure consistency

Now estimate cross-correlations between filtered input, filtered output– look for sharp cut-off - negligible denominator– gradual decline - denominator dynamics

Page 62: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 62

Pre-Filtering

• can also examine cross-correlation plots for indication of time delay– first non-zero lag in cross-correlation function

Note that differencing, which is used to treat non-stationary disturbances, is a form of pre-filtering– more on this later...

Page 63: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 63

Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics

Page 64: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 64

Model Diagnostics

Analyze residuals:

– look for unmodelled trends» auto-correlation» cross-correlation with inputs» spectrum - should be flat

– assess size of residual standard error

Wet towel analogy - wring out all moisture (information) until there is nothing left

Page 65: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 65

Unmodelled Trends in Residuals

• autocorrelations– should be statistically zero

• cross-correlations – between residual and inputs should be zero for lags greater

than the numerator order » i.e., at long lags

– if cross-correlation between inputs and past residuals is non-zero, indicates feedback present in data (inputs depend on past errors)

» i.e., at negative lags

Page 66: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 66

Debutanizer Example

Consider ARX(2,2,5) model– 2 poles, 1 zero, delay of 5

Autocorrelation plots– no systematic trend in residuals

Cross-correlation plots– no systematic relationship between residuals and input

Page 67: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 67

Debutanizer Example - Residual Correlation Plots

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Autocorrelation of residuals for output 1

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Samples

Cross corr for input 1and output 1 resids

Page 68: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 68

Debutanizer Example - Predicted vs. Response

0 50 100 150-0.15

-0.1

-0.05

0

0.05

0.1

0.15

Time

Measured and simulated model output

Page 69: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 69

Detecting Incorrect Time Delays

If cross-correlation between residual and input is non-zero for small lags, the time delay is possibly too large

– additional early transients aren’t being modeled because model assumes nothing is happening

Page 70: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 70

Debutanizer Example

Let’s choose a delay of 7

Cross-correlation plot– indicates significant cross-correlation between input and

output at positive lag– estimate of time delay is too large

Page 71: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 71

Model Diagnostics

Quantitative Tests

– significance of parameter estimates– ratio tests - of explained variation

Debutanizer Example– parameters are all significant

Page 72: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 72

Debutanizer Example - Parameter Estimates

This matrix was created by the command ARX on 11/16 1996 at 11:36

Loss fcn: 5.805e-006 Akaike`s FPE: 6.123e-006 Sampling interval 1

The polynomial coefficients and their standard deviations are

B =

1.0e-003 * 0 0 0 0 0 0.1428 -0.0605

0 0 0 0 0 0.0243 0.0272

A = 1.0000 -1.3924 0.4303

0 0.0747 0.0697

parameter

parameter

standarderror

standarderror

AR parameters

numerator parameters

Page 73: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 73

Model Diagnostics

Cross-Validation

Use model to predict behaviour of a new data set collected under similar circumstances

Reject model if prediction error is large

Page 74: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 74

Debutanizer Example

Use initial step test data as a cross-validation data set.

Prediction errors are small, and trend is predicted quite well

Conclusion - acceptable model

Page 75: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 75

Debutanizer Example - Prediction for Validation Data

0 50 100 1500

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Time

Measured and simulated model output

Page 76: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 76

Debutanizer Example - Residual Correlation Plots for Validation Data

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Autocorrelation of residuals for output 1

-20 -15 -10 -5 0 5 10 15 20-0.5

0

0.5

Samples

Cross corr for input 1and output 1 resids

Page 77: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 77

Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics

Page 78: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 78

Initially...

Use the structure selection methods described earlier.

Once you have estimated several candidate models...

Page 79: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 79

Model Structure Diagnostics

Akaike’s Information Criterion (AIC)

– weighted estimation error » unexplained variation with term penalizing excess

parameters» analogous to adjusted R2 for regression

– find model structure that minimizes the AIC

Page 80: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 80

Akaike’s Information Criterion

Definition

AIC N V pN= +log( ( $))θ 2

number of data points in sample

}related to prediction error(residual sum of squares)

number ofparameters

Page 81: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 81

Akaike’s Information Criterion

best model at minimum

AIC

# of parameters

Page 82: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 82

Akaike’s Final Prediction Error

An attempt to estimate prediction error when model is used to predict new outputs

Goal - choose model that minimizes FPE (balance between number of parameters and explained variation)

( )FPEp Np N N

residual sum of squares=+−

⎛⎝⎜

⎞⎠⎟

11

1//

Page 83: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 83

Minimum Data Length (MDL)

• Another approach - find “minimum length description” of data - measure is based on loss function + penalty for terms

• find description that minimizes this criterion

NN

VN)log(

)dim(θ+

Page 84: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 84

Cross-Validation

Collect additional data, or partition your data set, and predict output(s) for the additional input sequence– poor predictions - modify model accordingly, re-estimate

with old data and re-validate– good predictions - use your model!

Note - cross-validation set should be collected under similar conditions– operating point, no known disturbances (e.g., feed changes)

Page 85: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 85

Debutanizer Example

Search over a range of ARX model orders and time delay:

poles: 1-4

zeros: 1-4

time delay: 1-6

Examine mean square error, MDL, AIC and/or FPE

- Matlab generated -> ARX(2,2,5) model is best

Page 86: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 86

Debutanizer Example

0 2 4 6 8 100

0.02

0.04

0.06

0.08

0.1

0.12

# of par's

% Unexplained of output variance

Model Fit vs # of par's

AIC optimal (ARX3,2,5)

MDL optimal (ARX2,2,5)

Page 87: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 87

Other methods...

Look for Singularity of the “Information Matrix”

Page 88: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 88

Outline

• The Modeling Task• Types of Models• Model Building Strategy• Model Diagnostics• Identifying Model Structure• Modeling Non-Stationary Data• MISO vs. SISO Model Fitting• Closed-Loop Identification

Page 89: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 89

What is Non-Stationary Data?

Non-stationary disturbances – exhibit meandering or wandering behaviour– mean may appear to be non-zero for periods of time– stochastic analogue of integrating disturbance

Non-stationarity is associated with poles on the unit circle in the disturbance transfer function

» AR component has one or more roots at 1

Page 90: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 90

Non-StationaryData

0 100 200 300-4

-2

0

2

4AR parameter of 0.3

output

0 100 200 300-5

0

5AR parameter of 0.6

output

0 100 200 300-10

-5

0

5

10AR parameter of 0.9

time

output

0 100 200 300-20

-10

0

10

20Non-stationary

time

output

Page 91: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 91

How can you detect non-stationary data?

Visual– meandering behaviour

Quantitative– slowly decaying autocorrelation behaviour– difference the data– examine autocorrelation, partial autocorrelation functions for

differenced data– evidence of MA or AR indicates a non-stationary, or

integrated MA or AR disturbance

Page 92: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 92

Differencing Data

… is the procedure of putting the data in “delta form”

Start with y(t) and convert to

– explicitly accounting for the pole on the unit circle

Δy t y t y t( ) ( ) ( )= − −1

Page 93: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 93

Detecting Non-Stationarity

-2 0 2 4 6 8 10 120

0.5

1

response

Autocorrelation for Non-Stationary Disturbance

-2 0 2 4 6 8 10 12-0.5

0

0.5

1

time

response

Autocorrelation for Differenced Disturbance

Page 94: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 94

Impact of Over-Differencing

Over-differencing can introduce extra meandering and local trends into data

Differencing - “cancels” pole on unit circle

Over-differencing - introduces artificial unit pole into data

Page 95: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 95

Recognizing Over-Differencing

Visual– more local trends, meandering in data

Quantitative– autocorrelation behaviour decays more slowly than initial

undifferenced data

Page 96: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 96

Estimating Models for Non-Stationary Data

Approaches

Estimate the model using the differenced data

Explicitly incorporate the pole on the unit circle in the disturbance transfer function specification

Page 97: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 97

Estimating Models from Differenced Data

• Prepare the data by differencing BOTH the input and the output

• Specify initial model structure after using graphical, quantitative tools

• Estimate, diagnose model for differenced data• Convert model to undifferenced form by multiplying

through by (1-q-1)• Assess predictions on undifferenced data for fitting

and validation data sets

Page 98: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 98

Differenced Form of Box-Jenkins Model

Note - in time series literature,

is used to denote differencing

A q y tB q

F qq u t

C q

D qa tf( ) ( )

( )

( )( )

( )

( )( )( )−

−− +

−+ = +11

11

1

11Δ Δ

∇= − =−( )1 1q Δ

Page 99: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 99

Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics• Estimating MIMO models

Page 100: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 100

SISO Approach

Estimate models individually

Advantage– simplicity

Disadvantage– need to reconcile disturbance models for each input-output

channel in order to obtain one disturbance model for the output

– can’t assess directionality with respect to inputs

Page 101: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 101

MISO Approach

Estimate the transfer function models + disturbance model for a single output and all inputs simultaneously

Advantage– consistency - obtain one disturbance model directly– potential to assess directionality

Disadvantage– complexity - recognizing model structures is more difficult

Page 102: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 102

A Hybrid Approach

• conduct preliminary analysis using SISO approach– model structures– apparent disturbance structure

• estimate final model using MISO approach– must decide on a common disturbance structure

• feasible if input sequences are independent

Page 103: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 103

Outline

• Types of Models• Model Estimation Methods• Identifying Model Structure• Model Diagnostics• Closed-loop vs. open-loop estimation

Page 104: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 104

The Closed-Loop Identification Problem

Yt

UtSPt

+-

ControllerGc

ProcessGp

dither signal Wt

X

Page 105: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 105

Where should the input signal be introduced?

Options:

Dither at the controller output– clearer indication of process dynamics– preferred approach

Perturbations in the setpoint– additional controller dynamics will be included in estimated

model

Page 106: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 106

What do the closed-loop data represent?

• dither signal case, without disturbances• open-loop

– input-output data represents

• closed-loop– input-output data represents

Y G Wt p t=

YG

G GWt

p

p ct=

+1

Page 107: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 107

Estimating Models from Closed-Loop Data

Approach #1:

Working with W-Y data,

estimate and back out controller to

obtain process transfer function. – we already know the controller transfer function

G

G Gp

p c1+

Page 108: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 108

Estimating Models from Closed-Loop Data

Approach #2:

Estimate transfer functions for the process

(U ->Y), and for the controller (Y->U) simultaneously.

Page 109: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 109

Estimating Models from Closed-Loop Data

Approach #3:

Fit the model as in the open-loop case (U->Y).

Note that so that we are effectively

using a filtered input signal.

WGG

Ucp+

=1

1

Page 110: CHEE825/436 - Module 4J. McLellan - Fall 20051 Process and Disturbance Models.

CHEE825/436 - Module 4

J. McLellan - Fall 2005 110

Some Useful References

Identification Case Study - paper by Shirt, Harris and Bacon (1994).

Closed-Loop Identification - issues

- paper by MacGregor and Fogal

System Identification Workshop

- paper edited by Barry Cott