Artificial Neural Networks - LASAR · In an artificial neural network, variables are associated...

Post on 08-Jun-2020

16 views 0 download

Transcript of Artificial Neural Networks - LASAR · In an artificial neural network, variables are associated...

Artificial Neural NetworksFrancesco DI MAIO, Ph.D.,

Politecnico di Milano

Department of Energy - Nuclear Division

IEEE - Italian Reliability Chapter (Chair)

STF - China Fellow

francesco.dimaio@polimi.it

Measured

signals

f1

f2

Forcing

functions

Fault Detection, Diagnostics and Prognostics

Normal operation

Diagnostic system

Measured

signals

f1

f2

Forcing

functions

FAULT

DETECTION

(early recognition)

Fault Detection, Diagnostics and Prognostics

FAULT

CLASSIFICATION

(correct assignment)

f1

f2

Normal operation

Diagnostic system

FAULT

DETECTION

(early recognition)

Measured

signals

f1

f2

Forcing

functions

Fault Detection, Diagnostics and Prognostics

PROGNOSISLifetime estimation

Prognostic systemFAULT

CLASSIFICATION

(correct assignment)

f1

f2

Normal operation

Diagnostic system

FAULT

DETECTION

(early recognition)

Measured

signals

f1

f2

Forcing

functions

Fault Detection, Diagnostics and Prognostics

ns

s

s

...

2

1

M

T

T

...

2

1

Plant

Sensors

Data Pre-

Processing

Feature

Selection

Classification

1 2 3 4 5 6 7 8 9 10 -2

-1

0

1

2

3

4

10

9

8

7

6

5

4

3

2

0 32 59 86 113 140 167 194 221 248 275 -1

-0.5

0

0.5

1

1

window {

Mean residual

Maximum wavelet coeff.

Minimum wavelet coeff.

Fault Type

h < n

h

f

f

f

s

s

s

...

2

1

Diagnostic System: Elements

• Complex Plants and Systems

• Large number of monitored signals with non-linear

dependences

Difficulties in Detecting and Classyfying Faults

We resort to empirical models, rather than to

analytical models

Soft Computing Techniques:Artificial Neural Networks

Support/Relevance Vector Machines

Fuzzy Classifiers

Problem Statement: Challenges

Artificial Neural Networks

Empirical model

built by training

on input/output data

(modeling advantage)

Multivariate, non linear interpolators

Techniques capable of reconstructing the underlying

complex I/O nonlinear relations by combining multiple

simple functions (modeling & computational advantage)

What Are (Artificial) Neural Networks?

In an artificial neural network, variables are associated with

nodes in a graph.

What Are (Artificial) Neural Networks?

Regression

yy x x , ,x y

deterministic stochastic

ˆ,f x w

regression prediction

yy~ ~

What Artificial Do Neural Networks Do?

Classification

(x, c)

x Discriminant

function

c1

c2

What Do Artificial Neural Networks Do?

The output (u) of a node is the result of nonlinear

transformations (f) on the input variables (i) transmitted

along the links of the graph.

f

i1 i2 … in

u

What Are (Artificial) Neural Networks?

Neural networks are universal approximators of multivariate

non-linear functions.

KOLMOGOROV (1957):

For any real function continuous in [0,1]n, n2, there

exist n(2n+1) functions ml() continuous in [0,1] such that

),...,,( 21 nxxxf

12

1 1

21 )(),...,,(n

l

n

m

mmlln xxxxf

where the 2n+1 functions l’s are real and continuous.

Thus, a total of (2n+1) functions l () and n(2n+1) functions ml()

of one variable represent a function of n variables

What is the mathematical basis behind Artificial Neural Networks?

Neural networks are universal approximators of multivariate

non-linear functions.

CYBENKO (1989):

Let (•) be a sigmoidal continuous function. The linear combinations

N

j

n

i

jijij wx0 1

are dense in [0,1]n.

In other words, any function f: [0,1]n can be approximated by a

linear combination of sigmoidal functions.

Note that N is not specified.

What is the mathematical basis behind Artificial Neural Networks?

INPUT

OUTPUTNumber of connections

117+85=117

ARTIFICIAL NEURAL NETWORKS

Multilayered

Feedforward NN

INPUT LAYER:

each k-th node (k=1, 2, …, ni) receives the (normalized) value of the

k-th component of the input vector and delivers the same valuex

HIDDEN LAYER:

each j-th node (j=1, 2, …, nh) receives

in

k

jjkk wwx1

0

and delivers

in

k

jjkkj wwxfz1

0 with f typically sigmoidal

Forward Calculation (input-hidden)

OUTPUT LAYER:

each l-th node (l=1, 2, …, no) receives

hn

j

lljj wwz1

0

and delivers

hn

j

lljjl wwzfu1

0 f typically linear or sigmoidal

Forward Calculation (hidden-output)

x1(1), x2

(1) | t(1)

x1(2), x2

(2) | t(2)

x1(p), x2

(p) | t(p)

x1(np), x2

(np) | t(np)

Synapsis

Weights

W

Available input/output patterns

Setting the NN Parameters: Training Phase

Initialize weights to random values

Update weights so as to minimize the average squared

output deviation error (also called Energy Function):

2

1 1

1

2

onnp

o

pl pl

p lp o

E u tn n

Minimization by gradient descent algorithms

Setting The NN Parameters: Error Backpropagation

TRUE l-th output of

the p-th pattern

Computed l-th output

of the p-th pattern

Error Backpropagation (output-hidden)

)1(1

)( nwun

nw lj

h

jl

o

lj

Updating weight wlj (output-hidden connections)

Learning coefficient Momentum

Without loss of generality, set np=1 and define as a user-defined output importance

(usually equal to 1)

being

Error Backpropagation (hidden-input)

Updating weight wjk (hidden-input connections)

Similarly to the updating of the output-hidden weigths,

being

1( ) ( 1)i

jk j j jk

o

w n u w nn

Learning coefficient Momentum

After training:

• Synaptic weights fixed

• New input retrieval of information in the weights output

Capabilities:

• Nonlinearity of sigmoids NN can learn nonlinear mappings

• Each node independent and relies only on local info (synapses)

Parallel processing and fault-tolerance

Utilization of the Neural Network

CONCLUSIONS

Advantages:

No physical/mathematical modelling efforts.

Automatic parameters adjustment through a training phase based on available input/output data. Adjustments such as to obtain the best interpolation of the functional relation between input and output.

Disadvantages:

“black box” : difficulties in interpreting the underlying physical model.

Application:

Detection of malfunctions in the secondary system of a nuclear power plant by neural networks

Objective

Build and train a neural network to classify different malfunctions in the secondary system of a boiling water reactor

Boiling Water Reactor

Secondary System

Turbine

PumpsCondenser

Vessel

Outputs

Class 1: Leakage through the second high-pressure preheater

Class 2: Leakage in the first high-pressure preheater to the drain tank

Class 3: Leakage through the first high-pressure preheater drain back-up valve to the condenser

Class 4: Leakage through high-pressure preheaters bypass valve

Class 5: Leakage through the second high-pressure preheater drain back-up valve to the feedwater tank

Transients

Class 1:

Valve EA1

Class 3: Valve VB20

Class 2:Valve EA2

Class 4:

Valve VB7

Class 5:Valve VA25

Variables inputs

Variable Signal Unit

1 Position level for control valve EA1 %

2 Position level for control valve EB1 %

3 Temperature drain before VB3 ºC

4 Temperature feedwater after EA2 train A ºC

5 Temperature feedwater after EB2 train B ºC

6 Temperature drain 6 after VB1 ºC

7 Temperature drain 5 after VB2 ºC

8Position level control valve before EA2

%

9Position level control valve before EB2

%

10 Temperature feedwater before EB2 train B ºC

Sensors position

Measurement: 36 sampling instants in [80, 290]s, one each 6 s

Var 1 (%)

Var 2 (%)

Var 4 (ºC)

Var 8 (%) Var 3 (ºC)

Var 5 (ºC)

Var 9 (%)

Var 10 (ºC)

Var 6 (ºC)Var 7 (ºC)

Input/Output patterns

Input: The class assignment is performed dynamically as a two-step time window of the measured signals shifts to the successive time t+1

Output: number of the class to which the variables belong

Training set

A training set was constructed containing 8 transients for each class.

For a transient of a given class the 35 patterns used to train the network take the following form

classxxxxxx

classtxtxtxtxtxtx

classtxtxtxtxtxtx

classxxxxxx

)35()36()35()36()35()36(

)()1()()1()()1(

)1()()1()()1()(

)1()2()1()2()1()2(

10102211

10102211

10102211

10102211

Network architecure

Nh, η, α optimized with grid optimization

Optimal values

Nh : 17

η: 0.55

α : 0.6

Test results

0.9

1

1.1

Neural Network and True values.TEST DATA

1.9

2

2.1

2.9

3

3.1

cla

ss

3.9

4

4.1

5 10 15 20 25 30 35

4.9

5

5.1

input pattern

3a

3b

3c

3d

3e

Comments

The real-valued outputs provided by the neural network closely approximate the class integer values since early times.

As time progresses, the network output class assignment approaches more and more closely the true class integer.

New transients

Response of the neural network performance desirable when a new transient is found

Identification that something happens

Classification as don’t know

New transients

New class

Class 6: Steam line valve to the second high-pressure preheater closing

Class 6:

Valve VA6

1

2

3

4

5

6

Neural Network and True values.Class 6

Cla

ss

5 10 15 20 25 30 350

1

2

3

4

5

6

input pattern

Cla

ss

4a

4b

New transients

Close to Class 2 but with a larger deviation

Value provided by the neural network close to 0

New transient

the large output error is due to the fact that some of the inputs which are fed into the network fall outside the ranges of the input values of the training transients

5 10 15 20 25 30 35-6

-5

-4

-3

-2

-1

0

1

input patterns

sig

nal valu

e

second transient

upper training level

lower training level

Temperature drain after VB2

Conclusions

A neural network has been constructed and trainedto classify different malfunctions in the secondarysystem of a boiling water reactor.

The network uses as input patterns the informationprovided by signals that are monitored in the plantand provides as output an estimate of the classassignment.

When a transient belonging to a different class is fedto the network, it is capable of detecting that thepattern does not belong to any of the classes forwhich it was trained