Artificial Neural Networks - LASAR · In an artificial neural network, variables are associated...
Transcript of Artificial Neural Networks - LASAR · In an artificial neural network, variables are associated...
Artificial Neural NetworksFrancesco DI MAIO, Ph.D.,
Politecnico di Milano
Department of Energy - Nuclear Division
IEEE - Italian Reliability Chapter (Chair)
STF - China Fellow
Measured
signals
f1
f2
Forcing
functions
Fault Detection, Diagnostics and Prognostics
Normal operation
Diagnostic system
Measured
signals
f1
f2
Forcing
functions
FAULT
DETECTION
(early recognition)
Fault Detection, Diagnostics and Prognostics
FAULT
CLASSIFICATION
(correct assignment)
f1
f2
Normal operation
Diagnostic system
FAULT
DETECTION
(early recognition)
Measured
signals
f1
f2
Forcing
functions
Fault Detection, Diagnostics and Prognostics
PROGNOSISLifetime estimation
Prognostic systemFAULT
CLASSIFICATION
(correct assignment)
f1
f2
Normal operation
Diagnostic system
FAULT
DETECTION
(early recognition)
Measured
signals
f1
f2
Forcing
functions
Fault Detection, Diagnostics and Prognostics
ns
s
s
...
2
1
M
T
T
...
2
1
Plant
Sensors
Data Pre-
Processing
Feature
Selection
Classification
1 2 3 4 5 6 7 8 9 10 -2
-1
0
1
2
3
4
10
9
8
7
6
5
4
3
2
0 32 59 86 113 140 167 194 221 248 275 -1
-0.5
0
0.5
1
1
window {
Mean residual
Maximum wavelet coeff.
Minimum wavelet coeff.
Fault Type
h < n
h
f
f
f
s
s
s
...
2
1
Diagnostic System: Elements
• Complex Plants and Systems
• Large number of monitored signals with non-linear
dependences
Difficulties in Detecting and Classyfying Faults
We resort to empirical models, rather than to
analytical models
Soft Computing Techniques:Artificial Neural Networks
Support/Relevance Vector Machines
Fuzzy Classifiers
Problem Statement: Challenges
Artificial Neural Networks
Empirical model
built by training
on input/output data
(modeling advantage)
Multivariate, non linear interpolators
Techniques capable of reconstructing the underlying
complex I/O nonlinear relations by combining multiple
simple functions (modeling & computational advantage)
What Are (Artificial) Neural Networks?
In an artificial neural network, variables are associated with
nodes in a graph.
What Are (Artificial) Neural Networks?
Regression
yy x x , ,x y
deterministic stochastic
ˆ,f x w
regression prediction
yy~ ~
What Artificial Do Neural Networks Do?
Classification
(x, c)
x Discriminant
function
c1
c2
What Do Artificial Neural Networks Do?
The output (u) of a node is the result of nonlinear
transformations (f) on the input variables (i) transmitted
along the links of the graph.
f
i1 i2 … in
u
What Are (Artificial) Neural Networks?
Neural networks are universal approximators of multivariate
non-linear functions.
KOLMOGOROV (1957):
For any real function continuous in [0,1]n, n2, there
exist n(2n+1) functions ml() continuous in [0,1] such that
),...,,( 21 nxxxf
12
1 1
21 )(),...,,(n
l
n
m
mmlln xxxxf
where the 2n+1 functions l’s are real and continuous.
Thus, a total of (2n+1) functions l () and n(2n+1) functions ml()
of one variable represent a function of n variables
What is the mathematical basis behind Artificial Neural Networks?
Neural networks are universal approximators of multivariate
non-linear functions.
CYBENKO (1989):
Let (•) be a sigmoidal continuous function. The linear combinations
N
j
n
i
jijij wx0 1
are dense in [0,1]n.
In other words, any function f: [0,1]n can be approximated by a
linear combination of sigmoidal functions.
Note that N is not specified.
What is the mathematical basis behind Artificial Neural Networks?
INPUT
OUTPUTNumber of connections
117+85=117
ARTIFICIAL NEURAL NETWORKS
Multilayered
Feedforward NN
INPUT LAYER:
each k-th node (k=1, 2, …, ni) receives the (normalized) value of the
k-th component of the input vector and delivers the same valuex
HIDDEN LAYER:
each j-th node (j=1, 2, …, nh) receives
in
k
jjkk wwx1
0
and delivers
in
k
jjkkj wwxfz1
0 with f typically sigmoidal
Forward Calculation (input-hidden)
OUTPUT LAYER:
each l-th node (l=1, 2, …, no) receives
hn
j
lljj wwz1
0
and delivers
hn
j
lljjl wwzfu1
0 f typically linear or sigmoidal
Forward Calculation (hidden-output)
x1(1), x2
(1) | t(1)
x1(2), x2
(2) | t(2)
x1(p), x2
(p) | t(p)
x1(np), x2
(np) | t(np)
Synapsis
Weights
W
Available input/output patterns
Setting the NN Parameters: Training Phase
Initialize weights to random values
Update weights so as to minimize the average squared
output deviation error (also called Energy Function):
2
1 1
1
2
onnp
o
pl pl
p lp o
E u tn n
Minimization by gradient descent algorithms
Setting The NN Parameters: Error Backpropagation
TRUE l-th output of
the p-th pattern
Computed l-th output
of the p-th pattern
Error Backpropagation (output-hidden)
)1(1
)( nwun
nw lj
h
jl
o
lj
Updating weight wlj (output-hidden connections)
Learning coefficient Momentum
Without loss of generality, set np=1 and define as a user-defined output importance
(usually equal to 1)
being
Error Backpropagation (hidden-input)
Updating weight wjk (hidden-input connections)
Similarly to the updating of the output-hidden weigths,
being
1( ) ( 1)i
jk j j jk
o
w n u w nn
Learning coefficient Momentum
After training:
• Synaptic weights fixed
• New input retrieval of information in the weights output
Capabilities:
• Nonlinearity of sigmoids NN can learn nonlinear mappings
• Each node independent and relies only on local info (synapses)
Parallel processing and fault-tolerance
Utilization of the Neural Network
CONCLUSIONS
Advantages:
No physical/mathematical modelling efforts.
Automatic parameters adjustment through a training phase based on available input/output data. Adjustments such as to obtain the best interpolation of the functional relation between input and output.
Disadvantages:
“black box” : difficulties in interpreting the underlying physical model.
Application:
Detection of malfunctions in the secondary system of a nuclear power plant by neural networks
Objective
Build and train a neural network to classify different malfunctions in the secondary system of a boiling water reactor
Boiling Water Reactor
Secondary System
Turbine
PumpsCondenser
Vessel
Outputs
Class 1: Leakage through the second high-pressure preheater
Class 2: Leakage in the first high-pressure preheater to the drain tank
Class 3: Leakage through the first high-pressure preheater drain back-up valve to the condenser
Class 4: Leakage through high-pressure preheaters bypass valve
Class 5: Leakage through the second high-pressure preheater drain back-up valve to the feedwater tank
Transients
Class 1:
Valve EA1
Class 3: Valve VB20
Class 2:Valve EA2
Class 4:
Valve VB7
Class 5:Valve VA25
Variables inputs
Variable Signal Unit
1 Position level for control valve EA1 %
2 Position level for control valve EB1 %
3 Temperature drain before VB3 ºC
4 Temperature feedwater after EA2 train A ºC
5 Temperature feedwater after EB2 train B ºC
6 Temperature drain 6 after VB1 ºC
7 Temperature drain 5 after VB2 ºC
8Position level control valve before EA2
%
9Position level control valve before EB2
%
10 Temperature feedwater before EB2 train B ºC
Sensors position
Measurement: 36 sampling instants in [80, 290]s, one each 6 s
Var 1 (%)
Var 2 (%)
Var 4 (ºC)
Var 8 (%) Var 3 (ºC)
Var 5 (ºC)
Var 9 (%)
Var 10 (ºC)
Var 6 (ºC)Var 7 (ºC)
Input/Output patterns
Input: The class assignment is performed dynamically as a two-step time window of the measured signals shifts to the successive time t+1
Output: number of the class to which the variables belong
Training set
A training set was constructed containing 8 transients for each class.
For a transient of a given class the 35 patterns used to train the network take the following form
classxxxxxx
classtxtxtxtxtxtx
classtxtxtxtxtxtx
classxxxxxx
)35()36()35()36()35()36(
)()1()()1()()1(
)1()()1()()1()(
)1()2()1()2()1()2(
10102211
10102211
10102211
10102211
Network architecure
Nh, η, α optimized with grid optimization
Optimal values
Nh : 17
η: 0.55
α : 0.6
Test results
0.9
1
1.1
Neural Network and True values.TEST DATA
1.9
2
2.1
2.9
3
3.1
cla
ss
3.9
4
4.1
5 10 15 20 25 30 35
4.9
5
5.1
input pattern
3a
3b
3c
3d
3e
Comments
The real-valued outputs provided by the neural network closely approximate the class integer values since early times.
As time progresses, the network output class assignment approaches more and more closely the true class integer.
New transients
Response of the neural network performance desirable when a new transient is found
Identification that something happens
Classification as don’t know
New transients
New class
Class 6: Steam line valve to the second high-pressure preheater closing
Class 6:
Valve VA6
1
2
3
4
5
6
Neural Network and True values.Class 6
Cla
ss
5 10 15 20 25 30 350
1
2
3
4
5
6
input pattern
Cla
ss
4a
4b
New transients
Close to Class 2 but with a larger deviation
Value provided by the neural network close to 0
New transient
the large output error is due to the fact that some of the inputs which are fed into the network fall outside the ranges of the input values of the training transients
5 10 15 20 25 30 35-6
-5
-4
-3
-2
-1
0
1
input patterns
sig
nal valu
e
second transient
upper training level
lower training level
Temperature drain after VB2
Conclusions
A neural network has been constructed and trainedto classify different malfunctions in the secondarysystem of a boiling water reactor.
The network uses as input patterns the informationprovided by signals that are monitored in the plantand provides as output an estimate of the classassignment.
When a transient belonging to a different class is fedto the network, it is capable of detecting that thepattern does not belong to any of the classes forwhich it was trained