Tutorial An Introduction to the Use of Artificial Neural Networks....
Transcript of Tutorial An Introduction to the Use of Artificial Neural Networks....
(c) INAOE 2015
Tutorial
An Introduction to the Use of
Artificial Neural Networks.
Part 3: Solutions using ANN
Dra. Ma. del Pilar Gómez Gil INAOE
[email protected] [email protected]
This version: October 13,
2015
1
Outline
(c) INAOE 2015 2
Duration Topic Sub-topics
1 hour 1. Artificial Neural Networks.
What is that?
1.1 Some definitions.
1.2 Advantages and drawbacks of
ANN.
1.3 Characteristics of solutions
using ANN.
1.4 The fundamental neuron.
1.5 The concept of “learning” by
examples
1 hour 2. Basic architectures 2.1 Types of ANN
2.2 Single layer perceptron
network
2.3 Multi-layer Perceptrons
1 hour 3. Solutions using ANN 3.1 ANN as classifiers
3.2 ANN as a function
approximator.
3.3 ANN as predictors
1 hour 4. Examples using Matlab ANN
toolbox
4.1 A very simple classifier.
4.2 A very simple function
approximator.
Creating ANN models
Currently, there are countless models and
architectures of ANN
Designing a ANN implies taking advantages
of the characteristics of the problem domain
and capabilities of ANN to solve the problem
Next we see how to use ANN as classifiers,
function approximator and predictors
(c) INAOE 2015 3
3.1 ANN as
classifiers
(c) INAOE 2015 4
(c) INAOE 2015 5
An Adaptive Classifier
OB
JE
TS
CLA
SS
ES
SENSING MEASUREMENTS
CONTEX
ANALYSIS
FEATURE VECTOR
DECISION
PREPROCESING
AND FEATURE
SELECTION
LEARNING
[Tao & Gonzalez 74]
Using a MLP as classifier
(c) INAOE 2015
1. Design carefully the characteristic vector, choosing the appropriate measurements and preprocessing them, if required.
2. Analyze the convenience to normalize the characteristics, if variance in their magnitudes are high
3. Collect as much data as possible. Better to have a similar number of examples for each possible class
4. Divide data in 2 or 3 sets: training set, validation set and testing set
6
Using a MLP as classifier (cont.)
5. Define the network as follows: a) The number of input nodes equals to the size of the characteristic vector,
b) The number of output nodes equals to the number of classes at in the
solution
c) Decide an initial number of hidden nodes in the PLM. The simplest thumb
rule says: Try not to have more weights to train than examples in the
training set.
6. Start training your network. If total error decrease slowly or
do not decrease, try with more or less hidden nodes. The
Mathlab ANN toolbox decides automatically other required
parameters, unless that you specify otherwise.
(c) INAOE 2015 7
Using a MLP as classifier (cont.)
5. After getting a net that fit your training data, test it using
the validation data. If you are not satisfy with the
results, go to step 1
6. Now use your network with data in “testing test.” The
performance obtained with this evaluation is the one
that characterized your experiment.
(c) INAOE 2015 8
Values of output neurons
Example: for 3 classes, the target output
neurons are:
(c) INAOE 2015
Class Values of neurons
1 1 0 0
2 0 1 0
3 0 0 1
9
(c) INAOE 2015
Selecting a class
There are several outputs in a MLP, one for each class, therefore a process to decide the answer is required.
MLP Class decision
10
(c) INAOE 2015
Selecting a class (cont.)
The best way to chose a class, is calculating
the Euclidian distance from the output of
the network to each possible class. The
sample belongs to the class with the
slowest distance.
11
(c) INAOE 2015
Selecting a class (cont.)
Let the network to have n neurons in the output
layer, and there are m possible classes. The net
output is given by:
Each class is represented as:
)...,( 21 nyyyY
)...,( 21 nyyyY
)...,(
.
)...,(
)...,(
21
22221
11211
mnmm
n
n
ccc
ccc
ccc
m
2
1
C
C
C
12
(c) INAOE 2015
Selecting a class (cont.)
For i = 1.. m, distance from Y to Ci is
given by:
The assigned class is:
22
22
2
11 )...()()(2
1niniii ycycycd
midescogidaclase i ..1 )(argmin_
13
Classifing leukocites [Gómez –Gil et al. 2008]
(c) INAOE 2015
Figure 1. Maturity stages of white blood cells. (a) Myeloblast. (b) Promyelocyte. (c)
Myelocyte. (d) Metamyelocyte. (e) Band, (f) Polymorphonuclear leukocytes (PMN).
14
Feature extraction for leukocites [Gómez-Gil et al. 2008]
Feature vector:
where:
= Leukocyte normalized area
= Nucleus-cytoplasm ratio
components of Pecstrum of nucleus.
(c) INAOE 2015
),( BnPRAx ncL
LA
ncR
),( BnP
15
Feature extraction (cont)
(c) INAOE 2015
An example of a feature vector. (a) Original leukocyte image. (b) Matlab
screen with the obtained composed feature vector. [Gómez-Gil et al.
2008]
(a) (b)
16
Results [Gómez-Gil et al. 2008]
(c) INAOE 2015
Classifier
Classification
Rate Euclidean Distance 77.7%
K-NN with K= 1 70.4%
K-NN with K= 3 72.2%
K-NN with K= 5 70.4%
FFNN with 19 hidden nodes 87.6%
FFNN with 14 hidden nodes 84.9%
DAGSVM 71.6%
17
(c) INAOE 2015
Old Documents Recognition
[Gomez-Gil et al. 2007]
18
(c) INAOE 2015
Some problems with word
manuscript recognition
haciend algunos haciendo alguns
19
Results using a SOM network [Gomez-Gil et al. 2007]
(c) INAOE 2015
Number of
classes
Number of
training
patterns
Type of
Recognizer
Recognition rate
on
Training set
3 13 Nearest neighbor 84%
SOM (3x3) 92%
5 56 Nearest neighbor 58%
SOM (5x1) 58%
SOM (5x2) 71%
SOM (5x5) 73%
21 86 Nearest neighbor 6%
SOM (5x12) 63%
SOM (2x30) 70%
20
A topological map for character
isolated handwritten recognition
(c) INAOE 2015 21
3.2 ANN as a
Function
Approximator
(c) INAOE 2015 22
A multi-layer perceptron (MLP)…
(c) INAOE 2015
1x
ix
mx
j
F
jiw
.
.
.
.
.
1
h
…. with m inputs and one hidden layer
with h neurons, is able to approximate any function .
23
Mathematical definition
(c) INAOE 2015
h
j
m
i
jijijm bxwxxxF1 1
21 )(),...,,(
mihjbw jji ,...,1 ;,...,1 , are weights connecting neurons in the hidden
layer to external inputs,
hjj ,...,1 are weights connecting the single neuron in
the output layer with neurons in the hidden
layer,
ue
u
1
1)(
the activation function used for neurons in the
hidden layer;
is a scaling coefficient controlling the behavior of
the activation function in a range where 0)( u
• the activation function of output layer is linear
24
Confusing Training Data
X F(x)
3 9
-2 4
5 25
7 49
2 4 (ups! )
(c) INAOE 2015 25
One-step Forecasting of Seismograms
Using MLP [Bernardo-Torres & Gómez-Gil 2009]
Fig. 4. An example of forecasting for station number 3 (file CAYA8509.191)
(c) INAOE 2015 26
Results [Bernardo-Torres & Gómez-Gil 2009]
Station
Number
Over Training set Over testing set
(Generalization)
RPROP LV-MQ RPROP LV-MQ
1 2.193 ± 0.025 1.918 ± 0.039 1.082 ± 0.086 0.806 ± 0.058
2 0.904 ± 0.029 0.862 ± 0.007 0.707 ± 0.023 0.696 ± 0.015
3 0.608 ± 0.002 0.591 ± 0.013 0.526 ± 0.001 0.543 ± 0.008
4 1.620 ± 0.018 1.553 ± 0.036 1.027 ± 0.016 0.931 ± 0.097
5 0.803 ± 0.008 1.089 ± 0.812 0.640 ± 0.017 0.847 ± 0.473
(c) INAOE 2015 27
3.3 ANN as
Predictors
(c) INAOE 2015 28
(c) INAOE 2015
Predictions types
One step prediction Point-to-point prediction
or long-term prediction
P
R
E
D
I
C
T
O
R
s(t-5)
s(t-4)
s(t-3)
s(t-2)
s(t-1)
v(t)
s(t)
P
R
E
D
I
C
T
O
R
v(t-5)
v(t-4)
v(t-3)
v(t-2)
v(t-1)
v(t)
s(t)
s(t) : original signal; v(t) : predicted signal
29
(c) INAOE 2015
The problem
One-step prediction uses past values tu
calculate next value in a time series.
Long-term prediction eventually requires to
use values already calculated by the
predictor in order to calculate new values.
Therefore the prediction error propages very
quickly. This has a high impact in highly non-
linear systems (as chaotic time series)
30
(c) INAOE 2015
A linear predictor using “real data”
as input Learning
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1 17 33 49 65 81 97 113 129 145 161 177 193 209 225 241 257 273 289 305 321 337 353 369 385 401 417 433 449 465
ext1_07
expected
31
(c) INAOE 2015
A linear predictor using “predicted data” as
input
Long term prediction of an ECG using
feed forward network
-20
-10
0
10
20
time
ma
gn
itu
de
Orignal data predicted data
The original ECG is not visible!
32
(c) INAOE 2015
Recurrent neural networks…
They are dynamical systems
They learn from data
If correctly trained, they are able to oscillate in a stable way
The training algorithms of RNN are dificult to implement and to control
They are very powerful!
33
(c) INAOE 2015
The Hybrid-connected Complex
Neural Network [Gomez et al. 2011]
s(t-5) s(t-4) s(t-2) s(t-1)
v(t)
Sine function
3-node fully
connected NN
1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163 172 181 190 199 208 217 226 235 244
Initial
condition
34
(c) INAOE 2015
Dynamics
dy
dty x I
ii i i ( )
x w yi ji j
j
35
(c) INAOE 2015
A prediction using HCNN Case K.2
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1 90 179 268 357 446 535 624 713 802 891 980 1069 1158 1247 1336 1425 1514 1603 1692 1781 1870 1959 2048
n
expected
prediction
36
References
Bernardo-Torres A, Gómez-Gil P. One-step Forecasting of Seismograms Using Multi-Layer Perceptrons. Proc. of the 2009 6th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE 2009) Formerly known as ICEEE. Nov. 2009
Gómez-Gil P, Ramírez-Cortés JM, Pomares Hernández SE, Alarcón-Aquino V. “A Neural Network Scheme for Long-term Forecasting of Chaotic Time Series” Neural Proceesing Letters. Vol.33, No. 3, June 2011. pp 215-233. Published online: March 8, 2011. DOI: 10.1007/s11063-011-9174-0 (cited at JCR Science Edition—2009). (preliminary PDF)
Gómez-Gil, P. De-Los-Santos Torres G., Navarrete-García J. Ramírez-Cortés M. “The Role of Neural Networks in the interpretation of Antique Handwritten Documents.” Hibrid Intelligent Systems. Analysis and Design Series: Studies at Fuzziness and Soft Computing. Vol . 208. Editors: Castillo, O. Melin, P. Kacprzyk W. 2007 Springer. ISBN-10: 3-540-37419-1. Pags. 269-281.
P. Gómez-Gil, M. Ramírez-Cortés, J. González-Bernal, A. García-Pedrero, C. I. Prieto-Castro, D. Valencia, R. Lobato, J. E. Alonso. “A Feature Extraction Method based on Morphological Operators for Automatic Classification of Leukocytes.” Proceedings of the 2008 Seventh Mexican International Conference on Artificial Intelligence (MICAI). Published by the IEEE Computer Society. Pp. 227-232. Octuber 2008. ISBN: 978-0-7695-3441-1.
Tao, J.T. and Gonzalez, R.C. Pattern Recognition Principles. Addison-Wesley. 1974
(c) INAOE 2015 37