Tutorial An Introduction to the Use of Artificial Neural Networks....

(c) INAOE 2015

Tutorial

An Introduction to the Use of

Artificial Neural Networks.

Part 3: Solutions using ANN

Dra. Ma. del Pilar Gómez Gil INAOE

[email protected] [email protected]

This version: October 13,

2015

1

Outline

(c) INAOE 2015 2

Duration Topic Sub-topics

1 hour 1. Artificial Neural Networks.

What is that?

1.1 Some definitions.

1.2 Advantages and drawbacks of

ANN.

1.3 Characteristics of solutions

using ANN.

1.4 The fundamental neuron.

1.5 The concept of “learning” by

examples

1 hour 2. Basic architectures 2.1 Types of ANN

2.2 Single layer perceptron

network

2.3 Multi-layer Perceptrons

1 hour 3. Solutions using ANN 3.1 ANN as classifiers

3.2 ANN as a function

approximator.

3.3 ANN as predictors

1 hour 4. Examples using Matlab ANN

toolbox

4.1 A very simple classifier.

4.2 A very simple function

approximator.

Creating ANN models

Currently, there are countless models and

architectures of ANN

Designing a ANN implies taking advantages

of the characteristics of the problem domain

and capabilities of ANN to solve the problem

Next we see how to use ANN as classifiers,

function approximator and predictors

(c) INAOE 2015 3

3.1 ANN as

classifiers

(c) INAOE 2015 4

(c) INAOE 2015 5

An Adaptive Classifier

OB

JE

TS

CLA

SS

ES

SENSING MEASUREMENTS

CONTEX

ANALYSIS

FEATURE VECTOR

DECISION

PREPROCESING

AND FEATURE

SELECTION

LEARNING

[Tao & Gonzalez 74]

Using a MLP as classifier

(c) INAOE 2015

1. Design carefully the characteristic vector, choosing the appropriate measurements and preprocessing them, if required.

2. Analyze the convenience to normalize the characteristics, if variance in their magnitudes are high

3. Collect as much data as possible. Better to have a similar number of examples for each possible class

4. Divide data in 2 or 3 sets: training set, validation set and testing set

6

Using a MLP as classifier (cont.)

5. Define the network as follows: a) The number of input nodes equals to the size of the characteristic vector,

b) The number of output nodes equals to the number of classes at in the

solution

c) Decide an initial number of hidden nodes in the PLM. The simplest thumb

rule says: Try not to have more weights to train than examples in the

training set.

6. Start training your network. If total error decrease slowly or

do not decrease, try with more or less hidden nodes. The

Mathlab ANN toolbox decides automatically other required

parameters, unless that you specify otherwise.

(c) INAOE 2015 7

Using a MLP as classifier (cont.)

5. After getting a net that fit your training data, test it using

the validation data. If you are not satisfy with the

results, go to step 1

6. Now use your network with data in “testing test.” The

performance obtained with this evaluation is the one

that characterized your experiment.

(c) INAOE 2015 8

Values of output neurons

Example: for 3 classes, the target output

neurons are:

(c) INAOE 2015

Class Values of neurons

1 1 0 0

2 0 1 0

3 0 0 1

9

(c) INAOE 2015

Selecting a class

There are several outputs in a MLP, one for each class, therefore a process to decide the answer is required.

MLP Class decision

10

(c) INAOE 2015

Selecting a class (cont.)

The best way to chose a class, is calculating

the Euclidian distance from the output of

the network to each possible class. The

sample belongs to the class with the

slowest distance.

11

(c) INAOE 2015


Let the network to have n neurons in the output

layer, and there are m possible classes. The net

output is given by:

Each class is represented as:

)...,( 21 nyyyY

)...,( 21 nyyyY

)...,(

.

)...,(

)...,(

21

22221

11211

mnmm

n

n

ccc

ccc

ccc

m

2

1

C

C

C

12

(c) INAOE 2015


For i = 1.. m, distance from Y to Ci is

given by:

The assigned class is:

22

22

2

11 )...()()(2

1niniii ycycycd

midescogidaclase i ..1 )(argmin_

13

Classifing leukocites [Gómez –Gil et al. 2008]

(c) INAOE 2015

Figure 1. Maturity stages of white blood cells. (a) Myeloblast. (b) Promyelocyte. (c)

Myelocyte. (d) Metamyelocyte. (e) Band, (f) Polymorphonuclear leukocytes (PMN).

14

Feature extraction for leukocites [Gómez-Gil et al. 2008]

Feature vector:

where:

= Leukocyte normalized area

= Nucleus-cytoplasm ratio

components of Pecstrum of nucleus.

(c) INAOE 2015

),( BnPRAx ncL

LA

ncR

),( BnP

15

Feature extraction (cont)

(c) INAOE 2015

An example of a feature vector. (a) Original leukocyte image. (b) Matlab

screen with the obtained composed feature vector. [Gómez-Gil et al.

2008]

(a) (b)

16

Results [Gómez-Gil et al. 2008]

(c) INAOE 2015

Classifier

Classification

Rate Euclidean Distance 77.7%

K-NN with K= 1 70.4%



FFNN with 19 hidden nodes 87.6%

FFNN with 14 hidden nodes 84.9%

DAGSVM 71.6%

17

(c) INAOE 2015

Old Documents Recognition

[Gomez-Gil et al. 2007]

18

(c) INAOE 2015

Some problems with word

manuscript recognition

haciend algunos haciendo alguns

19

Results using a SOM network [Gomez-Gil et al. 2007]

(c) INAOE 2015

Number of

classes

Number of

training

patterns

Type of

Recognizer

Recognition rate

on

Training set

3 13 Nearest neighbor 84%

SOM (3x3) 92%


SOM (5x1) 58%

SOM (5x2) 71%

SOM (5x5) 73%


SOM (5x12) 63%

SOM (2x30) 70%

20

A topological map for character

isolated handwritten recognition

(c) INAOE 2015 21

3.2 ANN as a

Function

Approximator

(c) INAOE 2015 22

A multi-layer perceptron (MLP)…

(c) INAOE 2015

1x

ix

mx

j

F

jiw

.

.

.

.

.

1

h

…. with m inputs and one hidden layer

with h neurons, is able to approximate any function .

23

Mathematical definition

(c) INAOE 2015

h

j

m

i

jijijm bxwxxxF1 1

21 )(),...,,(

mihjbw jji ,...,1 ;,...,1 , are weights connecting neurons in the hidden

layer to external inputs,

hjj ,...,1 are weights connecting the single neuron in

the output layer with neurons in the hidden

layer,

ue

u

1

1)(

the activation function used for neurons in the

hidden layer;

is a scaling coefficient controlling the behavior of

the activation function in a range where 0)( u

• the activation function of output layer is linear

24

Confusing Training Data

X F(x)

3 9

-2 4

5 25

7 49

2 4 (ups! )

(c) INAOE 2015 25

One-step Forecasting of Seismograms

Using MLP [Bernardo-Torres & Gómez-Gil 2009]

Fig. 4. An example of forecasting for station number 3 (file CAYA8509.191)

(c) INAOE 2015 26

Results [Bernardo-Torres & Gómez-Gil 2009]

Station

Number

Over Training set Over testing set

(Generalization)

RPROP LV-MQ RPROP LV-MQ

1 2.193 ± 0.025 1.918 ± 0.039 1.082 ± 0.086 0.806 ± 0.058

2 0.904 ± 0.029 0.862 ± 0.007 0.707 ± 0.023 0.696 ± 0.015

3 0.608 ± 0.002 0.591 ± 0.013 0.526 ± 0.001 0.543 ± 0.008

4 1.620 ± 0.018 1.553 ± 0.036 1.027 ± 0.016 0.931 ± 0.097

5 0.803 ± 0.008 1.089 ± 0.812 0.640 ± 0.017 0.847 ± 0.473

(c) INAOE 2015 27

3.3 ANN as

Predictors

(c) INAOE 2015 28

(c) INAOE 2015

Predictions types

One step prediction Point-to-point prediction

or long-term prediction

P

R

E

D

I

C

T

O

R

s(t-5)

s(t-4)

s(t-3)

s(t-2)

s(t-1)

v(t)

s(t)

P

R

E

D

I

C

T

O

R

v(t-5)

v(t-4)

v(t-3)

v(t-2)

v(t-1)

v(t)

s(t)

s(t) : original signal; v(t) : predicted signal

29

(c) INAOE 2015

The problem

One-step prediction uses past values tu

calculate next value in a time series.

Long-term prediction eventually requires to

use values already calculated by the

predictor in order to calculate new values.

Therefore the prediction error propages very

quickly. This has a high impact in highly non-

linear systems (as chaotic time series)

30

(c) INAOE 2015

A linear predictor using “real data”

as input Learning

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1 17 33 49 65 81 97 113 129 145 161 177 193 209 225 241 257 273 289 305 321 337 353 369 385 401 417 433 449 465

ext1_07

expected

31

(c) INAOE 2015

A linear predictor using “predicted data” as

input

Long term prediction of an ECG using

feed forward network

-20

-10

0

10

20

time

ma

gn

itu

de

Orignal data predicted data

The original ECG is not visible!

32

(c) INAOE 2015

Recurrent neural networks…

They are dynamical systems

They learn from data

If correctly trained, they are able to oscillate in a stable way

The training algorithms of RNN are dificult to implement and to control

They are very powerful!

33

(c) INAOE 2015

The Hybrid-connected Complex

Neural Network [Gomez et al. 2011]

s(t-5) s(t-4) s(t-2) s(t-1)

v(t)

Sine function

3-node fully

connected NN

1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154 163 172 181 190 199 208 217 226 235 244

Initial

condition

34

(c) INAOE 2015

Dynamics

dy

dty x I

ii i i ( )

x w yi ji j

j

35

(c) INAOE 2015

A prediction using HCNN Case K.2

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1 90 179 268 357 446 535 624 713 802 891 980 1069 1158 1247 1336 1425 1514 1603 1692 1781 1870 1959 2048

n

expected

prediction

36

References

Bernardo-Torres A, Gómez-Gil P. One-step Forecasting of Seismograms Using Multi-Layer Perceptrons. Proc. of the 2009 6th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE 2009) Formerly known as ICEEE. Nov. 2009

Gómez-Gil P, Ramírez-Cortés JM, Pomares Hernández SE, Alarcón-Aquino V. “A Neural Network Scheme for Long-term Forecasting of Chaotic Time Series” Neural Proceesing Letters. Vol.33, No. 3, June 2011. pp 215-233. Published online: March 8, 2011. DOI: 10.1007/s11063-011-9174-0 (cited at JCR Science Edition—2009). (preliminary PDF)

Gómez-Gil, P. De-Los-Santos Torres G., Navarrete-García J. Ramírez-Cortés M. “The Role of Neural Networks in the interpretation of Antique Handwritten Documents.” Hibrid Intelligent Systems. Analysis and Design Series: Studies at Fuzziness and Soft Computing. Vol . 208. Editors: Castillo, O. Melin, P. Kacprzyk W. 2007 Springer. ISBN-10: 3-540-37419-1. Pags. 269-281.

P. Gómez-Gil, M. Ramírez-Cortés, J. González-Bernal, A. García-Pedrero, C. I. Prieto-Castro, D. Valencia, R. Lobato, J. E. Alonso. “A Feature Extraction Method based on Morphological Operators for Automatic Classification of Leukocytes.” Proceedings of the 2008 Seventh Mexican International Conference on Artificial Intelligence (MICAI). Published by the IEEE Computer Society. Pp. 227-232. Octuber 2008. ISBN: 978-0-7695-3441-1.

Tao, J.T. and Gonzalez, R.C. Pattern Recognition Principles. Addison-Wesley. 1974

(c) INAOE 2015 37

http://www.springerlink.com/content/b3254x7p36107u57/















http://ccc.inaoep.mx/~pgomez/publications/journals/PggNPL11.pdf




http://www.springerlink.com/content/33661173371xm44j/

http://www.springerlink.com/content/33661173371xm44j/

http://ccc.inaoep.mx/~pgomez/publications/congress/PggMic08.pdf

http://ccc.inaoep.mx/~pgomez/publications/congress/PggMic08.pdf

Tutorial An Introduction to the Use of Artificial Neural Networks....

Documents

Transcript of Tutorial An Introduction to the Use of Artificial Neural Networks....