Artificial Neural Networks and applications Dr. L. Iliadis Assis. Professor Democritus University of...

39
Artificial Neural Networks and applications Dr. L. Iliadis Assis. Professor Democritus University of Thrace, Greece [email protected]

Transcript of Artificial Neural Networks and applications Dr. L. Iliadis Assis. Professor Democritus University of...

Artificial Neural Networks

and applications

Dr. L. Iliadis Assis. Professor Democritus University of

Thrace, Greece

[email protected]

Overview1. Definition of a Neural Network

2. The Human Brain

3. Neuron Models

4. Artificial neural networks ANN

5. Historical Notes

6. ANN Architecture

7. Learning processes – Training and Testing ANN

7.1. Backpropagation Learning

8. Well Known Applications of ANN

What is a Neural Network?A Neural Network is a collection of units connected in some pattern to allow communication between them and it acts as a massively distributed processor.

These units are also referred to as neurons or nodes. The Neural Network has a natural propensity for storing experiential knowledge and making it available for use.

It has two main characteristics:

Knowledge is acquired by the network from its environment through a learning process.

Interneuron connection strengths, known as synaptic weights, are used to store the acquired knowledge.

Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to perform these behaviours. This is called Plasticity.

Zoom on Human Brain NeuronsAxon

Dendrites

Soma (Cell Body)

Biological Analogy of a Neuron

Dendrites

Soma (cell body)

Axon

The Role of the synapses

synapses

axon dendrites

The synapses are responsible for the information transmission between two connected Neurons

Structural Organization of Levels in the Brain

Primary motor: voluntary movementPrimary somatosensory: tactile, pain, pressure, position, temp., mvt.Motor association: coordination of complex movementsSensory association: processing of multisensorial informationPrefrontal: planning, emotion, judgementSpeech center (Broca’s area): speech production and articulationWernicke’s area: comprehension

of speechAuditory: hearingAuditory association: complex auditory processingVisual: low-level visionVisual association: higher-level

vision

Functional Areas of the Brain

HUMAN BRAIN VERSUS SILLICON

The Human cortex has approximately 10 billion neurons and 60 trillion synapses. The net results is that the Brain is an enormously efficient structureThe Energetic Efficiency of the Brain is approximately 10-16 Joules per Operation per second whereas the corresponding value for the best computers in use today is about 10-6 Joules per Operation per second Human Brain Neurons (where events happen in the millisecond range 10-3 sec) are 5 or 6 orders of magnitude slower than silicon logic gates (where events happen in the nanosecond range 10-9 sec).

A signal Xi at the input of synapse i, connected to neuron k is multiplied by the synaptic weight wi

An Adder is Summing the input signals weighted by the respective synapses

An activation function for limiting the amplitude of the output of a neuron. It squashes the permissible amplitude range of output signal to a finit value in the interval [0,1] or alternatively in [-1,1].

Inputs

Outputw2

w1

w3

wn

wn-1

. . .

x1

x2

x3

xn-1

xn

y

n

iii xwz

1

Synaptic Weights

Summing function

ARTIFICIAL NEURON MODEL

)( kbzHy

Activation function

Bias bk

Artificial neural NetworksArtificial Neural Networks consist of interconnected elements which were inspired from studies of biological nervous Systems. They are an attempt to create machines that work in a similar way to the human brain, using components that behave like biological neurons.

• Synaptic strengths are translated as synaptic weights;

• Excitation means positive product between the incoming spike rate and the corresponding synaptic weight;

• Inhibition means negative product between the incoming spike rate and the corresponding synaptic weight;

Neuron’s OutputNonlinear generalization of the neuron:

Sigmoidal or Gaussian functions may be used

),( wxHy Where y is the neuron’s output, x is the vector of inputs, and w is the vector of synaptic weights.

2

2

2

||||

1

1

a

wx

axw

ey

ey T

Sigmoidal neuron

Gaussian neuron

Software or Hardware?Although ANN can be implemented as fast Hardware devices, much research has been performed using conventional computer running software simulations. Software Simulations provide a somewhat cheap and flexible environment in which to research ideas for many real-world applications as there exists an adequate performance. e.g. An ANN Software package might be used to develop a System from Credit Scoring of an individual who applies for a Bank Loan.

Historical NotesRamon y Cajal in 1911 introduced the idea of neurons as structural constituents of the brain The origins of ANN go way back to the 1940s, when McCulloch and Pitts published the first mathematical model of a biological NeuronResearch on ANN stopped for more than 20 years.In the mid 1980s emerged a huge interest in ANN due to the publication of the book “Parallel Distributed Processors” by Rumelhart and McClellandANN made a grate come-back in the 1990’s and they are now widely accepted as a tool in the development of Intelligent Systems

Architecture of Artificial Neural Networks

Inp

ut L

ayer

Output Layer

The way that the artificial neurons are linked together to compose an ANN may vary according to its Architecture. The above architectural graph illustrates the layout of a Multilayer Feedforward (data flow only to one direction) ANN in the case of a single Hidden Layer. The Hidden Layer is where the process takes place.

Hidden Layer

Supervised LearningLearning = learning by adaptation

For example: Animals learn that the green fruits are sour and the yellowish/reddish ones are sweet. The learning happens by adapting

the fruit picking behavior.

Learning can be perceived as an optimisation process. When an ANN is in its SUPERVISED training or learning phase, there are

three factors to be considered:

The Inputs applied are chosen from a Training Set, where the

desired response of the System to these Inputs is Known

The Actual Output Produced when an Input Pattern is applied, is compared to the desired Output and an Error is estimated

In ANN the learning occurs by changing the synaptic strengths (change of the weights) eliminating some synapses, and building new

ones.

PERCEPTRONThe Perceptron is one of the early ANN which is built around a nonlinear neuron, namely the McCulloch-Pitts neuron model.

It produces an output equal to +1 if the hard limiter input is positive, and -1 if its is negative.

Learning with a Perceptron

Perceptron: xwy Tout

Input Data: ),(),...,,(),,( 22

11

NN yxyxyx

Error:22 ))(())(()( t

tTtout yxtwytytE

Learning (Weight Adjustment):

tj

m

jj

T

tit

tTii

i

ttT

ii

ii

xtwxtw

xyxtwctwtw

w

yxtwctw

w

tEctwtw

1

2

)()(

))(()()1(

))(()(

)()()1(

The synapse strength modification rules for artificial neural networks

can be derived by applying mathematical optimisation methods

Learning with MLP ANN

MLP Multi Layer Process ANN:

with p layers

Data: ),(),...,,(),,( 22

11

NN yxyxyx

Error: 22 ));(())(()( tt

tout yWxFytytE

1

221

2

22

111

1

11

);(

...

),...,(

,...,1,1

1

),...,(

,...,1,1

1

2

212

1

11

ppTout

TM

aywk

TM

axwk

ywWxFy

yyy

Mke

y

yyy

Mke

y

kkT

kkT

Calculation of the weight changes is too complicated

xyout

1 2 … p-1 p

Backpropagation LearningIt was developed by Werbos but Rumelhart et. al. in 1986 gave a new lease of life to ANN. The Weight adaption rule is known as Backpropagation.

• It defines 2 sweeps of the ANN. First it performs a forward sweep from the input layer. Thus it calculates first the changes for the synaptic weights of the output neuron;

• Then is performs a backward sweep from the output layer to the Input. In this way it calculates the changes backward starting from layer p-1, and propagates backward the local error terms.

•The bakcward sweep is similar to the forward, except that error values are propagated back through the ANN to determine how the weights are to be changed during training.

EXTDBD Learning Rule

The Extended Delta Bar DeltaExtended Delta Bar Delta is a Heuristic technique that has been used successfully in a wide range of applications and its main characteristic is that it uses a termed momentum. More specifically, a term is added to the standard weight change, which is proportional to the previous weight change. In this way good general trends are reinforced and oscillations are damped.

EVALUATION INSTRUMENTSThe RMS ErrorRMS Error adds up the squares of the errors for each PE in the output layer, divides by the number of PEs in the output layer to obtain an average and then takes the square root of that average hence the name “root square”.

Also another instrument is the Common Mean CorrelationCommon Mean Correlation (CMC) coefficient of the desired (d) and the actual (predicted) output (y) across

the Epoch. The CMC is calculated by

where and

It should be clarified that d stands for the desired values, y for the predicted values and i ranges from 1 to n (the number of cases in the data

training set) and E is the Epoch size which is the number of sets of training data presented to the ANN learning cycles between weight

updates.

22

yydd

yyddCMC

ii

ii

idE

d1

iyE

y1

TESTING AND OVERTRAINING

Over-TrainingOver-Training is a very serious problem!!!

Testing is the process that actually determines the strength of the ANN and its ability to generalize.generalize.

The performance of an ANN is critically dependent on the training data that must be representative of the task to learn (Callan, 1999). For this purpose in the Testing phase we chose randomly A lomg set of actual cases (records) that were not appliednot applied in the training phase.

New methods for learning with neural networks

Bayesian learning:

the distribution of the neural network parameters is learnt

Support vector learning:

the minimal representative subset of the available data is used to calculate the synaptic weights of the neurons

Tasks Performed by Artificial neural networks

The following tasks are usually performed by ANN

• Controlling the movements of a robot based on self-perception and other information (e.g., visual information)

• Decision making

• Pattern Recognition (e.g. recognizing a visual object, a familiar face)

ANN tasks

• Control

• Classification

• Prediction

• Approximation

These can be reformulated in general as

FUNCTION APPROXIMATION

tasks.

With the term Approximation we mean: given a set of values of a function g(x) build a neural network that approximates the g(x) values for any input x.

Learning to approximate

Error measure:

N

ttt yWxF

NE

1

2));((1

Rule for changing the synaptic weights:

ji

ji

newji

ji

ji

www

Ww

Ecw

,

)(

Where c is the learning parameter (usually a constant)

Summary• Artificial neural networks are inspired by the learning processes that take place in biological systems.

• Artificial neurons and neural networks try to imitate the working mechanisms of their biological counterparts.

• Learning can be perceived as an optimisation process.

• Biological neural learning happens by the modification of the synaptic strength. Artificial neural networks learn in the same way.

• The synapse strength modification rules for artificial neural networks can be derived by applying mathematical optimisation methods.

Summary• Learning tasks of artificial neural networks can be reformulated as function approximation tasks.

• Neural networks can be considered as nonlinear function approximating tools (i.e., linear combinations of nonlinear basis functions), where the parameters of the networks should be found by applying optimisation methods.

• The optimisation is done with respect to the approximation error measure.

• In general it is enough to have a single hidden layer neural network (MLP, RBF or other) to learn the approximation of a nonlinear function. In such cases general optimisation can be applied to find the change rules for the synaptic weights.

DEVELOPING ANN

DETERMINING ANN’S TOPOLOGY