Biological and Artificial Neuron
description
Transcript of Biological and Artificial Neuron
Dr.-Ing. Erwin SitompulPresident University
Lecture 2
Introduction to Neural Networksand Fuzzy Logic
President University Erwin Sitompul NNFL 2/1
http://zitompul.wordpress.com
President University Erwin Sitompul NNFL 2/2
soma
synapse
dendrite
axon
( )y f net( )f
1w
2w
3wb
1y
3y
2y
1
net
Weights, need to be determinedBiological neuron
Artificial neuron
Bias, need to be determined
Learning ProcessesNeural Networks
Biological and Artificial Neuron
President University Erwin Sitompul NNFL 2/3
Learning ProcessesNeural Networks
Application of Neural NetworksFunction approximation and predictionPattern recognitionSignal processingModeling and controlMachine learning
President University Erwin Sitompul NNFL 2/4
Building a Neural NetworkSelect Structure: design the way that the neurons are
interconnected.Select weights: decide the strengths with which the
neurons are interconnected.Weights are selected to get a “good match” of
network output to the output of a training set.Training set is a set of inputs and desired outputs.The weight selection is conducted by the use of a
learning algorithm.
Learning ProcessesNeural Networks
President University Erwin Sitompul NNFL 2/5
Stage 1: Network Training
Training Data
Stage 2: Network Validation
Artificial neural network
Input and output sets, adequate coverage
Learning Process
In the form of a set of optimized synaptic weights and biases
Unseen Data
From the same range as the training data
Artificial neural network
ImplementationPhase
Learning ProcessesNeural Networks
Learning Process
Knowledge
Output Prediction
President University Erwin Sitompul NNFL 2/6
Learning ProcessLearning is a process
by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded.
In most cases, due to complex optimization plane, the optimized weights and biases are obtained as a result of a number of learning iterations.
[w,b]x y
[w,b]0x y(0)
Initialize: Iteration (0)
[w,b]1x y(1)
Iteration (1)
[w,b]nx y(n) ≈ d
Iteration (n)
ANN
d : desired output
…
Learning ProcessesNeural Networks
President University Erwin Sitompul NNFL 2/7
Learning RulesLearning ProcessesNeural Networks
Error Correction LearningDelta Rule or Widrow-Hoff Rule
Memory Based LearningNearest Neighbor Rule
Hebbian LearningSynchronous activation increases the synaptic
strengthAsynchronous activation decreases the synaptic
strengthCompetitive LearningBoltzmann Learning
President University Erwin Sitompul NNFL 2/8
wk1(n)x1
x2
xm
Inpu
ts
Synapticweights
Bias
Activation
function
wk2(n)
wkm(n)
Outputyk (n)
Desired outputdk (n)
ek (n)
+
f(.)
bk(n)
1
-
Errorsignal
Learning ProcessesNeural Networks
Error-Correction Learning
LearningRule
President University Erwin Sitompul NNFL 2/9
Learning ProcessesNeural Networks
Delta Rule (Widrow-Hoff Rule)
Minimization of a cost function (or performance index)
21( ) ( )2 kn e nE
President University Erwin Sitompul NNFL 2/10
wkj(0) = 0
yk(n) = [wkj(n) xj(n)]
wkj(n+1) = wkj(n) + h [dk(n) – yk(n)] xj(n)
h : learning rate, [0…1]n = n+1
1kj kjkj
w n w nw
h + -
E
n = 0 Least Means Squares Rule
Learning ProcessesNeural Networks
Delta Rule (Widrow-Hoff Rule)
President University Erwin Sitompul NNFL 2/11
Learning ProcessesNeural Networks
Learning Paradigm
ANN
Error
Desired
Actual
+-
Environment(Data)
Teacher(Expert)
Supervised Unsupervised
Environment(Data)
Delay
ANN
Delayed Reinforcement
Learning
CostFunction
President University Erwin Sitompul NNFL 2/12
Single Layer PerceptronsNeural Networks
Single Layer Perceptrons
• Output unit is independent of the others.• Analysis can be limited to
single output perceptron.
Single-layer perceptron network is a network with all the inputs connected directly to the output(s).
President University Erwin Sitompul NNFL 2/13
Single Layer PerceptronsNeural Networks
Derivation of a Learning Rule for Perceptrons
w1 w2
E(w
)
Key idea: Learning is performed by adjusting the weights in order to minimize the sum of squared errors on a training.
Weights are updated repeatedly (in each epoch/iteration).
Sum of squared errors is a classical error measure (e.g. commonly used in linear regression).
Learning can be viewed as an optimization search problem in weight space.
President University Erwin Sitompul NNFL 2/14
Single Layer PerceptronsNeural Networks
Derivation of a Learning Rule for Perceptrons The learning rule performs a search within the
solution's vector space towards a global minimum.The error surface itself is a
hyper-paraboloid but is seldom as smooth as is depicted below.
In most problems, the solution space is quite irregular with numerous pits and hills which may cause the network to settle down in a local minimum (not the best overall solution).
Epochs are repeated until stopping criterion is reached (error magnitude, number of iterations, change of weights, etc).
President University Erwin Sitompul NNFL 2/15
Single Layer PerceptronsNeural Networks
Derivation of a Learning Rule for Perceptrons
Widrow [1962]
x1
x2
xm
wk1
wk2
wkm
.
.
. xwT
k xwTkky
Adaline(Adaptive Linear Element)
Goal: Tk kky d w x
President University Erwin Sitompul NNFL 2/16
Least Mean Squares (LMS)Single Layer PerceptronsNeural Networks
2
1
1( ) ( ) ( )2
p
k k ki
E d i y i
-w
2T
1
1 ( ) ( )2
p
k k ki
d i i
- w x
2
1 1
1 ( ) ( )2
p m
k kj ji j
d i w x i
-
The following cost function (error function) should be minimized:
President University Erwin Sitompul NNFL 2/17
Single Layer PerceptronsNeural Networks
Least Mean Squares (LMS)
12
21
k k kkm
mk kdw dwdf wf
w wdf f
w
+ + +
2T
1, , ,k k kmdw dw dww
T
1 2
, , ,k k km
f f ffw w w
Letting f(wk) = f (wk1, wk2,…, wkm) be a function over Rm, then
Defining
df f w
President University Erwin Sitompul NNFL 2/18
fw fw
df : positive df : zero df : negativego uphill plain go downhill
fw
To minimize f , we choose
df f w
fh - w
Single Layer PerceptronsNeural Networks
Gradient Operator
President University Erwin Sitompul NNFL 2/19
Single Layer PerceptronsNeural Networks
Adaline Learning RuleWith
2
1 1
1( ) ( ) ( ) ,2
p m
k k kj ji j
E d i w x i
-
w
T
1 2
( ) ( ) ( )( ) , , ,k k kk
k k km
E E EE
w w w
w w ww then
As already obtained before,( )k kEh - w w Weight Modification Rule
1
( ) ( ) ( )p
kk j
ikj
E i x iw
-
w
( ) ( ) ( )k k ki d i y i -Defining
we can write
President University Erwin Sitompul NNFL 2/20
Single Layer PerceptronsNeural Networks
Adaline Learning ModesBatch Learning Mode
1
( ) ( )ki
kj
p
jx iw ih
Incremental Learning Mode
k jkjw xh
President University Erwin Sitompul NNFL 2/21
-Learning Rule LMS Algorithm Widrow-Hoff Learning Rule
Single Layer PerceptronsNeural Networks
Adaline Learning Rule
Tˆ ˆ ˆ( 1) ( ) ( ) ( ) ( ) ( )n n n d n n nh+ + -w w x x wˆ ˆ( 1) ( ) ( )( ( ) ( ))n n n d n y nh+ + -w w xˆ ˆ( 1) ( ) ( ) ( )n n n e nh+ +w w x
President University Erwin Sitompul NNFL 2/22
Single Layer PerceptronsNeural Networks
Generalization and Early Stopping By proper training, a neural
network may produce reasonable output for inputs not seen during training Generalization
Generalization is particularly useful for the analysis of a “noisy” data (e.g. time–series)
“Overtraining” will not improve the ability of a neural network to produce good output.
On the contrary, it will try to take noise as the real data and lost its generality.
President University Erwin Sitompul NNFL 2/23
Generalization and Early Stopping Single Layer PerceptronsNeural Networks
Learning data set
Mod
elin
g er
ror
Number of iteration in optimization
Validation data set
Early stopping area
Overfitting vs Generalization
President University Erwin Sitompul NNFL 2/24
Homework 1Single Layer PerceptronsNeural Networks
Given a function y = 4x2, you are required to find the value of x that will result y = 2 by using the Least Mean Squares method.Use initial estimate x0 = 1 and learning rate η = 0.01.Write down the results of the first 10 epochs/iterations.Give conclusion about your result.
Note: Calculation can be done manually or using Matlab.