Neural Networks
Biological Neuron
dendrites
cell body
axon
signaldirection
colaterals
synapse
. . .
Biological Neural Networks
electricalsignal
electricalsignal
synapticgap
neurotransmitters
dendrite
vesicles
presynapticmembrane
postsynapticmembrane
axon
http://pharmacyebooks.com/2010/10/artifitial-neural-networks-hot-topic-pharmaceutical-research.html
BiologicalNetwork
inputdata output
(ββ ,+β ) (β1 ,+1 )
The Perceptron
The perceptron was developed by Frank Rosenblatt in 1957. It is a simple feed-forward network that can solve (create a decision function for) linearly separable problems.
Inside the Perceptron
πΊπ ππΊπ
πΊπ
πΊπ
πΊπ
πΊπ΅ βπ
πΊπ΅ βπ
ππ
ππ
π π
ππ΅ βπ
ππ΅ βπ
......
πΆπ
sigma-pi
stepfunction
weights
perceptronoutput
When is a Problem Linearly Separable?RED vs BLUE
Linearly Separable Not Linearly Separable
http://dynamicnotions.blogspot.com/2008/09/single-layer-perceptron.html
A Practical ApplicationClassification
Data from: UCI Machine Learning Repository - http://archive.ics.uci.edu/ml/
The Iris Data - This is one of the most famous datasets used to illustrate the classification problem. From four characteristics of the flower (the length of the sepal, the width of the sepal, the length of the petal and the width of the petal), the objective is to classify a sample of 150 irises into three species: versicolor, virginica and setosa.
Sources: R.A. Fisher. "The use of multiple measurements in taxonomic problems. Annals of Eugenics", 7(2), 179β188 (1936)
4210.280.0110000300.1835137273718-1.521854844881471.06085392071769-10.1057086709985-1.533286977513334.0131689222145-1.6375908770170810.741961194748-6.013315934547286.66056158141261
input layerhidden layeroutput layer
learning rateerror limitmax runs
# training sets
ihweights
howeights
0.0
0.5
1.0
Iris-setosa
Iris-versicolor
Iris-virginica
sepallength
sepalwidth
petallength
petalwidth
5.1 3.5 1.4 0.2 Iris-setosa4.9 3.0 1.4 0.2 Iris-setosa4.7 3.2 1.3 0.2 Iris-setosa4.6 3.1 1.5 0.2 Iris-setosa5.0 3.6 1.4 0.2 Iris-setosa :7.0 3.2 4.7 1.4 Iris-versicolor6.4 3.2 4.5 1.5 Iris-versicolor6.9 3.1 4.9 1.5 Iris-versicolor5.5 2.3 4.0 1.3 Iris-versicolor6.5 2.8 4.6 1.5 Iris-versicolor :6.3 3.3 6.0 2.5 Iris-virginica5.8 2.7 5.1 1.9 Iris-virginica7.1 3.0 5.9 2.1 Iris-virginica6.3 2.9 5.6 1.8 Iris-virginica6.5 3.0 5.8 2.2 Iris-virginica :
Iris Data - 3 classes 50 samples eachtrained network specification
4-2-1 net
iris characteristics
Training a 4-2-1 Network for the Iris Data
1/5 of Iris Data selected uniformly, 10 samples per class for a total of 30 training set pairs. The 4-2-1 network is comprised of a total of 10 weights, 8 between the input and hidden layers, and 2 between the hidden layer and the output.
The outputs for the three classes were set to 0, 0.5 and 1.0
1 2 3
1
2
3
50 0 0
0 46 1
0 4 49
1 2 3
1
2
3
1.0 0.0 0.0
0.0 0.92 1.0
0.0 0.08 0.98
ClassifierPerformance
Sample Count
Perf. Fraction
A Demonstration
inputdata
outputdata
(ββ ,+β ) (ββ ,+β )
inputlayer
hiddenlayer
outputlayer
(β1 ,+1 )
Typical Feed-Forward Neural Network
Inside an Artificial Neuron
πΊπ ππΆπ
πΆπ
πΆπ
πΆπ
πΆπ΅ βπ
πΆπ΅ βπ
ππ
ππ
π π
ππ΅ βπ
ππ΅ βπ
......
πΆπ
sigma-pi
sigmoidfunction
ou
tpu
ts f
rom
pre
vio
us
laye
r
weights
neuronoutput
dis
trib
uti
on
to
nex
t la
yer
1. Initialize the network with small random weights.
2. Present an input pattern to the input layer of the network.
3. Feed the input pattern forward through the network to calculate its activation value.
4. Take the difference between desired output and the activation value to calculate the networkβs activation error.
5. Adjust the weights feeding the output neurons to reduce the activation error for this input pattern.
6. Propagate an error value back to each hidden neuron that is proportional to its contribution to the network activation error.
7. Adjust the weights feeding each hidden neuron to reduce its contribution of error for this input pattern.
8. Repeat steps 2 to 7 for each input pattern in the training set ensemble.
9. Repeat step 8 until the network is suitably trained.
Backward Error Propagation
Implementing a Neural Network
m input layer nodes
n hiddenlayer nodes p output
layer nodes
t inputtraining
setseach withm values
t outputtraining
setseach withp values
m x nweightsinput tohiddenlayer
n x pweightshidden to
outputlayer
public static double learn = 0.28;public static double error = 0.01;public static int npairs = 0;public static int maxnumruns = 10000;public static int numinput = 1;public static int numhidden = 1;public static int numoutput = 1;public static double[,] inTrain;public static double[,] outTrain;public static neuron[] iLayer;public static neuron[] hLayer;public static neuron[] oLayer;public static weight[,] ihWeight;public static weight[,] hoWeight;public static int pxerr;public static double Scalerr;public static bool showtoterr = true;
public class neuron{ public double input; public double output; public double error; public neuron() { input = 0.0; output = 0.0; error = 0.0;
}}
public class weight{ public double wt; public double delta; public weight(double wght) { wt = wght; delta = 0.0;
}}
Neural Network Data Structure & Components
π ππ
πππ
πΉππ
πΌ
=
correction to weight value
error in jth unit
learning rate
π«ππ ππ
Generalized Delta Rule
tpi
pth
tra
inin
g s
et i
np
ut
Quantifying Error for Back Propagation
πΉππ= π β² (πππ) (πππβπππ )π (πππ )neuron output function for pth presentation for training
error for jth unit in output layer
error for jth unit in hidden layerπΉππ= π β
β² π (πππ)( β( πππ ππππ)
πΉπππ ππ)
π ππ
πΉππ
output layerhidden layer
πΉππ
πΉππ
πΉππ
...
π ππ
π ππ pth
trainin
g set o
utp
ut
πππ
πππ
πππ
π (π )= ππ+πβππ
βπ
π β² (π )=πβ π (π)π
The Sigmoid Function
sigmoid
derivative ofthe sigmoid
π (π )= ππ+πβπ
π β² (π )= π (π) (πβ π (π))
Another Sigmoid Function
sigmoid
derivative ofthe sigmoid
public void calcInputLayer(int p){ for (int i = 0; i < iLayer.Length; i++) { iLayer[i].output = inTrain[i, p]; }} public void calcHiddenLayer(){ for(int h=0;h<hLayer.Length;h++) { hLayer[h].input = 0.0; for (int i = 0; i < iLayer.Length; i++) hLayer[h].input += ihWeight[i, h].wt * iLayer[i].output; hLayer[h].output = f(hLayer[h].input); }} public void calcOutputLayer(){ for (int o = 0; o < oLayer.Length; o++) { oLayer[o].input = 0.0; for (int h = 0; h < hLayer.Length; h++) oLayer[o].input += hoWeight[h, o].wt * hLayer[h].output; oLayer[o].output = f(oLayer[o].input); }}
Running the Neural Network
public double f(double x){ return 1.0 / (1.0 + Math.Exp(-x));} public double df(double x){ return f(x) * (1.0 - f(x));}
Running the network is a feed-forward process. Input data is presented to the input layer.
The activation (input) is computed for each node of the hidden layer and then used to compute the output of the hidden layer nodes
The activation (input) is computed and used to compute the output of the network.
public void calcOutputError(int p, int r){ for (int o = 0; o < oLayer.Length; o++) oLayer[o].error = df(oLayer[o].input) * (outTrain[o, p] - oLayer[o].output); for (int h = 0; h < hLayer.Length; h++) for (int o = 0; o < oLayer.Length; o++) hoWeight[h, o].wt += learn * oLayer[o].error * hLayer[h].output;} public void calcHiddenError(int p, int r){ double err = 0.0; for (int h = 0; h < hLayer.Length; h++) { for (int o = 0; o < oLayer.Length; o++) err = oLayer[o].error * hoWeight[h, o].wt; hLayer[h].error = df(hLayer[h].input) * err; } for (int i = 0; i < iLayer.Length; i++) for (int h = 0; h < hLayer.Length; h++) ihWeight[i, h].wt += learn * hLayer[h].error * iLayer[i].output;}
Training the Network
In backward error propagation, the difference between the actual output and the goal (or target) output provided in the training set is used to compute the error in the network. This error is then used to compute the delta (change) in weight values for the weights between the hidden layer and the output layer.
These new weight values are then used to distribute the output error to the hidden layer nodes. These nodes errors are, in turn, used to compute the changes in value for the weights between the input layer and the hidden layer of the network.
1. Set the number of neurons in each level
2. Select the learning rate, error limit and max training runs
3. Give the number of training pairs and include them in the left-hand text window with input output pairs listed sequentially
input 1output 1input 2output 2 :input noutput n
Total Training Set Ensemble Error during training process
Training rate depends on initialvalue of random weights
User can monitor rate of error correction ineach weight during training as weight color
large delta
small delta
small or zero changes in each weight donot necessarily mean that network is trained
training could be hung in a local minimum
When running the network, place inputvalues in text window and click run
answer(s) appear on next line(s)
How Many Nodes?
Number of Input Layer Nodes matches number of input valuesNumber of Ouput Layer Nodes matches number of output values
But what about the hidden Layer?
Too few hidden layer nodes and the NN can't learn the patterns.
Too many hidden layer nodes and the NN doesn't generalize.
When Should We Use Neural Networks?
Neural Networks need lots of data (example solutions) for training.
The functional relationships of the problem/solution are not well understood.
The problem/solution is not applicable to a rule-based solution.
"Similar input data sets generate "similar" outputs.
Neural Networks perform general Pattern Recognition.
Neural Networks are particularly good as Decision Support tools.
Also good for modeling behavior of living systems.
Can a Neural Network do More than a Digital Computer?
Clearly a simlulation of a Neural Network running on a digital computer cannot be more powerful than the computer on which it is being executed.
The question is, "Can a computational system such as a Neural Network be built that can do something that a digital computer cannot?"
A digital computer is the physical embodiment of a Turing Machine which is defined as a universal computer of all computable functions.
An artificial Neural Network is loosely modeled on the human brain.
Rather than using a software simulation of neurons, we can build electronic circuits that closely mimic the activities of human brain cells.
Can we build a physical system of any kind (based on electronics, chemistry, etc...) that does everything a human brain can do?
Can you think of something human brains do that, so far, has not been accomplished or, at least, approximated by a computer or any other physical (man-made) system?
Consciousness
Can a Neural Network do More than a Digital Computer?
Clearly a simlulation of a Neural Network running on a digital computer cannot be more powerful than the computer on which it is being executed.
The question is, "Can a computational system such as a Neural Network be built that can do something that a digital computer cannot?"
A digital computer is the physical embodiment of a Turing Machine which is defined as a universal computer of all computable functions.
An artificial Neural Network is loosely modeled on the human brain.
Rather than using a software simulation of neurons, we can build electronic circuits that closely mimic the activities of human brain cells.
Can we build a physical system of any kind (based on electronics, chemistry, etc...) that does everything a human brain can do?
Can you think of something human brains do that, so far, has not been accomplished or, at least, approximated by a computer or any other physical (man-made) system?
What is the Computational Power of Consciousness?
Since we can't quantify consciousness, it is not likely that we can determine the level of computational power necessary to manifest it.
However, we can establish a relative measure of computational power for systems that do and (so far) do not exhibit consciousness.
Human Mind/Brain
Turing Machine
Digital Computer
Neural Network
Physical System/Model
Mind/Brain
TuringMachine
PhysicalModel
DigitalComputer
NeuralNetwork
Relative Computational Power
Mind/Brain
TuringMachine
PhysicalModel
DigitalComputer
NeuralNetwork
The RevisedTuring Test
Dualismvs
Materialism
Finite Storageand
Finite Precision
Engineeringand
Technology
Symbolismvs
Connectionism
Due to limitations of finite storage and the related issue of finite precision arithmetic, a Turing Machine can exhibit greater computational power than a digital computer.
Relative Computational Power
Mind/Brain
TuringMachine
PhysicalModel
DigitalComputer
NeuralNetwork
Relative Computational Power
>
Top Related