Introduction to Artificial Intelligence (G51IAI) Dr Matthew Hyde Neural Networks More precisely:...

Post on 17-Dec-2015

226 views 1 download

Transcript of Introduction to Artificial Intelligence (G51IAI) Dr Matthew Hyde Neural Networks More precisely:...

Introduction to Artificial Intelligence (G51IAI)

Dr Matthew HydeNeural Networks

More precisely: “Artificial Neural Networks”Simulating, on a computer, what we understand about neural networks in the brain

G51IAI – Introduction to AI

Lecture Outline

Recap on perceptrons Linear Separability Learning / Training The Neuron’s Activation Function

G51IAI – Introduction to AI

Recap from last lecture A ‘Perceptron’ Single layer NN

(one neuron) Inputs can be any

number Weights on the

edges Output can only

be 0 or 1

5

6 Z

θ = 6

0.5

-33

2 0 or 1

Truth Tables and Linear

SeparabilityG51IAI – Introduction to AI

G51IAI – Introduction to AI

AND function, and OR function

X1 X2 Z

1 1 11 0 00 1 00 0 0

AND XORX1 X2 Z

1 1 01 0 10 1 10 0 0

These are called “truth tables”

G51IAI – Introduction to AI

AND function, and OR function

X1 X2 Z

T T TT F FF T FF F F

AND XORX1 X2 Z

T T FT F TF T TF F F

These are called “truth tables”

Important!!!

You can represent any truth table graphically, as a diagram

The diagram is 2-dimensional if there are two inputs

3-dimensional if there are three inputs

Examples on the board in the lecture, and in the handouts

G51IAI – Introduction to AI

G51IAI – Introduction to AI

0,0,1 1,0,1

0,1,11,1,1

0,0,01,0,0

1,1,00,1,0

X axisY axis

Z axis

X Y Z Output

0 0 0 10 0 1 10 1 0 00 1 1 01 0 0 11 0 1 01 1 0 11 1 1 0

3 Inputs means 3-dimensions

G51IAI – Introduction to AI

Linear Separability in 3-dimensions

Instead of a line, the dots are separated by a plane

G51IAI – Introduction to AI

AND

• Functions which can be separated in this way are called Linearly Separable

• Only linearly Separable functions can be represented by a Perceptron

Minsky & Papert

0,0 XOR 1,0

0,11,1

0,0 1,0

0,11,1

AND XORX1 X2 Z

1 1 0

1 0 1

0 1 1

0 0 0

X1 X2 Z

1 1 1

1 0 0

0 1 0

0 0 0

Examples – Handout 3

Linear Separability Fill in the diagrams with the

correct dots black or white, for an output of 1

or 0

G51IAI – Introduction to AI

How to Train your Perceptron

G51IAI – Introduction to AI

Simple Networks

AND

X

Y

θ=1.5

1

1

X Y Z

1 1 1

1 0 0

0 1 0

0 0 0

X

Y

θ=01

1

-1 1.5Both of these

represent the AND function.

It is sometimes convenient to set the threshold to zero, and add a constant negative input

G51IAI – Introduction to AI

Training a NN

AND0,0 1,0

0,11,1

AND

X1 X2 Z

1 1 1

1 0 0

0 1 0

0 0 0

Randomly Initialise the Network

We set the weights randomly, because we do not know what we want it to learn.

The weights can change to whatever value is necessary

It is normal to initialise them in the range [-1,1]

Randomly Initialise the Network

X

Y

θ=00.5

-0.4

-1 0.3

G51IAI – Introduction to AI

Learning

While epoch produces an errorPresent network with next inputs (pattern) from epoch Err = T – OIf Err <> 0 then

Wj = Wj + LR * Ij * Err

End If

End While Get used to this notation!!Make sure that you can reproduce this pseudocode AND understand what all of the terms mean

Epoch

The ‘epoch’ is the entire training set

The training set is the set of four input and output pairs

X Y Z

1 1 1

1 0 0

0 1 0

0 0 0

INPUT DESIRED OUTPUT

The learning algorithm

X Y Z

1 1 1

1 0 0

0 1 0

0 0 0

INPUT DESIRED OUTPUT

Input the first inputs from the training set into the Neural NetworkWhat does the neural network output?Is it what we want it to output?If not then we work out the error and change some weights

First training step

Input 1, 1 Desired output is 1 Actual output is 0

X Y Z

1 1 1

1 0 0

0 1 0

0 0 0

1

1

θ=00.5

-0.4

-1 0.3

-0.3 + 0.5 + -0.4= -0.2= Output of 0

First training step

We wanted 1 We got 0 Error = 1 – 0 = 1

X Y Z

1 1 1

1 0 0

0 1 0

0 0 0

While epoch produces an errorPresent network with next inputs (pattern) from epoch Err = T – OIf Err <> 0 then

Wj = Wj + LR * Ij * Err

End If

End While

If there IS an error, then we change ALL the weights in the network

If there is an error, change ALL the weights

Wj = Wj + ( LR * Ij * Err ) New Weight = Old Weight +

(Learning Rate * Input Value * Error)

New Weight = 0.3 + (0.1 * -1 * 1)= 0.2

1 θ=00.5

-1 0.3 0.2

If there is an error, change ALL the weights

Wj = Wj + ( LR * Ij * Err ) New Weight = 0.5 + (0.1 * 1 * 1)

= 0.6

1 θ=00.5

-1 0.2

1 -0.4

0.6

Effects of the first change The output was too low (it was 0, but we wanted 1) Weights that contributed negatively have reduced Weights that contributed positively have increased It is trying to ‘correct’ the output gradually

X θ=0

-1 0.2

Y -0.3

0.6X

Y

θ=00.5

-0.4

-1 0.3

Epoch not finished yet

The ‘epoch’ is the entire training set

We do the same for the other 3 input-output pairs

X Y Z

1 1 1

1 0 0

0 1 0

0 0 0

INPUT DESIRED OUTPUT

The epoch is now finished

Was there an error for any of the inputs?

If yes, then the network is not trained yet

We do the same for another epoch, from the first inputs again

The epoch is now finished

If there were no errors, then we have the network that we want

It has been trainedWhile epoch produces an errorPresent network with next inputs (pattern) from epoch Err = T – OIf Err <> 0 then

Wj = Wj + LR * Ij * Err

End If

End While

Effect of the learning rate

Set too high The network quickly gets near to what you

want But, right at the end, it may ‘bounce around’

the correct weights It may go too far one way, and then when it

tries to compensate it will go too far back the other way

Wj = Wj + ( LR * Ij * Err )

Effect of the learning rate

Set too high It may ‘bounce around’ the correct

weights

AND0,0 1,0

0,1 1,1

Effect of the learning rate

Set too low The network slowly gets near to what you want It will eventually converge (for a linearly

separable function) but that could take a long time When setting the learning rule, you have to strike

a balance between speed and effectiveness

Wj = Wj + LR * Ij * Err

The Neuron’s Activation Function

G51IAI – Introduction to AI

G51IAI – Introduction to AI

Expanding the Model of the Neuron: Outputs other than ‘1’

X1

X2

Y1

2

-5

θ = 5

X3 2

θ = 0

Y2

-2

1

θ = 9

Z

θ = 21

5

1

20

-10

-4

3

6

10

1

1

Output is 1 or 0It doesn’t matter about how far over the threshold we are

Example from last lecture

G51IAI – Introduction to AI

...

...

Left wheel speed

Right wheel speedThe speed of

the wheels is not just 0 or 1

G51IAI – Introduction to AI

Expanding the Model of the Neuron: Outputs other than ‘1’

So far, the neurons have only output a value of 1 when they fire.

If the input sum is greater than the threshold the neuron outputs 1.

In fact, the neurons can output any value that you want.

G51IAI – Introduction to AI

Modelling a Neuron

aj : Input value (output from unit j) wj,i : Weight on the link from unit j to unit i ini : Weighted sum of inputs to unit i ai : Activation value of unit i g : Activation function

j

jiji aWin ,

G51IAI – Introduction to AI

Activation Functions

Stept(x) = 1 if x >= t, else 0 Sign(x) = +1 if x >= 0, else –1 Sigmoid(x) = 1/(1+e-x)

aj : Input value (output from unit j)

ini : Weighted sum of inputs to unit i

ai : Activation value of unit i g : Activation function

G51IAI – Introduction to AI

Summary

Linear Separability Learning Algorithm Pseudocode Activation function (threshold, sigmoid,

etc)