MSEP2010 L10(Xor Problem)

8/6/2019 MSEP2010 L10(Xor Problem)

http://slidepdf.com/reader/full/msep2010-l10xor-problem 1/17

1MSEP2010_L10



2MSEP2010_L10



Problems with Back ro a ation

1. The presence of local minima in the error surface.o u ons ge rappe n oca pea s, an may never

reach the lowest point if the multidimensional function

has many local minimums.2. Backpropagation is an extremely slow process. Tens of

thousands of learning trials are common even for simple

pro ems.3. The assumption behind backpropagation is that

to do. In fact, this depends very much on the definition

of error, on the choice of training set, and on the functiono e ne wor .

3MSEP2010_L10



4. If learning goes on too long generalization often suffers.

5. Backpropagation is unbiological - there is no evidence of weight error information running backwards in the brain.

.as catastrophic unlearning. After the network has learned a

set of patterns, if a new pattern has to be learned, sometimesmay e nee e o un o a o connec ons an o c angeeverything in order to accommodate the new information(which, of course, takes a very long time).

7. The biggest problem is neural network mysticism. A neuralnetwork may solve a practical problem, but it can bedifficult to understand how it solved it. For man roblemsthe hidden layer is not doing an obvious analysis. If you

don’t know what was done, it can be hard to improve it.

The strong tendency to say, “Who cares? The network works.” This

approach is rare y the road to either progress or wisdom.4MSEP2010_L10



XOR Problem

Haykin, page 176Touretzkey and Pomerleau, 1989

-1.5

1-2 -0.5

1

-0.5

1

1When the top hidden neuron is off and the bottom hidden neuron is on, which

occurs when the in ut attern is 0 1 or 1 0 the out ut neuron is switched

on due to the excitatory effect of the positive weight connected to the bottom

hidden neuron5MSEP2010_L10



Feed-forward network mappings

Feed-forward neural networks provide a general framework

for representing non-linear functional mappings between a set of input variables

and a set of out ut variables. This is achieved b re resentin the non-linear

function of many variables in terms of compositions of non-linear functions of

a single variable, called activation functions. 6MSEP2010_L10



The units which are not treated as output units are called hidden units.In this network there are d in uts M hidden units and c out ut units.

The output of the j th hidden unit is obtained by first forming a weightedlinear combination of the d input values, and adding a bias, to give

Here w ji denotes a weight in the first layer, going from input i to

hidden unit j , and w j0 denotes the bias for hidden unit j .

Treating bias as a weight

the linear sum in using an activation function g(.) to give

7MSEP2010_L10



The outputs of the network are obtained by transforming the activationsof the hidden units usin a second la er of rocessin elements. Thus

for each output unit k, we construct a linear combination of the outputs ofthe hidden units of the form

reat ng as as a we g t

th

this linear combination using a non-linear activation function, to give

An explicit expression for the complete function represented by the

8MSEP2010_L10



We can easily show that a two-layer network of the form shown in the

When the inputs are binary-

earlier figure can generate any Boolean function, provided the number M

of hidden units is sufficiently large (McCulloch and Pitts, 1943)

When the in uts are continuous-

(Lippmann, 1987 . In the file section of the group- Reading Assignment)

9MSEP2010_L10



Design a single output two layer network which classifies the shaded

reg on n g. rom e o er reg on.

(1,3)

(1,1)

(3, 2)

10MSEP2010_L10



The equations of the decision boundaries are

=−

h1

05050 21

−−

=+− . x x .

h2

21 . .

11

So the hidden layer weights are For the first neuron, weights are 1 0 -1

For the second neuron, weights are 0.5 -1 0.5

or t e t r neuron, we g ts are -0.5 -1 0.5 3.5

‘ ’, .

h1>0. Similarly h1=0 by off state by h1<0. Now take the part numbered 1. here h1=0,

h2=0 and h3=0. Similarly we can write for parts also (see the table). We are interested in

,

parts 0.11MSEP2010_L10



*(0,4)

1

*(2,3)

h1>0

h2>0

2

7

6

,

(2,2) (4,2) 5

3

3

4

* *(0,0) (2,0)

12MSEP2010_L10



0>−θ

03203

>−+

<−

θ

θ

ww

w

h1 h2 h3 o1 0 0 0 1

021

0321

>−+

<−++

θ

θ

θ

ww

www2 0 0 1 0

3 0 1 1 1

4 1 1 1 0

031

01

<−+

<−

θ

θ

ww

w 6 1 0 0 0

7 1 0 1 0

From the first equation we know that is negative. From the second equation, w3 is

more negat ve t an an so on… too =- , w3=- , w2= an w1=- .

=[2 0]

θ θ

net = newff([0 10;0 10],[3 1],{'hardlim' 'hardlim'});

net.b{1}=[-1;.5;3.5];

net.b{2}=[1];

net.lw{2,1}=[-3 3 -2];

. , = . - -. -

y=sim(net,p)

13MSEP2010_L10



About the lab class

14MSEP2010_L10



Find a neural network to fit the data generated by

humps function between [0 2]

80

100

x=0:.05:2;

60y=humps(x);

plot(x,y)

20

40

0

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-20

15MSEP2010_L10



= =

net=newff(p,t,[2,1],{'logsig','purelin'});

ne = ra n ne ,p, ;

y=sim(net,p);Plot the output

Change the number of hidden neurons

Change the learning rateChange number of epochs

net.trainParam.lr = 0.05;net.trainParam.epochs =100;

16MSEP2010_L10



load iris

load housing

Classification

Regression

Read about housin roblem in the followin link

http://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html

rv ne ac ne earn ng epos tory

htt ://archive.ics.uci.edu/ml /

17MSEP2010_L10

MSEP2010 L10(Xor Problem)

Documents

Transcript of MSEP2010 L10(Xor Problem)