Neural Networks: Rosenblatt's Perceptron

22
CHAPTER 01 ROSENBLATT’S PERCEPTRON CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq M. Mostafa Computer Science Department Faculty of Computer & Information Sciences AIN SHAMS UNIVERSITY (most of figures in this presentation are copyrighted to Pearson Education, Inc.)

Transcript of Neural Networks: Rosenblatt's Perceptron

Page 1: Neural Networks: Rosenblatt's Perceptron

CHAPTER 01

ROSENBLATT’S PERCEPTRON

CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq M. Mostafa

Computer Science Department

Faculty of Computer & Information Sciences

AIN SHAMS UNIVERSITY

(most of figures in this presentation are copyrighted to Pearson Education, Inc.)

Page 2: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

Introduction

The Perceptron

The Perceptron Convergence Theorem

Computer Experiment

The Batch Perceptron Algorithm

2

Rosenblatt’s Perceptron

Page 3: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq 3

Introduction

The Perceptron:

the simplest form of a neural network.

consists of a single neuron with adjustable synaptic

weights and bias.

can be used to classify linearly Separable patterns;

patterns that lie on opposite sides of a hyperplane.

is limited to perform pattern classification with only two

classes.

Page 4: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron

Linearly and nonlinearly separable classes.

Figure 1.4 (a) A pair of linearly separable patterns. (b) A pair of

non-linearly separable.

4

Page 5: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron

A nonlinear neuron that is consists of a linear combiner followed by a hard limiter (e.g., signum activation function)

Weights are adapted using an error-correction rule.

Figure 1.3 Signal-flow graph of the perceptron.

5

01

01)(

v

vvy

m

iiixwv

0

m

iii bxwv

1

Page 6: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron

The decision boundary

A hyperplane defined by

For the perceptron to function

Properly, the two classes C1

And C2 must be linearly

Separable. Figure 1.2 Illustration of the hyperplane

(in this example, a straight line) as decision

boundary for a two-dimensional, two-class

pattern-classification problem.

6

01

m

iii bxw

Page 7: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron Convergence Algorithm

the fixed-increment convergence theorem for the perceptron (Rosenblatt, 1962):

Let the subsets of training vectors X1 and X2 be linearly separable. Let the inputs presented to the perceptron originate from these two subsets. The perceptron converges after some no iterations, in the sense:

is a solution vector for no nmax .

Proof is reading: Pages (82-83 of ch01, Haykin).

7

...)2()1()( 000 nnn www

Page 8: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron Convergence Algorithm

We derive the error-correction learning algorithm as follows: We write the input signal, the weights, and the bias:

Then

The learning algorithms find a weight vector w such that:

8

Tm nxnxnxn )(),...,()(,1)( 21x

Tm nwnwnwnwn )(),...,(),(,)( 210w

m

i

Tii nnnxnwnv

0

)( xw

C1 or input vectevery for 0 xxwT

C2 or input vectevery for 0 xxwT

Page 9: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron Convergence Algorithm

The learning algorithms find a weight vector W such that:

9

C1 or input vectevery for 0 xxwT

C2 or input vectevery for 0 xxwT

Page 10: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron Convergence Algorithm

Given the subsets of training vectors X1 and X2 , then the training problem is then to find a weight vector W such that the previous two inequalities are satisfied. This is achieved when updating the weights as follows:

The learning-rate parameter (n) is a positive

number which could be variable. For fixed , we have fixed-increment learning rule.

The algorithm converges if (n) is a positive value.

10

C2 x(n)and 0)( if )()()()1( nnnnnnT

xwxww

C1 x(n)and 0)( if )()()()1( nnnnnnT

xwxww

Page 11: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron Convergence Algorithm

11

Page 12: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

The Perceptron Convergence Algorithm

12

Page 13: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

Perceptron and Bayes Classifier

Bayes Classifier

Figure 1.6 Signal-flow graph

of Gaussian classifier.

13

Page 14: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

Perceptron and Bayes Classifier

Bayes Classifier

Figure 1.7 Two overlapping, one-dimensional Gaussian distributions.

14

Page 15: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq 15

The Batch Perceptron Algorithm

We define the perceptron cost function as

where H is the set of samples x misclassified by a perceptron using w as its weight vector

the cost function J(w) is differentiable with respect to the weight vector w. Thus, differentiating J(w) with respect to yields the gradient vector

In the method of steepest descent, the adjustment to the

weight vector w at each time step of the algorithm is applied in a direction opposite to the gradient vector .

Page 16: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq 16

The Batch Perceptron Algorithm

Accordingly, the algorithm takes the form

which embodies the batch perceptron algorithm for computing the weight vector W.

The algorithm is said to be of the “batch” kind because at each time-step of the algorithm, a batch of misclassified samples is used to compute the adjustment

Page 17: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq 17

Batch Learning

Presentation of all the N examples in the training sample constitute one epoch.

The cost function of the learning is defined by the average error energy Eav

The weights are updated epoch-by-epoch

Advantages:

Accurate estimation of the gradient vector.

Parallelization of the learning process.

Page 18: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

Computer Experiment: Pattern Classification

18

Figure 1.8 The double-moon classification problem.

Page 19: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

Computer Experiment: Pattern Classification

19

Figure 1.9 Perceptron with the double-moon set at distance d = 1.

Page 20: Neural Networks: Rosenblatt's Perceptron

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq

Computer Experiment: Pattern Classification

20

Figure 1.10 Perceptron with the double-moon set at distance d = -4.

Page 21: Neural Networks: Rosenblatt's Perceptron

•Problems:

•1.1, 1.4, and 1.5

•Computer Experiment

•1.6

Homework 1

21

Page 22: Neural Networks: Rosenblatt's Perceptron

Model building Through Regression

Next Time

22