10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 -...

15
10-10-05 Prof. Pushpak Bhattachary ya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability, Introduction of Feedforward Network

Transcript of 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 -...

Page 1: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

1

CS 621 Artificial Intelligence

Lecture 23 - 10/10/05

Prof. Pushpak Bhattacharyya

Linear Separability, Introduction of Feedforward Network

Page 2: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

2

Test for Linear Separability (LS)• Theorem:

A function is linearly separable iff the vectors corresponding to the function do not have a Positive Linear Combination (PLC)

• PLC – Both a necessary and sufficient condition.

• X1, X2, … , Xm - Vectors of the function• Y1, Y2, … , Ym - Augmented negated set

• Prepending -1 to the 0-class vector Xi and negating it, gives Yi

Page 3: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

3

Example (1) - XNOR

• The set {Yi} has a PLC if ∑ Pi Yi = 0 , 1 ≤ i ≤ m

– where each Pi is a non-negative scalar and

– atleast one Pi > 0

• Example : 2 bit even-parity (X-NOR function)

X1 <0,0> + Y1 <-1,0,0>

X2 <0,1> - Y2 <1,0,-1>

X3 <1,0> - Y3 <1,-1,0>

X4 <1,1> + Y4 <-1,1,1>

Page 4: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

4

Example (1) - XNOR

• P1 [ -1 0 0 ]T + P2 [ 1 0 -1 ]T

+ P3 [ 1 -1 0 ]T + P4 [ -1 1 1 ]T

= [ 0 0 0 ]T

• All Pi = 1 gives the result.

• For Parity function,PLC exists => Not linearly separable.

Page 5: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

5

Example - AND

AND does not have PLC. Suppose not, P1, P2, P3,P4 s.t.

4

PiXiT = 0

i=1

Page 6: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

6

AND (Contd)

X1 = [1,0,0]

X2 = [1,0,-1]

X3 = [1,-1,0]

X4 = [-1,1,1]

P1[1,0,0]T + P2[1,0,-1]T + P3[1,-1,0]T + P4[-1,1,1]T = [0,0,0]T

P1 + P2 + P3 - P4 = 0 - (1)

- P3 + P4 = 0 - (2)

- P2 + P4 = 0 - (3)

Page 7: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

7

AND (contd)

This can be satisfied if

P1 = P2 = P3 = P4 = 0

So, PLC does not exist.

So, AND is computable by perceptron.

However finding PLC is not efficient.

Page 8: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

8

Exercise

Try to learn the SNNS package (available on CS621 homepage). Try PLC test for

1. Different boolean functions.2. Majority function (1 if #1s > #0s)3. Comparator function (1 if decimal(Y) > decimal(X)4. Odd parity5. IRIS data

Page 9: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

9

Study of Linear Separability• W. Xj = 0 defines a

hyperplane in the (n+1) dimension.

=> W vector and Xj vectors are perpendicular to each other.

. . .

θ

w1 w2 w3 wn

x2 x3 xn

Page 10: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

10

Linear Separability

Xk+1 -

Xk+2 -

-

Xm -

X1+ + X2

Xk+

+

Positive set :

w. Xj > 0 j≤k

Negative set :

w. Xj < 0 j>kSeparating hyperplane

Page 11: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

11

Linear Separability

• w. Xj = 0 => w is normal to the hyperplane which separates the +ve points from the –ve points.

• In this computing paradigm, computation means “placing hyperplanes”.

• Functions computable by the perceptron are called – “threshold functions” because of comparing ∑

wiXi with θ (the threshold)– “linearly separable” because of setting up linear

surfaces to separate +ve and –ve points

Page 12: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

12

Concept of Separability

All these need the concept of separability • Perceptrons• Support Vector Machines• Feed-forward networks

Page 13: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

13

SVMs

Variations of the idea of linear separability –

- - -- - - - - -

+ + ++ + + + + +

Separating plane

Vapnik – Statistical Learning Theory

Page 14: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

14

Feed-Forward Networks

Motivation: • Most real life data is not linearly separable.• If you can’t separate by a single plane, use more

planes.

Page 15: 10-10-05Prof. Pushpak Bhattacharyya, IIT Bombay 1 CS 621 Artificial Intelligence Lecture 23 - 10/10/05 Prof. Pushpak Bhattacharyya Linear Separability,

10-10-05 Prof. Pushpak Bhattacharyya, IIT Bombay

15

Feed-Forward Networks (contd)

(0,0)

(0,1)

(1,0)

(1,1) 0-class

1-class0-class

1-class

x2x1

h2 h1

y

hiddenlayers