8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF...

29
06/15/22 RBF Networks M.W. Mak Radial Basis Function Radial Basis Function Networks Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison between RBF and BP

Transcript of 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF...

Page 1: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 1RBF Networks M.W. Mak

Radial Basis Function Radial Basis Function NetworksNetworks

1. Introduction

2. Finding RBF Parameters

3. Decision Surface of RBF Networks

4. Comparison between RBF and BP

Page 2: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 2RBF Networks M.W. Mak

1. Introduction MLPs are highly non-linear in the parameter space

gradient descent local minima RBF networks solve this problem by dividing the

learning into two independent processes.

w

Page 3: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 3RBF Networks M.W. Mak

RBF networks implement the function

s x w w x ci i ii

M

( ) ( )

0

1

wi i and ci can be determined separately

Fast learning algorithm Basis function types

( ) log( )

( ) exp( )

( )

( )

r r r

r r

r r

rr

2

2

2

2 2

2 21

Page 4: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 4RBF Networks M.W. Mak

For Gaussian basis functions

s x w w x c

w wx c

p i i p ii

M

ipj ij

ijj

n

i

M

( )

exp( )

01

0

2

211 2

Assume the variance across each dimension are equal

s x w w x cp ii

pj ijj

n

i

M

( ) exp ( )

0 22

11

1

2

Page 5: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 5RBF Networks M.W. Mak

To write in matrix form, let

a x c

s x w a a

pi i p i

p i pii

M

p

where ( )

00 1

s x

s x

s x

a a a

a a a

a a a

w

w

wN

M

M

N N NM M

( )

( )

( )

`

1

2

11 12 1

21 22 2

1 2

0

1

1

1

1

s Aw

A s1

w

Page 6: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 6RBF Networks M.W. Mak

2. Finding the RBF Parameters

Use the K-mean algorithm to find ci

1

2

2

2

1

1

Page 7: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 7RBF Networks M.W. Mak

K-mean Algorithm

step1: K initial clusters are chosen randomly from the samples to form K groups.

step2: Each new sample is added to the group whose mean is the closest to this sample.

step3: Adjust the mean of the group to take account of the new points.

step4: Repeat step2 until the distance between the old means and the new means of all clusters is smaller than a predefined tolerance.

Page 8: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 8RBF Networks M.W. Mak

Outcome: There are K clusters with means representing the centroid of each clusters.

Advantages: (1) A fast and simple algorithm.

(2) Reduce the effects of noisy samples.

Page 9: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 9RBF Networks M.W. Mak

Use K nearest neighbor rule to find the function width

2

1

1

K

kiki cc

K

k-th nearest neighbor of ci

The objective is to cover the training points so that a smooth fit of the training samples can be achieved

Page 10: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 10RBF Networks M.W. Mak

Centers and widths found by K-means and K-NN

Page 11: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 11RBF Networks M.W. Mak

Determining weights w using the least square method

E d w x cp j jj

M

p jp

N

0

2

1

where dp is the desired output for pattern p

E

E

T

T T

( ) ( )

( )

d Aw d Aw

wA A A dSet w

0 1

Page 12: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 12RBF Networks M.W. Mak

Let E be the total-squared error between the actual output and the target output TNdddd

21

wAdwAdET

AwAwAwddAwdd

AwdAwdTTTTTT

TTT

AwAww

Awdw

dAwww

E TTTTT

0

AwAdA

wAAAwAdAww

dA

TT

TTTTTT

22

dAAAw

dAAwATT

TT

1

Page 13: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 13RBF Networks M.W. Mak

Note that

xAxAxAxx

yAyAxx

yyxx

TT

T

T

Problems

(1) Susceptible to round-off error.

(2) No solution if is singular.

(3) If is close to singular, we get very large component in w.

AAT

AAT

Page 14: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 14RBF Networks M.W. Mak

Reasons

(1) Inaccuracy in forming(2) If A is ill-conditioned, small change in A introduces

large change in(3) If ATA is close to singular, dependent columns in ATA

exist

AAT

1AAT

e.g. two parallel straight lines.

x

y

Page 15: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 15RBF Networks M.W. Mak

singular matrix :

1

0

42

21

y

x

If the lines are nearly parallel, they intersect each other at

,

i.e.

0

0

y

x

0

0

y

xor

So, the magnitude of the solution becomes very large; hence overflow will occur.

The effect of the large components can be cancelled out if the machine precision is infinite.

Page 16: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 16RBF Networks M.W. Mak

If the machine precision is finite, we get large error.For example,

0

0

102

104

21

2138

38

Finite machine precision =>

33

33

38

38

101

101

102

1000001.4

21

21

Solution: Singular Value Decomposition

Page 17: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 17RBF Networks M.W. Mak

xp

K-means

K-NearestNeighbor

BasisFunctions

LinearRegression

ci

ci

i

A w

RBF learning processRBF learning process

Page 18: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 18RBF Networks M.W. Mak

RBF learning by gradient descent

Let and i p

pj ij

ijj

n

p p pxx c

e x d x s x( ) exp ( ) ( ) ( )

1

2

2

21

E e x pp

N

1

2 1

2

( ) .

we have

E

w

E E

ci ij ij

, , and

Apply

Page 19: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 19RBF Networks M.W. Mak

we have the following update equations

w t w t e x x i M

w t w t e x i

t t e x w x x c t

c t c t e x w x x c t

i i w p i pp

N

i i w pp

N

ij ij p i i p pj ij ijp

N

ij ij c p i i p pj ij ijp

N

( ) ( ) ( ) ( ) , , ,

( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

1 1 2

1 0

1

1

1

1

2 3

1

2

1

when

when

Page 20: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 20RBF Networks M.W. Mak

Elliptical Basis Function networks

)}()(2

1exp{)( 1

jpjT

jppj xxx

j

j

: function centers

: covariance matrix

1

x1

2 M

x2 xn

J

jpjkjpk xwxy

0

)()(

y W D W = +

y x1( )

y xK ( )

Page 21: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 21RBF Networks M.W. Mak

K-means and Sample covariance K-means :

if Sample covariance :

j jj x

Nx

j

1

x j

x x j kj k

jj

j jT

xN

x xj

1

( )( )

The EM algorithm

Page 22: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 22RBF Networks M.W. Mak

EBF Vs. RBF networksEBF Vs. RBF networks

RBFN with 4 centers EBFN with 4 centers

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3

Class 1Class 2

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3

Class 1Class 2

Page 23: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 23RBF Networks M.W. Mak

Out put 1 of an EBF net work (bias, no rescale, gamma=1)

'nxor.ebf 4.Y.N.1.dat ' 1.43

0.948 0.463

-0.0209 -0.505

-3-2

-10

12

3 -3-2

-10

12

3

-1

-0.5

0

0.5

1

1.5

2

EBF Network’s output

Elliptical Basis Function NetworksElliptical Basis Function Networks

Page 24: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 24RBF Networks M.W. Mak

RBFN for Pattern Classification

MLP RBFHyperplane Kernel function

The probability density function (also called conditional density function or likelihood) of the k-th class is defined as

kCxp |

Page 25: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 25RBF Networks M.W. Mak

•According to Bays’ theorem, the posterior prob. is

xp

CPCxpxCP kk

k

||

where P(Ck) is the prior prob. and

)()|( rr

r CPCxpxp

• It is possible to use a common pool of M basis functions, labeled by an index j, to represent all of the class-conditional densities, i.e.

)|()|(|1

k

M

jk CjPjxpCxp

Page 26: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 26RBF Networks M.W. Mak

)1|(xp

)|( kCxp

)|( Mxp)2|(xp

k

M

jk CjPjxpCxp |||

1

)|( kCMP

Page 27: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 27RBF Networks M.W. Mak

kk

M

jk CPCjPjxpxp

1

||

M

j

kk

k

M

j

jPjxp

CPCjPjxp

1

1

|

||

jP

jP

jPjxp

CPCjPjxp

xCP M

j

M

jkk

k

1

''

1

|

||

|

Page 28: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 28RBF Networks M.W. Mak

M

jjkj

M

jk

M

j

M

j

kk

xw

xjPjCP

jPjxp

jPjxp

jP

CPCjP

1

1

1

''1

||

|

||

Hidden node’s output posterior prob. of the j-th set of

features in the input .

weight posterior prob. of class membership, given

the presence of the j- th set of features .

:)|()( xjPxj

:)|( jCPw kkj

No bias term

Page 29: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.

04/19/23 29RBF Networks M.W. Mak

RBF networks MLP

Learning speed Very Fast Very Slow

Convergence Almost guarantee Not guarantee

Response time Slow Fast

Memoryrequirement

Very large Small

Hardwareimplementation

IBM ZISC036Nestor Ni1000www-5.ibm.com/fr/cdlab/zisc.html

Voice Direct 364www.sensoryinc.com

Generalization Usually better Usually poorer

Comparison of RBF and MLPComparison of RBF and MLP

To learn more about NN hardware, see To learn more about NN hardware, see http://www.particle.kth.se/~lindsey/HardwareNNWCourse/home.htmlhttp://www.particle.kth.se/~lindsey/HardwareNNWCourse/home.html