Radial Basis Function NetworksRadial Basis Function...

41
Radial Basis Function Networks Radial Basis Function Networks Radial Basis Function Networks Radial Basis Function Networks

Transcript of Radial Basis Function NetworksRadial Basis Function...

Page 1: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Radial Basis Function NetworksRadial Basis Function NetworksRadial Basis Function NetworksRadial Basis Function Networks

Page 2: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Radial Basis Function Networks A special types of ANN that have three

layerslayers Input layerHidd lHidden layerOutput layer

i f i hidd l i Mapping from input to hidden layer is nonlinear

Mapping from hidden to output layer is linear

PR , ANN, & ML 2

Page 3: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

ComparisonMulti-layer perceptron Multiple hidden layers

RBF Networks Single hidden layer Multiple hidden layers

Nonlinear mapping W: inner product

Single hidden layer Nonlinear + linear W: distance W: inner product

Global mapping W l ifi

W: distance Local mapping W d t Warp classifiers

Stochastic approximation

Warp data Curve fitting

approximation

PR , ANN, & ML 3

Page 4: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Another View: Curve Fitting We try to estimate a mapping from patterns

into classes f(patterns)->classes, f(X)->df(p ) , f( ) Patterns are represented as feature vector X Classes are decisions d Classes are decisions d Training samples: f(Xi)->di, i=1,..., n Interpolation of the f based on samples

d

x2

d

PR , ANN, & ML 4x1

Page 5: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Yet Another View: Warping Data If the problem is not linearly separable,

MLP will use multiple neurons to defineMLP will use multiple neurons to define complicated decision boundaries (warp classifiers)

Another alternative is to warp data into higher dimensional space that they are much more likely to be linearly separable (single perceptron will do)

This is very similar to the idea of Support Vector Machine

PR , ANN, & ML 5

Page 6: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Example XOR Warpped XOR

y|]0,0[|

2 )(t

e xx

(1 1)

)(xx

(1,1)

)()(

)(2

1

xx

x

yx

(0,0)

x |]11[| t

(0,1)(1,0)

( , )

PR , ANN, & ML 6

|]1,1[|1 )(

t

e xx

Page 7: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

More Example

PR , ANN, & ML 7

Page 8: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

A Pure Interpolation Approach Given: (Xi, di), i=1, …, n Desired: f(Xi)= di( i) i

Solution: f(X), with f(Xi)= di

Radial basis function solution iiwf )()( XXX X,Xi) – general form is shift and rotation invariant Shift invariant requires X-Xi

i

iiwf )()( XXX

q i

Rotation invariant requires || X-Xi ||

Example22)( Multiquadrics

Inserve Multiquadrics Gaussan

22)( crr

22

1)(cr

r

PR , ANN, & ML 8

2

2

2)( r

er

Page 9: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Graphical Interpretation Each neuron responds based on the distance to the center

of its receptive field The bottom level is a nonlinear mappingpp g The top level is a linear weighted sum

iiwf )()( XXX i

w1 wn

)( 1XX )( nXX )( 2XX

PR , ANN, & ML 9x1 x2 xm

Page 10: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Other Alternatives: Global Lagrange polynomials

)(n

knk Lyxfy

)())(())(()())(())((

)(

111

111,

0,

nkkkkkkok

nkkokn

kknk

xxxxxxxxxxxxxxxxxxxxL

yfy

PR , ANN, & ML 10

Page 11: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Other Alternatives: Local Bezier Basis B-spline basis

PR , ANN, & ML 11

Page 12: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

B-Spline Interpolation A big subject in mathematics

U d i di i li Used in many disciplinesApproximation Pattern recognitionComputer graphics

As far as pattern recognition is concernedDetermine order of spline (DOFs)

Knot vectors (partition into intervals) Fitting in each interval

PR , ANN, & ML 12

Page 13: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Interpolation SolutionXXX ),()(

i

ii

d

wf

XX2

1

2

1

22221

11211

),(

jiijn

n

dd

ww

21

),(

jiij

nnnnnn dw

DΦWDΦW

1

is symmetrical is invertable (if all X ’s are distinct)

PR , ANN, & ML 13

is invertable (if all Xi s are distinct)

Page 14: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Practical Issue: Accuracy (cont.) The function represents the Green’s

function for a certain differential operatorfunction for a certain differential operator When it is shift and rotational invariant, we

it (X X ) G(||X X ||) ican write (X, Xi) as G(||X-Xi||), again, Gaussian Kernel is a popular choice here

PR , ANN, & ML 14

Page 15: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Practical Issues Accuracy

How about data are noisy?How about data are noisy? Speed

How about there are many sample points? Training

What is the training procedure?

PR , ANN, & ML 15

Page 16: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Practical Issue: Accuracy When data are noisy, pure interpolation

represents a form of “overfitting”represents a form of overfitting Need a stabilizing (or smoothing,

l i ti ) tregularization) term The solution should achieve two things

Good fitting Smoothness

PR , ANN, & ML 16

Page 17: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Practical Issue: Accuracy (cont.)2

1

2 ||21))((

21 Dfdf

n

iii X

1 22 i

TXn

i

m

ijiji DfGwd 2*

1

2

1

||21))((

21

DG)GG(GW

DG)WGG(GT

oT

To

Ti i

1

1 1 22

The solution is rooted in the regularization theory, which is way beyond the scope of this course (readwhich is way beyond the scope of this course (read the papers on the class Web sites for more details)

Try to minimize error as a weighted sum of two

PR , ANN, & ML 17

y gterms which impose the fitting and the smoothness constraints

Page 18: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Sidebar I It can be proven that MAP estimator (Baysian rule)

gives the same results as regularized RBF solution Un-regularized (fitting) solution assumes the same

prior

Dfdfn

iii ||

21))((

21 2

1

2X

PfPfPfP

i

)()()|()|(

22 1

XXX

cfPfPfPP

))(log()|(log)|(log)(

XXX

PR , ANN, & ML 18

Page 19: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Sidebar II Regularization is also similar to (or call)

ridge regression in statisticsridge regression in statistics The problem here is to fit a model to data

ith t fittiwithout overfitting In linear case, we have

wwxwyi j j

jjijoiridge

22)(minarg w

w

wxwyi j

jijoiridge

2)(minargw

w

PR , ANN, & ML 19

swsubjecttoj

j

2

Page 20: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Intuition When variables xi are highly correlated,

their coefficients become poorly determinedtheir coefficients become poorly determined with high varianceE g widely large positive coefficient on oneE.g. widely large positive coefficient on one

can be canceled by a similarly large negative coefficient on its correlated cousincoefficient on its correlated cousin

Size constraint is helpfulCaveat: constraint is problem dependentCaveat: constraint is problem dependent

PR , ANN, & ML 20

Page 21: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Solution to Ridge Regression Similar to regularization

ww i j j

jjijoiridge wwxwy 22)(minarg

WWYXWYXW

WWYXWYXWww

TT

TTridge

d )()(

)()(minarg

WYXWXw

WWYXWYXW

T

TT

dd

0)(

0)()(

YXIXXYXWIXX

WYXWX

TTridge

TT

λ 1)()(

0)(

PR , ANN, & ML 21

YXIXXw TTridge λ 1)(

Page 22: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Ugly Math

YXIXXw TTridge λ 1)(

YUVΣIUΣUVΣUΣYXIXXXXwY

TTTTTT

TTridge

λVVλ

1

1

)()(

YUΣVIUΣUVΣUΣYUVΣIUΣUVΣUΣ

TTTTTT λVVλVV

1111 )()()(

)(

YUΣIΣΣUΣYUΣIVUΣUVΣVUΣ

TTT

TTTTTTT

λVλVV

1

111

)()(

Yuu Ti

ii λ

2

2

)(

PR , ANN, & ML 22

ii λ

Page 23: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Physical Interpretation Singular values of X represents the spread

of data along different body-fittingof data along different body fitting dimensions

To estimate Y(=Xwridge) regularization To estimate Y( Xw ) regularization minimizes the contribution from less spread-out dimensionsLess spread-out dimensions usually have much

larger variance (high dimension eigen modes)1Trace X(XTX+I)-1XT is called effective

degrees of freedom

PR , ANN, & ML 23

Page 24: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

More Details Trace X(XTX+I)-1XT is called effective

degrees of freedomdegrees of freedomControls how many eigen modes are actually

used or activeused or active Different methods are possible

Sh i ki th t ib ti l d Shrinking smoother: contributions are scaled Projection smoother: contributions are used (1)

or not sed (0)or not used (0)

PR , ANN, & ML 24

Page 25: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Practical Issue: Speed When there are many training samples, G

and matrices are of size n by nand matrices are of size n by n Inverting such a matrix is of O(n3) Reducing the number of bases used

PR , ANN, & ML 25

Page 26: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Practical Issue: Speed (cont.)

m

*

m<n

n m

jiji

jjj

DfGwd

Gwf

||21))((

21

)()(

2*2

1

*

TX

TXX

m

To

Ti i

GGG

),(),(),(

22

12111

1 1

TXTXTXDG)WGG(G

mnmnnn

m

GGG

GGG

),(),(),(

),(),(),(

21

22212

TXTXTX

TXTXTXG

m

m

o

mn

GGGGGG

),(),(),(),(),(),(

22212

12111

TTTTTTTTTTTT

G

PR , ANN, & ML 26

mmmnnn GGG

),(),(),( 21 TTTTTT

Page 27: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Practical Issue: Training How can the center of radial basis functions

for the reduced basis set be determined?for the reduced basis set be determined? Chosen randomly Training involves finding wi, using SVD

PR , ANN, & ML 27

Page 28: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Training with K-mean Using unsupervised clustering

Fi d h d t l t d th t i Find where data are clustered – that is where the radial basis functions should be

l dplaced With k-mean

PR , ANN, & ML 28

Page 29: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

K-Means Algorithm(fixed # of clusters)(fixed # of clusters)

Arbitrarily pick N cluster centers, assign samples to nearest center

Compute sample mean of each clusterp p Reassign samples to clusters with the

nearest mean (for all samples)nearest mean (for all samples) Repeat if there are changes, otherwise stop

PR , ANN, & ML 29

Page 30: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

PR , ANN, & ML 30

Page 31: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

PR , ANN, & ML 31

Page 32: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

PR , ANN, & ML 32

Page 33: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Training with Gradient Decent Error Expression

21 n m

1 1))((

21 )()()(

n

i

m

jjiji

nnn

Gwd TX

Free variables in the error expression areWeightgCenter locationBasis spreadp

PR , ANN, & ML 33

Page 34: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Effect of Weights)()(

2)( ))((

21 nnn m

jijin Gwd TX

)()()(

1 1

)(

))((2

nnmn

i jjiji

Gwd

TX

)()()(

1

)(

)(

nnn

n

jjijii

G

Gwd

TX

TX

)(

)(

1

)( )(n

iji

ni

j

Gw

TX

)(

)()()1(

n

j

n

wn

jn

jw

ww

PR , ANN, & ML 34

j

Page 35: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Effect of Center Positions)()(

2)( ))((

21 nnn m

jijin Gwd TX

)()()(

1 1

)(

2nnm

jijin

i

i j

Gwd TX

)()(

)(1)()(

)(

1

)()('2nn

n ji

n

jin

in

j

n

j

Gw TXΣTX

)(

)()()1(

1

)()(

nnn

jii

jiij

j

TT

T

)( n

j

jjT

TT T

PR , ANN, & ML 35

Page 36: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Effect of Basis Spread)()(

2)( ))((

21 nnn m

jijin Gwd TX

)()()(

1 1

)(

2nnm

jijin

i

i jjiji

Gwd

TX

)(

)()()()(

)(

1

)('n

nn

ij

n

jin

in

j

n

jjj

Gw

QTX

)()(

)(

)(

11

))((nn T

jijin

ij

iji

jiij

j

TXTXQ

Σ

)(11

)()(1)1(1

nj

nn

jn

j

jijiij

ΣΣΣ

Σ

PR , ANN, & ML 36

Page 37: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Details A lot of theoretical development results are

omitted hereomitted hereE.g., relation to kernel regression and SVM

A l t f t i id ti t A lot of tuning considerations are not covered hereE.g., how to determine ?

This is an active research area

PR , ANN, & ML 37

Page 38: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Examples

PR , ANN, & ML 38544,000 data points w. 80,000 centersAccuracy of 1.4x10-6 for all data points

Page 39: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Problem Definitioni i l d f d Given a point cloud of data From laser range scanner, orCT MR etcCT, MR, etc.

Find a single analytical surface approximation Or an inside outside function Or an inside-outside function

Range data are s(X)=0Outside is s(X)>0Outside is s(X)>0 Inside is s(X)<0

Just sample data s(X)=0 is not enoughp ( ) g s can be a trivial zero functionNeed off-surface data generation

PR , ANN, & ML 39

Page 40: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

Procedures1. Off surface data generation2. Choose a subset from the interpolation node xi and p i

fit an RBF only to these3. Evaluate the resideual ei = fi – s(xi)4. If max(ei)<accuracy, then stop5. Else append new centers where ei is largepp i g6. Re-fit RBF and go back to step 2

PR , ANN, & ML 40

Page 41: Radial Basis Function NetworksRadial Basis Function Networksyfwang/courses/cs290i_prann/pdf/rbf.pdf · Radial Basis Function Networks A special types of ANN that have three layers

More Results

PR , ANN, & ML 41Less smoothing More smoothing