6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only...

61
ial-basis function (RBF) networks radial-basis function: a function which depe only on the radial distance from a point XOR problem quadratically separable

Transcript of 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only...

Page 1: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

6. Radial-basis function (RBF) networks

RBF = radial-basis function: a function which depends only on the radial distance from a point

XOR problem

quadratically separable

Page 2: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

So RBFs are functions taking the form

Where is a nonlinear activation function, x is the input and xi is

the i’th position, prototype, basis or centre vector.

The idea is that points near the centres will have similar outputs

I.e. if x ~ xi then (x) ~ (xi) since they should have similar

properties.

Therefore instead of looking at the data points themselves characterise the data by their distances from the prototype vectors (similar to kernel density estimation)

||)(|| ixx

Page 3: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

x d1 d2

(0,0) 1 1.1(1,1) 1 .5(0,1) 0 1.1(1,0) .5

x1=(0,1)

x2=(1,0.5)

2

For example, the simplest form of is the identity function (x) = x

Now use the distances as the inputs to a network and form a weighted sum of these

Page 4: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

yM

Input

y1

y2

Output

Can be viewed as a Two-layer network

Hidden layer

y)y-xN

wj

d

output = wi i(y)adjustable parameters are weights wj

number of hidden units = number of prototype vectorsForm of the basis functions decided in advance

Page 5: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

• use a weighted sum of the outputs from the basis functions for e.g. classification, density estimation etc•Theory can be motivated by many things (regularisation, Bayesian classification, kernel density estimation, noisy interpolation etc), but all suggest that basis functions are set so as to represent the data. • Thus centres can be thought of as prototypes of input data.

* *

*

* *

*

O1

01

0

MLP vs RBFdistributed local

Page 6: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

P(C1)0

0

P(C3)

x y

3(x) = p(x|C3)1(x) = p(x|C1)

E.g. Bayesian interpretation: if we choose to model the probability and we choose appropriate weights then we can interpret the outputs as the posterior probabilities:

Ok = P(Ck|(x) p(x|Ck) P(Ck)

O1 O2 O3

Page 7: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Starting point: exact interpolationEach input pattern x must be mapped onto a target

value d

d

x

Page 8: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

That is, given a set of N vectors xi and a corresponding set of N

real numbers, di (the targets), find a function F that satisfies the

interpolation condition:

F ( xi ) = di for i =1,...,N

or more exactly find:

satisfying:

||)(||)(1

j

N

jj xxwxF

ij

N

jiji dxxwxF

||)(||)(1

Page 9: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Example: XOR problem

x d(0,0) 0(1,1) 0(0,1) 1(1,0) 1

Exact interpolation: RBF placed at position of each pattern vectorusing 1) linear RBF

Page 10: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

i.e. 4 hidden units in network

||;||)()1 ii xxx

wNetwork structure

Page 11: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Results

0 1 1

1 0 1

1 0 1

1 1 0

=

w1

w2

w3

w4

0

1

1

0

w1

w2

w3

w4

=

1

1

2

2

2

2

2

2

2

2 02

2

2

2

2

122

21

2

2110

).(

1

1

11

d

d

wxd

1112

22

2

2011

).(

2

22

d

wxd

Page 12: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Ie F(x1,x2) = sqrt(x12+x2

2)

sqrt((x1-1)2+x22)

sqrt(x1

2+(x2-1)2)

+ sqrt((x1-1)2+(x2-1)2)

And general solution is:

||)(||)(1

j

N

jj xxwxF

2

2

2

2

Page 13: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

x1 - x1 ) x1 - xN )

xN - xN )xN - x1 )

=

w1

wN

d1

dN

Interpolation Matrix weight

W = D

x i - xj ): scalar function of distance between vector x i

and xj

Equivalently

For n vectors get:

Page 14: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

If is invertible we have a unique solution of the above equation

Micchelli’s Theorem

Let xi , i = 1, ..., N be a set of distinct points in Rd, Then the

N-by-N interpolation matrix , whose ji-th element is x i - xj ) ,

is nonsingular.

DW 1So provided is nonsingular then interpolation matrix will havean inverse and weights to achieve exact interpolation

Page 15: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Easy to see that there is always a solution.

For instance, if we take (x-y)=1 if x = y, and 0 otherwise (e.g.

a Gaussian with very small , setting wi=di solves the

interpolation problem

However, this is a bit trivial as the only general conclusion about the input space is that the training data points are different.

Page 16: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

To summarize:

For a given data set containing N points (xi,di), i=1,…,N Choose a RBF function Calculate xj xi ) Obtain the matrix Solve the linear equation W = D Get the unique solution Done!

Like MLP’s, RBFNs can be shown to be able to approximate any function to arbitrary accuracy (using an arbitrarily large numbers of basis functions).

Unlike MLP’s, however, they have the property of ‘best approximation’ i.e. that there exists an RBFN with minimum approximation error.

Page 17: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

(a) Multiquadrics

for some c>0(b) Inverse multiquadrics

for some c>0(c) Gaussian

for some >0

2/122 )()( crr

2/122 )()( crr

2

2

2exp)(

r

r

Other types of RBFs include

Page 18: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

• Inverse multiquadrics and Gaussian RBFs are both examples of ‘localized’ functions

• Multiquadrics RBFs are ‘nonlocalized’ functions

Linear activation function has some undesirable properties e.g. (xi) = 0. (NB is still a non-linear

function as it is only piecewise linear in x).

Page 19: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

‘Localized’: as distance from the centre increases the output of the RBF decreases

Page 20: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

• ‘Nonlocalized’: as distance from the centre increases the output of the RBF increases

Page 21: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Example: XOR problem

x d(0,0) 0(1,1) 0(0,1) 1(1,0) 1

Exact interpolation: RBF placed at position of each pattern vectorusing 2) Gaussian RBF with =1

Page 22: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

i.e. 4 hidden units in network

)2

||||exp()()2

2i

i

xxx

wNetwork structure

Page 23: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Results

exp(0) exp(-.5) exp(-.5) exp(-1)

exp(-.5) exp(0) exp(-1) exp(-.5)

exp(-.5) exp(-1) exp(0) exp(-.5)

exp(-1) exp(-.5) exp(-.5) exp(0)

=

w1

w2

w3

w4

0

1

1

0

w1

w2

w3

w4

=

-3.0359

3.4233

3.4233

-3.0359

Page 24: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

2) f(x1,x2) = -3.0359 exp(-(x12+x2

2)/2) +3.4233 exp(-(x1-1)2+x2

2)/2) +3.4233 exp(-(x1

2+(x2-1)2)/2) -3.0359 exp(-(x1-1)2+(x2-1)2)/2)

1) f(x1,x2) = sqrt(x12+x2

2)

- sqrt((x1-1)2+x22)

- sqrt(x1

2+(x2-1)2)

+ sqrt((x1-1)2+(x2-1)2)

2

2

2

2

Page 25: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Large = 1

Page 26: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Small = 0.2

Page 27: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Problems with exact interpolation

can produce poor generalisation performance as only data points constrain mapping

overfitting problem

Bishop(1995) example

Underlying function f(x)=0.5+0.4sine(2pi x)sampled randomly for 30 points

added gaussian noise to each data point

30 data points 30 hidden RBF units

fits all data points but creates oscillations due added noise and unconstrained between data points

Page 28: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

All Data Points 5 Basis functions

Page 29: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

To fit an rbf to every data point is very inefficient due to the computational cost of matrix inversion and is very bad for generalisation so:

• Use less RBF’s than data points I.e. M<N• Therefore don’t necessarily have RBFs centred at data points• Can include bias terms• Can have gaussians with general covariance matrices but there is a trade-off between complexity and the number of parameters to be found

Page 30: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

1 parameter d parameters

d(d+1)/2 parameters

for d rbfs we have

Page 31: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

6. Radial-basis function (RBF) networks II

Generalised radial basis function networks

Exact interpolation expensive due to cost of matrix inversion

prefer fewer centres (hidden RBF units) centres not necessarily at data points can include biases can have general covariance matrices

now no longer exact interpolation, so

where M (number of hidden units) <N (number of training data)

)()(0

M

iii xwxy

Page 32: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

xN

Input: nD vector

x1

x2

Output

Three-layer networks

Hidden layer

x)x-xM

wMy

1. output = wi i(x)

2. adjustable parameters are weights wj, number of hidden units M (<N)

3. Form of the basis functions decided in advance

w0 = bias

Page 33: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

* x

*

r1

r2

r1) r2)

w1 w2

F(x)

x

sigw1Tx)

w31 w32

F(x)

sigw2Tx)

w1 w2

w1Tx = k

Page 34: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Comparison of MLP to RBFN

MLP

hidden unit outputs are monotonic functions of a weighted linear sum of the inputs => constant on (d-1)D hyperplanes

distributed representation as many hidden units contribute to network output => interference between units => non-linear training => slow convergence

RBF

hidden unit outputs are functions of distance from prototype vector (centre) => constant on concentric (d-1)D hyperellipsoids

localised hidden units mean that few contribute to output => lack of interference => faster convergence

Page 35: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Comparison of MLP to RBFN

MLP

more than one hidden layer

global supervised learning of all weights

global approximations to nonlinear mappings

RBF

one hidden layer

hybrid learning with supervised learning in one set of weights

localised approximations to nonlinear mappings

Page 36: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

xN

Input: nD vector

x1

x2

Output

Three-layer networks

Hidden layer

x)x-xM

wMy

1. output = wi i(x)

2. adjustable parameters are weights wj, number of hidden units M (<N)

3. Form of the basis functions decided in advance

w0 = bias

Page 37: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Hybrid training of RBFNTwo stage ‘hybrid’ learning process

stage 1: parameterise hidden layer of RBFs - hidden unit number (M) -centre/position (ti) -width ()use unsupervised methods (see below) as they are quick and unlabelled data is plentiful. Idea is to estimate the density of the data

stage 2 Find weight values between hidden and output units minimize sum-of-squares error between actual output and desired responses --invert matrix if M=N --Pseudoinverse of if M<NStage 2 later, now concentrate on stage 1.

Page 38: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Random subset approach

Randomly select centres of M RBF hidden units from N data points

widths of RBFs usually common and fixed to ensure a degree of overlap but based on an average or maximum distance between RBFs e.g. dmax /sqrt (2M)

where dmax is the maximum distance between the set of M RBF units

The method is efficient and fast, but suboptimal and its important to get correct …

Page 39: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

10 0.08

0.4

Page 40: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Clustering Methods: K-means algorithm--divides data points into K subgroups based on similarity

Batch version 1. Randomly assign each pattern vector x to one of K subsets 2. Compute mean vector of each subset 3. Reassign each point to subset with closest mean vector 4. Until no further reassignments, loop back to 2

On-line version 1. Randomly choose K data points to be basis centres i

2. As each vector is xn presented, update the nearest i using:

Δi = xni) 3. Repeat until no further changes

Page 41: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

The covariance matrices (can now be set to the covariance of the data points of each subset

-- However, note that K must be decided at the start-- Also, the algorithm can be sensitive to initial conditions-- Can get problems of no/few points being in a set: see competitive learning lecture-- Might not cover the space accurately

Other unsupervised techniques such as self organising maps and Gaussian mixture models can also be used

Another approach is to use supervised techniques where the parameters of the basis functions are adaptive and can be optimised. However, this negates the speed and simplicity advantages of the 1st stage of training.

Page 42: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Relationship with probability density function estimationRadial basis functions can be related to kernel density functions (Parzen windows) used to estimate probability density functions

E.g. In 2 dimensions the pdf at a point x can be estimated from the fraction of training points which fall within a square of side h centred on x

Here p(x) = 1/6 x 1/(hxh) x n H(x-xn,h)

where H = 1 if |xn-x| < h ie estimate density by fraction of points within each square

Alternatively, H(|xn-x|) could be gaussian giving a smoother estimate for the pdf

X

**

*

**

*

x h

y

Page 43: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

In Radial basis networks the first stage of training is an attempt to model the density of the data in an unsupervised way

As in kernel density estimation, we try to get an idea of the underlying density by picking some prototypical points

Then use distribution of the data to approximate a prior distribution

Page 44: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Now for each training data vector ti and corresponding target di we

want F ( ti ) = di , that is, we must find a function F that

satisfies the interpolation condition : F ( ti ) = di for i =1,...,N

Or more exactly find:

satisfying:

||)(||)(0

j

M

jj xxwxF

ij

M

jiji dxtwtF

||)(||)(0

Back to Stage 2 for a network with M < N basis vectors

Page 45: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

t1 - x1 ) t1 - xM )

tN - xN )tN - x1 )

=

w0

w1

wM

d1

dN

So the interpolation matrix becomes:

Which can be written as:

W = D

where is an MxN matrix (not square).

Page 46: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

n

nn dtyE 2))((2

1

To solve this we need to generate an error function such as the least squares error:

and minimise it.

As the derivative of the least squares error is a linear function of the weights it can be solved using linear matrix inversion techniques (usually singular value decomposition (Press et al., Numerical Recipes)).

Other error functions can be used but minimising the error then becomes a non-linear optimisation problem.

Page 47: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

However, note that the problem is OverDetermined

That is, by using N training vectors and only M centres we have M unknowns (the weights) and N bits of information eg training vectors (-2, 0), (1, 0), targets 1, 2 centre: (0, 0), linear rbf W = D =>

2

1

1

2w

w =0.5 or w =2 ??? Unless N=M and there are no degeneracies (parallel or nearly parallel) data vectors, we cannot simply invert the matrix and must use the pseudoinverse (using Singular Value Decomposition).

Page 48: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Alternatively, can view this as an ill-posed problem

Ill-posed problems (Tikhonov)

How do we infer function F which maps X onto y from a finite data set?

This can be done if problem is well-posed - existence = each input pattern has an output - uniqueness = each input pattern maps onto only one output - continuity = small changes in input pattern space imply small

changes in y

In RBFs however: - noise can violate continuity condition - different output values for same input patterns violates uniqueness - insufficient information in training data may violate existence

condition

Page 49: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Ill-posed problem: the finite data set does not yield a unique solution

Page 50: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Regularization theory (Tikhonov, 1963)

To solve ill-posed problems need to supplement finite data set

with prior knowledge about nature of mapping

-- regularization theory

• common to place constraint that mapping is smooth (since smoothness implies continuity)

• add penalty term to standard sum-of squares error for non-smooth mappings

E(F)=ES (F)+ Ec(F)where eg:

ES (F)= 1/2 ( di- F(xi) )2 and Ec(F)=1/2 || DF ||2

and DF could be, say the first or second order derivative of F etc.

Page 51: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

is called the regularization parameter:

unconstrained (smoothness not enforced) = infinity, smoothness constraint dominates and less account is taken of training data error

controls balance (trade-off) between a smooth mapping and fitting the data points exactly

Page 52: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

EC = curvature

= 0 = 40

Page 53: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Regularization networks

--Poggio & Girosi (1990) applied regularization theory to RBF networks--By minimizing the new error function E(F) we obtain (using results from functional analysis)

where I is the unit matrix. Provided EC is chosen to be quadratic in y, this equation can be solved using the same techniques as the non-regularised network.

DIW 1][

Page 54: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Problems of RBFs

1. Need to choose number of basis functions

2. Due to local nature of basis functions has problems in ignoring ‘noisy’ input dimensions unlike MLPs (helps to use dimensionality reduction such as PCA)

1D data, M rbfs Same data with uncorrelated noise, M2 rbfs

Page 55: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Problems of RBFs 2

3. Optimal choice of basis function parameters may not be optimal for the output task

Data from h => rbf at a, but gives a bad representation of h. In contrast, one centred at b would be perfect

Page 56: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Problems of RBFs 3

4. Because of dependence on distance, if variation in one parameter is small with respect to the others it will contribute very little to the outcome (l + )2 ~ l2. Therefore, preprocess data to give zero mean and unit variance via simple transformation:

x* = (x - )

(Could achieve the same using general covariance matrices but this is simpler)

Page 57: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

However, this does not take into account correlations in the data.

Better to use whitening (Bishop, 1995, pp 299-300)

Page 58: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

x* = -1/2 UT (x -

whereU is a matrix whose columns are the eigenvectors ui of , the covariance matrix of the data, and a matrix with the corresponding eigenvalues i on the diagonals i.e:

U = (u1, … …, un)

And:

diag(1, ……, n)

1u1

2u2

Page 59: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Using RBF Nets in practice• Choose a functional form (Gaussian generally, but prior knowledge/experience may suggest others)

• Select the type of pre-processing

--Reduce dimensionality (techniques to follow in next few lectures) ?

--Normalise (whiten) data?

(no way of knowing if these will be helpful: may need to try a few combinations)

• Select clustering method (k-means)

• Select number of basis functions, cluster and find basis centres

• Find weights (via matrix inversion)

• Calculate performance measure.

Page 60: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

If only life were so simple…

• How do we choose k? Similar to problem of selecting number of hidden nodes for MLP

• What type of pre-processing is best?

• Does the clustering method work for the data? E.g might be better to fix and try again.

There is NO general answer: each choice will be problem-specific. The only info you have is your performance measure.

Page 61: 6. Radial-basis function (RBF) networks RBF = radial-basis function: a function which depends only on the radial distance from a point XOR problem quadratically.

Note the dependence on the performance measure (make sure it’s a good one).

Good thing about RBF Nets is that the training procedure is relatively quick and so lots of combinations can be used.

Idea: try e.g. increasing k until performance measure decreases (or gets to a minimum, or something more adventurous).

k

Per

form

ance

m

easu

re

Optimal k?