Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization...
-
Upload
emmett-niblett -
Category
Documents
-
view
217 -
download
1
Transcript of Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization...
![Page 1: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/1.jpg)
Support Vector Machines
![Page 2: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/2.jpg)
RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
![Page 3: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/3.jpg)
![Page 4: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/4.jpg)
Zur Anzeige wird der QuickTime™ Dekompressor „TIFF (LZW)“
benötigt.
![Page 5: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/5.jpg)
Gaussian response function
Each hidden layer unit computes
x = an input vector u = weight vector of hidden layer neuron i
€
hi = e−Di
2
2σ 2
€
Di2 = (
r x −
r u i)
T (r x −
r u i)
![Page 6: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/6.jpg)
Location of centers u
The location of the receptive field is critical
Apply clustering to the training set each determined cluster center would
correspond to a center u of a receptive field of a hidden neuron
![Page 7: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/7.jpg)
Determining Following heuristic will perform well in
practice For each hidden layer neuron, find the RMS
distance between ui and the center of its N nearest neighbors cj
Assign this value to i€
RMS =1
n⋅ uk −
c lk
l=1
N
∑N
⎛
⎝
⎜ ⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟ ⎟
2
i= k
n
∑
![Page 8: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/8.jpg)
The output neuron produces the linear weighted sum
The weights have to be adopted (LMS)
€
Δwi = η (t − o)x i€
o = wi ⋅hi
i= 0
n
∑
![Page 9: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/9.jpg)
Why does a RBF network work?
The hidden layer applies a nonlinear transformation from the input space to the hidden space
In the hidden space a linear discrimination can be performed
( )
( )
( )( )( )
( )
( )( )
( )
( )
( )
( )( )
( )
( )( )
![Page 10: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/10.jpg)
Support Vector Machines Linear machine
Constructs a hyperplane as the decision surface in such a way that the margin of separation between positive and negative examples is maximized
Good generalization performance Support vector learning algorithm may construct
following three learning machines Polynominal learning machines Radial-Basis functions networks Two-layer perceptrons
![Page 11: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/11.jpg)
Two Class Problem: Linear Separable Case
Class 1
Class 2 Many decision
boundaries can separate these two classes
Which one should we choose?
![Page 12: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/12.jpg)
Example of Bad Decision Boundaries
Class 1
Class 2
Class 1
Class 2
![Page 13: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/13.jpg)
Good Decision Boundary: Margin Should Be Large The decision boundary should be as far away from the
data of both classes as possible We should maximize the margin, m
Class 1
Class 2
m
w/||w|| * (x1-x2) = 2/||w||
![Page 14: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/14.jpg)
€
g(r x ) =
r w T
r x + b
r x =
r x P + r
r w
||r w ||
g(r x ) =
r w T
r x P + b + r
r w
||r w ||
r w T
g(r x ) = r ||
r w ||
![Page 15: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/15.jpg)
€
g(r x ) =
r w T
r x ± b = ±1 for d = ±1
r =g(
r x )
||r w ||
=
1
||r w ||
if d =1
−1
||r w ||
if d = −1
⎧
⎨ ⎪
⎩ ⎪
m = 2r =2
||r w ||
![Page 16: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/16.jpg)
The Optimization Problem
Let {x1, ..., xn} be our data set and let
yi {1,-1} be the class label of xi
The decision boundary should classify all points correctly
A constrained optimization problem
![Page 17: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/17.jpg)
The Optimization Problem Introduce Lagrange multipliers , Lagrange function:
Minimized with respect to w and b
)1][(||||2
1),,(
1
2 −+−= ∑=
bxwywbwL iT
i
N
ii
![Page 18: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/18.jpg)
The Optimization Problem We can transform the problem to its dual
This is a quadratic programming (QP) problem Global maximum of i can always be found
w can be recovered by
![Page 19: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/19.jpg)
6=1.4
A Geometrical Interpretation
Class 1
Class 2
1=0.8
2=0
3=0
4=0
5=07=0
8=0.6
9=0
10=0
![Page 20: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/20.jpg)
How About Not Linearly Separable
We allow “error” i in classification
Class 1
Class 2
![Page 21: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/21.jpg)
Soft Margin Hyperplane
Define i=0 if there is no error for xi
i are just “slack variables” in optimization theory
We want to minimize C : tradeoff parameter between error and margin
The optimization problem becomes
![Page 22: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/22.jpg)
The Optimization Problem The dual of the problem is
w is also recovered as The only difference with the linear separable
case is that there is an upper bound C on i
Once again, a QP solver can be used to find i
![Page 23: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/23.jpg)
Extension to Non-linear Decision Boundary Key idea: transform xi to a higher dimensional
space to “make life easier” Input space: the space xi are in Feature space: the space of (xi) after transformation
Why transform? Linear operation in the feature space is equivalent to
non-linear operation in input space The classification task can be “easier” with a proper
transformation. Example: XOR
![Page 24: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/24.jpg)
Extension to Non-linear Decision Boundary Possible problem of the transformation
High computation burden and hard to get a good estimate SVM solves these two issues simultaneously
Kernel tricks for efficient computation Minimize ||w||2 can lead to a “good” classifier
( )
( )
( )( )( )
( )
( )( )
(.)( )
( )
( )
( )( )
( )
( )
( )( )
( )
Feature spaceInput space
![Page 25: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/25.jpg)
Example Transformation Define the kernel function K (x,y) as
Consider the following transformation
The inner product can be computed by K without going through the map (.)
![Page 26: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/26.jpg)
Kernel TrickThe relationship between the kernel function K and
the mapping (.) is
This is known as the kernel trick In practice, we specify K, thereby specifying (.)
indirectly, instead of choosing (.) Intuitively, K (x,y) represents our desired notion of
similarity between data x and y and this is from our prior knowledge
K (x,y) needs to satisfy a technical condition (Mercer condition) in order for (.) to exist
![Page 27: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/27.jpg)
Examples of Kernel Functions
Polynomial kernel with degree d
Radial basis function kernel with width
Closely related to radial basis function neural networks
Sigmoid with parameter and
It does not satisfy the Mercer condition on all and Research on different kernel functions in
different applications is very active
![Page 28: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/28.jpg)
Zur Anzeige wird der QuickTime™ Dekompressor „TIFF (LZW)“
benötigt.
Zur Anzeige wird der QuickTime™ Dekompressor „TIFF (LZW)“
benötigt.
![Page 29: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/29.jpg)
Multi-class Classification SVM is basically a two-class classifier One can change the QP formulation to allow multi-
class classification More commonly, the data set is divided into two
parts “intelligently” in different ways and a separate SVM is trained for each way of division
Multi-class classification is done by combining the output of all the SVM classifiers Majority rule Error correcting code Directed acyclic graph
![Page 30: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/30.jpg)
Conclusion SVM is a useful alternative to neural networks Two key concepts of SVM: maximize the
margin and the kernel trick Many active research is taking place on areas
related to SVM Many SVM implementations are available on
the web for you to try on your data set!
![Page 31: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/31.jpg)
Measuring Approximation Accuracy
Comparing its output with correct values Mean squared Error F(w) of the network
• D={(x1,t1),(x2,t2), . .,(xd,td),..,(xm,tm)}
€
F(r w ) =
1
m|r t d −
r o d |2
d =1
m
∑
![Page 32: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/32.jpg)
Zur Anzeige wird der QuickTime™ Dekompressor „TIFF (LZW)“
benötigt.
![Page 33: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/33.jpg)
RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary Kernel-Trick Approximation Accurancy Overtraining
![Page 34: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/34.jpg)
![Page 35: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/35.jpg)
Bibliography
Simon Haykin, Neural Networks, Secend edition Prentice Hall, 1999
Zur Anzeige wird der QuickTime™ Dekompressor „TIFF (Unkomprimiert)“ benötigt.
![Page 36: Support Vector Machines. RBF-networks Support Vector Machines Good Decision Boundary Optimization Problem Soft margin Hyperplane Non-linear Decision Boundary.](https://reader035.fdocuments.in/reader035/viewer/2022062712/56649cae5503460f94972255/html5/thumbnails/36.jpg)