Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support...
Transcript of Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support...
![Page 1: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/1.jpg)
Machine Learning using Matlab
Lecture 7 Support Vector Machine (SVM)
![Page 2: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/2.jpg)
Note● Deadline for presentation application is 11.06.2017. If you still didn’t send
your application, please send it asap.● The presentation date schedule will be released in our course website next
week.● On Thursday lab session I will give a quiz, if you can finish in time, you will be
given bonus in your final score.
![Page 3: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/3.jpg)
Outline● Primal and dual forms● Feature map● Kernel trick● Regression● SVM toolbox
![Page 4: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/4.jpg)
Intuition
Support vector
Support vector
SVM is also called “maximum margin classifier”
![Page 5: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/5.jpg)
“Hard” margin● Given training examples , SVM aims to find an optimal hyperplane
so that:
● It is equivalent to minimizing the following function:
![Page 6: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/6.jpg)
Which classifier is better?
Tradeoff between the margin and the number of mistakes in the training data
![Page 7: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/7.jpg)
Introduce “slack” variables
Support vector
Support vector
● For point is between margin and correct side of hyperplane. This is margin violation
● For point is misclassified
![Page 8: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/8.jpg)
“Soft” margin solutionThe optimization problems becomes:
● Every constraint can be satisfied if is sufficiently large.● C is a regularization parameter:
○ Small C ⇒ large margin○ large C ⇒ narrow margin○ C = ∞ ⇒ hard margin
● It is called primal form of SVM.
![Page 9: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/9.jpg)
Different regularization
![Page 10: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/10.jpg)
Dual formWith Lagrange multipliers, we have the dual form of SVM:
The decision function can be rewritten:
Prediction is very fast as most ⍺ are zeros.
![Page 11: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/11.jpg)
What if the data is not linearly separable?● In logistic regression, we add more
parameters to make the decision boundary nonlinearly.
● However, we can’t do the same way in SVM because I still want to learn to a linear classifier.
![Page 12: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/12.jpg)
Map data into higher dimension
Data is linearly separable in 3D space
w
![Page 13: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/13.jpg)
Feature map● By mapping data from d-dimensional to D-dimensional space (d<D), we can
still learn a linear classifier.● , where is called feature map.●
What change in classifier learning after mapping features?
![Page 15: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/15.jpg)
Transformed feature in primal formClassifier:
Optimization:
● Simply map x to phix where data is separable● Solve for w in high dimensional space● There are many more parameters to learn for w if D>>d, can we avoid this?
![Page 16: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/16.jpg)
Transformed feature in dual formClassifier:
Optimization:
● In dual form, phix only occurs in pairs● Only the m dimensional vector alp needs to be learnt● Kernel:
![Page 17: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/17.jpg)
Kernel in dual formClassifier:
Optimization:
![Page 18: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/18.jpg)
Kernel trick
1. Classifier can be learnt and applied without explicitly computing2. All that is required to compute the kernel function3. Complexity of learning depends on number of training examples m rather than the
dimensions of feature space D.
![Page 19: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/19.jpg)
Common kernel functions● Linear kernel:● Polynomial kernel:
○ Contains all polynomials terms up to degree d
● Radial Basis Function (Gaussian kernel):○ Infinite dimensional feature space
How many parameters do you need to tune?
![Page 20: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/20.jpg)
Kernel trick - summary● Classifier can be learnt in high dimensional feature space, without explicitly
knowing the feature map● Kernels can also be used elsewhere, for example, kernel PCA, kernel
k-means● Different kernel functions may be applied to different scenarios● However, the optimal parameters have to be chosen empirically
![Page 21: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/21.jpg)
Support Vector Regression (SVR)
-insensitive loss
![Page 22: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/22.jpg)
SVR primal form
![Page 23: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/23.jpg)
SVR dual form
Introduce Lagrange multipliers , we have:
![Page 24: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/24.jpg)
SVR - summary● SVR is the extension of SVM, thus the optimization algorithm for SVM can be
applied to SVR directly [Smola ’04].● Likewise, “kernel trick” can also be applied to SVR.● Q: how many parameters should I tune if I use gaussian kernel?
Three parameters, namely, C, σ, and
![Page 25: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/25.jpg)
SVM toolbox1. Libsvm: https://www.csie.ntu.edu.tw/~cjlin/libsvm/2. SVMlight: http://svmlight.joachims.org/
![Page 26: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,](https://reader031.fdocuments.in/reader031/viewer/2022041014/5ec4cf6d81e61327482cb498/html5/thumbnails/26.jpg)
SVM - summary● SVM was originally proposed by Boser, Guyon and Vapnik in 1992 and gained
increasing popularity in late 1990s.● SVM can be applied to complex data types beyond feature vectors (e.g. graphs,
sequences, relational data) by designing kernel functions for such data.● For multiclass SVM, you can use either one-vs-rest scheme or multi-class SVM, e.g.,
[Weston ’99] and [Crammer ’01].● SVM is a convex problem, thus we have global optimal solution. However, the
computational cost increases along with the number of training examples. Therefore, more efficient optimization algorithms are proposed, e.g. SMO [Platt ’99] and [Joachims ’99].
● Tuning SVMs remains a black art: selecting a specific kernel and parameters is usually done in a try-and-see manner.