Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis...

25
Slide credits to Andrew Moore: http://www.cs.cmu.edu/~awm/tutorials Support Vector Machine Foundations of Data Analysis 04/14/2020

Transcript of Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis...

Page 1: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

Slide credits to Andrew Moore: http://www.cs.cmu.edu/~awm/tutorials

Support Vector Machine

Foundations of Data Analysis 04/14/2020

Page 2: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

2

Linear Classifiers

denotes +1

denotes -1

How would you classify this data?

f(x,w) = sign(w · x)

Page 3: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

3

f(x,w) = sign(w · x)

How would you classify this data?

denotes +1

denotes -1

Linear Classifiers

Page 4: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

4

f(x,w) = sign(w · x)

How would you classify this data?

denotes +1

denotes -1

Linear Classifiers

Page 5: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

5

f(x,w) = sign(w · x)

How would you classify this data?

denotes +1

denotes -1

Linear Classifiers

Page 6: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

6

f(x,w) = sign(w · x)

How would you classify this data?

Linear Classifiers

denotes +1

denotes -1

Page 7: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

7

Which is the best?

f(x,w) = sign(w · x)

denotes +1

denotes -1

Linear Classifiers

Page 8: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

8

Define the margin of a linear classifier as the width that the boundary could be increased before hitting a datapoint.

f(x,w) = sign(w · x) Linear Classifiers

denotes +1

denotes -1

Page 9: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

9

Maximum margin: the widest margin that maximally separates two data groups.

f(x,w) = sign(w · x)Maximum Margin

denotes +1

denotes -1

Page 10: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

10

Maximum margin linear classifier is the simplest kind of SVM (LinearSVM)

f(x,w) = sign(w · x)

Support Vectors : data points that the margin pushes up against.

denotes +1

denotes -1

Support Vector Machines

Page 11: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

11

Specifying a line and margin

• How do we represent this mathematically? • …in m input dimensions?

Plus-Plane

Minus-PlaneClassifier Boundary

“Predict Class = +1” zone

“Predict Class = -1” zone

Page 12: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

12

Plus-Plane

Minus-PlaneClassifier Boundary

“Predict Class = +1” zone

Class +1 if

-1 if

embarrassing points

if

w · x >= 1

w · x <= �1

�1 < w · x < 1

“Predict Class = -1” zone

Specifying a line and margin

Page 13: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

13

“Predict Class = +1” zone

M = Margin Width

wx=1

wx=0

wx=-1

“Predict Class = -1” zone

How to compute M?

Computing the Margin Width

Page 14: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

14

“Predict Class = +1” zone

M = Margin Width

wx=1

wx=0

wx=-1

“Predict Class = -1” zone

x+

x-

• Let x- be any point on the minus plane • Let x+ be the closest plus-plane-point to x-

• Claim: x+ = x- + λ w for some value of λ. Why?

Computing the Margin Width

Page 15: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

15

“Predict Class = +1” zone

M = Margin Width

wx=1

wx=0

wx=-1

“Predict Class = -1” zone

x+

x-

Computing the Margin Width

The line from x- to x+ is perpendicular to the planes. So to get from x- to x+ travel some distance in direction w.

• Let x- be any point on the minus plane • Let x+ be the closest plus-plane-point to x-

• Claim: x+ = x- + λ w for some value of λ. Why?

Page 16: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

16

“Predict Class = +1” zone

M = Margin Width

wx=1

wx=0

wx=-1

“Predict Class = -1” zone

x+

x-

What we know: • w . x+ = +1 • w . x- = -1 • x+ = x- + λ w • |x+ - x- | = M

Computing the Margin Width

Page 17: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

17

“Predict Class = +1” zone

M = Margin Width

wx=1

wx=0

wx=-1

“Predict Class = -1” zone

x+

x-

What we know: • w . x+ = +1 • w . x- = -1 • x+ = x- + λ w • |x+ - x- | = M

w . (x - + λ w) + b = 1

=>

w . x - + b + λ w .w = 1

=>

-1 + λ w .w = 1

=>w.w2

Computing the Margin Width

Page 18: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

18

“Predict Class = +1” zone

M = Margin Width

wx=1

wx=0

wx=-1

“Predict Class = -1” zone

x+

x-

What we know: • w . x+ = +1 • w . x- = -1 • x+ = x- + λ w • |x+ - x- | = M

w.w2

M = |x+ - x- | =| λ w |=

wwwwww

.2

..2

==

www .|| λλ ==

Computing the Margin Width

Page 19: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

19

“Predict Class = +1” zone

wx=1

wx=0

wx=-1

“Predict Class = -1” zone

Learning the Maximum Margin Classifier

x+

x-

M = Margin Width =ww.2

Use optimization to search the space of W to find the widest margin that matches all the data points.

Page 20: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

20

What would SVMs do?

x=0

Simple 1-D Example

Page 21: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

21

Not a big surprise

Positive “plane” Negative “plane”x=0

Simple 1-D Example

Page 22: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

22

What can SVM do about this?

Harder 1-D Example

x=0

Page 23: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

23

),( 2kkk xx=z

x=0

Harder 1-D Example

• Permit non-linear basis functions made linear regression

• Let’s permit them here too

Page 24: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

■ General idea: the original input space can always be mapped to some higher-dimensional feature space where the training set is separable:

Φ: x → φ(x)

Nonlinear SVMs: Feature Space

24

Page 25: Support Vector Machine - GitHub Pages · Support Vector Machine Foundations of Data Analysis 04/14/2020. 2 Linear Classifiers denotes +1 ... Linear Classifiers denotes +1 denotes

Nonlinear SVM Basis Functions

Φ(xk) = ( polynomial terms of xk of degree 1 to q )

Φ(xk) = ( Gaussian radial basis functions of xk )

Φ(xk) = ( sigmoid functions of xk )

25