Post on 17-Jan-2016
SUPPORT VECTOR MACHINE
2009/3/24
1
Support Vector Machine
A supervised learning method Is known as the maximum margin classifier Find the max-margin separating hyperplane
2
SVM – hard margin3
x1
x2
2∥w∥
<w, x> - θ = 0
<w, x> - θ = -1
<w, x> - θ = +1
max2
∥w∥w, θyn(<w, xn> - θ) ≧1
argmin
2w, θyn(<w, xn> - θ) ≧1
1<w, w>
Quadratic programming4
argmin
1Σ Σ aijvivj + Σ bivi2 i j
Σ rkivi ≧ qki
vV* quadprog(A, b, R, q)
argmin
2w, θyn(<w, xn> - θ) ≧1
1<w, w>
Let V = [ θ, w1, w2, …, wD ]
Σ wd2
21
d=1
D
(-yn) θ + Σ yn (xn)d wd ≧ 1d=1
D
Adapt the problem for quadratic programming
Find A, b, R, q and put into the quad. solver
Adaptation5
V = [ θ, w1, w2, …, wD ]
v0, v1, v2, .…, vD
Σ wd2
21
d=1
D
(-yn) θ + Σ yn (xn)d wd ≧ 1d=1
D
v0 vd
argmin
1Σ Σ aijvivj + Σ bivi2 i j
Σ rkivi ≧ qki
v
a00 = 0a0j = 0ai0 = 0
i ≠ 0, j ≠ 0aij = 1 (i = j)
0 (i ≠ j)
b0 = 0
i ≠ 0bi = 0
qn = 1
rn0 = -yn
d > 0
rnd = yn (xn)d
(1+D)*(1+D)
(1+D)*1
(2N)*(1+D)
(2N)*1
SVM – soft margin
Allow possible training errors
Tradeoff c Large c : thinner hyperplane, care about error Small c : thicker hyperplane, not care about
error
6
argmin
2w, θyn(<w, xn> - θ) ≧1 – ξn
1<w, w> + c Σξnn
ξn ≧ 0
errors
tradeoff
Adaptation7
argmin
1Σ Σ aijvivj + Σ bivi2 i j
Σ rkivi ≧ qki
v
V = [ θ, w1, w2, …, wD, ξ1, ξ2, …, ξN ]
(1+D+N)*(1+D+N)
(2N)*(1+D+N)
(1+D+N)*1
(2N)*1
Primal form and Dual form
Primal form
8
Dual form
argmin
2w, θyn(<w, xn> - θ) ≧1 – ξn
1<w, w> + c Σξnn
ξn ≧ 0
argmin
2α
0 ≦αn≦C
1ΣΣ αnynαmym<xn, xm> - Σ αnn m
Σ ynαn = 0
n
n
Variables: 1+D+N
Constraints: 2N
Variables: N
Constraints: 2N+1
Dual form SVM
Find optimal α* Use α* solve w* and θ
αn=0 correct or on 0<αn<C on αn=C wrong or on
9
αn=C
free SV
αn=0
Support Vector
Nonlinear SVM
Nonlinear mapping X Φ(X) {(x)1, (x)2} R2 {1, (x)1, (x)2, (x)1
2, (x)22,
(x)1(x)2} R6
Need kernel trick
10
argmin
2α
0 ≦αn≦C
1ΣΣ αnynαmym<Φ(xn), Φ(xm)> - Σ αnn m
Σ ynαn = 0
n
n
(1+ <xn, xm>)2