Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q....
Transcript of Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q....
![Page 1: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/1.jpg)
Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen
Support Vector Elastic Network
“Sven the Terrible”
![Page 2: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/2.jpg)
Traditional Computer Science
Data
ProgramOutput
Computer
Traditional CS:
![Page 3: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/3.jpg)
Machine Learning
Data
ProgramOutput
Computer
Traditional CS:
Machine Learning:
Data
OutputProgram
Computer
![Page 4: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/4.jpg)
Support Vector Machines
w >x
min
w
1
2
kwk22 + CnX
i=1
max(0, 1� yi(w>xi))
2}
L2 Regularization.
}
Squared hinge loss.
14644 Citations
Published in ML journals
Usable means MATLAB
Fast means parallel
Many GPU Implementations
![Page 5: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/5.jpg)
Support Vector Machines
w >x
min
w
1
2
kwk22 + CnX
i=1
max(0, 1� yi(w>xi))
2}
L2 Regularization.
}
Squared hinge loss.
14644 Citations
Published in ML journals
Usable means MATLAB
Fast means parallel
Many GPU Implementations
![Page 6: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/6.jpg)
Elastic Net/Lasso
min�
kX� � yk22 + �2k�k22such that |�|1 t
13856 Citations
Published in stats journals
Usable means R
Fast means Fortran
Zero GPU Implementations
![Page 7: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/7.jpg)
min�
kX� � yk22 + �2k�k22such that |�|1 t
13856 Citations
Published in stats journals
Usable means R
Fast means Fortran
Zero GPU Implementations
Elastic Net/Lasso
![Page 8: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/8.jpg)
min�
kX� � yk22 + �2k�k22such that |�|1 t
13856 Citations
Published in stats journals
Usable means R
Fast means Fortran
Zero GPU Implementations
Elastic Net/Lasso
![Page 9: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/9.jpg)
min�
kX� � yk22 + �2k�k22such that |�|1 t
t
0 0.5 1 1.50.2
0
0.2
0.4
0.6 Glmnet
0 0.5 1 1.50.2
0
0.2
0.4
0.6 SVEN (GPU)
Coe
ffici
ents
�i
L1 budget t L1 budget t
Equivalence of regularization path
L1 Budget
Elastic Net/Lasso
![Page 10: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/10.jpg)
+ interpretable+ parallel + scales to large data + multi-platform
- slow - does not scale
- not interpretable
Elastic Net SVM
![Page 11: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/11.jpg)
Reductions
Problem A Problem B
Solution BSolution A
Elastic Net SVM
Input X,Y Input Xnew,Ynew
Output � ↵Output
![Page 12: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/12.jpg)
Reductions
Problem A Problem B
Solution BSolution A
[n,p] = size(X); Xnew = [bsxfun(@minus,X,Y./t) bsxfun(@plus,X,Y./t)]'; Ynew = [ones(p,1); -ones(p,1)]; C = 1/(2*lambda);
alpha = C * max(1 - Ynew.*(Xnew*model.w),0); beta = t*(alpha(1:p) - alpha(p+1:2*p)) / sum(alpha);
model = trainsvmGPU(Ynew,sparse(Xnew),['-q -s 1 -c ' num2str(C)]);
Input X,Y Input Xnew,Ynew
Output � ↵Output
Elastic Net SVMfunction beta = SVEN(X,Y,t,lambda)
![Page 13: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/13.jpg)
Results
0 0.5 1 1.50.2
0
0.2
0.4
0.6 Glmnet
0 0.5 1 1.50.2
0
0.2
0.4
0.6 SVEN (GPU)
Coe
ffici
ents
�i
L1 budget t L1 budget t
Equivalence of regularization path
![Page 14: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/14.jpg)
ResultsO
ther
alg
. run
time
(sec
)
101
MITFaces [n=489410, p=361] Yahoo [n=141397, p=519] YMSD [n=463715, p=90]
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
FD [n=400000, p=900]
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) runtime (sec)100
100
101
102
102 101100 102
100
101
102
100 10110-110-1
100
101
101
101
102
102
glmnet SVEN (CPU)Shotgun L1_Ls
n>>d datasets
O(d2)Running time:
Or…
![Page 15: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/15.jpg)
ResultsO
ther
alg
. run
time
(sec
)
GLI85 [n=85, p=22283] arcene [n=900, p=10000] SMKCAN187 [n=187, p=19993] GLABRA180 [n=180, p=49151]
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
100
10-1
10-2
101
10010-110-2 101 10010-1 101 102
10-1
100
101
102
10010-1 101
10-1
100
101
10-1
100
101
102
10010-1 101 102
glmnet SVEN (CPU)Shotgun L1_Ls
PEMS [n=440, p=138672] scene15 [n=544, p=71963] dorothea [n=800, p=88119] E2006 [n=3308, p=72812]
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) fa
ster
SVEN (GPU) s
lower
SVEN (GPU) runtime (sec)10010-1 101 102
10-1
100
101
102
10-1
100
101
102
10010-1 101 102 10010-1 101 10210-1
100
101
102
100
101
102
103
100 101 102 103
d>>n datasets
Running time: O(n2)
![Page 16: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/16.jpg)
Conclusion
Elastic Net and SVM are equivalent problems.
Many optimizations only for SVM now apply to Elastic Net.
This leads to the fastest Elastic Net solver we are aware of.
![Page 17: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”](https://reader035.fdocuments.in/reader035/viewer/2022071603/613ec111b946476b8b530d88/html5/thumbnails/17.jpg)
Questions?
“Sven the Nice?”