Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins...
-
Upload
reynard-washington -
Category
Documents
-
view
214 -
download
0
Transcript of Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins...
![Page 1: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/1.jpg)
Ariadna QuattoniXavier Carreras
An Efficient Projection for l1,∞ Regularization
Michael Collins Trevor Darrell
MIT CSAIL
![Page 2: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/2.jpg)
Goal : Efficient training of jointly sparse models in high dimensional spaces.
Joint Sparsity
Why? : Learn from fewer
examples. Build more efficient
classifiers. Interpretability.
![Page 3: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/3.jpg)
3
Church Airport Grocery Store
Flower-Shop
2,1w
2,5w
11w 11w
2,4w
2,3w
2,2w
3,1w
3,5w
3,4w
3,3w
3,2w
4,1w
4,5w
4,4w
4,3w
4,2w
1,1w
1,5w
1,4w
1,3w
1,2w
![Page 4: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/4.jpg)
4
Church Airport Grocery Store
Flower-Shop
2,1w
2,5w
11w 11w
2,4w
2,3w
2,2w
3,5w
3,4w
3,2w
4,1w
4,5w
4,4w
4,3w
4,2w
1,1w
1,5w
1,4w
1,3w
1,2w
![Page 5: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/5.jpg)
5
Church Airport Grocery Store
Flower-Shop
11w 11w
2,4w
2,3w
2,2w
3,5w
3,4w
3,2w
4,4w
4,2w
1,4w
1,2w
![Page 6: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/6.jpg)
l1,∞ Regularization How do we promote joint (i.e. row) sparsity ?
Coefficients forfeature 2
Coefficients for task 2
An l1 norm on the maximum absolute values of the coefficients across tasks promotes sparsity.
Use few features
The l∞ norm on each row promotes non-sparsity on each row.
Share parameters
![Page 7: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/7.jpg)
Contributions An efficient projected gradient method for l1,∞ regularization
Our projection works on O(n log n) time, same cost as l1 projection
Experiments in Multitask image classification problems
We can discover jointly sparse solutions
l1,∞ regularization leads to better performance than l2 and l1 regularization
![Page 8: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/8.jpg)
Multitask Application
},...,,{ 21 mDDDD
Joint SparseApproximation
1D2D mD
Collection of Tasks
![Page 9: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/9.jpg)
l1,∞ Regularization: Constrained Convex
Optimization Formulation
We use a Projected SubGradient method. Main advantages: simple, scalable, guaranteed convergence rates.
A convex function
Convex constraints
Projected SubGradient methods have been recently proposed:
l2 regularization, i.e. SVM [Shalev-Shwartz et al. 2007]
l1 regularization [Duchi et al. 2008]
![Page 10: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/10.jpg)
10
Euclidean Projection into the l1-∞ ball
![Page 11: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/11.jpg)
11
Characterization of the solution
![Page 12: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/12.jpg)
Series1
02468
10
Series1
02468
10
Series1
02468
10
Series1
02468
10
Characterization of the solution
Feature I Feature II Feature III Feature IV
![Page 13: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/13.jpg)
Mapping to a simpler problem We can map the projection problem to the following problem which finds the optimal maximums μ:
![Page 14: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/14.jpg)
Efficient Algorithm
14
Se-ries
1
02468
10
Se-ries
1
02468
10
Se-ries
1
02468
10
Se-ries
1
02468
10
![Page 15: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/15.jpg)
Efficient Algorithm
15
Se-ries
1
02468
10
Se-ries
1
02468
10
Se-ries
1
02468
10
Se-ries
1
02468
10
![Page 16: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/16.jpg)
16
Complexity
The total cost of the algorithm is dominated by sorting the entries of A.
The total cost is in the order of:
![Page 17: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/17.jpg)
17
Synthetic Experiments
Generate a jointly sparse parameter matrix W:
For every task we generate pairs:where:
We compared three different types of regularization :
l1,∞ projection l1 projection l2 projection
![Page 18: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/18.jpg)
18
Synthetic Experiments
![Page 19: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/19.jpg)
19
Dataset: Image Annotation
40 top content words Raw image representation: Vocabulary Tree(Nister and Stewenius 2006)
11000 dimensions
president actress team
![Page 20: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/20.jpg)
20
Results
Most of the differences are statistically significant
![Page 21: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/21.jpg)
21
Results
![Page 22: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/22.jpg)
22
Dataset: Indoor Scene Recognition
67 indoor scenes. Raw image representation: similarities to a set of unlabeled images. 2000 dimensions.
bakery bar Train station
![Page 23: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/23.jpg)
Results
![Page 24: Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.](https://reader035.fdocuments.in/reader035/viewer/2022062519/5697c01a1a28abf838ccf1c3/html5/thumbnails/24.jpg)
Conclusions
We proposed an efficient global optimization algorithm for l1,∞ regularization.
We presented experiments on image classification tasks and shown that our method can recover jointly sparse solutions.
A simple an efficient tool to implement an l1,∞ penalty, similar to standard l1 and l2 penalties.