Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss...
-
Upload
chris-diehl -
Category
Technology
-
view
421 -
download
0
Transcript of Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss...
![Page 1: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/1.jpg)
Approximate Leave-One-Out Error Estimation for Learning with Smooth,
Strictly Convex Margin Loss FunctionsChristopher P. Diehl
Research and Technology Development CenterJohns Hopkins Applied Physics Laboratory
September 29, 2004
![Page 2: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/2.jpg)
Overview of the Talk
General Learning FrameworkModel SelectionLOO Error EstimationUnlearning and LOO Margin EstimationComments on the MethodEvaluating the ApproximationResultsConclusion and Future Work
![Page 3: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/3.jpg)
General Large Margin Learning Framework
Class of objective functions includesSmoothed Support Vector MachinesKernel Logistic Regression
( )convex strictly are and
where
(
Risk EmpiricalRisk
Structural
g
bxxKyxfxfyz
zgCbbL
sIjjjj
iii
N
iiib
Ω
+==
+Ω=
∑
∑
∈
=
,)()(
)(),),(min1
,
α
43421321
ααα
![Page 4: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/4.jpg)
Model Selection
The GoalIdentify the model that yields the best generalization performance
The NeedAn efficient means of estimating the generalization performance (error rate) using the available data
Common ApproachesValidation SetK-fold Cross-ValidationLeave-One-Out (N-fold Cross-Validation)
Almost unbiased estimator of the generalization errorValuable tool when data is scarce for one or more classes
![Page 5: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/5.jpg)
LOO Error Estimation
Definition
∑=
<=N
i
kkNLOO zIe
1
)(1 0 ( )kk
kk
k xfyz )()( =where and
)(kf is the classifier trained on all data except ),( kk yx
Approximation Strategy
Estimate )(kf based on f to derive an estimate of
the LOO margin )(kkz
![Page 6: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/6.jpg)
Unlearning and LOO Margin Estimation
Repeated Steps in LOOUnlearn the example
Compute the LOO margin
Exact UnlearningSet regularization parameter to remove example’s influenceUpdate classifier parameters to minimize
via multiple Newton steps
),( kk yx)(k
kz
0=kC
∑≠=
+Ω=N
kii
iik
bzgCbbL
1
)(
,)(),),(min αα
α(
![Page 7: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/7.jpg)
Unlearning and LOO Margin Estimation
Approximate UnlearningCompute only one Newton step to approximate the change in the classifier parameters
Estimate of LOO margin is then
∆+∆+≈ ∑
∈ SIjkjjjkk
kk bxxKyyzz ),()( α
( ) ( )),(minarg), **,
)(,
1
,
)(,
2
****bLbLL
b b
kbb
kb αα
αααα
α ( where =∇
∇−=
∆∆ −
[ ]),(
),(2
1
1)(
**
),(ˆ
ˆˆ)(1ˆˆ)(
bb
kSkjkjk
kkkk
kkkkk
kk
LyIjxxKyy
zgCzgCzz
ααH
q
qHqqHq
∇=∈∀=
′′−′
+≈−
−
where
T
T
T
L
![Page 8: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/8.jpg)
Comments about the Method
Main Computational ExpenseComputing for each exampleInverse Hessian available from training phase
Margin SensitivityLimiting form of approximation yields
Margin increases monotonically with decreasing regularization (increasing Ck )Implies ∆zk ≤ 0 when unlearning the example – examples classified incorrectly will remain in error after unlearning
GeneralizationsApproximations can be derived for other strictly convex objective functions and cross-validation methods (e.g. K-fold cross-validation)
kk qHq ˆˆ 1−T
0ˆˆ)( 1 ≥′−=∂∂ −
kkkk
k zgCz qHq T
![Page 9: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/9.jpg)
Evaluating the Approximation
Two Questions1. How much approximation error are we incurring?2. How much computational savings are we gaining?
Evaluation Approach1. Compute fraction of margin sign errors
Fraction of examples that yield exact and approximate LOO margins with different signs
2. Compute ratio of CPU time used to compute exact and approximate LOO error estimates
Used cputime command in MATLAB to collect measurements
![Page 10: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/10.jpg)
Fraction of Margin Sign Errors
![Page 11: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/11.jpg)
Reduction in CPU Time
![Page 12: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/12.jpg)
LOO for Model Selection
![Page 13: Approximate Leave-One-Out Error Estimation for Learning with Smooth, Strictly Convex Margin Loss Functions - MLSP 2004](https://reader038.fdocuments.in/reader038/viewer/2022103117/559d3dce1a28ab1a078b476f/html5/thumbnails/13.jpg)
Conclusion and Future Work
RecapIntroduced approximate LOO error estimation method for strictly convex objective functionsMethod estimates the change in the margin by computing a single Newton step to approximately unlearn a given example
Future WorkInvestigate causes of LOO error failing to track the test errorDefine and evaluate method for model selection based on the LOO margin distribution