Rates of convergence of nearest neighbor estimation under arbitrary ...
Noisy Optimization Convergence Rates (GECCO2013)
-
Upload
jialin-liu -
Category
Presentations & Public Speaking
-
view
15 -
download
0
Transcript of Noisy Optimization Convergence Rates (GECCO2013)
Log-logConvergence forNoisyOptimizationS. Astete-Morales, J. Liu, O. Teytaud
[email protected], INRIA-CNRS-LRI, Univ. Paris-Sud, 91190 Gif-sur-Yvette,France
Abstract
We consider: Noisy optimization problems, without the assumption of variance vanishing in theneighborhood of the optimum.We show mathematically: Exponential number of resamplings and number of resamplings poly-nomial in the inverse step-size lead to a log-log convergence rate.We show empirically: Convergence rate is obtained also with polynomial resampling setting.
Compared to the state of the art: Our results provide
i) Proofs of log-log convergence for evolution strategies (which were not covered byexisting results) in the case of objective functions with quadratic expectations andnoise with constant variance.
ii) Log-log rates also for objective functions with expectation Ef(x) = ||x− x∗||p.
iii) Experiments with different parametrizations.
Notationd: dimensionn: # iterationxn: parent at nrn: # eval. at n for each
en: # eval. at np: power in fitσn: step-sizeN : Gaussian
Algorithm: (µ, σ)-ESInitialize ParametersInput: initial individual and initial step-sizen← 1while (true) doGenerate λ individuals independently, each:ij = xn + σnNEvaluate each of them and average their fit-ness values.Select the µ best individualsUpdaten← n+ 1
end while
Theoretical Analysis
Focus: Simple revaluation rules, choosing thenumber of resamplings.
• Preliminary: Noise-free [Auger,A.]Some ES verify:
log(||xn||)n
< C < 0
• Non Adaptive, Scale InvarianceWe prove: If rn = dKζne then
log(||xn||)n
< C ′ < 0
• Adaptive, no Scale InvarianceWe prove: If rn = dY σ−ηn e then
log(||xn||)n
< C ′′ < 0
Different settingsand ⇒ Same property!
ad hoc resampling
Experimental Results• Polynomial number of resamplings:
Setting:No Scale Invariancefitness(x) = ||x||p +N , p = 2We experiment: rn = dLnζe
Results:ζ = 0⇒ poor resultsζ = 1, 2 or 3⇒ slopes close to − 1
2pBetter for ζ = 2 or ζ = 3 than ζ = 1
Conjecture:Asymptotic regime is − 1
2p , reached later for ζ large
0 5 10 15−5
−4
−3
−2
−1
0
1
log(Evaluations)
log(||y||)
K=2ζ=2p=2d=2
slope=−0.3267
0 5 10 15−5
−4
−3
−2
−1
0
1
log(Evaluations)
log(||y||)
slope=−0.2829 K=2ζ=2p=2d=4
Conclusion• Log-log Convergence:
Proof of convergence, no scale invariance, real algorithms in noisy caselog(||xn||)log(en)
→ C
• Room for Improvement:ConstantsLess noise ⇒ better ratesAlgorithms with surrogate models
References[1] Mohamed Jebalia and Anne Auger and Nikolaus Hansen: Log linear convergence and divergence of the scale-invariant (1+1)-ES in noisy environments, Springer (2010)[2] Rémi Coulom and Philippe Rolet and Nataliya Sokolovska and Olivier Teytaud: Handling Expensive Optimization with Large Noise, 19th International Symposium on
Foundations of Genetic Algorithms (2011)[3] Olivier Teytaud and Jérémie Decock: Noisy Optimization Complexity, FOGA - Foundations of Genetic Algorithms XII (2013)