ERM In afgzpiuh - maxim.ece.illinois.edu
Transcript of ERM In afgzpiuh - maxim.ece.illinois.edu
Learning Algorithms in Depth Stabilityh 13 read ch 3 on optimization
Recap data 2mi P hypotheses f e F
v s
Fn EF
e.g ERM In afgzpiuh.IE f Zi
In fzfnczypc.czcannersae
I
c
L 0
as a a
Sufficient condition for consistency UEEM
Pn f La fl Zi Epncf empiricalrisk
p f Ep f
spup Ep sup Pn Ifl P f lf C F
U CEM property of F ERM is consistent
But learnatility is possible w o UCEM
F given no assumptions
assume 7 I s t f z inf f z httf Ef
HEISEf
O z
E Fu IERM on will always return It
Closer look at learning algos
Framework Vapnik 1995
2 instance space
P class of prof dist on Z
F closed convex subset of a Hilbert space IfL F x Z IR loss fan
Lp f Ep elf 27
SylCf E Plotz
Examples 1 Il H S of fans Z IRIff z fiz
2 Fl HK RK Hs for some K
2 X y E X e It
life E elf Ix y 6C y fixpen fan
3 l tf CxiyH y fixt
Main idea want to bring out the dependaeof l t 7 on tooth f and Z
Learning algosAn Zn F one for each n
Z i Z all tuples over 2
A 2 F
Attn random element of F
Risks
Lpl Attn Syl ACE Platz
Attn th IZ l ACE Zi
Motivating Example ERM w strongly convexLosses
Ch 3
f z l f z
f E F closed convex subset of H H Il
Assume
1 l fiz is L Lipschitz in feet unit in 2
sup Il If z lcf 271 E LAF f'llc CZ
2 l Liz is in strongly convex in f EF waif in 2ma
Review 4 f IRF convex sef f f GI
aft Cl d f EF VdecorD
aF closed for f FEI
p f IR is convex if
41 aft d f Excelf ta d cecffor all f if C F DE co 13
y F 7112 is m strongly convex m o
if I f i Ucf mayfly is convex
rite not requiring 4 to be differentiable
Consequences of strong convexity
UH ft Il Hf Ed 41 f t A d elf conc
7 NI t Iff f 112YA
unique minimizes
if 41ft min elf thenf Et
Q1 f 3 Ulf t my Hf f 112
can be derived from A
let B f be an L Lip fan let
f argmin Ulff ee Lip perturbation
I argfmc.in Ulf t Bl f
Then 11 I f Il e LmStability of minimizers under Lip perturbationsif 4C is convex then 41 t Mz IT ism strongly convex
Back to learning
C fiz L lip m str cox in f C Funiformly in 2
ERM In _afrgzf.tn E llf Zi
Thm With Pr 31 S
L In Iff Lif s 2LSmh
Iote Lg dependence instead of log f
Proof Zn n Aten In3 A
Zit Aczfi InciZf El Enl P indep of 2N
Z Z1 Zi l Zi Ziti r Zn
Z Z g n Zi l Zi I g n Zn
Fix f EF
Intf Ln FE elf Zi Closs on 27
filet Lnlcfizi t ta E illfiZiloss on 27,7
tall f Z t Lncf In elf Zi
LnCf th lif E Lcf Zi
n for each i C In
nci f Lac flat life Zi fi f zip
In arqzin LnIt
Incite argfzi.hn f
claim 11 In Iii'll e
Proof of claim
c f t Lncf is m str aux ble l is
2 ft tallit elf Zi is 2 Lip
B f _th lif 2 it Lif Zi
1131 f Bcf 7 Ent Il I f Zi elf zit I
In It If 2 it elf 2 i
E 2 Hf f II since l isC Lip in f
L f Ln it t Blt
F.c Ttip
By stab at minimizer Iifa Idi711 E LIMOE
Claim E 14 In Lucinda 2M
Proof of claim In A Zn
ELL In EEL Attn
ECLCATED Zi D tiZn IZ
E CLC InD tn IE IE Cli Attu 2 i
Eun CINI tn Ece CA tent 2 i
V icons Attn ZiD AC 22,7 Zi
Eun tent to Ecl CAGED ti
i EILCIN LncIn
ntEEClC Inti HIM zi
Et EI L Ellin I H
E 2h2 BTm
E L In Lcf D Lif tariff UfELL In Luc InnEllen Iu Luff IE Clu ft Cf'D
To D
E 2Am
By Markov's inequality
IP LCIn L ft 3 E ELL ul Lcf 3t
z
f 22mmlet n2m 8 t 2L B
Tum
Key points
H ACHI ACZI 11 E ZIM ti
hypothesis stability
El's.IElCACzntzi 7 lCACznciD.ziI 2L
Tmreplace one stability