IDSIA Lugano Switzerland On the Convergence Speed of MDL Predictions for Bernoulli Sequences Jan...

IDSIA Lugano Switzerland

On the Convergence Speed of MDL Predictions for Bernoulli Sequences

Jan Poland and Marcus Hutter

Is MDL Really So Bad?

Big Picture

MDL Bayes Other methods, e.g. PAC-Bayes

Bernoulli Classes

121410

111001

111000

111010

341161101

111011

141161100

Code = 111|{z}1+#bi ts

0|{z}stop

10|{z}data

² Set of parameters £ = f#1;#2; : : :g½[0;1]² Weights w# for each#2 £² Weights correspond to codes: w#=2¡ (Code#)

² Givenobservedsequencex=x1x2:::xn

² Probabilityof x given#:p#(x) =##ones(x)(1¡ #)n¡ #ones(x)

² Posterior weights w#(x) =w#p#(x)P#w#p#(x)

² Bayesmixture»(x) =P#w#(x)#

² MDL/MAP #¤(x) =argmax#w#(x)#

² MaximumLikelihood (ML):SameasMAP,butwithprior weightsset to1

Estimators

An Example Process

Sequence x

Bayes mixture

ML estimate

MAP (MDL) *

0000011

...(32)...

...(640)...

Trueparameter#0= 5

16 =0:3125

² Let #0 2 £ bethe trueparameter withweight w0

² » converges to #0 almost surely and fast,

preciselyP 1t=0E(»¡ #0)2 · ln(w¡ 10 )

² #¤ convergesto#0 almost surelyandingeneral slow,

preciselyP 1t=0E(#¤ ¡ #0)2 · O(w¡ 10 )

² Even true for arbitrary non-i.i.d. (semi-) measures!² TheML estimates converge to #0 almost surely,no such assertion about convergencespeed possible

What We Know

² Bayesmixturebound is descri pt i on l ength(#0)

² MDL bound is exp(descri pt i on l ength(#0))

² ) MDL is exponentiallyworse in general

² This is also a loss bound!

² Howabout simple classes?

² Deterministic classes: can showboundhuge constant£(descri pt i on l ength(#0))3

² Simplestochastic classes, e.g. Bernoulli?

Is MDL Really So Bad?

N parameters, w#= 1Nfor all #, #0= 1

MDL Is Really So Bad!

12+ 116

12+ 18

12+ 14

: :: } }}}}

Pt E(#

¤ ¡ #0)21#¤2[12+18;12+14]

Pt E(#

¤ ¡ #0)21#¤2[12+ 116;12+ 18]

Pt E(#

¤ ¡ #0)2=O(w¡ 10 ) in the following example:

² The instantaneous loss bound is good,

precisely E (#¤ ¡ #0)2 · 1nO¡ln(w¡ 10 )

² This does not imply a ¯nitely bounded cumulativeloss!

² The cumulative loss bound is good for certain niceclasses (parameters+weights)

² Intuitively: Bound is good if parameters of equalweights areuniformly distributed

MDL Is Not That Bad!

² De ne interval construction (I k; J k) which exponen-tially contracts to #0

² Let K (I k) betheshortest description lengthof some#2 I k

Prepare Sharper Upper Bound

38#0= 1

4}J 0= [0; 12)

}I 0= [12;1]

I1 J1I1

² Let K (J k) betheshortest description lengthof some#2 J k

² Theorem:X

E(#¤ ¡ #0)2 · O¡lnw¡ 10 +

2¡ ¢ (k)p¢ (k)

² Corollaries: \Uniformly distributed weights ) goodbounds

Sharper Upper Bound

² £ = fall computable#2 [0;1]g

² w#=2¡ K (#), whereK denotesthepre xKolmogorovcomplexity

²Pk 2¡ ¢ (k)

p¢ (k) = 1 ) Theoremnot applicable

² Conjecture:X

E(#¤¡ #0)2 · O¡lnw¡ 10 +

2¡ ¢ (k)¢

² ) bound huge constant£pol ynomi al holds forincompressible#0

² Compare to determistic case

The Universal Case

² Cumulativeand instantaneousboundsareincompat-ible

² Main positivegeneralizes to arbitrary i.i.d. classes

² Openproblem: goodboundsformoregeneral classes?

² Thank you!

Conclusions

IDSIA Lugano Switzerland On the Convergence Speed of MDL Predictions for Bernoulli Sequences Jan...

Documents

Transcript of IDSIA Lugano Switzerland On the Convergence Speed of MDL Predictions for Bernoulli Sequences Jan...

Fitness Uniform Deletion: A Simple Way to Preserve Diversity Shane Legg & Marcus Hutter IDSIA Galleria 2 6928 Manno-Lugano Switzerland.

Frank Hutter Lars Kotthoff Joaquin Vanschoren Editors ...

The Ant Colony Optimization (ACO) Metaheuristic: a Swarm - Idsia

PAPARAN PUBLIK (PUBLIC EXPOSE) INSIDENTIL · 2019. 11. 26. · Products Cell MDL Cell MDL Cell MDL Cell MDL Cell MDL Cell MDL F1 (Cicadas) 50 100 200 200 200 200 200 F2 (Cisalak)

UT35A/MDL, UT32A/MDL Digital Indicating Controller (DIN ...

AMR Hutter S02 KinBasic

MDL 150 MDL MDL 300 MDL 400 MDL 500 User’s manual · MDL 100 MDL 150 MDL 200 MDL 300 MDL 400 MDL 500 User’s manual

Karl Hutter - Click Bond

U ALGORITHMIC INTELLIGENCE A mathematical top down ...hutter1.net/ai/aixigentle.pdfof the AIXI theory to other AI approaches. 1 2 Marcus Hutter, Technical Report, IDSIA-01-03 Contents

Deep Big Multilayer Perceptrons For Digit Recognition - Idsia

Learning Curve Theory - Marcus Hutter

Self-organized Cooperation between Robotic Swarms - Idsia

Flexible, High Performance Convolutional Neural Networks - Idsia

Applications of Discrete Optimization - The Swiss AI Lab IDSIA

Variability ENSO: Mdl < Obs NEUT: Mdl > Obs

Artificial Intelligence: Overview - Homepage of Marcus Hutter

New Methods for Spectral Clusteringrepository.supsi.ch/5528/1/IDSIA-12-04.pdfTechnical Report IDSIA-12-04 June 21, 2004 New Methods for Spectral Clustering Igor Fischer School of Computer

Universal Artificial Intelligence - Marcus Hutter · Marcus Hutter - 7 - Universal Arti cial Intelligence Relevant Research Fields (Universal) Artiﬁcial Intelligence has interconnections

Artificial Intelligence: Overview - Marcus Hutterhutter1.net/ai/sintro2ai.pdf · Artiﬂcial Intelligence: Overview - 1 - Marcus Hutter Artificial Intelligence: Overview Marcus Hutter

Bayesian Treatment of Incomplete Discrete Data applied to Mutual Information and Feature Selection Marcus Hutter & Marco Zaffalon IDSIA IDSIA Galleria.