An introduction to Kolmogorov complexity (and its …...Kolmogorov complexity by algorithmic means....

Post on 06-Aug-2020

15 views 0 download

Transcript of An introduction to Kolmogorov complexity (and its …...Kolmogorov complexity by algorithmic means....

An introduction to Kolmogorov complexity

(and its applications)

Laurent Bienvenu ( LIAFA, CNRS & Université de Paris 7 )

CIRM, MarseilleFebruary 9, 2010

1. Kolmogorov complexity

Some truly random sequences?.

Let us imagine a company/website selling DVD’s, each containing asequences of 109 bits, and advertised as “truly random”.

We decide to order 4 such DVD’s

1. Kolmogorov complexity 3/22

1st DVD:

000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.....

1. Kolmogorov complexity 4/22

2nd DVD:

01010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010...

1. Kolmogorov complexity 5/22

3rd DVD:

001001000011111101101010110010001000010110100011000010001101001100010011000110011000101000101110000000110111000001110011010001001010010100001001001110000010001000101001100111110011000111010000000010000010111010111010100110....

1. Kolmogorov complexity 6/22

3rd DVD:

001001000011111101101010110010001000010110100011000010001101001100010011000110011000101000101110000000110111000001110011010001001010010100001001001110000010001000101001100111110011000111010000000010000010111010111010100110....

It’s almost π written in binary!

1. Kolmogorov complexity 7/22

4th DVD:

111000000111000000000111000111000111000111000000111111000000000111111000000000000111000000111000000111000111111111111111000111000111000111000000000111000000111000000000111111000111000000000000111111111000000111000111111111000......

1. Kolmogorov complexity 8/22

None of these sequences look “random”

A priori, all sequences of length N have the same probability tooccur.

A posteriori, some of them look non-random.

How to formalize this intuition?

1. Kolmogorov complexity 9/22

None of these sequences look “random”

A priori, all sequences of length N have the same probability tooccur.

A posteriori, some of them look non-random.

How to formalize this intuition?

1. Kolmogorov complexity 9/22

None of these sequences look “random”

A priori, all sequences of length N have the same probability tooccur.

A posteriori, some of them look non-random.

How to formalize this intuition?

1. Kolmogorov complexity 9/22

Leibniz’s philosphy.

G.W. Leibniz (∼ 1686).

“[...] Mais il est bon de considérer que Dieu ne fait rien hors d’ordre[....] car quant à l’ordre universel, tout y est conforme. Ce qui est sivrai, que non seulement rien n’arrive dans le monde, qui soitabsolument irregulier, mais on ne sçauroit mêmes rien feindre de tel”

“[...] Mais quand une règle est fort composée, ce qui luy estconforme passe pour irregulier”

1. Kolmogorov complexity 10/22

Leibniz’s philosphy.

G.W. Leibniz (∼ 1686).

“[...] Mais il est bon de considérer que Dieu ne fait rien hors d’ordre[....] car quant à l’ordre universel, tout y est conforme. Ce qui est sivrai, que non seulement rien n’arrive dans le monde, qui soitabsolument irregulier, mais on ne sçauroit mêmes rien feindre de tel”

“[...] Mais quand une règle est fort composée, ce qui luy estconforme passe pour irregulier”

1. Kolmogorov complexity 10/22

The shortest description.

Idea: a non-random sequence will be regular, i.e., can be describedeasily.

We have to be careful with the term “description”. To wit, theso-called Berry paradox (discussed by Russel, 1927); consider:

“The smallest integer which cannot be described byless than one hundred words ”

1. Kolmogorov complexity 11/22

The shortest description.

Idea: a non-random sequence will be regular, i.e., can be describedeasily.

We have to be careful with the term “description”. To wit, theso-called Berry paradox (discussed by Russel, 1927); consider:

“The smallest integer which cannot be described byless than one hundred words ”

1. Kolmogorov complexity 11/22

The shortest description.

Idea: a non-random sequence will be regular, i.e., can be describedeasily.

We have to be careful with the term “description”. To wit, theso-called Berry paradox (discussed by Russel, 1927); consider:

“The smallest integer which cannot be described byless than one hundred words ”

1. Kolmogorov complexity 11/22

Kolmogorov complexity.

Chaitin, Kolmogorov (1966).

DefinitionLet x be a finite binary string. We call Kolmogorov complexity of x thequantity K(x) defined by

K(x) = the shortest computer program (in binary) that generates x

1. Kolmogorov complexity 12/22

Making things fully formal (1).

One may argue that K(x) depends on what we mean by “program”and also on the “operating system” on which our “programs” run.

Choose a model of computation for functions {0, 1}∗ → {0, 1}∗

(Turing machines, RAM machines etc.) such that there exists auniversal machine U, defined by:

U(0n1p) = Mn(p)

where Mn is the n-th machine.

1. Kolmogorov complexity 13/22

Making things fully formal (1).

One may argue that K(x) depends on what we mean by “program”and also on the “operating system” on which our “programs” run.

Choose a model of computation for functions {0, 1}∗ → {0, 1}∗

(Turing machines, RAM machines etc.) such that there exists auniversal machine U, defined by:

U(0n1p) = Mn(p)

where Mn is the n-th machine.

1. Kolmogorov complexity 13/22

Making things fully formal (2).

The machine U is additively optimal, i.e. it is better at describing thanany other machine, up to an additive constant. Formally:

PropositionFor any given machine M, there exists a constant cM such that forall p, x if M(p) = x is defined, then there exists p ′ such that|p ′| ≤ |p| + cM and that U(p ′) = x.

Then, set

K(x) = min{

|p| : U(p) = x}

We can view p as the shortest description or ideal compression of x.

1. Kolmogorov complexity 14/22

Kolmogorov complexity is well-defined, up to an additive constant.

Typically, we prove results of type

K(x) ≤ |x|/2 − O(1),

K(x) ≥ n − O(1),

etc.

1. Kolmogorov complexity 15/22

Basic properties (1).

The complexity of a string x is at most its length.

PropositionFor any string x, K(x) ≤ |x| + O(1)

A very intuitive result, as one can always describe a string by giving itexplicitely.

1. Kolmogorov complexity 16/22

Basic properties (2).PropositionFor all k:

#{

x : K(x) < k}

< 2k

Indeed, there are 20 + 21 + . . . + 2k−1 < 2k programs of size < k.

CorollaryFor a given n and c, there is only a proportion 2−c of strings oflenght n whose complexity is less than n − c.

Intuitive again: a string chosen at random should be close toincompressible with high probability. From this, it makes sense tocall algorithmically random any string x of whose complexity is closeto |x|.

1. Kolmogorov complexity 17/22

Basic properties (2).PropositionFor all k:

#{

x : K(x) < k}

< 2k

Indeed, there are 20 + 21 + . . . + 2k−1 < 2k programs of size < k.

CorollaryFor a given n and c, there is only a proportion 2−c of strings oflenght n whose complexity is less than n − c.

Intuitive again: a string chosen at random should be close toincompressible with high probability. From this, it makes sense tocall algorithmically random any string x of whose complexity is closeto |x|.

1. Kolmogorov complexity 17/22

Basic properties (3).

Another fundamental property is that it is not possible to increaseKolmogorov complexity by algorithmic means.

PropositionFor any computable function f, there exists a constant cf such that forall x

K(f(x)) ≤ K(x) + cf

This also shows that we can define Kolmogorov complexity to anytype of object which can be encoded in a binary string (integers,finite graph, pair of strings). The choice of the encoding will onlyaffect the complexity by a constant.

1. Kolmogorov complexity 18/22

Basic properties (3).

Another fundamental property is that it is not possible to increaseKolmogorov complexity by algorithmic means.

PropositionFor any computable function f, there exists a constant cf such that forall x

K(f(x)) ≤ K(x) + cf

This also shows that we can define Kolmogorov complexity to anytype of object which can be encoded in a binary string (integers,finite graph, pair of strings). The choice of the encoding will onlyaffect the complexity by a constant.

1. Kolmogorov complexity 18/22

Kolmogorov complexity for other objects.

For an integer m, K(m) ≤ log(m) + O(1)

For a finite graph G with n vertices, K(G) ≤ n2 + O(1)

For a pair of objects x, y, K(x, y) ≤ K(x) + K(y) + O(log |x|, log |y|)(the log term disappears if x and y are of about the same length)

1. Kolmogorov complexity 19/22

Basic properties (4).

The last important property is bad news: Kolmogorov complexity isnot a computable function :-(

The proof is essentially Berry’s paradox! Suppose K is computable.We can then design a computable function f : N → N by

f(n) = min {m : K(m) ≥ n}

By definition, for all n, K(f(n)) ≥ n. But also K(f(n)) ≤ K(n) + cf

(non-creation of complexity), and K(n) ≤ log(n) + O(1).

So we would have log(n) + O(1) ≥ n for all n. An obviouscontradiction!

1. Kolmogorov complexity 20/22

Basic properties (4).

The last important property is bad news: Kolmogorov complexity isnot a computable function :-(

The proof is essentially Berry’s paradox! Suppose K is computable.We can then design a computable function f : N → N by

f(n) = min {m : K(m) ≥ n}

By definition, for all n, K(f(n)) ≥ n. But also K(f(n)) ≤ K(n) + cf

(non-creation of complexity), and K(n) ≤ log(n) + O(1).

So we would have log(n) + O(1) ≥ n for all n. An obviouscontradiction!

1. Kolmogorov complexity 20/22

Basic properties (4).

The last important property is bad news: Kolmogorov complexity isnot a computable function :-(

The proof is essentially Berry’s paradox! Suppose K is computable.We can then design a computable function f : N → N by

f(n) = min {m : K(m) ≥ n}

By definition, for all n, K(f(n)) ≥ n. But also K(f(n)) ≤ K(n) + cf

(non-creation of complexity), and K(n) ≤ log(n) + O(1).

So we would have log(n) + O(1) ≥ n for all n. An obviouscontradiction!

1. Kolmogorov complexity 20/22

Conditional complexity.

Kolmogorov complexity can be seen as an algorithmic version ofentropy. Like for entropy, we can define a conditional version:

K(x | y) = the shortest computer program (in binary)

that transforms y into x

(the formalization is done as before).

A fundamental result is the symmetry of informa-tion (Levin and Kolmogorov ∼ 1970).K(x, y) = K(x) + K(y | x) (up to logarithmic term)

1. Kolmogorov complexity 21/22

Conditional complexity.

Kolmogorov complexity can be seen as an algorithmic version ofentropy. Like for entropy, we can define a conditional version:

K(x | y) = the shortest computer program (in binary)

that transforms y into x

(the formalization is done as before).

A fundamental result is the symmetry of informa-tion (Levin and Kolmogorov ∼ 1970).K(x, y) = K(x) + K(y | x) (up to logarithmic term)

1. Kolmogorov complexity 21/22

2. Randomness for infinite sequences