Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k...

156
Nonstochastic Information Theory for Feedback Control Girish Nair Department of Electrical and Electronic Engineering University of Melbourne Australia Dutch Institute of Systems and Control Summer School Zandvoort aan Zee The Netherlands June 2015 Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 1 / 68

Transcript of Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k...

Page 1: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Nonstochastic Information Theoryfor Feedback Control

Girish Nair

Department of Electrical and Electronic EngineeringUniversity of Melbourne

Australia

Dutch Institute of Systems and Control Summer SchoolZandvoort aan ZeeThe Netherlands

June 2015

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 1 / 68

Page 2: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outline

1 Basics of Shannon Information Theory

2 Overview of Capacity-Limited State Estimation/Control

3 Motivation for Nonstochastic Control

4 Uncertain Variables, Unrelatedness and Markovness

5 Nonstochastic Information

6 Channels and Coding Theorems

7 State Estimation and Control via Noisy Channels

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 2 / 68

Page 3: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outline

1 Basics of Shannon Information Theory

2 Overview of Capacity-Limited State Estimation/Control

3 Motivation for Nonstochastic Control

4 Uncertain Variables, Unrelatedness and Markovness

5 Nonstochastic Information

6 Channels and Coding Theorems

7 State Estimation and Control via Noisy Channels

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 2 / 68

Page 4: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outline

1 Basics of Shannon Information Theory

2 Overview of Capacity-Limited State Estimation/Control

3 Motivation for Nonstochastic Control

4 Uncertain Variables, Unrelatedness and Markovness

5 Nonstochastic Information

6 Channels and Coding Theorems

7 State Estimation and Control via Noisy Channels

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 2 / 68

Page 5: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outline

1 Basics of Shannon Information Theory

2 Overview of Capacity-Limited State Estimation/Control

3 Motivation for Nonstochastic Control

4 Uncertain Variables, Unrelatedness and Markovness

5 Nonstochastic Information

6 Channels and Coding Theorems

7 State Estimation and Control via Noisy Channels

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 2 / 68

Page 6: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outline

1 Basics of Shannon Information Theory

2 Overview of Capacity-Limited State Estimation/Control

3 Motivation for Nonstochastic Control

4 Uncertain Variables, Unrelatedness and Markovness

5 Nonstochastic Information

6 Channels and Coding Theorems

7 State Estimation and Control via Noisy Channels

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 2 / 68

Page 7: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outline

1 Basics of Shannon Information Theory

2 Overview of Capacity-Limited State Estimation/Control

3 Motivation for Nonstochastic Control

4 Uncertain Variables, Unrelatedness and Markovness

5 Nonstochastic Information

6 Channels and Coding Theorems

7 State Estimation and Control via Noisy Channels

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 2 / 68

Page 8: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outline

1 Basics of Shannon Information Theory

2 Overview of Capacity-Limited State Estimation/Control

3 Motivation for Nonstochastic Control

4 Uncertain Variables, Unrelatedness and Markovness

5 Nonstochastic Information

6 Channels and Coding Theorems

7 State Estimation and Control via Noisy Channels

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 2 / 68

Page 9: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

What is Information?

Wikipedia:Information is that which informsi.e. that from which data and knowledge can be derived(as data represents values attributed to parameters,while knowledge is acquired through understanding ofreal things or abstract concepts, in any particular field ofstudy).

In [Shannon BSTJ 1948], information was somewhat moreconcretely defined, within a probability space.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 3 / 68

Page 10: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

What is Information?

Wikipedia:Information is that which informsi.e. that from which data and knowledge can be derived(as data represents values attributed to parameters,while knowledge is acquired through understanding ofreal things or abstract concepts, in any particular field ofstudy).

In [Shannon BSTJ 1948], information was somewhat moreconcretely defined, within a probability space.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 3 / 68

Page 11: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

What is Information?

Wikipedia:Information is that which informsi.e. that from which data and knowledge can be derived(as data represents values attributed to parameters,while knowledge is acquired through understanding ofreal things or abstract concepts, in any particular field ofstudy).

In [Shannon BSTJ 1948], information was somewhat moreconcretely defined, within a probability space.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 3 / 68

Page 12: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

What is Information?

Wikipedia:Information is that which informsi.e. that from which data and knowledge can be derived(as data represents values attributed to parameters,while knowledge is acquired through understanding ofreal things or abstract concepts, in any particular field ofstudy).

In [Shannon BSTJ 1948], information was somewhat moreconcretely defined, within a probability space.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 3 / 68

Page 13: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Shannon Entropy

Prior uncertainty or entropy of a discrete random variable (rv)X ⇠ pX

H[X ] := Elog2

✓1

pX (X )

◆�=�Â

xpX (x) log2 pX (x)� 0.

Minimum expected no. yes/no questions sufficient to determine X .Joint (discrete) entropy H[X ,Y ] defined by replacing pX with pX ,Y .Conditional entropy of X given Yis average uncertainty in X given Y :

H[X |Y ] := Elog2

✓1

pX |Y (X |Y )

◆�⌘ H[X ,Y ]�H[Y ] (� 0).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 4 / 68

Page 14: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Shannon Entropy

Prior uncertainty or entropy of a discrete random variable (rv)X ⇠ pX

H[X ] := Elog2

✓1

pX (X )

◆�=�Â

xpX (x) log2 pX (x)� 0.

Minimum expected no. yes/no questions sufficient to determine X .Joint (discrete) entropy H[X ,Y ] defined by replacing pX with pX ,Y .Conditional entropy of X given Yis average uncertainty in X given Y :

H[X |Y ] := Elog2

✓1

pX |Y (X |Y )

◆�⌘ H[X ,Y ]�H[Y ] (� 0).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 4 / 68

Page 15: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Shannon Entropy

Prior uncertainty or entropy of a discrete random variable (rv)X ⇠ pX

H[X ] := Elog2

✓1

pX (X )

◆�=�Â

xpX (x) log2 pX (x)� 0.

Minimum expected no. yes/no questions sufficient to determine X .Joint (discrete) entropy H[X ,Y ] defined by replacing pX with pX ,Y .Conditional entropy of X given Yis average uncertainty in X given Y :

H[X |Y ] := Elog2

✓1

pX |Y (X |Y )

◆�⌘ H[X ,Y ]�H[Y ] (� 0).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 4 / 68

Page 16: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Shannon Information

Information gained about X from Y := Reduction in uncertainty:

I[X ;Y ] := H[X ]�H[X |Y ].

Called mutual information since symmetric:

I[X ;Y ] = �Âx ,y

pX ,Y (x ,y) log2

✓pX (x)pY (y)pX ,Y (x ,y)

⌘ H[X ]+H[Y ]�H[X ,Y ].

For continuous X ,Y , replace pmf’s p with corresponding pdf’s f .

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 5 / 68

Page 17: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Shannon Information

Information gained about X from Y := Reduction in uncertainty:

I[X ;Y ] := H[X ]�H[X |Y ].

Called mutual information since symmetric:

I[X ;Y ] = �Âx ,y

pX ,Y (x ,y) log2

✓pX (x)pY (y)pX ,Y (x ,y)

⌘ H[X ]+H[Y ]�H[X ,Y ].

For continuous X ,Y , replace pmf’s p with corresponding pdf’s f .

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 5 / 68

Page 18: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Shannon Information

Information gained about X from Y := Reduction in uncertainty:

I[X ;Y ] := H[X ]�H[X |Y ].

Called mutual information since symmetric:

I[X ;Y ] = �Âx ,y

pX ,Y (x ,y) log2

✓pX (x)pY (y)pX ,Y (x ,y)

⌘ H[X ]+H[Y ]�H[X ,Y ].

For continuous X ,Y , replace pmf’s p with corresponding pdf’s f .

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 5 / 68

Page 19: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Codes for Stationary Memoryless Random Channels

Decoder Channel kY

Coder kX

n

kkknnXY

XY

xyqxyp

xyqxyp

nn

kk

0:0:0|

|

)|()|(

)|()|(

:0:0

M Message

A block code is defined byan error tolerance e > 0, block length n+1 2 Nand message-set cardinality µ � 1;an encoder mapping g s.t. for any independent, uniformlydistributed message M 2 {m1, . . . ,mµ},

X0:n = g(i) if M = mi ;

and a decoder M̂ = d (Y0:n) s.t. Pr[M̂ 6= M] e .Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 6 / 68

Page 20: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Codes for Stationary Memoryless Random Channels

Decoder Channel kY

Coder kX

n

kkknnXY

XY

xyqxyp

xyqxyp

nn

kk

0:0:0|

|

)|()|(

)|()|(

:0:0

M Message

A block code is defined byan error tolerance e > 0, block length n+1 2 Nand message-set cardinality µ � 1;an encoder mapping g s.t. for any independent, uniformlydistributed message M 2 {m1, . . . ,mµ},

X0:n = g(i) if M = mi ;

and a decoder M̂ = d (Y0:n) s.t. Pr[M̂ 6= M] e .Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 6 / 68

Page 21: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Codes for Stationary Memoryless Random Channels

Decoder Channel kY

Coder kX

n

kkknnXY

XY

xyqxyp

xyqxyp

nn

kk

0:0:0|

|

)|()|(

)|()|(

:0:0

M Message

A block code is defined byan error tolerance e > 0, block length n+1 2 Nand message-set cardinality µ � 1;an encoder mapping g s.t. for any independent, uniformlydistributed message M 2 {m1, . . . ,mµ},

X0:n = g(i) if M = mi ;

and a decoder M̂ = d (Y0:n) s.t. Pr[M̂ 6= M] e .Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 6 / 68

Page 22: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Codes for Stationary Memoryless Random Channels

Decoder Channel kY

Coder kX

n

kkknnXY

XY

xyqxyp

xyqxyp

nn

kk

0:0:0|

|

)|()|(

)|()|(

:0:0

M Message

A block code is defined byan error tolerance e > 0, block length n+1 2 Nand message-set cardinality µ � 1;an encoder mapping g s.t. for any independent, uniformlydistributed message M 2 {m1, . . . ,mµ},

X0:n = g(i) if M = mi ;

and a decoder M̂ = d (Y0:n) s.t. Pr[M̂ 6= M] e .Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 6 / 68

Page 23: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Capacity and Information

Define (ordinary) capacity C operationally as the highest block-codingrate that yields vanishing error probability:

C := lime!0

supn,µ2N,g,d

log2 µn+1

= lime!0

limn!•

supµ2N,g,d

log2 µn+1

.

Shannon showed that capacity can also be thought of intrinsically, asthe maximum information rate across channel:

Theorem (Shannon BSTJ 1948)

C = supn�0,pX0:n

I[X0:n;Y0:n]

n+1

✓= sup

pX

I[X ;Y ]

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 7 / 68

Page 24: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Capacity and Information

Define (ordinary) capacity C operationally as the highest block-codingrate that yields vanishing error probability:

C := lime!0

supn,µ2N,g,d

log2 µn+1

= lime!0

limn!•

supµ2N,g,d

log2 µn+1

.

Shannon showed that capacity can also be thought of intrinsically, asthe maximum information rate across channel:

Theorem (Shannon BSTJ 1948)

C = supn�0,pX0:n

I[X0:n;Y0:n]

n+1

✓= sup

pX

I[X ;Y ]

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 7 / 68

Page 25: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Networked State Estimation/Control

Classical assumption: controllers and estimators knew plantoutputs perfectly.Since the 60’s this assumption has been challenged:

I Delays, due to latency and intermittent channel access, in controlarea networks.

I Quantisation errors in digital control,I Finite communication capacity per sensor in long-range radar

surveillance networks

Focus here on limited quantiser resolution and capacity.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 8 / 68

Page 26: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Estimation/Control over Communication Channels

kU kY Decoder/ Estimator

kX̂Channel

kQQuantiser/ Coder

kS

kkkk

kkk

VBUAXXWGXY

�� �

�1

,

kk WV , Noise

kkkk

kkk

VBUAXXWGXY

�� �

�1

, kU kY Decoder/ Controller

Channel kQ Quantiser/ Coder

kS

kk WV , Noise

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 9 / 68

Page 27: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Additive Noise Model

Early work considered errorless digital channels and staticquantisers, with uniform quantiser errors modelled as additive,uncorrelated noise [e.g. Curry 1970] with variance µ 2�2R (R = bitrate).Good approximation for stable plants and high R, and allowslinear stochastic estimation/control theory to be applied.However, for unstable plants it leads to conclusions that arewrong, e.g.

I if plant is noiseless and unstable, then states/estimation errorscannot converge to 0;

I and if plant is unstable, then mean-square-boundedstates/estimation errors can always be achieved.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 10 / 68

Page 28: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Additive Noise Model

Early work considered errorless digital channels and staticquantisers, with uniform quantiser errors modelled as additive,uncorrelated noise [e.g. Curry 1970] with variance µ 2�2R (R = bitrate).Good approximation for stable plants and high R, and allowslinear stochastic estimation/control theory to be applied.However, for unstable plants it leads to conclusions that arewrong, e.g.

I if plant is noiseless and unstable, then states/estimation errorscannot converge to 0;

I and if plant is unstable, then mean-square-boundedstates/estimation errors can always be achieved.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 10 / 68

Page 29: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Additive Noise Model

Early work considered errorless digital channels and staticquantisers, with uniform quantiser errors modelled as additive,uncorrelated noise [e.g. Curry 1970] with variance µ 2�2R (R = bitrate).Good approximation for stable plants and high R, and allowslinear stochastic estimation/control theory to be applied.However, for unstable plants it leads to conclusions that arewrong, e.g.

I if plant is noiseless and unstable, then states/estimation errorscannot converge to 0;

I and if plant is unstable, then mean-square-boundedstates/estimation errors can always be achieved.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 10 / 68

Page 30: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Additive Noise Model

Early work considered errorless digital channels and staticquantisers, with uniform quantiser errors modelled as additive,uncorrelated noise [e.g. Curry 1970] with variance µ 2�2R (R = bitrate).Good approximation for stable plants and high R, and allowslinear stochastic estimation/control theory to be applied.However, for unstable plants it leads to conclusions that arewrong, e.g.

I if plant is noiseless and unstable, then states/estimation errorscannot converge to 0;

I and if plant is unstable, then mean-square-boundedstates/estimation errors can always be achieved.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 10 / 68

Page 31: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Errorless Channels - Data Rate TheoremIn fact, coding-based analyses reveal that stable stateestimation/control possible iff

R > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Holds under various assumptions and stability notions:

I Random initial state, noiseless plant; mean r th power convergenceto 0.[N.-Evans, Auto.03]

I Bounded initial state, noiseless plant; uniform convergence to 0[Tatikonda-Mitter, TAC04]

I Random plant noise; mean-square boundedness.[N.-Evans, SICON04]I Bounded plant noise; uniform boundedness [Tatikonda-Mitter, TAC04]

Additive uncorrelated noise models of quantisation fail to capturethe existence of such a threshold.Necessity typically proved using differential entropy power, quantisation theory or volumepartitioning bounds.

Sufficiency via explicit construction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 11 / 68

Page 32: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Errorless Channels - Data Rate TheoremIn fact, coding-based analyses reveal that stable stateestimation/control possible iff

R > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Holds under various assumptions and stability notions:

I Random initial state, noiseless plant; mean r th power convergenceto 0.[N.-Evans, Auto.03]

I Bounded initial state, noiseless plant; uniform convergence to 0[Tatikonda-Mitter, TAC04]

I Random plant noise; mean-square boundedness.[N.-Evans, SICON04]I Bounded plant noise; uniform boundedness [Tatikonda-Mitter, TAC04]

Additive uncorrelated noise models of quantisation fail to capturethe existence of such a threshold.Necessity typically proved using differential entropy power, quantisation theory or volumepartitioning bounds.

Sufficiency via explicit construction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 11 / 68

Page 33: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Errorless Channels - Data Rate TheoremIn fact, coding-based analyses reveal that stable stateestimation/control possible iff

R > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Holds under various assumptions and stability notions:

I Random initial state, noiseless plant; mean r th power convergenceto 0.[N.-Evans, Auto.03]

I Bounded initial state, noiseless plant; uniform convergence to 0[Tatikonda-Mitter, TAC04]

I Random plant noise; mean-square boundedness.[N.-Evans, SICON04]I Bounded plant noise; uniform boundedness [Tatikonda-Mitter, TAC04]

Additive uncorrelated noise models of quantisation fail to capturethe existence of such a threshold.Necessity typically proved using differential entropy power, quantisation theory or volumepartitioning bounds.

Sufficiency via explicit construction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 11 / 68

Page 34: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Errorless Channels - Data Rate TheoremIn fact, coding-based analyses reveal that stable stateestimation/control possible iff

R > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Holds under various assumptions and stability notions:

I Random initial state, noiseless plant; mean r th power convergenceto 0.[N.-Evans, Auto.03]

I Bounded initial state, noiseless plant; uniform convergence to 0[Tatikonda-Mitter, TAC04]

I Random plant noise; mean-square boundedness.[N.-Evans, SICON04]I Bounded plant noise; uniform boundedness [Tatikonda-Mitter, TAC04]

Additive uncorrelated noise models of quantisation fail to capturethe existence of such a threshold.Necessity typically proved using differential entropy power, quantisation theory or volumepartitioning bounds.

Sufficiency via explicit construction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 11 / 68

Page 35: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Errorless Channels - Data Rate TheoremIn fact, coding-based analyses reveal that stable stateestimation/control possible iff

R > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Holds under various assumptions and stability notions:

I Random initial state, noiseless plant; mean r th power convergenceto 0.[N.-Evans, Auto.03]

I Bounded initial state, noiseless plant; uniform convergence to 0[Tatikonda-Mitter, TAC04]

I Random plant noise; mean-square boundedness.[N.-Evans, SICON04]I Bounded plant noise; uniform boundedness [Tatikonda-Mitter, TAC04]

Additive uncorrelated noise models of quantisation fail to capturethe existence of such a threshold.Necessity typically proved using differential entropy power, quantisation theory or volumepartitioning bounds.

Sufficiency via explicit construction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 11 / 68

Page 36: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Errorless Channels - Data Rate TheoremIn fact, coding-based analyses reveal that stable stateestimation/control possible iff

R > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Holds under various assumptions and stability notions:

I Random initial state, noiseless plant; mean r th power convergenceto 0.[N.-Evans, Auto.03]

I Bounded initial state, noiseless plant; uniform convergence to 0[Tatikonda-Mitter, TAC04]

I Random plant noise; mean-square boundedness.[N.-Evans, SICON04]I Bounded plant noise; uniform boundedness [Tatikonda-Mitter, TAC04]

Additive uncorrelated noise models of quantisation fail to capturethe existence of such a threshold.Necessity typically proved using differential entropy power, quantisation theory or volumepartitioning bounds.

Sufficiency via explicit construction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 11 / 68

Page 37: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Errorless Channels - Data Rate TheoremIn fact, coding-based analyses reveal that stable stateestimation/control possible iff

R > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Holds under various assumptions and stability notions:

I Random initial state, noiseless plant; mean r th power convergenceto 0.[N.-Evans, Auto.03]

I Bounded initial state, noiseless plant; uniform convergence to 0[Tatikonda-Mitter, TAC04]

I Random plant noise; mean-square boundedness.[N.-Evans, SICON04]I Bounded plant noise; uniform boundedness [Tatikonda-Mitter, TAC04]

Additive uncorrelated noise models of quantisation fail to capturethe existence of such a threshold.Necessity typically proved using differential entropy power, quantisation theory or volumepartitioning bounds.

Sufficiency via explicit construction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 11 / 68

Page 38: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Noisy Channels

‘Stable’ states/estimation errors possible iff a suitable channelfigure-of-merit (FoM) satisfies

FoM > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Unlike noiseless channel case, FoM depends critically on stabilitynotion and noise model:

I FoM = C - states/est. errors ! 0 almost surely (a.s.) [Matveev-SavkinSIAM07], or mean-square bounded (MSB) states over AWGNchannel [Braslavsky et al. TAC07]

I FoM = Cany - MSB states over random discrete memorylesschannels [Sahai-Mitter TIT06]

I FoM = C0f for control or C0 for state estimation, with a.s. boundedstates/est. errors [Matveev-Savkin IJC07]

As C � Cany � C0f � C0, these criteria generally do not coincide.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 12 / 68

Page 39: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Noisy Channels

‘Stable’ states/estimation errors possible iff a suitable channelfigure-of-merit (FoM) satisfies

FoM > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Unlike noiseless channel case, FoM depends critically on stabilitynotion and noise model:

I FoM = C - states/est. errors ! 0 almost surely (a.s.) [Matveev-SavkinSIAM07], or mean-square bounded (MSB) states over AWGNchannel [Braslavsky et al. TAC07]

I FoM = Cany - MSB states over random discrete memorylesschannels [Sahai-Mitter TIT06]

I FoM = C0f for control or C0 for state estimation, with a.s. boundedstates/est. errors [Matveev-Savkin IJC07]

As C � Cany � C0f � C0, these criteria generally do not coincide.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 12 / 68

Page 40: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Noisy Channels

‘Stable’ states/estimation errors possible iff a suitable channelfigure-of-merit (FoM) satisfies

FoM > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Unlike noiseless channel case, FoM depends critically on stabilitynotion and noise model:

I FoM = C - states/est. errors ! 0 almost surely (a.s.) [Matveev-SavkinSIAM07], or mean-square bounded (MSB) states over AWGNchannel [Braslavsky et al. TAC07]

I FoM = Cany - MSB states over random discrete memorylesschannels [Sahai-Mitter TIT06]

I FoM = C0f for control or C0 for state estimation, with a.s. boundedstates/est. errors [Matveev-Savkin IJC07]

As C � Cany � C0f � C0, these criteria generally do not coincide.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 12 / 68

Page 41: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Noisy Channels

‘Stable’ states/estimation errors possible iff a suitable channelfigure-of-merit (FoM) satisfies

FoM > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Unlike noiseless channel case, FoM depends critically on stabilitynotion and noise model:

I FoM = C - states/est. errors ! 0 almost surely (a.s.) [Matveev-SavkinSIAM07], or mean-square bounded (MSB) states over AWGNchannel [Braslavsky et al. TAC07]

I FoM = Cany - MSB states over random discrete memorylesschannels [Sahai-Mitter TIT06]

I FoM = C0f for control or C0 for state estimation, with a.s. boundedstates/est. errors [Matveev-Savkin IJC07]

As C � Cany � C0f � C0, these criteria generally do not coincide.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 12 / 68

Page 42: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Noisy Channels

‘Stable’ states/estimation errors possible iff a suitable channelfigure-of-merit (FoM) satisfies

FoM > Â|li |�1

log2 |li |,

where l1, . . . ,ln = eigenvalues of plant matrix A.Unlike noiseless channel case, FoM depends critically on stabilitynotion and noise model:

I FoM = C - states/est. errors ! 0 almost surely (a.s.) [Matveev-SavkinSIAM07], or mean-square bounded (MSB) states over AWGNchannel [Braslavsky et al. TAC07]

I FoM = Cany - MSB states over random discrete memorylesschannels [Sahai-Mitter TIT06]

I FoM = C0f for control or C0 for state estimation, with a.s. boundedstates/est. errors [Matveev-Savkin IJC07]

As C � Cany � C0f � C0, these criteria generally do not coincide.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 12 / 68

Page 43: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Missing Information

If the goal is MSB or a.s. convergence ! 0 of states/estimationerrors, then information theory is crucial for finding lower bounds.However, when the goal is a.s. bounded states/errors, classicalinformation theory has played no role so far in networkedestimation/control.Yet information in some sense must be flowing across thechannel, even without a probabilistic model/objective.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 13 / 68

Page 44: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Questions

Is there a meaningful theory of information for nonrandomvariables?Can we construct an information-theoretic basis for networkedestimation/control with nonrandom noise?Are there intrinsic, information-theoretic interpretations of C0 andC0f ?

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 14 / 68

Page 45: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway?

Long tradition in control of treating noise as nonrandom perturbationwith bounded magnitude, energy or power:

Control systems usually have mechanical/chemical components,as well as electrical.

I Dominant disturbances may not be governed by known probabilitydistributions.

I E.g. in mechanical systems, main disturbance may be vibrations atresonant frequencies determined by machine dimensions andmaterial properties.

In contrast, communication systems are mainlyelectrical/electro-magnetic/optical.

I Dominant disturbances - thermal noise, shot noise, fading etc. -well-modelled by probability distributions derived fromstatistical/quantum physics.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 15 / 68

Page 46: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway?

Long tradition in control of treating noise as nonrandom perturbationwith bounded magnitude, energy or power:

Control systems usually have mechanical/chemical components,as well as electrical.

I Dominant disturbances may not be governed by known probabilitydistributions.

I E.g. in mechanical systems, main disturbance may be vibrations atresonant frequencies determined by machine dimensions andmaterial properties.

In contrast, communication systems are mainlyelectrical/electro-magnetic/optical.

I Dominant disturbances - thermal noise, shot noise, fading etc. -well-modelled by probability distributions derived fromstatistical/quantum physics.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 15 / 68

Page 47: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway?

Long tradition in control of treating noise as nonrandom perturbationwith bounded magnitude, energy or power:

Control systems usually have mechanical/chemical components,as well as electrical.

I Dominant disturbances may not be governed by known probabilitydistributions.

I E.g. in mechanical systems, main disturbance may be vibrations atresonant frequencies determined by machine dimensions andmaterial properties.

In contrast, communication systems are mainlyelectrical/electro-magnetic/optical.

I Dominant disturbances - thermal noise, shot noise, fading etc. -well-modelled by probability distributions derived fromstatistical/quantum physics.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 15 / 68

Page 48: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway?

Long tradition in control of treating noise as nonrandom perturbationwith bounded magnitude, energy or power:

Control systems usually have mechanical/chemical components,as well as electrical.

I Dominant disturbances may not be governed by known probabilitydistributions.

I E.g. in mechanical systems, main disturbance may be vibrations atresonant frequencies determined by machine dimensions andmaterial properties.

In contrast, communication systems are mainlyelectrical/electro-magnetic/optical.

I Dominant disturbances - thermal noise, shot noise, fading etc. -well-modelled by probability distributions derived fromstatistical/quantum physics.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 15 / 68

Page 49: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway?

Long tradition in control of treating noise as nonrandom perturbationwith bounded magnitude, energy or power:

Control systems usually have mechanical/chemical components,as well as electrical.

I Dominant disturbances may not be governed by known probabilitydistributions.

I E.g. in mechanical systems, main disturbance may be vibrations atresonant frequencies determined by machine dimensions andmaterial properties.

In contrast, communication systems are mainlyelectrical/electro-magnetic/optical.

I Dominant disturbances - thermal noise, shot noise, fading etc. -well-modelled by probability distributions derived fromstatistical/quantum physics.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 15 / 68

Page 50: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway?

Long tradition in control of treating noise as nonrandom perturbationwith bounded magnitude, energy or power:

Control systems usually have mechanical/chemical components,as well as electrical.

I Dominant disturbances may not be governed by known probabilitydistributions.

I E.g. in mechanical systems, main disturbance may be vibrations atresonant frequencies determined by machine dimensions andmaterial properties.

In contrast, communication systems are mainlyelectrical/electro-magnetic/optical.

I Dominant disturbances - thermal noise, shot noise, fading etc. -well-modelled by probability distributions derived fromstatistical/quantum physics.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 15 / 68

Page 51: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway? (cont.)

Related to the previous points,In most digital comm. systems, bit periods Tb ⇡ 2⇥10�5s orshorter.) Thermal and shot noise (s µ

pTb) noticeable compared to

detected signal amplitudes (µ Tb).Control systems typically operate with longer sample or bitperiods, 10�2 or 10�3s.) Thermal/shot noise negligible compared to signal amplitudes.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 16 / 68

Page 52: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway? (cont.)

Related to the previous points,In most digital comm. systems, bit periods Tb ⇡ 2⇥10�5s orshorter.) Thermal and shot noise (s µ

pTb) noticeable compared to

detected signal amplitudes (µ Tb).Control systems typically operate with longer sample or bitperiods, 10�2 or 10�3s.) Thermal/shot noise negligible compared to signal amplitudes.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 16 / 68

Page 53: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway? (cont.)

For safety or mission-critical reasons, stability and performanceguarantees often required every time a control system is used, ifdisturbances within rated bounds.Especially if plant is unstable or marginally stable.Or if we wish to interconnect several control systems and still besure of performance.In contrast, most consumer-oriented communications requiresgood performance only on average, or with high probability.Occasional violations of specifications permitted, and cannot beprevented within a probabilistic framework.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 17 / 68

Page 54: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway? (cont.)

For safety or mission-critical reasons, stability and performanceguarantees often required every time a control system is used, ifdisturbances within rated bounds.Especially if plant is unstable or marginally stable.Or if we wish to interconnect several control systems and still besure of performance.In contrast, most consumer-oriented communications requiresgood performance only on average, or with high probability.Occasional violations of specifications permitted, and cannot beprevented within a probabilistic framework.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 17 / 68

Page 55: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Why Nonstochastic Anyway? (cont.)

For safety or mission-critical reasons, stability and performanceguarantees often required every time a control system is used, ifdisturbances within rated bounds.Especially if plant is unstable or marginally stable.Or if we wish to interconnect several control systems and still besure of performance.In contrast, most consumer-oriented communications requiresgood performance only on average, or with high probability.Occasional violations of specifications permitted, and cannot beprevented within a probabilistic framework.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 17 / 68

Page 56: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Probability in Practice

‘If there’s a fifty-fifty chance that something can go wrong,nine out of ten times, it will.’

(attrib. L. ‘Yogi’ Berra, former US baseball player)

(Photo from Wikipedia)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 18 / 68

Page 57: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Uncertain Variable Formalism

Define an uncertain variable (uv) X to be a mapping from anunderlying sample space ⌦ to a space X.Each w 2 ⌦ may represent a specific combination of noise/inputsignals into a system, and X may represent a state/outputvariable.For a given w, x = X (w) is the realisation of X .

Unlike probability theory, no s -algebra ⇢ 2⌦ or measure on ⌦ isimposed.Assume ⌦ is uncountable to accommodate continuous X.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 19 / 68

Page 58: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Uncertain Variable Formalism

Define an uncertain variable (uv) X to be a mapping from anunderlying sample space ⌦ to a space X.Each w 2 ⌦ may represent a specific combination of noise/inputsignals into a system, and X may represent a state/outputvariable.For a given w, x = X (w) is the realisation of X .

Unlike probability theory, no s -algebra ⇢ 2⌦ or measure on ⌦ isimposed.Assume ⌦ is uncountable to accommodate continuous X.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 19 / 68

Page 59: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

UV Formalism- Ranges and Conditioning

Marginal range JX K := {X (w) : w 2 ⌦}✓ X.Joint range JX ,Y K := {(X (w),Y (w)) : w 2 ⌦}✓ X⇥Y.Conditional range JX |yK := {X (w) : Y (w) = y ,w 2 ⌦}.

In the absence of statistical structure, the joint range fullycharacterises the relationship between X and Y . Note

JX ,Y K =[

y2JY KJX |yK⇥{y},

i.e. joint range is given by the conditional and marginal, analogously toprobability theory.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 20 / 68

Page 60: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Independence Without ProbabilityDefinition

The uv’s X ,Y are called (mutually) unrelated if

JX ,Y K = JX K⇥ JY K, (1)

denoted X ? Y. Else called related.

Equivalent characterisation:

Proposition

The uv’s X ,Y unrelated if

JX |yK = JX K, 8y 2 JY K. (2)

Unrelatedness is equivalent to X and Y inducing qualitativelyindependent [Rényi’70] partitions of ⌦ when ⌦ is finite.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 21 / 68

Page 61: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Examples of Relatedness and Unrelatedness

y ya b a b| 'Y x Y⊂

a b,X Y a b,X Ya ba b| 'Y

Y x

=a bYy’

y’

a b a b| 'X y X⊂

y

x x

a b a b| 'X Xa bX

x’ x’

a) X,Y related b) X,Y unrelateda b a b| 'X X y=a bX

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 22 / 68

Page 62: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Markovness without Probability

DefinitionX ,Y ,Z said to form a Markov uncertainty chain X �Y �Z if

JX |y ,zK = JX |yK, 8(y ,z) 2 JY ,Z K. (3)

Equivalent to

JX ,Z |yK = JX |yK⇥ JZ |yK, 8y 2 JY K,

i.e. X ,Z conditionally unrelated given Y , or in other wordsX ? Z |Y .X ,Y ,Z said to form a conditional Markov uncertainty chain givenW if X � (Y ,W )�Z .Also write as X �Y �Z |W or X ? Z |(Y ,W ).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 23 / 68

Page 63: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Markovness without Probability

DefinitionX ,Y ,Z said to form a Markov uncertainty chain X �Y �Z if

JX |y ,zK = JX |yK, 8(y ,z) 2 JY ,Z K. (3)

Equivalent to

JX ,Z |yK = JX |yK⇥ JZ |yK, 8y 2 JY K,

i.e. X ,Z conditionally unrelated given Y , or in other wordsX ? Z |Y .X ,Y ,Z said to form a conditional Markov uncertainty chain givenW if X � (Y ,W )�Z .Also write as X �Y �Z |W or X ? Z |(Y ,W ).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 23 / 68

Page 64: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Markovness without Probability

DefinitionX ,Y ,Z said to form a Markov uncertainty chain X �Y �Z if

JX |y ,zK = JX |yK, 8(y ,z) 2 JY ,Z K. (3)

Equivalent to

JX ,Z |yK = JX |yK⇥ JZ |yK, 8y 2 JY K,

i.e. X ,Z conditionally unrelated given Y , or in other wordsX ? Z |Y .X ,Y ,Z said to form a conditional Markov uncertainty chain givenW if X � (Y ,W )�Z .Also write as X �Y �Z |W or X ? Z |(Y ,W ).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 23 / 68

Page 65: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Information without Probability

DefinitionTwo points (x ,y),(x 0,y 0) 2 JX ,Y K are called taxicab connected(x ,y)! (x 0y 0) if 9 a sequence

(x ,y) = (x1,y1),(x2,y2), . . . ,(xn�1,yn�1),(xn,yn) = (x 0,y 0)

of points in JX ,Y K such that each point differs in only one coordinatefrom its predecessor.

Not hard to see that ! is an equivalence relation on JX ,Y K.Call its equivalence classes a taxicab partition T [X ;Y ] of JX ,Y K.Define a nonstochastic information index

I⇤[X ;Y ] := log2 |T [X ;Y ]| 2 [0,•]. (4)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 24 / 68

Page 66: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Information without Probability

DefinitionTwo points (x ,y),(x 0,y 0) 2 JX ,Y K are called taxicab connected(x ,y)! (x 0y 0) if 9 a sequence

(x ,y) = (x1,y1),(x2,y2), . . . ,(xn�1,yn�1),(xn,yn) = (x 0,y 0)

of points in JX ,Y K such that each point differs in only one coordinatefrom its predecessor.

Not hard to see that ! is an equivalence relation on JX ,Y K.Call its equivalence classes a taxicab partition T [X ;Y ] of JX ,Y K.Define a nonstochastic information index

I⇤[X ;Y ] := log2 |T [X ;Y ]| 2 [0,•]. (4)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 24 / 68

Page 67: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Connection to Common Random Variables

T [X ;Y ] also called ergodic decomposition [Gács-Körner PCIT72].For discrete X ,Y , the elements of T [X ;Y ] are the connectedcomponents of [Wolf-Wullschleger itw04], which were shown there to bethe maximal common rv Z⇤, i.e.

I Z⇤ = f⇤(X ) = g⇤(Y ) under suitable mappings f⇤,g⇤(since points in distinct sets in T [X ;Y ] are not taxicab-connected)

I If another rv Z ⌘ f (X )⌘ g(Y ), then Z ⌘ k(Z⇤)(since all points in the same set in T [X ;Y ] are taxicab-connected)

Not hard to see that Z⇤ also has the largest no. distinct values ofany common rv Z ⌘ f (X )⌘ g(Y ) .I⇤[X ;Y ] = Hartley entropy of Z⇤.Maximal common rv’s first described in the brief paper ‘The latticetheory of information’ [Shannon TIT53].

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 25 / 68

Page 68: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Connection to Common Random Variables

T [X ;Y ] also called ergodic decomposition [Gács-Körner PCIT72].For discrete X ,Y , the elements of T [X ;Y ] are the connectedcomponents of [Wolf-Wullschleger itw04], which were shown there to bethe maximal common rv Z⇤, i.e.

I Z⇤ = f⇤(X ) = g⇤(Y ) under suitable mappings f⇤,g⇤(since points in distinct sets in T [X ;Y ] are not taxicab-connected)

I If another rv Z ⌘ f (X )⌘ g(Y ), then Z ⌘ k(Z⇤)(since all points in the same set in T [X ;Y ] are taxicab-connected)

Not hard to see that Z⇤ also has the largest no. distinct values ofany common rv Z ⌘ f (X )⌘ g(Y ) .I⇤[X ;Y ] = Hartley entropy of Z⇤.Maximal common rv’s first described in the brief paper ‘The latticetheory of information’ [Shannon TIT53].

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 25 / 68

Page 69: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

ExamplesExamples

25

y

x

y

xz=1

z=0

z=1

z=0

| | 2 max.# distinct valuesthat can always be agreed on from separate observations of & . X Y

= =T | | 1 max.# distinct valuesthat can always be agreed on from separate observations of & . X Y

= =T

z=0

z=0

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 26 / 68

Page 70: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Equivalent View via Overlap Partitions

As in probability, often easier to work with conditional rather thanjoint ranges.Let JX |Y K := {JX |yK : y 2 JY K} be the conditional range family.

DefinitionTwo points x ,x 0 are called JX |Y K-overlap-connected if 9 a sequence ofsets B1, . . . ,Bn 2 JX |Y K s.t.

x 2 B1 and x 0 2 Bn

Bi \Bi+1 6= /0, 8i 2 [1 : n�1].

Overlap connectedness is an equivalence relation on JX K,induced by JX |Y K.Let the overlap partition JX |Y K⇤ of JX K denote the equivalenceclasses.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 27 / 68

Page 71: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Equivalent View via Overlap Partitions

As in probability, often easier to work with conditional rather thanjoint ranges.Let JX |Y K := {JX |yK : y 2 JY K} be the conditional range family.

DefinitionTwo points x ,x 0 are called JX |Y K-overlap-connected if 9 a sequence ofsets B1, . . . ,Bn 2 JX |Y K s.t.

x 2 B1 and x 0 2 Bn

Bi \Bi+1 6= /0, 8i 2 [1 : n�1].

Overlap connectedness is an equivalence relation on JX K,induced by JX |Y K.Let the overlap partition JX |Y K⇤ of JX K denote the equivalenceclasses.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 27 / 68

Page 72: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Equivalent View via Overlap Partitions (cont.)

PropositionFor any uv’s X ,Y,

I⇤[X ;Y ] = log2 |JX |Y K⇤| . (5)

Proof Sketch:For any two points (x ,y),(x 0,y 0) 2 JX ,Y K, (x ,y)! (x 0,y 0) iff x 0

and x 0 are JX |Y K-overlap-connected.This allows us to set up a bijection between the partitions T [X ;Y ]and JX |Y K⇤.) T [X ;Y ] and JX |Y K⇤ must have the same cardinality.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 28 / 68

Page 73: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Equivalent View via Overlap Partitions (cont.)

PropositionFor any uv’s X ,Y,

I⇤[X ;Y ] = log2 |JX |Y K⇤| . (5)

Proof Sketch:For any two points (x ,y),(x 0,y 0) 2 JX ,Y K, (x ,y)! (x 0,y 0) iff x 0

and x 0 are JX |Y K-overlap-connected.This allows us to set up a bijection between the partitions T [X ;Y ]and JX |Y K⇤.) T [X ;Y ] and JX |Y K⇤ must have the same cardinality.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 28 / 68

Page 74: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Equivalent View via Overlap Partitions (cont.)

PropositionFor any uv’s X ,Y,

I⇤[X ;Y ] = log2 |JX |Y K⇤| . (5)

Proof Sketch:For any two points (x ,y),(x 0,y 0) 2 JX ,Y K, (x ,y)! (x 0,y 0) iff x 0

and x 0 are JX |Y K-overlap-connected.This allows us to set up a bijection between the partitions T [X ;Y ]and JX |Y K⇤.) T [X ;Y ] and JX |Y K⇤ must have the same cardinality.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 28 / 68

Page 75: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤

(Nonnegativity) I⇤[X ;Y ]� 0 (obvious)(Symmetry) I⇤[X ;Y ] = I⇤[Y ;X ]. Follows from the fact that

(x ,y)! (x 0,y 0) 2 JX ,Y K () (y ,x)! (y 0,x 0) 2 JY ,X K. (6)

From this property and (5), knowing just one of the conditionalrange families JX |Y K or JY |X K is enough to determine I⇤[X ;Y ].Not like ordinary mutual information.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 29 / 68

Page 76: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤ (cont.)

Proposition (Monotonicity)For any uv’s X ,Y ,Z,

I⇤[X ;Y ,Z ]� I⇤[X ;Y ]. (7)

Proof: Idea is to find a surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. This wouldautomatically imply that the latter cannot have greater cardinality.

Pick any set B 2 JX |Y ,Z K⇤ and choose a B0 2 JX |Y K⇤ s.t.B\B0 6= /0.At least one such B0 exists, since JX |Y K⇤ covers JX K ◆ B.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 30 / 68

Page 77: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤ (cont.)

Proposition (Monotonicity)For any uv’s X ,Y ,Z,

I⇤[X ;Y ,Z ]� I⇤[X ;Y ]. (7)

Proof: Idea is to find a surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. This wouldautomatically imply that the latter cannot have greater cardinality.

Pick any set B 2 JX |Y ,Z K⇤ and choose a B0 2 JX |Y K⇤ s.t.B\B0 6= /0.At least one such B0 exists, since JX |Y K⇤ covers JX K ◆ B.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 30 / 68

Page 78: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof of Monotonic Property (cont.)

Furthermore, exactly one such intersecting B0 2 JX |Y K⇤ exists foreach B 2 JX |Y ,Z K⇤, since B✓ B0:

I By definition, any x 2 B and x 0 2 B\B0 are connected by asequence of successively overlapping sets in JX |Y ,Z K.

I As JX |y ,zK ✓ JX |yK, x ,x 0 are also connected by a sequence ofsuccessively overlapping sets in JX |Y K.

I But B0 = all pts. that are JX |Y K-overlap connected with therepresentative pt. x 0 2 B0, so x 2 B0.

I As x was arbitrary, B✓ B0.

Thus B 7! B0 is a well-defined map from JX |Y ,Z K⇤ ! JX |Y K⇤.It is also onto since, as noted before, every set B0 2 JX |Y K⇤intersects some B in JX |Y ,Z K⇤, which covers JX K.So B 7! B0 is the required surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 31 / 68

Page 79: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof of Monotonic Property (cont.)

Furthermore, exactly one such intersecting B0 2 JX |Y K⇤ exists foreach B 2 JX |Y ,Z K⇤, since B✓ B0:

I By definition, any x 2 B and x 0 2 B\B0 are connected by asequence of successively overlapping sets in JX |Y ,Z K.

I As JX |y ,zK ✓ JX |yK, x ,x 0 are also connected by a sequence ofsuccessively overlapping sets in JX |Y K.

I But B0 = all pts. that are JX |Y K-overlap connected with therepresentative pt. x 0 2 B0, so x 2 B0.

I As x was arbitrary, B✓ B0.

Thus B 7! B0 is a well-defined map from JX |Y ,Z K⇤ ! JX |Y K⇤.It is also onto since, as noted before, every set B0 2 JX |Y K⇤intersects some B in JX |Y ,Z K⇤, which covers JX K.So B 7! B0 is the required surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 31 / 68

Page 80: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof of Monotonic Property (cont.)

Furthermore, exactly one such intersecting B0 2 JX |Y K⇤ exists foreach B 2 JX |Y ,Z K⇤, since B✓ B0:

I By definition, any x 2 B and x 0 2 B\B0 are connected by asequence of successively overlapping sets in JX |Y ,Z K.

I As JX |y ,zK ✓ JX |yK, x ,x 0 are also connected by a sequence ofsuccessively overlapping sets in JX |Y K.

I But B0 = all pts. that are JX |Y K-overlap connected with therepresentative pt. x 0 2 B0, so x 2 B0.

I As x was arbitrary, B✓ B0.

Thus B 7! B0 is a well-defined map from JX |Y ,Z K⇤ ! JX |Y K⇤.It is also onto since, as noted before, every set B0 2 JX |Y K⇤intersects some B in JX |Y ,Z K⇤, which covers JX K.So B 7! B0 is the required surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 31 / 68

Page 81: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof of Monotonic Property (cont.)

Furthermore, exactly one such intersecting B0 2 JX |Y K⇤ exists foreach B 2 JX |Y ,Z K⇤, since B✓ B0:

I By definition, any x 2 B and x 0 2 B\B0 are connected by asequence of successively overlapping sets in JX |Y ,Z K.

I As JX |y ,zK ✓ JX |yK, x ,x 0 are also connected by a sequence ofsuccessively overlapping sets in JX |Y K.

I But B0 = all pts. that are JX |Y K-overlap connected with therepresentative pt. x 0 2 B0, so x 2 B0.

I As x was arbitrary, B✓ B0.

Thus B 7! B0 is a well-defined map from JX |Y ,Z K⇤ ! JX |Y K⇤.It is also onto since, as noted before, every set B0 2 JX |Y K⇤intersects some B in JX |Y ,Z K⇤, which covers JX K.So B 7! B0 is the required surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 31 / 68

Page 82: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof of Monotonic Property (cont.)

Furthermore, exactly one such intersecting B0 2 JX |Y K⇤ exists foreach B 2 JX |Y ,Z K⇤, since B✓ B0:

I By definition, any x 2 B and x 0 2 B\B0 are connected by asequence of successively overlapping sets in JX |Y ,Z K.

I As JX |y ,zK ✓ JX |yK, x ,x 0 are also connected by a sequence ofsuccessively overlapping sets in JX |Y K.

I But B0 = all pts. that are JX |Y K-overlap connected with therepresentative pt. x 0 2 B0, so x 2 B0.

I As x was arbitrary, B✓ B0.

Thus B 7! B0 is a well-defined map from JX |Y ,Z K⇤ ! JX |Y K⇤.It is also onto since, as noted before, every set B0 2 JX |Y K⇤intersects some B in JX |Y ,Z K⇤, which covers JX K.So B 7! B0 is the required surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 31 / 68

Page 83: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof of Monotonic Property (cont.)

Furthermore, exactly one such intersecting B0 2 JX |Y K⇤ exists foreach B 2 JX |Y ,Z K⇤, since B✓ B0:

I By definition, any x 2 B and x 0 2 B\B0 are connected by asequence of successively overlapping sets in JX |Y ,Z K.

I As JX |y ,zK ✓ JX |yK, x ,x 0 are also connected by a sequence ofsuccessively overlapping sets in JX |Y K.

I But B0 = all pts. that are JX |Y K-overlap connected with therepresentative pt. x 0 2 B0, so x 2 B0.

I As x was arbitrary, B✓ B0.

Thus B 7! B0 is a well-defined map from JX |Y ,Z K⇤ ! JX |Y K⇤.It is also onto since, as noted before, every set B0 2 JX |Y K⇤intersects some B in JX |Y ,Z K⇤, which covers JX K.So B 7! B0 is the required surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 31 / 68

Page 84: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof of Monotonic Property (cont.)

Furthermore, exactly one such intersecting B0 2 JX |Y K⇤ exists foreach B 2 JX |Y ,Z K⇤, since B✓ B0:

I By definition, any x 2 B and x 0 2 B\B0 are connected by asequence of successively overlapping sets in JX |Y ,Z K.

I As JX |y ,zK ✓ JX |yK, x ,x 0 are also connected by a sequence ofsuccessively overlapping sets in JX |Y K.

I But B0 = all pts. that are JX |Y K-overlap connected with therepresentative pt. x 0 2 B0, so x 2 B0.

I As x was arbitrary, B✓ B0.

Thus B 7! B0 is a well-defined map from JX |Y ,Z K⇤ ! JX |Y K⇤.It is also onto since, as noted before, every set B0 2 JX |Y K⇤intersects some B in JX |Y ,Z K⇤, which covers JX K.So B 7! B0 is the required surjection from JX |Y ,Z K⇤ ! JX |Y K⇤. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 31 / 68

Page 85: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤ (cont.)Proposition (Data Processing)For Markov uncertainty chains X �Y �Z (3),

I⇤[X ;Z ] I⇤[X ;Y ].

Proof:

By monotonicity and the overlap partition characterisation of I⇤,

I⇤[X ;Z ](7) I⇤[X ;Y ,Z ]

(5)= log |JX |Y ,Z K⇤|. (8)

By Markovness (3), JX |y ,zK = JX |yK, 8y 2 JY K and z 2 JZ |yK.) JX |Y ,Z K = JX |Y K.) JX |Y ,Z K⇤ = JX |Y K⇤.Substitute into the RHS of the equation above. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 32 / 68

Page 86: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤ (cont.)Proposition (Data Processing)For Markov uncertainty chains X �Y �Z (3),

I⇤[X ;Z ] I⇤[X ;Y ].

Proof:

By monotonicity and the overlap partition characterisation of I⇤,

I⇤[X ;Z ](7) I⇤[X ;Y ,Z ]

(5)= log |JX |Y ,Z K⇤|. (8)

By Markovness (3), JX |y ,zK = JX |yK, 8y 2 JY K and z 2 JZ |yK.) JX |Y ,Z K = JX |Y K.) JX |Y ,Z K⇤ = JX |Y K⇤.Substitute into the RHS of the equation above. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 32 / 68

Page 87: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤ (cont.)Proposition (Data Processing)For Markov uncertainty chains X �Y �Z (3),

I⇤[X ;Z ] I⇤[X ;Y ].

Proof:

By monotonicity and the overlap partition characterisation of I⇤,

I⇤[X ;Z ](7) I⇤[X ;Y ,Z ]

(5)= log |JX |Y ,Z K⇤|. (8)

By Markovness (3), JX |y ,zK = JX |yK, 8y 2 JY K and z 2 JZ |yK.) JX |Y ,Z K = JX |Y K.) JX |Y ,Z K⇤ = JX |Y K⇤.Substitute into the RHS of the equation above. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 32 / 68

Page 88: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤ (cont.)Proposition (Data Processing)For Markov uncertainty chains X �Y �Z (3),

I⇤[X ;Z ] I⇤[X ;Y ].

Proof:

By monotonicity and the overlap partition characterisation of I⇤,

I⇤[X ;Z ](7) I⇤[X ;Y ,Z ]

(5)= log |JX |Y ,Z K⇤|. (8)

By Markovness (3), JX |y ,zK = JX |yK, 8y 2 JY K and z 2 JZ |yK.) JX |Y ,Z K = JX |Y K.) JX |Y ,Z K⇤ = JX |Y K⇤.Substitute into the RHS of the equation above. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 32 / 68

Page 89: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Properties of I⇤ (cont.)Proposition (Data Processing)For Markov uncertainty chains X �Y �Z (3),

I⇤[X ;Z ] I⇤[X ;Y ].

Proof:

By monotonicity and the overlap partition characterisation of I⇤,

I⇤[X ;Z ](7) I⇤[X ;Y ,Z ]

(5)= log |JX |Y ,Z K⇤|. (8)

By Markovness (3), JX |y ,zK = JX |yK, 8y 2 JY K and z 2 JZ |yK.) JX |Y ,Z K = JX |Y K.) JX |Y ,Z K⇤ = JX |Y K⇤.Substitute into the RHS of the equation above. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 32 / 68

Page 90: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Stationary Memoryless Uncertain Channels - Take 1

An uncertain signal X is a mapping from ⌦ to the space X• ofdiscrete-time sequences x = (xi)

•i=0 in X.

A stationary memoryless uncertain channel may be defined interms of

I input and output spaces X,Y;I a set-valued transition function T : X! 2Y;I and the family of all uncertain input-output signal pairs (X ,Y ) s.t.

JYk |x0:k ,y0:k�1K = JYk |xk K = T(xk ), k 2 Z�0. (9)

If channel ‘used without feedback’, then impose the extraconstraint

JXk |x0:k�1,y0:k�1K = JXk |x0:k�1K, k 2 Z�0, (10)

on (X ,Y ).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 33 / 68

Page 91: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Stationary Memoryless Uncertain Channels - Take 1

An uncertain signal X is a mapping from ⌦ to the space X• ofdiscrete-time sequences x = (xi)

•i=0 in X.

A stationary memoryless uncertain channel may be defined interms of

I input and output spaces X,Y;I a set-valued transition function T : X! 2Y;I and the family of all uncertain input-output signal pairs (X ,Y ) s.t.

JYk |x0:k ,y0:k�1K = JYk |xk K = T(xk ), k 2 Z�0. (9)

If channel ‘used without feedback’, then impose the extraconstraint

JXk |x0:k�1,y0:k�1K = JXk |x0:k�1K, k 2 Z�0, (10)

on (X ,Y ).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 33 / 68

Page 92: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Stationary Memoryless Uncertain Channels - Take 1

An uncertain signal X is a mapping from ⌦ to the space X• ofdiscrete-time sequences x = (xi)

•i=0 in X.

A stationary memoryless uncertain channel may be defined interms of

I input and output spaces X,Y;I a set-valued transition function T : X! 2Y;I and the family of all uncertain input-output signal pairs (X ,Y ) s.t.

JYk |x0:k ,y0:k�1K = JYk |xk K = T(xk ), k 2 Z�0. (9)

If channel ‘used without feedback’, then impose the extraconstraint

JXk |x0:k�1,y0:k�1K = JXk |x0:k�1K, k 2 Z�0, (10)

on (X ,Y ).

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 33 / 68

Page 93: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Channel Noise?

Previous formulation parallels [Massey isit90] for stationarymemoryless stochastic channels:

fYk |X0:k ,Y0:k�1(yk |x0:k ,y0:k�1) = fYk |Xk (yk |xk )⌘ q(yk ,xk ).

In many cases, it is enough to think in terms of these conditionalranges. Channel noise implicit.However, in many cases it is convenient to model channel noiseexplicitly. E.g.

I when the transmitter has access to some function of past channelnoise, not just past channel outputs,

I or when the channel is part of a larger system, with other input andnoise signals.In this case, previous formulation would have to be changed toinclude the other terms in the conditioning arguments.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 34 / 68

Page 94: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Channel Noise?

Previous formulation parallels [Massey isit90] for stationarymemoryless stochastic channels:

fYk |X0:k ,Y0:k�1(yk |x0:k ,y0:k�1) = fYk |Xk (yk |xk )⌘ q(yk ,xk ).

In many cases, it is enough to think in terms of these conditionalranges. Channel noise implicit.However, in many cases it is convenient to model channel noiseexplicitly. E.g.

I when the transmitter has access to some function of past channelnoise, not just past channel outputs,

I or when the channel is part of a larger system, with other input andnoise signals.In this case, previous formulation would have to be changed toinclude the other terms in the conditioning arguments.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 34 / 68

Page 95: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Channel as Noisy Function

DefinitionA stationary memoryless uncertain channel (SMUC) consists of

an unrelated, identically spread (uis) noise signal V = (Vk )•k=0

taking values over a space V, i.e.

JVk |v0:k�1K = JVkK = V, 8v0:k�1 2 Vk ,k 2 Z�0; (11)

input and output spaces X,Y, and a transition functiont : X⇥V! Y;and the family G of all uncertain input-output signal pairs (X ,Y )s.t. 8k 2 Z�0,

I Yk = t(Xk ,Vk ),I and X0:k ? Vk

If channel used w/o feedback, then tighten last condition so thatX ? V . Yields smaller family Gnf ⇢ G .

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 35 / 68

Page 96: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Zero Error Coding in UV Framework (No Feedback)

Decoder Channel kY

Coder kX

M Message

M

kV

A zero-error code w/o feedback is defined byI a block length n+1 2 N;I a message cardinality µ � 1;I and an encoder mapping g : [1 : µ]! Xn+1, s.t. for any M ? V

taking µ distinct values m1, . . . ,mµ ,F X0:n = g(i) if M = mi .F |JM|y0:nK|= 1,8y0:n 2 JY0:nK.

Last condition equivalent to existence of a decoder that alwaysmaps Y0:n 7! M, despite channel noise.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 36 / 68

Page 97: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Zero Error Coding in UV Framework (No Feedback)

Decoder Channel kY

Coder kX

M Message

M

kV

A zero-error code w/o feedback is defined byI a block length n+1 2 N;I a message cardinality µ � 1;I and an encoder mapping g : [1 : µ]! Xn+1, s.t. for any M ? V

taking µ distinct values m1, . . . ,mµ ,F X0:n = g(i) if M = mi .F |JM|y0:nK|= 1,8y0:n 2 JY0:nK.

Last condition equivalent to existence of a decoder that alwaysmaps Y0:n 7! M, despite channel noise.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 36 / 68

Page 98: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Zero Error Capacity and I⇤Zero-error capacity C0 defined operationally, as the highestblock-coding rate that yields zero errors:

C0 := supn,µ2N,g1:n

log2 µn+1

= limn!•

supµ2N,g1:n

log2 µn+1

. (12)

Theorem (after N. TAC13)

C0 = supn�0,(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1

= lim

n!•sup

(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1

!. (13)

In [Wolf-Wullschleger itw04], C0 was characterised as the largestShannon entropy rate of the maximal rv Zn common to discreteX0:n,Y0:n.Key idea is similar, but nonstochastic and applicable tocontinuous-valued X ,Y

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 37 / 68

Page 99: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Zero Error Capacity and I⇤Zero-error capacity C0 defined operationally, as the highestblock-coding rate that yields zero errors:

C0 := supn,µ2N,g1:n

log2 µn+1

= limn!•

supµ2N,g1:n

log2 µn+1

. (12)

Theorem (after N. TAC13)

C0 = supn�0,(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1

= lim

n!•sup

(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1

!. (13)

In [Wolf-Wullschleger itw04], C0 was characterised as the largestShannon entropy rate of the maximal rv Zn common to discreteX0:n,Y0:n.Key idea is similar, but nonstochastic and applicable tocontinuous-valued X ,Y

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 37 / 68

Page 100: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Zero Error Capacity and I⇤Zero-error capacity C0 defined operationally, as the highestblock-coding rate that yields zero errors:

C0 := supn,µ2N,g1:n

log2 µn+1

= limn!•

supµ2N,g1:n

log2 µn+1

. (12)

Theorem (after N. TAC13)

C0 = supn�0,(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1

= lim

n!•sup

(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1

!. (13)

In [Wolf-Wullschleger itw04], C0 was characterised as the largestShannon entropy rate of the maximal rv Zn common to discreteX0:n,Y0:n.Key idea is similar, but nonstochastic and applicable tocontinuous-valued X ,Y

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 37 / 68

Page 101: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: � (Construct a Code)

Pick any (X ,Y ) 2 Gnf ,n 2 N. Let

µ = |JX0:n;Y0:nK⇤|⌘ |JY0:n;X0:nK⇤| ,

and index the overlap partition sets:

JX0:n|Y0:nK⇤ ⌘ {PX (z) : z 2 [1 : µ]} , (14)JY0:n|X0:nK⇤ ⌘ {PY (z) : z 2 [1 : µ]} . (15)

Define uv Z as the unique index s.t. PX (Z ) 3 X0:n.This is also the unique index s.t. PY (Z ) 3 Y0:n.For each z 2 [1 : µ], pick an input sequence x(z) 2 PX (z)✓ JX0:nKand define the coder map

g(z) = x(z) 2 JX0:nK, 8z 2 [1 : µ].

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 38 / 68

Page 102: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: � (Construct a Code)

Pick any (X ,Y ) 2 Gnf ,n 2 N. Let

µ = |JX0:n;Y0:nK⇤|⌘ |JY0:n;X0:nK⇤| ,

and index the overlap partition sets:

JX0:n|Y0:nK⇤ ⌘ {PX (z) : z 2 [1 : µ]} , (14)JY0:n|X0:nK⇤ ⌘ {PY (z) : z 2 [1 : µ]} . (15)

Define uv Z as the unique index s.t. PX (Z ) 3 X0:n.This is also the unique index s.t. PY (Z ) 3 Y0:n.For each z 2 [1 : µ], pick an input sequence x(z) 2 PX (z)✓ JX0:nKand define the coder map

g(z) = x(z) 2 JX0:nK, 8z 2 [1 : µ].

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 38 / 68

Page 103: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: � (cont.)Now, consider any message M ? V that can take µ distinct valuesm1, . . . ,mµ . Encode this message to give an input uv sequence

X 00:n = x(i) if M = mi .

This yields an output sequence Y 00:n, where

Y 0k = t(X 0

k ,Vk ), k 2 [0 : n].

As M and X0:n each ? V , it follows that if M = mi then

JY 00:n|X 0

0:n = x(i)K = JY0:n|X0:n = x(i)K ✓ PY (i).

Sets PY (1), . . .PY (µ) are disjoint since they form a partition) Message M can be recovered from Y 0

0:n with this code.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 39 / 68

Page 104: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: � (cont.)Now, consider any message M ? V that can take µ distinct valuesm1, . . . ,mµ . Encode this message to give an input uv sequence

X 00:n = x(i) if M = mi .

This yields an output sequence Y 00:n, where

Y 0k = t(X 0

k ,Vk ), k 2 [0 : n].

As M and X0:n each ? V , it follows that if M = mi then

JY 00:n|X 0

0:n = x(i)K = JY0:n|X0:n = x(i)K ✓ PY (i).

Sets PY (1), . . .PY (µ) are disjoint since they form a partition) Message M can be recovered from Y 0

0:n with this code.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 39 / 68

Page 105: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: � (cont.)Now, consider any message M ? V that can take µ distinct valuesm1, . . . ,mµ . Encode this message to give an input uv sequence

X 00:n = x(i) if M = mi .

This yields an output sequence Y 00:n, where

Y 0k = t(X 0

k ,Vk ), k 2 [0 : n].

As M and X0:n each ? V , it follows that if M = mi then

JY 00:n|X 0

0:n = x(i)K = JY0:n|X0:n = x(i)K ✓ PY (i).

Sets PY (1), . . .PY (µ) are disjoint since they form a partition) Message M can be recovered from Y 0

0:n with this code.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 39 / 68

Page 106: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: � (cont.)

Thus

C0 �log2 µn+1

=log2 |JX0:n|Y0:nK⇤|

n+1=

I⇤[X0:n;Y0:n]

n+1.

As (X ,Y ) 2 Gnf and n 2 Z were arbitrary,

C0 � supn�0,(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 40 / 68

Page 107: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: � (cont.)

Thus

C0 �log2 µn+1

=log2 |JX0:n|Y0:nK⇤|

n+1=

I⇤[X0:n;Y0:n]

n+1.

As (X ,Y ) 2 Gnf and n 2 Z were arbitrary,

C0 � supn�0,(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 40 / 68

Page 108: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: (Construct (X ,Y ) 2 Gnf )

Select an arbitrary zero-error code (n,µ,g).Pick a message uv M ? V taking distinct values m1, . . . ,mµ .Set

X0:n = g(i)if M = mi

Xk = Xn, k > n.Yk = t(Xk ,Vk ), k 2 Z�0.

As X0:n is a function of M ? V , it follows that X ? V .Thus (X ,Y ) 2 Gnf .

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 41 / 68

Page 109: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: (cont.)

By zero-error property, the sets JY0:n|X0:n = g(i)K, i = 1, . . . ,µ, aredisjoint, therefore distinct.Thus each partition set in JY0:n|X0:nK⇤ contains exactly one ofthese sets:

I It includes at least one set JY0:n|x0:nK.I If it includes more than one such set then, by definition of the

overlap partition they would have overlaps, which is impossible.

) µ = |JY0:n|X0:nK⇤|.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 42 / 68

Page 110: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Proof: (cont.)

Thus

log2 µn+1

=log2 |JY0:n|X0:nK⇤|

n+1 sup

n�0,(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1.

As the zero-error code (n,µ,g) was arbitrary, we can take asupremum in the LHS to get

C0 supn�0,(X ,Y )2Gnf

I⇤[X0:n;Y0:n]

n+1.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 43 / 68

Page 111: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Conditional Maximin Information

Let T [X ;Y |w ] := taxicab partition of the conditional joint rangeJX ,Y |wK, given W = w .Then define conditional nonstochastic information

I⇤[X ;Y |W ] := minw2JW K

log2 |T [X ;Y |w ]| .

= Log-cardinality of most refined variable common to (X ,W ) and(Y ,W ) but unrelated to W .I.e. if two agents each observe X ,Y separately but also share W ,then I⇤[X ;Y |W ] captures the most refined variable that is ‘new’with respect to W and on which they can both agree.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 44 / 68

Page 112: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Conditional Maximin Information

Let T [X ;Y |w ] := taxicab partition of the conditional joint rangeJX ,Y |wK, given W = w .Then define conditional nonstochastic information

I⇤[X ;Y |W ] := minw2JW K

log2 |T [X ;Y |w ]| .

= Log-cardinality of most refined variable common to (X ,W ) and(Y ,W ) but unrelated to W .I.e. if two agents each observe X ,Y separately but also share W ,then I⇤[X ;Y |W ] captures the most refined variable that is ‘new’with respect to W and on which they can both agree.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 44 / 68

Page 113: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Zero Error Coding with Feedback

Decoder Channel kY

Coder kX

M Message

M

kV

Unit Delay 1−kY

A zero-error code with feedback is defined byI a block length n+1 2 N;I a message cardinality µ � 1;I and a sequence g0:n of encoder mappings s.t. for any message

M ? V taking values m1, . . . ,mµ ,F Xk = gk (i ,Y0:k�1) if M = mi ,F |JM|y0:nK|= 1,8y0:n 2 JY0:nK.

Last condition equivalent to existence of a decoder that canreconstruct M from Y0:n without error.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 45 / 68

Page 114: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

C0f and Directed Nonstochastic Information

Zero-error feedback capacity C0f defined operationally, as thehighest feedback coding rate that yields zero errors:

C0f := supn,µ2N,g1:n

log2 µn+1

= limn!•

supµ2N,g1:n

log2 µn+1

. (16)

Growth rate of maximum cardinality of sets of feedback codingfunctions that can be unambiguously determined from channeloutputs.Define directed nonstochastic information

I⇤[X0:n ! Y0:n] :=n

Âk=0

I⇤[X0:k ;Yk |Y0:k�1]

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 46 / 68

Page 115: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

C0f in terms of Directed Nonstochastic Information

Theorem (N. cdc12)For a stationary memoryless uncertain channel,

C0f = supn�0,(X ,Y )2G

I⇤[X0:n ! Y0:n]

n+1.

Parallels characterisation in [Kim TIT08, Tatikonda-Mitter TIT09] for ordinaryfeedback capacity Cf of stochastic channels:

Cf = supn�0,pXk |X0:k�1 ,Y0:k�1

,0kn

I[X0:n ! Y0:n]

n+1,

where Marko-Massey directed informationI[X0:n ! Y0:n] := Ân

k=0 I[X0:k ;Yk |Y0:k�1],and conditional information I[X ;Y |Z ] := H[X |Z ]�H[X |Y ,Z ].

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 47 / 68

Page 116: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

LTI State Estimation over Noisy Channels

kU kY Decoder/ Estimator

kX̂Channel

kQQuantiser/ Coder

kS

kkkk

kkk

VBUAXXWGXY

�� �

�1

,

kk WV , Noise

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 48 / 68

Page 117: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

LTI State Estimation - Disturbance-Free

Plant: LTI, noiseless, zero input:

Xk+1 = AXk , Yk = GXk , X0 a uv.

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Estimator: Q0:k 7! X̂k+1.Objective: Uniform r-exponential convergence from an l-ball. I.e.

given r, l > 0, construct a coder-estimator s.t. for any uvX0 with JX0K ✓ Bl(0),

limk!•

supw2⌦

r�kkXk � X̂kk= 0.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 49 / 68

Page 118: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

LTI State Estimation - Disturbance-Free

Plant: LTI, noiseless, zero input:

Xk+1 = AXk , Yk = GXk , X0 a uv.

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Estimator: Q0:k 7! X̂k+1.Objective: Uniform r-exponential convergence from an l-ball. I.e.

given r, l > 0, construct a coder-estimator s.t. for any uvX0 with JX0K ✓ Bl(0),

limk!•

supw2⌦

r�kkXk � X̂kk= 0.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 49 / 68

Page 119: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Disturbance-Free State Estimation and C0

Assumptions:DF1: A has one or more eigenvalues with magnitude > r.DF2: (G,Ar) is observable, where Ar := A restricted to

eigenspace governed by eigenvalues of magnitude � r.DF3: X0 ? Z

Theorem (N. TAC13)If uniform r-exponential convergence is achieved from some l-ball,then

C0 � Â|li |�r

log2

✓|li |r

◆. (17)

Conversely, if (26) holds strictly, then for any l > 0, a coder-estimatorthat achieves uniform r-exponential convergence from Bl(0) can beconstructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 50 / 68

Page 120: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Disturbance-Free State Estimation and C0

Assumptions:DF1: A has one or more eigenvalues with magnitude > r.DF2: (G,Ar) is observable, where Ar := A restricted to

eigenspace governed by eigenvalues of magnitude � r.DF3: X0 ? Z

Theorem (N. TAC13)If uniform r-exponential convergence is achieved from some l-ball,then

C0 � Â|li |�r

log2

✓|li |r

◆. (17)

Conversely, if (26) holds strictly, then for any l > 0, a coder-estimatorthat achieves uniform r-exponential convergence from Bl(0) can beconstructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 50 / 68

Page 121: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case

Pick arbitrarily large t 2 N and small e 2⇣

0,1� r|l |

⌘.

Divide [�l , l] into

k :=

$����(1� e)l

r

����t%� 1

equal intervals of length 2l/k.Inside each interval construct a centred subinterval I(s) of shorterlength l/k. Define the subinterval family

H := {I(s) : s = 1, . . . ,k}, (18)

noting that subintervals 2 H are separated by a gap � l/k.Set the initial state range JX0K =

SH2H H ⇢ [�l , l].

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 51 / 68

Page 122: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Let Ek := Xk � X̂k . By hypothesis, 9f > 0 s.t.

frk � supJ|Ek |K� 0.5diamJEkK (19)� 0.5diamJEk |q0:k�1K (20)

= 0.5diamr

l kX0 �hk (q0:k�1) |q0:k�1

z

= 0.5diamJl kX0|q0:k�1K (21)= 0.5|l |k diamJX0|q0:k�1K (22)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 52 / 68

Page 123: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Let Ek := Xk � X̂k . By hypothesis, 9f > 0 s.t.

frk � supJ|Ek |K� 0.5diamJEkK (19)� 0.5diamJEk |q0:k�1K (20)

= 0.5diamr

l kX0 �hk (q0:k�1) |q0:k�1

z

= 0.5diamJl kX0|q0:k�1K (21)= 0.5|l |k diamJX0|q0:k�1K (22)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 52 / 68

Page 124: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Let Ek := Xk � X̂k . By hypothesis, 9f > 0 s.t.

frk � supJ|Ek |K� 0.5diamJEkK (19)� 0.5diamJEk |q0:k�1K (20)

= 0.5diamr

l kX0 �hk (q0:k�1) |q0:k�1

z

= 0.5diamJl kX0|q0:k�1K (21)= 0.5|l |k diamJX0|q0:k�1K (22)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 52 / 68

Page 125: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Let Ek := Xk � X̂k . By hypothesis, 9f > 0 s.t.

frk � supJ|Ek |K� 0.5diamJEkK (19)� 0.5diamJEk |q0:k�1K (20)

= 0.5diamr

l kX0 �hk (q0:k�1) |q0:k�1

z

= 0.5diamJl kX0|q0:k�1K (21)= 0.5|l |k diamJX0|q0:k�1K (22)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 52 / 68

Page 126: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Let Ek := Xk � X̂k . By hypothesis, 9f > 0 s.t.

frk � supJ|Ek |K� 0.5diamJEkK (19)� 0.5diamJEk |q0:k�1K (20)

= 0.5diamr

l kX0 �hk (q0:k�1) |q0:k�1

z

= 0.5diamJl kX0|q0:k�1K (21)= 0.5|l |k diamJX0|q0:k�1K (22)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 52 / 68

Page 127: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Next show that for large t , no two sets in H (18) can beJX0|Q0:t�1K-overlap-connected:Suppose in contradiction that 9H 2 H that isJX0|Q0:t�1K-overlap-connected with another set in H .)9 JX0|q0:t�1K containing both a point u 2 H and a point v insome H0 2 H \{H})

|u�v | diamJX0|q0:t�1K(22) 2fr t

|l |t . (23)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 53 / 68

Page 128: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Next show that for large t , no two sets in H (18) can beJX0|Q0:t�1K-overlap-connected:Suppose in contradiction that 9H 2 H that isJX0|Q0:t�1K-overlap-connected with another set in H .)9 JX0|q0:t�1K containing both a point u 2 H and a point v insome H0 2 H \{H})

|u�v | diamJX0|q0:t�1K(22) 2fr t

|l |t . (23)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 53 / 68

Page 129: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

Next show that for large t , no two sets in H (18) can beJX0|Q0:t�1K-overlap-connected:Suppose in contradiction that 9H 2 H that isJX0|Q0:t�1K-overlap-connected with another set in H .)9 JX0|q0:t�1K containing both a point u 2 H and a point v insome H0 2 H \{H})

|u�v | diamJX0|q0:t�1K(22) 2fr t

|l |t . (23)

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 53 / 68

Page 130: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

However, any two sets 2 H are separated by a distance of atleast l/k. So

|u�v | � lk=

lj((1� e)|l |/r)t

k

� l((1� e)|l |/r)t =

lr t

|(1� e)l |t .

The RHS of this would exceed the RHS of (23) when t issufficiently large that

� 11�e�t> 2f/l , yielding a contradiction.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 54 / 68

Page 131: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

So for large enough t , no two sets of H areJX0|Q0:t�1K-overlap-connected.So

2I⇤[X0;Q0:t�1] ⌘ |JX0|Q0:t�1K⇤| � |H |

=

$����(1� e)l

r

����t%

� 0.5����(1� e)l

r

����t, (24)

since bxc> x/2, for every x � 1.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 55 / 68

Page 132: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

So for large enough t , no two sets of H areJX0|Q0:t�1K-overlap-connected.So

2I⇤[X0;Q0:t�1] ⌘ |JX0|Q0:t�1K⇤| � |H |

=

$����(1� e)l

r

����t%

� 0.5����(1� e)l

r

����t, (24)

since bxc> x/2, for every x � 1.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 55 / 68

Page 133: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

But X0 �S0:t�1 �Q0:t�1 is a Markov uncertainty chain, so

I⇤[X0;Q0:t�1] I⇤[S0:t�1;Q0:t�1]

tC0.

Substitute into the LHS of (24), take logarithms and divide by t toget

C0 � log2(1� e)+ log2

����lr

�����1t.

Letting t ! • yields

C0 � log2(1� e)+ log2

����lr

���� .

As e can be made arbitrarily small, we are done. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 56 / 68

Page 134: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

But X0 �S0:t�1 �Q0:t�1 is a Markov uncertainty chain, so

I⇤[X0;Q0:t�1] I⇤[S0:t�1;Q0:t�1]

tC0.

Substitute into the LHS of (24), take logarithms and divide by t toget

C0 � log2(1� e)+ log2

����lr

�����1t.

Letting t ! • yields

C0 � log2(1� e)+ log2

����lr

���� .

As e can be made arbitrarily small, we are done. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 56 / 68

Page 135: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

But X0 �S0:t�1 �Q0:t�1 is a Markov uncertainty chain, so

I⇤[X0;Q0:t�1] I⇤[S0:t�1;Q0:t�1]

tC0.

Substitute into the LHS of (24), take logarithms and divide by t toget

C0 � log2(1� e)+ log2

����lr

�����1t.

Letting t ! • yields

C0 � log2(1� e)+ log2

����lr

���� .

As e can be made arbitrarily small, we are done. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 56 / 68

Page 136: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Necessity Argument - Scalar Case (cont.)

But X0 �S0:t�1 �Q0:t�1 is a Markov uncertainty chain, so

I⇤[X0;Q0:t�1] I⇤[S0:t�1;Q0:t�1]

tC0.

Substitute into the LHS of (24), take logarithms and divide by t toget

C0 � log2(1� e)+ log2

����lr

�����1t.

Letting t ! • yields

C0 � log2(1� e)+ log2

����lr

���� .

As e can be made arbitrarily small, we are done. ⇤

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 56 / 68

Page 137: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

State Estimation with Plant Disturbances

Plant: LTIXk+1 = AXk +Vk , Yk = GXk +Wk ,

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Estimator: Q0:k 7! X̂k+1.Objective: Uniformly bounded estimation errors beginning from an

l-ball. I.e. given l > 0, construct a coder-estimator s.t. forany initial state X0 with JX0K ✓ Bl(0),

supk2Z�0,w2⌦

kXk � X̂kk< •.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 57 / 68

Page 138: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

State Estimation with Plant Disturbances

Plant: LTIXk+1 = AXk +Vk , Yk = GXk +Wk ,

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Estimator: Q0:k 7! X̂k+1.Objective: Uniformly bounded estimation errors beginning from an

l-ball. I.e. given l > 0, construct a coder-estimator s.t. forany initial state X0 with JX0K ✓ Bl(0),

supk2Z�0,w2⌦

kXk � X̂kk< •.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 57 / 68

Page 139: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Estimation with Disturbances and C0Assumptions:

D1: A has one or more eigenvalues with magnitude � 1.D2: (G,A1) is observable, where A1 := A restricted to

eigenspace governed by eigenvalues of magnitude � 1.D3: JVkK and JWkK are uniformly bounded over k .D4: X0,V ,W and Z are mutually unrelated.D5: The zero-noise sequence pair (v ,w) = (0,0) is valid, i.e.

(0,0) 2 JV ,W K.

Theorem (N. TAC13)If uniformly bounded estimation errors are achieved from some l-ball,then

C0 � Â|li |�1

log2 |li |. (25)

Conversely, if (25) holds strictly, then for any l > 0, a coder-estimatorthat achieves uniformly bounded estimation errors from Bl(0) can beconstructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 58 / 68

Page 140: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Estimation with Disturbances and C0Assumptions:

D1: A has one or more eigenvalues with magnitude � 1.D2: (G,A1) is observable, where A1 := A restricted to

eigenspace governed by eigenvalues of magnitude � 1.D3: JVkK and JWkK are uniformly bounded over k .D4: X0,V ,W and Z are mutually unrelated.D5: The zero-noise sequence pair (v ,w) = (0,0) is valid, i.e.

(0,0) 2 JV ,W K.

Theorem (N. TAC13)If uniformly bounded estimation errors are achieved from some l-ball,then

C0 � Â|li |�1

log2 |li |. (25)

Conversely, if (25) holds strictly, then for any l > 0, a coder-estimatorthat achieves uniformly bounded estimation errors from Bl(0) can beconstructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 58 / 68

Page 141: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Control over Noisy Channels

kkkk

kkk

VBUAXXWGXY

�� �

�1

, kU kY Decoder/ Controller

Channel kQ Quantiser/ Coder

kS

kk WV , Noise

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 59 / 68

Page 142: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

LTI Control - Disturbance-Free

Plant: LTI, noiseless, zero input:

Xk+1 = AXk +BUk , Yk = GXk , X0 a uv.

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Controller: Q0:k 7! Uk .Objective: Uniform r-exponential stability on an l-ball. I.e. given

r, l > 0, construct a coder-controller s.t. for any uv X0 withJX0K ✓ Bl(0),

limk!•

supw2⌦

r�kkXkk= 0.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 60 / 68

Page 143: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

LTI Control - Disturbance-Free

Plant: LTI, noiseless, zero input:

Xk+1 = AXk +BUk , Yk = GXk , X0 a uv.

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Controller: Q0:k 7! Uk .Objective: Uniform r-exponential stability on an l-ball. I.e. given

r, l > 0, construct a coder-controller s.t. for any uv X0 withJX0K ✓ Bl(0),

limk!•

supw2⌦

r�kkXkk= 0.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 60 / 68

Page 144: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Disturbance-Free Control and C0fAssumptions:

DF1: A has one or more eigenvalues with magnitude > r.DF2: (G,Ar) is observable and (Ar ,B) is controllable, where

Ar := A restricted to eigenspace governed by eigenvaluesof magnitude � r.

DF3: X0 ? Z

PropositionIf uniform r-exponential stability is achieved on some l-ball, then

C0f � Â|li |�r

log2

✓|li |r

◆. (26)

Conversely, if (26) holds strictly, then for any l > 0, a coder-controllerthat achieves uniform r-exponential stability on Bl(0) can beconstructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 61 / 68

Page 145: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Disturbance-Free Control and C0fAssumptions:

DF1: A has one or more eigenvalues with magnitude > r.DF2: (G,Ar) is observable and (Ar ,B) is controllable, where

Ar := A restricted to eigenspace governed by eigenvaluesof magnitude � r.

DF3: X0 ? Z

PropositionIf uniform r-exponential stability is achieved on some l-ball, then

C0f � Â|li |�r

log2

✓|li |r

◆. (26)

Conversely, if (26) holds strictly, then for any l > 0, a coder-controllerthat achieves uniform r-exponential stability on Bl(0) can beconstructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 61 / 68

Page 146: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Control with Plant Disturbances

Plant: LTIXk+1 = AXk +BUk +Vk , Yk = GXk +Wk ,

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Controller: Q0:k 7! Uk .Objective: Uniformly bounded states beginning from an l-ball. I.e.

given l > 0, construct a coder-controller s.t. for any initialstate X0 with JX0K ✓ Bl(0),

supk2Z�0,w2⌦

kXkk< •.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 62 / 68

Page 147: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Control with Plant Disturbances

Plant: LTIXk+1 = AXk +BUk +Vk , Yk = GXk +Wk ,

Coder: Y0:k 7! Sk

Channel: Stationary and memoryless, Qk = t(Sk ,Zk ), where Z =channel noise.

Controller: Q0:k 7! Uk .Objective: Uniformly bounded states beginning from an l-ball. I.e.

given l > 0, construct a coder-controller s.t. for any initialstate X0 with JX0K ✓ Bl(0),

supk2Z�0,w2⌦

kXkk< •.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 62 / 68

Page 148: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Control with Disturbances and C0fAssumptions:

D1: A has one or more eigenvalues with magnitude � 1.D2: (G,A1) is observable and (A1,B) is controllable, where

A1 := A restricted to eigenspace governed by eigenvaluesof magnitude � 1.

D3: JVkK and JWkK are uniformly bounded over k .D4: X0,V ,W and Z are mutually unrelated.D5: The zero-noise sequence pair (v ,w) = (0,0) is valid, i.e.

(0,0) 2 JV ,W K.

Theorem (N. cdc12)If uniformly bounded estimation errors are achieved from some l-ball,then

C0f � Â|li |�1

log2 |li |. (27)

Conversely, if (27) holds strictly, then for any l > 0, a coder-controllerthat achieves uniformly bounded states from Bl(0) can be constructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 63 / 68

Page 149: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Control with Disturbances and C0fAssumptions:

D1: A has one or more eigenvalues with magnitude � 1.D2: (G,A1) is observable and (A1,B) is controllable, where

A1 := A restricted to eigenspace governed by eigenvaluesof magnitude � 1.

D3: JVkK and JWkK are uniformly bounded over k .D4: X0,V ,W and Z are mutually unrelated.D5: The zero-noise sequence pair (v ,w) = (0,0) is valid, i.e.

(0,0) 2 JV ,W K.

Theorem (N. cdc12)If uniformly bounded estimation errors are achieved from some l-ball,then

C0f � Â|li |�1

log2 |li |. (27)

Conversely, if (27) holds strictly, then for any l > 0, a coder-controllerthat achieves uniformly bounded states from Bl(0) can be constructed.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 63 / 68

Page 150: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

What Does That Add to [Matveev-Savkin IJC07]?

Matveev and Savkin considered similar estimation and controlproblems. Mixed formulation - plant noise was a nonstochastic,bounded disturbance, while initial state and channel werestochastic and independent.Aim was to achieve a.s. boundedness for any plant noise.Proof of necessity there used the randomness of the initial stateand channel to apply a law of large numbers.No information theory.Here, necessity is proved using data processing on Markovuncertainty chains, and analysing I⇤ and directed I⇤.No statistical assumptions.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 64 / 68

Page 151: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

What Does That Add to [Matveev-Savkin IJC07]?

Matveev and Savkin considered similar estimation and controlproblems. Mixed formulation - plant noise was a nonstochastic,bounded disturbance, while initial state and channel werestochastic and independent.Aim was to achieve a.s. boundedness for any plant noise.Proof of necessity there used the randomness of the initial stateand channel to apply a law of large numbers.No information theory.Here, necessity is proved using data processing on Markovuncertainty chains, and analysing I⇤ and directed I⇤.No statistical assumptions.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 64 / 68

Page 152: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

What Does That Add to [Matveev-Savkin IJC07]?

Matveev and Savkin considered similar estimation and controlproblems. Mixed formulation - plant noise was a nonstochastic,bounded disturbance, while initial state and channel werestochastic and independent.Aim was to achieve a.s. boundedness for any plant noise.Proof of necessity there used the randomness of the initial stateand channel to apply a law of large numbers.No information theory.Here, necessity is proved using data processing on Markovuncertainty chains, and analysing I⇤ and directed I⇤.No statistical assumptions.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 64 / 68

Page 153: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Summary

This talk described:A nonstochastic theory of uncertainty and information, withoutassuming a probability spaceIntrinsic characterisations of the operational zero-error capacityand zero-error feedback capacity for stationary memorylesschannels.An information-theoretic basis for analysing worst-case networkedestimation/control with bounded noise.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 65 / 68

Page 154: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

Outlook

Theory is still far from mature!Tractable algorithms to estimate C0 (perhaps Monte Carlo)?Disturbances with bounded energy or time-averages?C0f for channels with memory?Zero-error feedback capacity with imperfect channel feedback?Multi-agent systems...?

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 66 / 68

Page 155: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

References

C.E. Shannon, “A mathematical theory of communication”, Bell Syst. Tech. Jour., vol. 27, pp. 379–423, 623–56, 1948.

G.N. Nair, “A nonstochastic information theory for communication and state estimation”, IEEE Trans. Automatic Control,USA, vol. 58, no. 6, pp. 1497–510, 2013.

G.N. Nair, “A nonstochastic information theory for feedback”, Proc. 51st IEEE Conf. Decision and Control, Maui, USA,pp. 1343–8, 2012

J. Baillieul, “Feedback designs in information-based control”, Stochastic Theory and Control. Proceedings of a Workshopheld in Lawrence, Kansas, pp. 35–57, Springer, 2002

S. Tatikonda and S. Mitter, “Control under communication constraints”, IEEE Trans. Automatic Control, USA, vol. 49, no.7, pp. 1056–68, July 2004.

G. N. Nair and R. J. Evans, “Stabilizability of stochastic linear systems with finite feedback data rates”, SIAM Jour.Control and Optimization, vol. 43, no. 2, pp. 413–36, July 2004.

A. S. Matveev and A. V. Savkin, “An analogue of Shannon information theory for detection and stabilization via noisydiscrete communication channels”, SIAM Jour. Control and Optimization, vol. 46, no. 4, pp. 1323–67, 2007.

J. H. Braslavsky, R. H. Middleton and J. S. Freudenberg, “Feedback stabilization over signal-to-noise ratio constrainedchannels”, IEEE Trans. Automatic Control, USA, vol. 52, no. 8, pp. 1391–403, 2007

A. Sahai and S. Mitter, “The necessity and sufficiency of anytime capacity for stabilization of a linear system over a noisycommunication link part 1: scalar systems”, IEEE Trans. Info. Theory, pp. 3369–95, vol. 52, no. 8, 2006.

A.S. Matveev and A.V. Savkin, “Shannon zero error capacity in the problems of state estimation and stabilization vianoisy communication channels”, Int. Jour. Control, vol. 80, pp. 241–55, 2007.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 67 / 68

Page 156: Nonstochastic Information Theory for Feedback Controldisc-cps15.imtlucca.it/pdf/Nair.pdfk k k k k k k k X AX BU V Y GX W 1, Noise V k,W k kkkk kkk XAXBUV YGXW 1 Decoder/ U k,Y k Controller

References (cont.)

C.E. Shannon, “The lattice theory of information”, Trans. IRE Professional Group on Info. Theory, vol. 1, iss. 1, Feb.1953., pp. 105–8.

C.E. Shannon, “The zero-error capacity of a noisy channel”, Proc. IRE Trans. Info. Theory, vol. 2, pp. 8–19, 1956.

P. Gacs and J. Korner, “Common information is far less than mutual information”, Problems of Control and InformationTheory, vol. 2, no. 2, pp. 149–62, 1973

S. Wolf and J. Wullschleger, “Zero-error information and applications in cryptography”, in Proc. Info. Theory Workshop,San Antonio, USA, 2004, pp. 1–6.

J. Massey, “Causality, feedback and directed information”, in Proc. Int. Symp. Info. Theory App., Nov. 1990, pp. 1–6

Y.H. Kim, “A coding theorem for a class of stationary channels with feedback”, IEEE Trans. Info. Theory, 1488–99, 2008.

S. Tatikonda and S. Mitter, “The capacity of channels with feedback”, IEEE Trans. Info. Theory, pp. 323–49, 2009.

G. Klir, Uncertainty and Information: Foundations of Generalized Information Theory, Wiley, 2006, ch. 2.

H. Shingin and Y. Ohta, “Disturbance rejection with information constraints: performance limitations of a scalar system forbounded and Gaussian disturbances”, Automatica, vol. 48, no. 6, pp. 1111–6, 2012.

W. S. Wong and R. W. Brockett, “Systems with finite communication bandwidth constraints I”, IEEE Trans. AutomaticControl, USA, vol. 42, pp. 1294–9, 1997.

W. S. Wong and R. W. Brockett, “Systems with finite communication bandwidth constraints II: stabilization with limitedinformation feedback”, IEEE Trans. Automatic Control, USA, vol. 44, pp. 1049–53, 1999.

Nair (Uni. Melbourne) Nonstochastic Information DISC School 2015 68 / 68