Extremely non-normal numbers in dynamical systems

Extremely non-normal numbers in

dynamical systems

Anastasios Stylianou

School of Mathematics and Statistics

University of Saint Andrews

A thesis submitted for the degree of

Master of Mathematics

April 2018

Declaration

I certify that this project report has been written by me, is a record ofwork carried out by me, and is essentially di�erent from work

undertaken for any other purpose or assessment.

Anastasios Stylianou

Acknowledgements

Firstly, I would like to express my great appreciation to my supervisor,Professor Lars Olsen. His unfailing support and assistance through thiswork was invaluable. Working under his supervision helped me become abetter mathematician.

Further, I am particularly grateful for the constant encouragementand support of my family. My parents, Andreas and Emily, who werealways eager to learn about the progress of my thesis and all my siblings,Matthew, Vicky, Iro and Andreas, whose useful comments and insightshelped me throughout this project.

I would like to acknowledge the technical support fromMelissa Iacovidou,and thank her for introducing me to Lyx and assisting me with any LaTeXchallenge I presented to her.

Last but not least, I would like to say a big thank you to my �atmateKypros Papadopoulos. Both for his direct help on this project throughinspection and constructive conversations and mainly for his indirect supportby motivating me with his hard working presence.

Abstract

It was proved by Borel, more than a hundred years ago, that Lebesguealmost all real numbers are normal. Here, we present the topologicalviewpoint from which a typical number is not only non-normal, but itfails to be normal in a spectacular way. We consider numbers fromsymbolic representations of dynamical systems and examine the sequencesof their digits frequency vectors. Using regular averaging methods we tryto smooth out any divergence from these sequences. However, we observethat even the averaged versions of these sequences of vectors still exhibitan extreme non-normal behaviour. In particular, we show that every shiftinvariant probability vector is an accumulation point of these sequences.

Contents

0 Introduction 10.1 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Notation index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1 Measure and Category 31.1 Existence theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Cantor's theorem . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.2 Borel's theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.3 Baire category theorem . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Generic property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.1 Mearure theoretic viewpoint . . . . . . . . . . . . . . . . . . . 71.2.2 Topological viewpoint . . . . . . . . . . . . . . . . . . . . . . 71.2.3 A Paradox? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Symbolic Representation of Dynamical Systems 112.1 Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Dynamical systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Markov Partitions and symbolic representations . . . . . . . . . . . . 15

2.3.1 Uniqueness of symbolic representations . . . . . . . . . . . . . 18

3 Summability Theory 203.1 Divergent series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Averaging methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.1 History and de�nition . . . . . . . . . . . . . . . . . . . . . . 213.2.2 Examples of averaging methods . . . . . . . . . . . . . . . . . 233.2.3 Main result for regular linear transformations . . . . . . . . . 24

4 Normal Numbers 264.1 Historical overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2 Normal numbers in dynamical systems . . . . . . . . . . . . . . . . . 274.3 Extremely non-normal numbers . . . . . . . . . . . . . . . . . . . . . 284.4 Statement of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

i

5 Typical numbers are extremely non-normal 335.1 On in�nite Markov Partitions . . . . . . . . . . . . . . . . . . . . . . 345.2 On �nite Markov Partitions . . . . . . . . . . . . . . . . . . . . . . . 375.3 Proof of the main result . . . . . . . . . . . . . . . . . . . . . . . . . 40

References 46

A Regular linear transformations 47

B Accumulation points of block frequencies 51

ii

Chapter 0

Introduction

0.1 Thesis outline

Firstly, we start by stating the main result we aim to reach by the end of this thesis.

Theorem. Let k ≥ 1 be an integer and T ∈ T be a positive and regular, linear trans-

formation. Furthermore, let P = {Pi : i ∈ I} be a Markov Partition for a dynamical

system (M,ϕ). Suppose that the generated shift space X+P,ϕ is the one-sided full shift.

Then, the set ETk is residual.

After reading this project, the reader should be comfortable with the statement

of the theorem above. Each of the next four Chapters will focus on a speci�c area of

background knowledge needed to comprehend the main result. In these four chapters

we present a brief introduction on the following notions:

• Chapter 1: Generic property in measure theoretic and topological viewpoint

• Chapter 2: Shifts, dynamical systems and symbolic representations

• Chapter 3: Averaging methods

• Chapter 4: Normal numbers in symbolic representations of dynamical systems

The �nal and most important chapter, contains the proof of the main result.

Combining the general theory of chapters 2 and 4 we will de�ne normal and non-

normal numbers in a symbolic representation of a dynamical system. Then, making

use of regular averaging methods which will encounter in chapter 3, we will de�ne the

set of extremely non-normal numbers. Finally, we will show that this set of numbers

satis�es the topological version of the generic property discussed in chapter 1.

1

0.2 Notation index

Notation Meaning Page

N Set of natural numbers including zero 3

E σ-algebra 3

B(X) Borel sigma-algebra on X 4

λ Lebesgue measure 4

A Alphabet, set of symbols 11

AN Set of sequences with terms in A 11

Bn(X) Set of n-blocks occuring in words from X 12

B(X) Set of blocks occuring in words from X 12

F Set of forbidden blocks 12

XF Shift with set of forbidden blocks F 12

P Topological partition 14

bxc Integral part of x 17

T Set of �nice� linear transformations 26

TQ Set of �nice� linear transformations with rational entries 26

N(x, b, n) Number of occurences of digit b in the �rst n digits of x 27

N(x, b, n) Number of occurences of block b in the �rst n digits of x 28

S Alphabet, set of symbols 28

P (w, b, n) Frequency of block b, in the �rst n digits of word w 29

Pk(w, n) Vector of frequencies of k-blocks in the �rst n digits of word w 29

[b] Cylinder centered at the block b 29

P Tk (w,m) mth averaged vector of frequencies of k-blocks in w 30

ATk (w) Set of accumulation points of (P Tk (w,m))m∈N 30

Sk Simplex of shift invariant probability vectors 31

E Set of extremely non-normal numbers 31

P (w, b) Frequency of block b in string w 34

Pk(w) Vector of frequencies of k-blocks in string w 34

γ∗ γ∗ = γγγγ . . . 36

sgn(x) Sign of x: sgn(x) =

x/|x| x 6= 0

0 x = 050

2

Chapter 1

Measure and Category

1.1 Existence theorems

The aim of this chapter is to rigorously de�ne �generic properties� of a set. Following

the example of Oxtoby [10], we start with a brief introduction on the theorems by

Cantor, Baire and Borel. These three theorems are called existence theorems. The

motivation of this characterisation will be discussed later in this section.

1.1.1 Cantor's theorem

Both the concepts of measure and category are based on that of countability. There-

fore, let's start by recalling the de�nition of a countable set.

De�nition 1.1. A set A is called countable if there exists a bijective function from

N to A. Moreover, a set is called at most countable if it is either countable or �nite.

Theorem 1.1 (Cantor). For any sequence (an)n∈N of real numbers and for any non-

empty interval I ⊆ R there exists a point p in I such that p 6= an for every n ∈ N.

Proof. See �gure 1.1 for the idea of the proof.

Cantor's theorem can be characterised as an existence theorem. The reason is

that if a set of points not satisfying a speci�c property is a countable subset of an

interval in the real line, then we can conclude immediately that not only there exist

points in the interval satisfying the property in question, but in fact, it also follows

that most of the interval's points, in the sense of cardinality, satisfy this property.

3

Figure 1.1: Diagonal argument

1.1.2 Borel's theorem

Previously, we discussed the existence theorem of Cantor. In this section, we continue

with Borel's existence theorem connected with the notion of measure. As with the

previous theorem, we start with a de�nition, that of a nullset.

De�nition 1.2. A set A ⊆ R is said to be a nullset if for any ε > 0 there exists a

countable family of intervals In such that A ⊆⋃In and

∑|In| < ε. More generally, if

(X, E , µ) is a measure space then a subset A ⊆ X is said to be a nullset if µ (A) = 0.

Theorem 1.2 (Borel). If a �nite or in�nite sequence of intervals In in R covers a

non-empty interval I, then |I| ≤∑|In|. More generally, if (X, E , µ) is a measure

space and A ⊆⋃i∈NBi for measurable sets A and Bi for i ∈ N we have that

µ (A) ≤∑n∈N

µ (Bi)

Proof. Let A ∈ E and suppose that {Bi}i∈N ⊆ E such that A ⊆⋃i∈NBi. Then,

µ (A) = µ

(A ∩

⋃i∈N

Bi

)= µ

(⋃i∈N

Bi ∩ A

)≤∑n∈N

µ (Bi ∩ A) ≤∑n∈N

µ (Bi)

4

Remark. Consider the measure space (R,B(R), λ). In this space, every countable

set has measure zero whereas any non-empty interval has strictly positive lebesgue

measure. This implies that for any sequence of reals and any non-empty interval in

R, there exists a point in the interval which doesn't occur in the sequence proving

Cantor's theorem. Further, if the set of reals not satisfying a certain property is a

nullset and contained in an interval, then we can conclude that there exist points in

the interval satisfying this property. In fact, most of them in the sense of measure

have this property.

1.1.3 Baire category theorem

Finally, we present Baire's theorem. Again, this theorem can be characterised as an

existence theorem, this time linked with the notion of category.

De�nition 1.3. Let (X, d) be a metric space. A set A is a dense subset of X if any

non-empty, open subset of X has a non-empty intersection with A.

De�nition 1.4. Let (X, d) be a metric space. A set A is called nowhere dense if it

is not dense in any non-empty, open subset of X, equivalently if for all x ∈ X and all

r > 0, there is a y ∈ X and ρ > 0 such that

B (y, ρ) ⊆ B (x, r) \A

We can characterise a nowhere dense set as one that is �full of holes�.

De�nition 1.5. Let (X, d) be a metric space. A set A is said to be of �rst category

or meagre if it can be represented as a countable union of nowhere dense sets. A

subset of X that cannot be represented in this way is said to be of second category.

These de�nitions were formulated by Baire [2] in his PhD thesis in 1899, to whom

the following theorem is due.

Theorem 1.3 (Baire's Category Theorem). In a complete metric space (X, d), the in-

tersection of any countable family of open and dense sets is dense in X. Equivalently,

every complete metric space is of second category.

Proof. Theorem 1.3 in Oxtoby [10].

5

Remark. Baire's Theorem provides another proof of Cantor's Theorem. This is be-

cause any interval equipped with the euclidean metric is a complete metric space and

hence of second category, whereas any countable set is of �rst category. Moreover,

if a set of points not satisfying a certain property is of �rst category and contained

in an interval, we can conlude that not only there exist, but most, in the sense of

category, of the points in the interval satisfy this property.

De�nition 1.6. A subset A of a metric space (X, d) is called residual or co-meagre

if its complement is meagre.

Many results in this thesis will involve residual sets. Consequently, in the following

proposition we present the main properties of such sets which we will be using in our

proofs in the next chapters.

Proposition 1.1 (Properties of Residual sets). Let (X, d) be a metric space. Then,

1. if R is residual and R ⊆ S ⊆ X then S is residual,

2. if {Rn}∞n=0 is a countable family of residual sets, then⋂n∈N

Rn is residual,

3. if {Gn}∞n=0 is a countable family of open and dense sets, then⋂n∈N

Gn is residual.

Proof. (1): Let R be a residual set, i.e. X\R is meagre and suppose R ⊆ S ⊆ X.

Then, looking at the complements we have that X\S ⊆ X\R which implies that X\Sis also meagre.

(2): Let {Rn}∞n=0 be a countable family of residual sets. Then, for each n ∈ N, X\Rn

is meagre. Therefore, X\⋂n∈N

Rn =⋃n∈N

X\Rn is a countable union of meagre sets and

hence meagre.

(3): Assume {Gn}∞n=0 is a countable family of open and dense sets. Notice that since

X\⋂n∈N

Gn =⋃n∈N

X\Gn

by De Morgan's law, it su�ces to show the following claim.

Claim. For n ∈ N, X\Gn is nowhere dense.

Proof. Let x ∈ X and r > 0. Since Gn is dense we can �nd y ∈ Gn ∩ B (x, r). Gn

is also open so we can �nd ρ > 0 small enough so that B (y, ρ) ⊆ B (x, r) ∩ Gn =

B (x, r) \ (X\Gn).

6

1.2 Generic property

The three mentioned theorems provide di�erent notions of a �small� set, countable,

nullset and meagre respectively, with the last two both containing countable sets. A

nowhere dense set is small in the intuitive geometric sense of being perforated with

�holes� and a meagre set can be �approximated� by such a set. On the other hand, a

nullset is small in the metric sence that it can be covered by a sequence of open sets of

arbitrarily small total length. Therefore, this justi�es the next de�nitions where the

complements of these �small� sets de�ne the generic property in a measure theoretic

and a topological perspective, respectively.

1.2.1 Mearure theoretic viewpoint

De�nition 1.7. Let (X, E , µ) be a measure space. We say that �a property P holds

µ-almost everywhere in X� if the set {x ∈ X : x doesn't have P} is a nullset with

respect to µ.

Example. Countable sets, the Cantor set and lines in the plane are null subsets with

respect to to the Lebesgue measure λ. So for example, λ-almost all points in the

plane do not have rational coordinates and also for any q ∈ Q they do not belong in

the line x = π + q.

1.2.2 Topological viewpoint

De�nition 1.8. Let (X, d) be a metric space. We say that �a typical element of X

has property P � if the set {x ∈ X : x doesn't satisfy P} is meagre.

Example. All countable subsets of R are meagre, in particular the rational numbers.

The Cantor Set is an an example of an uncountable meagre set. Of course, as a space

the Cantor set equipped with the euclidean metric is complete and hence a Baire

space and so not meagre by Baire's Category Theorem. Hence, a typical real number

is both irrational and not in the Cantor set.

It is natural to ask if and how are these classes of �small� sets related. This is the

context of the next subsection.

7

1.2.3 A Paradox?

Theorem 1.4. The real numbers can be decomposed into two complementary sets

such that one is of �rst category and the other is of measure zero.

Proof. Any of the examples given bellow produces such a decomposition.

A complete metric space should be thought as a vast set. It is so big, that any

Cauchy sequence converges to a limit in the space. The theorem above suggests that

such a huge set can be decomposed into two disjoint �tiny� sets, each small in its own

way. This shows that a set that is small in one sense can be big in another. There

is no solid argument suggesting that these two notions of smallness should agree ev-

erywhere since they were introduced in di�erent areas of mathematics and so there is

nothing paradoxical about the result above. However as we noted before, countable

sets are �small� with respect to both viewpoints. The following examples from Oxtoby

[10] present some cases where a set is big with one notion and small with the other one.

Example 1.1. Let q0, q1, q2, ... be an enumeration of Q. Then, the set

G =∞⋂j=1

∞⋃i=1

(qi −

1

2i+j+1, qi +

1

2i+j+1

)

is a nullset with respect to (R,B(R), λ) but also residual in (R, |.|).Observe that, for each j ≥ 1, the set Gj =

∞⋃i=1

(qi − 1

2i+j+1 , qi +1

2i+j+1

)is open and

dense in R since it contains the rationals. Hence, {Gj}∞j=1 is a countable family of

residual sets and so by proposition 1.1.3 we get that G =∞⋂j=1

Gj is residual.

On the other hand, as shown below it follows by Borel's theorem that each Gj has

�nite lebesgue measure

λ (Gj) = λ

(∞⋃i=1

(qi −1

2i+j+1, qi +

1

2i+j+1)

)

≤∞∑i=1

λ

(qi −

1

2i+j+1, qi +

1

2i+j+1

)=∞∑i=1

1

2i+j=

1

2j<∞

8

Then, it follows by the continuity of measures that since G1 ⊇ G2 ⊇ G3 ⊇ ...

λ (G) = λ

(∞⋂j=1

Gj

)= lim

j→∞λ (Gj) = lim

j→∞

1

2j= 0

proving that G has Lebesgue measure zero.

A real number z is called a Liouville number if z is irrational and also satis�es

the property that for each positive integer n there exist integers p and q such that:∣∣∣∣z − p

q

∣∣∣∣ < 1

qnand q > 1

Example 1.2. A typical number in (R, |.|) is a Liouville number while L , the set of

Liouville numbers is a nullset with respect to (R,B(R), λ).

For n ≥ 1 write

Un =∞⋃q=2

∞⋃p=−∞

{x ∈ R : 0 <

∣∣∣∣x− p

q

∣∣∣∣ < 1

qn

}=∞⋃q=2

∞⋃p=−∞

(p

q− 1

qn,p

q+

1

qn

)\{p

q

}

and observe �rstly that the set Un is open and secondly that Q ⊆ Un implying that

R = Q ⊆ Un. In particular, we get that Un is dense in R. Then, {Un}∞n=1 is a

countable family of residual sets. Proposition 1.1.3 implies that⋂n≥1

Un is residual

in R. Set I := R\Q and notice that an element z in the residual set I ∩⋂n≥1

Un is

irrational and for each n ≥ 1 x ∈ Un which implies that there exists p, q ∈ Z such

that∣∣∣z − p

q

∣∣∣ < 1qn

and q ≥ 2. This implies that z is a Liouville number, i.e. z ∈ L .

This shows that I ∩⋂n≥1

Un ⊆ L and so by proposition 1.1.1 L is residual.

On the other hand, reversing the argument above we can show the weaker statement

that for n ≥ 3 we have:

L ⊆∞⋃q=2

∞⋃p=−∞

(p

q− 1

qn,p

q+

1

qn

)

9

Therefore, �xing m ∈ N we have:

L ∩ (−m,m) ⊆∞⋃q=2

qm⋃p=−qm

(p

q− 1

qn,p

q+

1

qn

)

and so considering the Lebesgue measure of this set

λ (L ∩ (−m,m)) ≤∞∑q=2

mq∑p=−mq

λ(

(p

q− 1

qn,p

q+

1

qn

)) =

∞∑q=2

mq∑p=−mq

(2

qn

)

=∞∑q=2

2(2mq + 1)

qn≤

∞∑q=2

4mq + q

qn≤ (4m+ 1)

∞∑q=2

1

qn−1

≤ (4m+ 1)

∫ ∞1

dq

qn−1≤ 4m+ 1

n− 2

This holds for any n ≥ 3 and therefore, since for �xed m ∈ N we have that

limn→∞4m+1n−2 = 0 we conclude that λ (L ∩ (−m,m)) = 0. This implies that L

is a nullset with respect to (R,B(R), λ).

Another example is the set of normal numbers in R. Lebesgue almost all numbers

are normal but a typical number is non-normal. In the next chapters, we will focus

in this example and prove some extensions and generalisations of this result.

10

Chapter 2

Symbolic Representation of

Dynamical Systems

In this chapter we present a brief introduction to symbolic dynamics using the same

notation as Chapters 1 and 6 of Lind and Marcus [6]. Symbolic representations of

dynamical systems will provide a more general setup of a number system than the

decimal expansion of the real numbers. Informally, a partition of a compact metric

space will induce, under some assumptions, a numbering representation of the points

in our space. This symbolic representation will enable us to study frequencies of digits

in a much more general layout.

2.1 Shifts

De�nition 2.1. If A is an alphabet then the one-sided full A− shift is the collection

of all countably in�nite sequences of symbols from A denoted by

AN ={x = (xi)i∈N : xi ∈ A for all i in N

}The one-sided full r-shift is the one-sided shift of the alphabet {0, . . . , r − 1}.

When |A| = r then there is a natural conjugation between the one-sided full A− shiftand the one-sided full r-shift and so sometimes we choose not to di�erentiate between

them. Similarly, if |A| = ω we can assume N is our alphabet. Also, note that through-

out this thesis we will be working with one-sided shifts AN instead of the two-sided

AZ. Very similar de�nitions and results could be stated for two-sided shifts.

11

Blocks of consecutive symbols will play a central role in the study of frequencies

of occurrence in numbers in the subsequent chapters.

De�nition 2.2. A block (or string) over A is a �nite sequence of symbols from A.For example αβαβαααββ is a block over the alphabet A := {α, β}. The length of

a block x is the number of symbols it contains. Therefore, if x = x1...xn then it has

length n, denoted by |x| = n. The empty block (or empty string), denoted by ε, is

the sequence with no symbols and so |ε| = 0. An m − block is a block of length m.

Finally, let Am denote the set of all m− blocks with symbols from A.

If x ∈ AN we say that a block w = w1...wn over A occurs in x = x0x1... if there

exists j ∈ N so that xj = w1, ..., xj+n−1 = wn. Notice that the empty block ε occurs

in any element x. Now let F be a set of blocks over A which we will regard as the

family of forbidden blocks. For any such F , de�ne XF to be the set of points in AN

such that no block in F occurs in one of them.

XF ={x = x0x1 . . . ∈ AN : xixi+1 . . . xi+j /∈ F for all i, j ∈ N

}De�nition 2.3. A one-sided shift space over A is a subset of the one-sided full

A− shift such that X = XF for a collection F of forbidden blocks with symbols from

A. The shift space XF is called of �nite type if F is a �nite set.

Conversely, sometimes it is convenient to describe a shift space by specifying

which blocks are allowed, rather than which are forbidden. This leads naturally to

the notion of the language of a shift space, i.e. the collection of �allowed� blocks in X.

De�nition 2.4. Let X be a subset of a one-sided shift space, and let Bn(X) denote

the set of all n−blocks that occur in points of X. The language of X is the collection

B(X) =∞⋃n=0

Bn(X)

12

Next, we present the necessary and su�cient conditions for a set of blocks to be

the language of a shift space and show that such a set determines uniquely the shift

space. The following two propositions will turn out to be very useful in the �nal

section of this chapter where we will use them to de�ne Markov Partitions.

Proposition 2.1. Let L be a collection of blocks over A. Then, L is the language

of some shift space X, i.e. L = B(X), if and only if for all w ∈ L the following two

conditions hold:

1. every subblock of w is in L and

2. there exist blocks u, v ∈ L so that |uv| > 0 and uwv ∈ L

Proof. (⇒) : if w ∈ L then there exists x ∈ X so that w occurs in x. But then,

every subblock of w also occurs in x prooving the �rst condition. For the second

condition, the existence of blocks u, v with |uv| > 0 and such that uwv ∈ L follows by

�extending� the block w in x. Since, x has in�nite length we can do this by choosing

the adjacent blocks of w on either side in the representation of x.

(⇐) : Assume L is a set of blocks over A such that the two conditions hold. Let

X denote the shift space XLc . We will show that actually L = B(X). Let w ∈ B(X),

i.e. there exists x ∈ X so that w occurs in x. Hence, w /∈ Lc which implies that

w ∈ L establishing that B(X) ⊆ L. On the other hand, for w ∈ L by repeatedly

applying rule 1 followed by rule 2 we can create a sequence x = (xi)i∈N so that each

subblock of x is in L and w occurs in x. Therefore, x ∈ XLc implying that w ∈ B(X)

and so L ⊆ B(X).

Proposition 2.2. The language of a shift space determines the shift space. In fact,

for a shift space X, it follows that X = XB(X)c.

Proof. Let x ∈ X. Any block occuring in x is an element of B(X) and so x ∈ XB(X)c

implying that X ⊆ XB(X)c . Conversly, let x ∈ XB(X)c . Then, any block w occuring

in x is an element of B(X). However, since X is a shift space there exists a set

of forbidden blocks F so that X = XF and so w ∈ B(XF) for any such block.

Consequently, w /∈ F �nally implying that XB(X)c ⊆ XF = X .

13

2.2 Dynamical systems

De�nition 2.5. A dynamical system (M,ϕ) consists of a compact metric space (M,d)

and a continuous map ϕ :M →M .

One of the main sources of interest in symbolic dynamics is its use in representing

other dynamical systems. Suppose we want to study a dynamical system (M,ϕ).

To describe the orbit {ϕn(x) : n ∈ N} for a given x ∈ M we can construct an

�approximative� description in the following way. Given a partition of M into sets

P0, P1, P2, ... we can track the orbit of x by recording the set Ei where x lands under

iteration by ϕ. This is the main idea of Markov Partitions studied in the next section.

De�nition 2.6. A topological partition of a metric space (M,d) is an at most count-

able collection P = {Pi : i ∈ I} of non-empty, disjoint, open sets whose closures

together cover M , meaning that M =⋃i∈IPi.

Note. A topological partition is not necessarily a partition in the usual sense, since

the union of its elements need not be the whole space. Also, recall from the de�nition

of an at most countable set in chapter 1, P could be a �nite set.

Suppose that (M,φ) is a dynamical system and that P = {Pi : i ∈ I} is a

topological partition ofM . Consider the index set I as an alphabet. Then, we say that

a block w = α1 . . . αn , where α1, . . . , αn ∈ I, is allowed for P , φ ifn⋂j=1

φ−j(Pαj) 6= ∅

and set LP,φ to be the collection of all allowed blocks of P , φ.

Proposition 2.3. LP,φ is the language of a unique shift space called the one-sided

symbolic dynamical system corresponding to P , φ and denoted by X+P,φ.

Proof. Using Proposition 2.1 it su�ces to check that for w ∈ LP,φ all subblocks of w

are in LP,φ and also that there exist u, v ∈ LP,φ with |uv| > 0 such that uwv ∈ LP,φ.For w = a1...an ∈ LP,φ we have that

n∩j=1

φ−j(Paj) is non-empty by assumption. Any

subblock w′ of w is of the form w′ = ai . . . ai+j with 1 ≤ i ≤ n, 0 ≤ j ≤ n− i. Noticethat

n∩j=1

φ−j(Paj) ⊆i+j∩k=i

φ−k(Pak) and so by assumption w′ ∈ LP,φ. To prove the second

condition take u = v = a1 and notice that sincen∩j=1

φ−j(Paj) ⊆ φ−1(Pa1) we have

that u, v ∈ LP,φ and |uv| > 0. Then, uwv = a1a1...ana1 and again using the fact thatn∩j=1

φ−j(Paj) is non-empty by assumption we conclude that uwv ∈ LP,φ . Therefore,

LP,φ is the language of a shift space X. Uniqueness, follows from Propostion 2.2.

14

Now given a dynamical system (M,ϕ) and a topological partition P = {Pi : i ∈ I}of M , in order to ensure that X+

P,φ gives a realistic representation of points in M we

need to impose some extra conditions. For n ≥ 0, consider the n-cylinder around

x = x0x1 . . . ∈ X+P,φ and denote it by

Dn(x) =n⋂k=0

ϕ−k(Pxk) ⊆M

The closures of the n-cylinders,{Dn(x)

}∞n=0

, are closed subsets of a compact space,

hence compact. Additionaly, they decrease with n, meaning that D0(x) ⊇ D1(x) ⊇D2(x) ⊇ ... .

Claim.∞⋂k=0

Dk(x) is a non-empty subset of M .

Proof. Assume for the sake of contradiction that∞∩k=0

Dk(x) = ∅. Then, De Morgan's

Law gives that the union of the complements of the n-cylinders,∞∪k=0

(Dk(x))c provide

an open cover ofM . By assumption,M is a compact metric space hence there exists a

�nite open subcoverN∪k=0

(Dnk(x))c =M . But then considering the complemnets again

we have thatN∪k=0

Dnk(x) = ∅. Set n0 = max0≤k≤N nk. Then, the decreasing property

of the cylinders imples that Dn0(x) =N∩k=0

Dnk(x) = ∅ which is a contradiction.

We have established that∞∩k=0

Dk(x) is non-empty. Moreover, we would like this

countable intersection of cylinders around x to contain a single element ofM in order

for X+P,φ to be in an injective correspondence with M . This leads us to the de�nition

of a Markov Partition.

2.3 Markov Partitions and symbolic representations

De�nition 2.7. Let (M,ϕ) be a dynamical system and P = {Pi : i ∈ I} be a topo-

logical partition ofM . Then, we call P a Markov Partition for (M,ϕ) if the following

two conditions hold:

1. for all x ∈ X+P,φ the intersection

∞∩k=0

Dk(x) is a singleton,

2. X+P,φ is a shift space of �nite type.

We say that a Markov Partition produces a symbolic representation of the Dynamical

System.

15

We proceed by presenting some examples of symbolic representations of dynam-

ical systems that should seem familiar. In all four of the following examples, for

simplicity we use ([0, 1) , |.|) as the underlying metric space of the dynamical system.

Although this space is not compact, there exists a natural bijection between [0, 1) and

R/Z, where (R/Z, |.|) is a compact metric space. In the �rst two examples, we con-

sider �nite Markov Partitions which produce the well-known N -ary and β-expansions.

Example 2.1. Let N ∈ N and consider the dynamical system ([0, 1) , SN) where

SN : [0, 1)→ [0, 1) is given by

SN(x) = Nx (mod 1)

together with the �nite topological partition

P =

{(i

N,i+ 1

N

): 0 ≤ i ≤ N − 1

}Then it is easy to see that P is a Markov Partition for ([0, 1) , SN) with induced shift

space the one-sided full N -shift and the symbolic representation of the system is ex-

actly the N -ary expansion of [0, 1).

Example 2.2. Similarly to the previous example, let β > 1 be a real number and

consider the dynamical system ([0, 1) , ϕ) where ϕ : [0, 1)→ [0, 1) is given by

ϕ(x) = βx (mod 1)

together with the �nite topological partition

P =

{(i

β,i+ 1

β

): 0 ≤ i ≤ bβc − 1

}∪{(bβcβ, 1

)}Then again, we can show that P is a Markov Partition of ([0, 1) , ϕ) and the symbolic

representation that it produces corresponds to β-expansion of points in the unit in-

terval.

16

In the next two examples we consider two other types of expansions with the

main di�erence being that their corresponding topological partition will be countably

in�nite rather than �nite as in N -ary and β-expansions.

Example 2.3. Consider the dynamical system ([0, 1) , G) where G : [0, 1)→ [0, 1) is

the Gauss map given by

G(x) =

1x−⌊1x

⌋x 6= 0

0 x = 0

together with the in�nite topological partition P ={(

1i+1, 1i

): i ∈ N, i ≥ 1

}Claim. For x ∈ [0, 1) the symbolic representation of its orbit under iteration with G

corresponds to its Continued Fraction Expansion.

Proof. Set x1 = x and for k ≥ 1 and xk+1 = G (xk). Then, the sequence (xk)k∈N is the

sequence of the orbit of x under iteration with G. Notice that applying G corresponds

to taking the fractional part of the input. Associate each interval(

1i+1, 1i

)∈ P for i ≥

1 with the digit i. Then, the kth digit of x in its symbolic representation is the unique

natural number ak such that xk ∈(

1ak+1

, 1ak

). This implies that 1

xk∈ (ak, ak + 1)

and so⌊

1xk

⌋= ak. But this is exactly how the kth digit in the continued fraction

expansion is found, as the integer part of the reciprocal of the fractional part of the

previous step xk−1.

Remark. The partition should be modi�ed so that it contains the lower limit at each

interval in order to face the problem occurring when we consider rational points in

[0, 1). However, we choose to ignore this problem in order to have open sets in the

partition. The set of rationals is a countable set and thus negligible in our perspective

in the later chapters.

Example 2.4. Consider the dynamical system ([0, 1) , L) where L : [0, 1)→ [0, 1) is

de�ned by

L(x) =

n(n+ 1)x− n x ∈[

1n+1

, 1n

)0 x = 0

together with the in�nite topological partition

P =

{(1

i+ 1,1

i

): i ∈ N, i ≥ 1

}

17

Then, similarly to the example above, P is a Markov Partition and by associating each

interval(

1i+1, 1i

)∈ P with the digit i we can show that for x ∈ [0, 1) the symbolic

representation of its orbit under iteration with L corresponds to its Lüroth series

expansion.

2.3.1 Uniqueness of symbolic representations

Let (M,ϕ) be a dynamical system and suppose that P = {Pi : i ∈ I} is a Markov

Partition ofM . Using the same notation as above, denote by X+P,φ the one-sided shift

of �nite type coresponding to P , φ.In the de�nition of a Markov Partition we required that each element of the par-

tition is an open set. This will allow us to escape from any ambiguity concerning the

uniqueness of symbolic representations and get a one to one correspondence between

in�nite words in our shift X+P,φ and �typical� points in the dynamical system (M,ϕ).

Markov Partitions require that any w ∈ X+P,φ maps to a unique element in M , but

the converse need not be true. For example consider the decimal expansion as seen

in example 2.1. Then, 0.2000... and 0.1999... represent the same point in [0, 1]. This

huge complication appears because the boundaries of di�erent elements in the parti-

tion is non-empty. In this case, we have that 210∈(

110, 210

)∩(

210, 310

)which leads to

two dinstict representations of the number 210.

In this thesis, we are aiming to show that a subset of points in M is residual. To

make things easier though we would like to concentrate on the set of inner points of

elements in a topological partition of M . As we show below, the set of inner points

U∞ is residual in M and so by working in U∞ instead of M we only avoid a meagre

set, not a�ecting our aim to show that another subset of M , seen now as a subset of

U∞, is residual.

For the reason mentioned above, set

U =⋃i∈I

Pi

and notice that U is an open and dense set in M . This follows from the facts that

each Pi is open and that U =⋃i∈IP i = M . For n ≥ 1 set Un =

n−1⋂i=0

ϕ−i(U). Using the

facts that U is an open and dense set while ϕ is a continuous map we conclude that

for each n ≥ 1, Un is an open and dense subset of M . Hence, {Un}n∈N is a family

18

of open and dense sets inside the compact space M . Then, by applying the Baire

Category Theorem we get that U∞ :=∞⋂i=0

Ui is a dense subset of M and moreover as

shown in proposition 1.1, U∞ is a residual subset of M .

In the next chapters we are going to consider symbolic expansions of points in M

and study their digits frequencies. Ergo, we need the map πP,φ : XP,φ →M given by

πP,φ(w) ∈∞⋂n=0

Dn(w)

This map is well de�ned since by assumption the set∞⋂n=0

Dn(w) is a singleton for each

w ∈ X+P,φ. Moreover, πP,φ is bijective on U∞ and we can call w ∈ X+

P,φ the symbolic

expansion of x ∈ U∞ if x = πP,φ(w) . As mentioned before, in the following chapters

for convienience we will be using U∞ as our ambient space.

Note. The shift map σ : X+P,φ → X+

P,φ given by

σ(x0x1x2 . . .) = x1x2x3 . . .

is such that the following diagram commutes:

X+P,φ

σ−−−−→ X+P,φyπP,φ

yπP,φM

ϕ−−−−→ M

19

Chapter 3

Summability Theory

The theory of summability of divergent series is a major branch of mathematical

analysis that has found important applications in applied mathematics, physics and

engineering. It deals with methods of assigning natural values to divergent series,

whose prototypical examples include the Abel summation method, the Cesàro means

and Borel summability method (cf. Alabdulmohsin [1]).

3.1 Divergent series

Any series that does not converge to a real number is called divergent. In 1828, Niels

Abel described divergent series as the �work of the devil� and declare it �shameful�

for any mathematician to base any argument on them (cf. Hardy [4]). However, some

divergent series are �nicer� than others in the sense that they may share properties

with convergent series or even after alternating their terms they become convergent.

Some of the most famous examples of divergent series are given below.

Example 3.1. The series∑∞

n=11n= 1 + 1

2+ 1

3+ 1

4+ 1

5+ . . . known as the harmonic

series satis�es a necessary condition for a series to converge, namely that its terms

tend to 0, but at the same time the series slowly diverges to in�nity.

Example 3.2. The divergent series∑∞

n=0(−1)n = 1−1+1−1+ . . ., known today as

Grandi's divergent series, sparked a big debate in the community of mathematicians

of the 18th century. Looking at the value(s) of this series, Guido Grandi was allured

to advocate creationism. He argued that grouping terms into (1 − 1) + (1 − 1) + ...

20

suggested the value of 0 for the in�nite sum. On the contrary, the series itself arose in

a well-known Maclaurin expansion of the function f(x) = 11+x

around 1, which sug-

gested a value of 12for the in�nite sum. Since the same mathematical object could be

equally assigned a zero value (nothing) and a non-zero value (something), Grandi ar-

gued that creation out of nothing was mathematically justi�able (cf. Alabdulmohsin

[1]).

Example 3.3.∑∞

n=0 n = 0 + 1 + 2 + 3 + 4 + . . .. In this example, not only the

sum of the natural numbers diverges to in�nity but also its terms diverge to in�nity.

Note that the usual misconception that this sum is equal to − 112

arises in at least two

di�erent ways. However, both methods, Zeta function regularization and Ramanu-

jan summation, do not rigorously prove this result but rather suggest that in some

perspective it is logical to associate the value − 112

to this divergent series.

3.2 Averaging methods

3.2.1 History and de�nition

Divergent series arise quite frequently in many branches of mathematics and sciences,

such as asymptotic analysis, analytic number theory, Fourier analysis, quantum the-

ory and dynamical systems, but their divergence makes them di�cult to work with.

Consequently, they provoked a profound debate in the mathematical community for

a long time. Throughout the 17th and 18th centuries, for example, mathematicians

used divergent series regularly, rather naively. They supported the view that a series

should have the value of the algebraic expression from which it was derived. Partic-

ularly, Euler believed that every series had a unique value, and that divergence was

nothing more than an arti�cial limitation [4].

In the 19th century, however, the commitment to mathematical rigor had led the

most prominent mathematicians of the time, such as Cauchy and Weierstrass, to

forbid the use of divergent series entirely. Consequently, little work was published

on divergent series from 1830 to 1880 Hardy [4]. Nevertheless, around the turn of

the 20th century, a Hegelian synthesis between the two opposing views was initiated.

Ernesto Cesàro placed their study of divergent series on a rigorous footing by provid-

ing the �rst modern de�nition. Cesàro's denition was an averaging method, which

21

was later generalized independently by Norlund and Voronoi. Later on, Ramanujan

would share a similar view by describing the value of an in�nite sum as a �center of

gravity�. The study of divergent series included contributions from famous mathe-

maticians such as Frobenius, Borel, Hardy, Ramanujan and Littlewood, and the name

summability theory was established to denote this newly surfaced area of mathemat-

ical analysis. Reasonably, the key question in summability theory is how to interpret

divergent series, such as the Grandi series mentioned earlier (cf. Alabdulmohsin [1]).

For a series∞∑i=0

ai we denote the nth partial sum by sn = a0+a1+. . .+an for n ∈ N.

When the sequence of partial sums converges to a real number, we say that the series

convereges to that limit. The theory of divergent series uses averaging methods to

generalise the notion of the limit of (sn)n∈N and hence give a value to the series that

would otherwise diverge in the traditional sense.

De�nition 3.1. An N×N real valued matrix T = [cm,n], where cm,n is the (m,n)th-

entry of T , is called a linear transformation if for any real sequence (sn)n∈N and any

natural number m, tm =∑∞

n=0 cm,nsn converges to a real number. The sequence

(tm)m∈N is called the T -averaged version of (sn)n∈N and if limm→∞ tm exists we call it

the T -value of (sn)n∈N.

Linear transformations are mainly used to smooth out divergence from a sequence.

However, if a sequence is already convergent we would like the averaged version of

this sequence to also converge to the same limit. Those linear transformations which

�preserve� limits are called regular. In this thesis, we will be primarily concerned with

regular and positive, linear transformations.

De�nition 3.2. A linear transformation T , with (m,n)th-entry cm,n ∈ R is called

1. Regular, if T maps all convergent sequences to their original limit, i.e. T is

regular if

tm → l as m→∞

whenever

sk → l as k →∞

2. Positive, if cm,n ≥ 0 for all m,n ∈ N.

22

3.2.2 Examples of averaging methods

Hölder means: Given a series A :=∑∞

n=0 ai with partial sums sequence (sn)n∈N

set H0n = sn and de�ne recursively Hk+1

n =Hk

0+...+Hkn

n+1. If the limit limn→∞H

kn exists

for some k ∈ N we say that A is Hölder summable with (h, k) sum equal to the limit

above.

Cesàro means: The Cesàro (c, 1) averaging method is a linear transformation given

by the lower triangular matrix

T =

1 0 0 · · · 0 · · ·12

12

0 · · · 0 · · ·13

13

13

. . . 0 · · ·...

......

. . . 0 · · ·1m

1m

1m

... 1m

......

......

......

. . .

More generally, as in the previous example, we can de�ne the (c, k) Cesàro averaging

method in the following way:

Given a series A :=∑∞

n=0 ai with partial sums sequence (sn)n∈N set A0n = sn and

de�ne recursively Ak+1n = Ak0 + ... + Akn. De�ne also Ek

n to be the value of Akn when

a0 = 1 and an = 0 for n ≥ 1. If the limit limn→∞Ak

n

Eknexists for some k ∈ N we say

that A is Cesàro summable with (c, k) sum equal to the limit above.

The two examples above are very similar. Holder means (h, k) are de�ned recur-

sively by n summations and one division each time in k steps, whereas the Cesàro

means (c, k) are de�ned in the same way except that the only division occurs at the

last step. It is shown in Hardy [4] that the two methods are actually equivalent,

meaning that a series A is (h, k) summable with sum equal to S if and only if A is

(c, k) summable with the same sum.

Abelian means: Let (λn)n∈N be a strictly increasing and unbounded sequence of

non-negative reals. Suppose

f(x) =∞∑n=0

sne−λnx

23

converges for all real numbers x > 0. Then the Abelian mean Aλ is given by

Aλ((sn)n) = limx↘0

f(x)

If λn = n for each n ∈ N, then we obtain the method of Abel summation. Here

f(x) =∞∑n=0

sne−nx =

∞∑n=0

snzn

putting z = e−x. Then the limit of f(x) as x approaches 0 from above is the same as

the limit of the power series for f(z) as z approaches 1 from below through positive

reals, and the Abel sum A(s) is de�ned to be

A((sn)n) = limz↗1

∞∑n=0

snzn

It is interesting to note that Abel summation is consistent with Cesàro summation,

i.e. A((sn)n) = (c, k)((sn)n) whenever the latter is de�ned. The Abel sum is therefore

regular, linear and consistent with Cesàro summation but also more powerful than

the latter.

3.2.3 Main result for regular linear transformations

Toeplitz and Schur proved the following result around 1911 in order to give a charac-

terisation of regular linear transformations. Toeplitz considered only lower triangular

matrices. However, his result was then generalised by Steinhauss (see Hardy [4]).

Theorem 3.1. A linear transformation T = [cm,n]N×N is regular if and only if the

following conditions are satis�ed:

1. there exists M > 0 such that for all m ∈ N we have that γm =∑∞

n=0 |cm,n| < M ,

2. for all n ∈ N, limm→∞ cm,n = 0 and

3. setting cm =∑∞

n=0 cm,n we have that cm → 1 as m→∞.

Proof. Given in the Appendix.

24

In the next chapter, we will be using linear transformations T = [cm,n]N×N that are

positive and regular. Using the theorem above, we get that cm,n ≥ 0 for all m,n ∈ Nand that the sequence of row sums of T , (γm)m∈N is bounded and tends to 1, whereas

the elements of each column of T tend to 0. Finally, following Toeplitz's example we

will work with lower triangular linear transformations.

De�nition 3.3. Let T be the set of all �nice� linear transformations T = [cm,n]N×Nsuch that the following two conditions hold:

• T is a lower triangular matrix and

• there exists G > 0 such that supn∈N |cm,n| < Gm

for all m ∈ N.

Also, denote by TQ the countable subset of linear transformations in T with entries

in Q.

Example. The linear transformations corresponding to the (h, k) Hölder means and

the (c, k) Cesàro means are elements of T .

25

Chapter 4

Normal Numbers

4.1 Historical overview

Turning our attention to occurrences of distinct digits in a number, we may ask our-

selves, what properties should a �normal� number satisfy? In order to get acquainted

with this question consider the following example. Assume we choose a real number

at random, written in decimal expansion. Then, there is nothing suggesting that

the digit �5� will appear more times than the digit �7� in this number. This is the

intuition for Borel's normal numbers theorem but before we introduce this, let's �rst

rigorously de�ne the property of normality.

De�nition 4.1. Let x be a real number with fractional part .x1x2... when written

in its unique non-terminating expansion base r. Let N(x, b, n) denote the number of

occurrences of the digit b in the �rst n places in the fractional part of x. The number

x is called simply normal to base r if

limn→∞

N(x, b, n)

n=

1

r

for each b ∈ {0, ..., r − 1}; x is said to be normal base r if all numbers x, xr, xr2, ...

are simply normal in bases r, r2, r3, ... .

Remark. This de�nition for r = 10 was introduced by Emile Borel [3] who stated that

�la propriete characteristique� of a normal number is the following.

26

De�nition 4.2. A real number x with fractional part .x1x2... when written in its

unique non-terminating expansion base r is called normal if for any block of m spec-

i�ed digits b = b1 . . . .bm we have

limn→∞

N(x, b, n)

n=

1

rm

where N(x, b, n) stands for the number of occurrences of the block b in the �rst n

digits of the fractional part of x.

Remark. In Niven and Zuckerman [8], it is shown that the two de�nitions of normal

numbers given above are equivalent.

Theorem 4.1. (Borel,1909) Lebesgue almost all numbers in [0, 1] are normal base

10.

Remark. This result extends to any base r, for r ≥ 2. Sometimes when the base we

are working in is clear from the context, instead of calling a number normal base r,

we will just call it normal.

Corollary 4.1. A real number is called non-normal base r if it is not normal in base

r. The set of non-normal numbers in any base has Lebesgue measure zero.

Proof. It follows from the generalised version of Borel's result.

4.2 Normal numbers in dynamical systems

We would like to de�ne the property of normality in a more general setup. Consider

numbers in a symbolic representation of a dynamical system, using the notation and

notions introduced in Chapter 2. Given (M,ϕ) a dynamical system with a Markov

Partition P = {Pi : i ∈ I} set

S =

{0, ..., N − 1} if I is a �nite set of size N

N if I is a countable set

to be the alphabet of our shift so that each digit i in the alphabet S corresponds,

maybe after relabeling, to the set Pi in the partition.

27

Similarly to the previous section, we de�ne the frequency of a block of length k,

say b = b1...bk ∈ Sk, in the �rst n digits of a word w = a0a1a2... ∈ X+

P,ϕ by

P (w, b, n) =

|{0≤i≤n−k: ai=b1,...,ai+k−1=bk}|

n−k+1if n ≥ k

0 otherwise

and also write

Pk(w, n) = (P (w, b, n))b∈Sk

for the vector of frequencies of all k-blocks in the �rst n digits of w. Notice that we

immediately get that ||Pk(w, n)||1 =∑

b∈Sk |P (w, b, n)| = 1.

Now let µ be a probability measure on the induced shift space X+P,φ.

De�nition 4.3. We call w ∈ X+P,φ µ-normal, if for any block b ∈ S

k we have

limn→∞

P (w, b, n) = µ ([b])

Where [b] is the cylinder around b, i.e. the set{(ai)i∈N ∈ X

+P,φ : a0 = b1, ..., ak−1 = bk

}.

In particular, we call w normal, if µ is the the normalised Hausdor� measure on X+P,φ.

4.3 Extremely non-normal numbers

In the previous section we de�ned a normal number in a symbolic dynamical system.

Now we focus on non-normal numbers. Non-normal numbers do not satisfy the prop-

erty above, i.e. that for any block of digits the sequence of its frequencies converges

to the measure of the cylinder centered arround that block. More crucially, for many

numbers not only that sequence does not converge to that speci�c point but rather it

diverges dramatically. The divergence is so chaotic, that almost every possible con-

vegrence point is an accumulation point for the sequence. We will study this in more

detail later. Now, we use the previous chapter on regular linear transformations to

de�ne an averaged version of the sequence of block frequencies to check, if even this

smoothed version of the sequence still exhibits this very abnormal behaviour.

28

Recall from the previous section the de�nition of the vector of block frequencies

in the �rst n digits.

Pk(w, n) = (P (w, b, n))b∈Sk

We use this to de�ne the sequence (Pk(w, n))n∈N. Given a positive and regular,

linear transformation T = [cm,n]N×N we proceed to de�ne the T -averaged version of

the latter sequence denoted by (P Tk (w,m))m∈N where for m ∈ N we set

P Tk (w,m) =

∑n∈N

cm,nPk(w, n)

Note. P Tk (w,m) is a vector with a coordinate for each b ∈ S

k. Keep in mind that in

the term Pk(w, n), n corresponds to the number of digits of w we consider frequen-

cies of blocks over. On the other hand, in the term P Tk (w,m), m corresponds to the

number of averaging steps already taken likewise to the sequence (tm)m∈N considered

in the previous chapter.

For a normal number w, we have that (Pk(w,m))m∈N converges pointwise to

(µ ([b]))b∈Sk with respect to ||.||1 and so by regularity of T ,(P Tk (w,m)

)m∈N also

converges pointwise to the same limit.

Turning our attention to non-normal numbers now, consider for w ∈ X+P,φ the set

of accumulation points of the sequence (P Tk (w,m))m∈N, denoted by ATk (w).

Theorem 4.2. Let w ∈ X+P,φ. Then, ATk (w) ⊆ Fk where Fk is the following simplex

of shift invariant vectors:

Fk =

(pi)i∈Sk : pi ≥ 0,∑i∈Sk

pi ≤ 1,∑i∈S

pij =∑i∈S

pji for all j ∈ Sk−1

.

Proof. Given in the Appendix

29

Now we concentrate on the elements of Fk that are probability vectors. Hence,

consider the set

Sk ={p ∈ [0, 1]S

k

: ‖p‖1 = 1}∩ Fk

It is important to note that, it is shown in the proof of the theorem above that

Sk = Fk, in the case that S is �nite.

De�nition 4.4. Let x ∈ U∞ with π−1P,φ(x) = w ∈ X+P,φ . For T ∈ T we call x,

T -extremely non-k-normal if ATk (w) = Sk. Also, denote the set of T -extremely non-

k-normal numbers by ETk . Finally, de�ne the set of extremely non-normal numbers

by:

E =⋂T∈T

⋂k≥1

ETk

where T is positive and regular.

In order to make it easier to work with the set E, we present the following propo-

sition to express it as a countable intersection.

Proposition 4.1. E =⋂

T∈TQ

⋂k≥1

ETk .

Proof. Clearly, E ⊆⋂

T∈TQ

⋂k≥1

ETk . We are now left to prove the reverse containment.

Let x ∈⋂

T∈TQ

⋂k≥1

ETk and �x T ∈ T and k ∈ N. It su�ces to show that x ∈ ETk . In

other words, considering q ∈ Sk it su�ces to show that it is an accumulation point of

(P Tk (w,m))m∈N, where w is the symbolic expansion of x. Let ε > 0 and since TQ = T

we can �nd T ′ = [c′m,n]N×N in TQ such that |cm,n − c′m,n| < ε2n+2 for all m,n ∈ N.

Since x ∈ ET ′k there exists a strictly increasing sequence of natural numbers (rl)l∈N

so that P T ′

k (w, rl) → q as l → ∞. Then, there exists L ∈ N so that for l ≥ L, we

have ∥∥∥P T ′

k (w, rl)− q∥∥∥1<ε

2

30

Therefore, for l ≥ L we have that

∥∥P Tk (w, rl)− q

∥∥1≤∥∥∥P T

k (w, rl)− P T ′

k (w, rl)∥∥∥1+∥∥∥P T ′

k (w, rl)− q∥∥∥1

≤

∥∥∥∥∥∞∑n=0

crl,nPk(w, n)−∞∑n=0

c′rl,nPk(w, n)

∥∥∥∥∥1

+ε

2

≤

∥∥∥∥∥∞∑n=0

(crl,n − c′rl,n)Pk(w, n)

∥∥∥∥∥1

+ε

2

≤∞∑n=0

∣∣crl,n − c′rl,n∣∣ ‖Pk(w, n)‖1 + ε

2

≤∞∑n=0

∣∣crl,n − c′rl,n∣∣+ ε

2

≤∞∑n=0

ε

2n+2+ε

2=ε

2

∞∑n=0

1

2n= ε

which shows that q ∈ Sk is an accumulation point of (P Tk (w,m))m∈N. This holds for

any q ∈ Sk implying that x ∈ ETk .

4.4 Statement of results

Theorem 4.3 (Main Result). Let k ≥ 1 be an integer and T ∈ T be a positive

and regular, linear transformation. Furthermore, let P = {Pi : i ∈ I} be a Markov

Partition for a dynamical system (M,ϕ). Suppose that the generated shift space X+P,ϕ

is the one-sided full shift. Then, the set ETk is residual.

Proof. See Chapter 5.

Remark 4.1. For the case where the Markov Partition is a �nite set and hence produces

a shift space over a �nite alphabet, we could get a more general result as shown in

Madritsch and Petrykiewicz [7], by requiring X+P,ϕ to satisfy the speci�cation property

instead of being the full one-sided shift. The speci�cation property provides the

required framework to show that, our shift is �almost� closed under concatenation of

words, a property we want to use in our proofs which is clearly true for the full shift.

31

Remark 4.2. Recall the de�nition of N -ary expansions from example 2.1. Also, re-

member the (c, k) Cesàro means for k ∈ N seen in Chapter 3. The main result of

Hyde et al. [5] says that for a typical point in [0, 1] the Cesàro averaged version of

the sequence of block frequncies in its N -ary expansion has the whole simplex of

N -dimensional probability vectors as set of accumulation points. This result consists

of a special case of theorem 4.8 where the symbolic representation of the system is

the one discussed in example 2.1 and extremely non-normal numbers are de�ned only

by considering Cesàro means instead of any positive and regular averaging method.

Therefore, denoting by Tk ∈ T the linear transformation corresponding to the (c, k)

Cesàro means and assuming our dynamical system is ([0, 1] , SN) together with the

Markov Partition{(

iN, i+1N

): 0 ≤ i ≤ N − 1

}we get the required result from theorem

4.8. In particular, this result was the motivation for this more general version.

Corollary 4.2. The set E is residual in M , implying that �a typical element of M is

extremely non-normal�.

Proof. Theorem 4.8 implies that for T ∈ TQ and k ∈ N the set ETk is residual in M .

By proposition 4.5, E is a countable intersection of residual sets, hence residual in

U∞. Finally, U∞ was shown to be a residual set of M implying that E is also residual

in M and hence completing the proof.

32

Chapter 5

Typical numbers are extremely

non-normal

This Chapter will focus on the proof of Theorem 4.3 which is the main result of this

thesis. We begin by introducing some more notation and present some lemmata that

will play an important role in our aim to prove the theorem.

De�nition 5.1. For a �nite word w = a0...an−1 of length n and a k-block b = b1...bk

both over an alphabet S de�ne by

P (w, b) =

|{0≤i≤n−k: ai=b1,...,ai+k−1=bk}|

n−k+1if n ≥ k

0 otherwise

the frequency of appearance of b in w and set

Pk(w) = (P (w, b))b∈Sk

to be the vector of frequencies of k-blocks in w.

Also, recall from the previous chapter, the set of shift invariant probability vectors

associated with frequencies of k-blocks from an alphabet S:

Sk =


pi = 1, and∑i∈S

pij =∑i∈S


.

33

Subsequently, we have to treat the two cases of �nite and in�nite Markov Parti-

tions separately. After we establish the necessary lemmata for each case, we present

a joint proof in the �nal section of this chapter.

5.1 On in�nite Markov Partitions

Suppose that we have a dynamical system (M,ϕ) together with a countable Markov

Partition P = {P0, P1, ...}. In this case the induced symbolic representation ofM has

S = N as an alphabet. This is a an in�nite alphabet but for elements of Sk we would

like to be able to �weight� only the coordinates of blocks that use symbols from a

�nite subset of our alphabet. For this reason we de�ne for k, N ≥ 1 the set

Sk,N =

(pi)i∈Nk :pi ≥ 0,

∑i∈Nk

pi = 1,∑i∈N

pij =∑i∈N

pji for all j ∈ Nk−1

and pi = 0 for i ∈ Nk\{0, ..., N − 1}

of shift invariant probability vectors where only the �rst N digits are weighted.

Now de�ne the union of probability vectors over �nite alphabets

S∗k =⋃N≥1

Sk,N

We proceed with the following observations. The set S∗k is clearly a dense subset

of the space (Sk, ‖.‖1). Moreover, S∗k is separable (with countable dense subset its

subset of elements with rational coordinates). So let's name a sequence (qk,m)m in S∗k

that is dense in (Sk, ‖.‖1). Throughout this section we �x the integers m + 1, k ≥ 1

and set q = qk,m. Then, since q ∈ S∗k =⋃N≥1 Sk,N there exists N ≥ 1 so that

q ∈ Sk,N and consequently qi = 0 for i ∈ Nk\{0, ..., N − 1}k.We would like to �nd a �nite word with k-block frequencies really close to the

coordinates of q in order to use it later in the main part of the proof. Therefore, we

de�ne for n ≥ 1 the set

Zn := Zn (q, N, k) =

w ∈ ⋃l≥knNk

{0, ..., N − 1}l : ‖Pk(w)− q‖1 ≤1

n

and crucially prove that it is non-empty.

34

Lemma 5.1. For all n ≥ 1, k,N ∈ N and q ∈ S∗k we have Zn (q, N, k) 6= ∅.

Proof. This is lemma 2.4 in Olsen [9].

We will make use of this �rst lemma to create an in�nite word with speci�c block

frequencies by concatenating arbitrary many copies of a �nite word from Zn(q, N, k).

Lemma 5.2. Let k, n,N, t be positive integers and q ∈ Sk,N . In addition, consider

the word w = w1...wt ∈ Nt and let M := max1≤i≤t {wi} be the maximal digit of w.

Then, for any �nite word γ ∈ Zn(q, N, k) and any

l ≥ L := t+ |γ|max

{nNk,

t

k

[1 +

(M + 1)k

Nk

]}we have that

‖Pk(wγ∗, l)− q‖1 ≤6

n(5.1)

Proof. Firstly, set s := |γ| to be the length of γ and σ := wγ∗|l the starting l-block

of wγ∗ = wγγγ . . .. By the de�nition of the block σ

σ =

l︷︸︸︷w︸︷︷︸t

γ︸︷︷︸s

γ︸︷︷︸s

... γ︸︷︷︸s︸︷︷︸

q times

γ1...γr︸︷︷︸r

it follows that there exist natural numbers q and r so that l = t+qs+r with 0 ≤ r < s.

Now a block i ∈ Nk that occurs in σ satis�es the following property:

qs

lP (γ, i) ≤ P (σ, i) ≤ qs

lP (γ, i) +

t+ q(k − 1) + r

l(5.2)

This can be easily proved considering where the block i could occur in σ. It is either

inside γ, or in w, or �nally in between them or at the end.

In our e�ort to prove the inequality (5.1) we will focus our concern on the occur-

rences of k-blocks inside γ and use the triangle inequality as follows:

‖Pk(wγ∗, l)− q‖1 = ‖Pk(σ)− q‖1 ≤∥∥∥Pk(σ)− qs

lPk(γ)

∥∥∥1+∥∥∥qslPk(γ)− q

∥∥∥1

35

For the �rst summand we have that∥∥∥Pk(σ)− qs

lPk(γ)

∥∥∥1=∑i∈Nk

∣∣∣P (σ, i)− qs

lP (γ, i)

∣∣∣=

∑i∈{0,...,N−1}k

∣∣∣P (σ, i)− qs

lP (γ, i)

∣∣∣+ ∑i∈Nk\{0,...,N−1}k

∣∣∣P (σ, i)− qs

lP (γ, i)

∣∣∣≤

∑i∈{0,...,N−1}k

(t+ q(k − 1) + r

l

)+

∑i∈Nk\{0,...,N−1}k

P (w,i) 6=0

P (σ, i)

(5.3)

≤∑

i∈{0,...,N−1}k

(t+ qk + s

l

)+

∑i∈{0,...,M}k

t

l(5.4)

= Nk

(t+ qk + s

l

)+ (M + 1)k

t

l

=[Nk + (M + 1)k

] tl+Nk

(qk

l

)+Nk

(sl

)≤tNk

[1 + (M + 1)k/Nk

]l

+Nk

(qk

qknNk

)+Nk

( s

snNk

)(5.5)

=tNk

[1 + (M + 1)k/Nk

]l

+2

n

≤ 1

n+

2

n=

3

n(5.6)

(5.3): using property (5.2) and the fact that γ ∈ {0, ..., N − 1}k

(5.4): M is the maximal digit of w so for i ∈ Nk\{0, ...,M}k, P (w, i) = 0.

Also, for i ∈ Nk\{0, ..., N − 1}k, P (σ, i) ≤ tl

(5.5): By the assumptions for l and γ we have that l ≥ qs ≥ qknNk and l ≥ snNk

(5.6): By assumption l ≥ |γ| tk

[1 + (M + 1)k/Nk

]≥(nkNk

)tk

[1 + (M + 1)k/Nk

]

36

Looking at the second summand now we have:∥∥∥qslPk(γ)− q

∥∥∥1≤∥∥∥qslPk(γ)− Pk(γ)

∥∥∥1+ ‖Pk(γ)− q‖1

≤∣∣∣qsl− 1∣∣∣ ‖Pk(γ)‖1 + 1

n(5.7)

≤ 1

n+

∣∣∣∣qs− ll

∣∣∣∣ (5.8)

=1

n+t+ r

l≤ 1

n+t

l+s

l(5.9)

≤ 1

n+

2

nNk≤ 3

n(5.10)

(5.7): Since γ ∈ Zn(q, N, k), ‖Pk(γ)− q‖1 ≤1n

(5.8): By de�nition we have that ‖Pk(γ)‖1 = 1

(5.9): By construction of σ

(5.10): Since l ≥ snNk and l ≥ s tk≥ nkNk t

k≥ tnNk by assumption.

Combining the two established inequalities we get the result.

5.2 On �nite Markov Partitions

In this section, we focus on the simpler case of �nite Markov Partitions. Suppose

that we have a dynamical system (M,ϕ) together with a �nite Markov Partition P =

{P0, ..., PN−1}. This induces a symbolic representation of M over a �nite alphabet

S = {0, ...N − 1}. In this case, denote the set of shift invariant probability vectors

associated with frequencies of k-blocks from the alphabet {0, ...N − 1} as

Sk =


pi = 1, and∑i∈S

pij =∑i∈S


Then, for a natural number k, n ≥ 1 and q ∈ Sk de�ne the set of �nite blocks with

block frequencies �relatively� close to the coordinates of q by

Zn := Zn (q, k) =

w ∈ ⋃l≥knNk

{0, ..., N − 1}l : ‖Pk(w)− q‖1 ≤1

n

and crucially prove that it is non-empty.

37

Lemma 5.3. For all n ≥ 1, k ∈ N and q ∈ Sk we have Zn (q, k) 6= ∅.

Proof. See Lemma 3.2 in Madritsch and Petrykiewicz [7].

We will make use of this lemma to create an in�nite word with speci�c block

frequencies by concatenating arbitrary many copies of a �nite word in Zn(q, k).

Lemma 5.4. Let k, n, t be positive integers and q ∈ Sk. In addition, consider the

word w = w1...wt ∈ {0, ..., N − 1}t and let M := max1≤i≤t {wi}. Then, for any

γ ∈ Zn(q, k) and any

l ≥ L := nNkmax {t, |γ|}

we have that

‖Pk(wγ∗, l)− q‖1 ≤6

n

Proof. Firstly, set s := |γ| and σ := wγ∗|l. By construction of the word σ it follows

that there exist natural numbers q and r so that l = t+ qs+ r with 0 ≤ r < s.

σ =

l︷︸︸︷w︸︷︷︸t

γ︸︷︷︸s

γ︸︷︷︸s

... γ︸︷︷︸s︸︷︷︸

q times

γ1...γr︸︷︷︸r

Now a block i ∈ {0, ..., N − 1}k that occurs in σ satis�es the following property:

qs

lP (γ, i) ≤ P (σ, i) ≤ qs

lP (γ, i) +

t+ q(k − 1) + r

l(5.11)

this can be easily proved considering where the block i could occur in σ. It is either

inside γ, or in w, or �nally in between them or at the end.

In our e�ort to prove the inequality we will focus our concern on the occurrences of

k-blocks inside γ.

‖Pk(wγ∗, l)− q‖1 = ‖Pk(σ)− q‖1 ≤∥∥∥Pk(σ)− qs

lPk(γ)

∥∥∥1+∥∥∥qslPk(γ)− q

∥∥∥1.

38

For the �rst summand we have that∥∥∥Pk(σ)− qs

lPk(γ)

∥∥∥1=

∑i∈{0,...,N−1}k

∣∣∣P (σ, i)− qs

lP (γ, i)

∣∣∣≤

∑i∈{0,...,N−1}k

t+ q(k − 1) + r

l(5.12)

≤∑

i∈{0,...,N−1}k

(t+ qk + s

l

)

= Nk

(t+ qk + s

l

)≤ Nk t

l+qkNk

l+Nk s

l

≤ Nk

(t

nNkt

)+

qkNk

qnkNk+Nk

( s

nNks

)(5.13)

=3

n

(5.12): By property (5.11)

(5.13): Since l ≥ nNkmax {t, |γ|} and also l ≥ qs ≥ qnkNk.

Looking at the second summand now we have:∥∥∥qslPk(γ)− q

∥∥∥1≤∥∥∥qslPk(γ)− Pk(γ)

∥∥∥1+ ‖Pk(γ)− q‖1

≤∣∣∣qsl− 1∣∣∣ ‖Pk(γ)‖1 + 1

n(5.14)

≤ 1

n+

∣∣∣∣qs− ll

∣∣∣∣ (5.15)

=1

n+t+ r

l≤ 1

n+t

l+s

l(5.16)

≤ 1

n+

2

nNk≤ 3

n(5.17)

(5.14): Since γ ∈ Zn(q, k), ‖Pk(γ)− q‖1 ≤1n

(5.15): By de�nition we have that ‖Pk(γ)‖1 = 1

(5.16): By construction of σ

(5.17): Since l ≥ nNkmax {t, |γ|}

Combining the two established inequalities we get the result.

39

5.3 Proof of the main result

Theorem (Main Result). Let k ≥ 1 be an integer and T ∈ T be a positive and reg-

ular, linear transformation. Furthermore, let P = {Pi : i ∈ I} be a Markov Partition

for a dynamical system (M,ϕ). Suppose that the generated shift space X+P,ϕ is the

one-sided full shift. Then, the set ETk is residual.

Our strategy is to form a residual set E which is easier to work with and crucially

show that it is a subset of ETk but also and already residual. Our construction will

follow ideas from proof of theorem 1.1 Hyde et al. [5].

We begin by recursively de�ning the functions ψm for m ≥ 1 by ψ1(x) = 2x and

ψm = ψ1(ψm−1(x)) for m ≥ 2. Next we choose a countable, dense subset of S∗k (in

the �nite Markov Partition case we ignore the asterisk ∗ and work with Sk). Set

D = S∗k ∩ Q(Sk) to be that set. Now we may concentrate on the probability vectors

inside D.We say that a sequence (xn)n∈N with terms in R(Sk) has property P if for all

q ∈ D, m, i ∈ N and ε > 0 there exists a natural number j such that:

1. j ≥ i

2. j2j< ε

3. if j < n < ψm(2j) then ‖xn − q‖1 < ε

We de�ne now the set E which consists of all points in U∞ whose symbolic expansion

has sequence of vector of frequencies satisfying property P, i.e.

E = {πP,φ(w) : (Pk(w, n))n∈N has property P } ∩ U∞

Now our objective is to show the following statements:

Firstly, we show that our set E is residual. Then, we proceed to show that if

(Pk(w, n))n∈N has property P, then also (P Tk (w, n))n∈N has property P and �nally, we

show that E is a subset of ETk for k ∈ N and T ∈ T a positive and regular, linear

transformation.

40

Lemma 5.5. The set E is a residual subset of U∞.

Proof. To prove the lemma will make use of the properties of residual sets again. We

construct a countable family of open and dense sets in U∞ and show that our set E

can be expressed as the intersection of this family, hence showing that E is residual.

For that reason, �x α,m, i ∈ N and q ∈ D and de�ne property Pα,m,q,i for a sequence

(xn)n∈N with terms in RSkin the following way:

Our sequence satis�es property Pα,m,q,i if for all ε >1αthere exists a natural number

j such that:

1. j ≥ i

2. j2j< ε

3. if j < n < ψm(2j) then ||xn − q||1 < ε

Analogously to the discussion above, we de�ne the set Eα,m,q,i which consists of all

points in U∞ whose symbolic expansion have frequency vectors satisfying property

Pα,m,q,i, i.e.

Eα,m,q,i = {πP,φ(w) : (Pk(w, n))n∈N has property Pα,m,q,i} ∩ U∞

It easily follows from the de�nitions that

E =⋂α∈N

⋂m∈N

⋂q∈D

⋂i∈N

Eα,m,q,i

and so it su�ces to show that each set Eα,m,q,i is open and dense. Fix α,m, i ∈ Nand q ∈ D and proceed with the following claims.

Claim. Eα,m,q,i is open.

Proof. Let x ∈ Eα,m,q,i and set w = w0w1... to be the symbolic expansion of x so

that πP,φ(w) = x. We want to �nd a ball of positive radius in M which is a subset

of Eα,m,q,i. By construction, there exists a natural number j satisfying the three

conditions de�ning property Pα,m,q,i, i.e. j ≥ i, j2j≤ 1

αand if j < n < ψm(2

j)

then ‖Pk(w, n)− q‖1 ≤1α. Now to simplify notation set t := ψm(2

j) and consider

Dt(w) =t∩k=0

ϕ−k(Pwk). Recalling that ϕ is a continuous function and each set P ∈ P

is open by assumption, we get that the cylinder Dt(w) is open as a �nite intersection

41

of open sets and of course it contains x. Therefore, we can �nd a positive distance

δ so that the ball in U∞ centered in x of radius δ is a subset of Dt(w). Now we

wish to show that this ball BM(x, δ) is a subset of Eα,m,q,i. We will do that by

noticing that for a point y ∈ BM(x, δ) ⊆ Dt(w) the �rst t digits of the symbolic

expansion of y are the same as x. As a result the same j as above shows that the

sequence(Pk(π

−1P,φ(y), n)

)n∈N has property Pα,m,q,i �nally implying that y ∈ Eα,m,q,i.

The choice of y was arbitrary and so BM(x, δ) ⊆ Eα,m,q,i.

Claim. Eα,m,q,i is dense.

Proof. Let x ∈ U∞ and δ > 0. It su�ces to �nd an element in the intersection

BM(x, δ) ∩ Eα,m,q,i. Again, denote by w ∈ X+P,φ the symbolic expansion of x and

notice that x ∈ Dt(w) for t ≥ 1 and also diamDt(w)t→∞−→ 0. Hence, there exists a

positive natural number t′ so that Dt′(w) ⊂ B(x, δ).

Now set σ := w|t′ to be the block of the �rst t′ digits of the symbolic expansion of x.Using lemma 5.1, we may choose a �nite word γ ∈ Z6a(q, N, k) so that ||Pk(γ)−q||1 ≤16α. With the block σ of length t′ and the �nite word γ ∈ Z6a(q, N, k) we can

immediately use lemma 5.2.

Let ε ≥ 1αand L as in lemma 5.2, then we can choose a positive natural number

j big enough so that j ≥ max {i, L} and j/2j < ε. We will show that any point in

the non-empty open cylinder Dψm(2j)(σγ∗) has the desired property of lying in the

intersection of BM(x, δ)∩Eα,m,q,i. Firstly, it is easy to see that since σ agrees with w in

the �rst t′ digits and by assumption ψm(2j) ≥ j ≥ L ≥ t′ we have that Dψm(2j)(σγ

∗) ⊆Dt′(σγ

∗) = Dt′(w) ⊂ B(x, δ) . So we are left to prove that Dψm(2j)(σγ∗) ⊂ Eα,m,q,i.

Let y ∈ Dψm(2j)(σγ∗) and remember that we chose j so that j ≥ i and j/2j < ε.

We only need to show that for j < n < ψm(2j) we have ||Pk(π−1P,φ(y), n) − q||1 <

ε. To do that we make use of the fact that the block frequencies of the symbolic

representation of y in the �rst n digits for n < ψm(2j) are the same as the block

frequencies of σγ∗ since y ∈ Dψm(2j)(σγ∗). Therefore,

||Pk(π−1P,φ(y), n)− q||1 = ||Pk(σγ∗, n)− q||1

and since n > j ≥ L Lemma 5.2 implies that∥∥Pk(π−1P,φ(y), n)− q∥∥1= ‖Pk(σγ∗, n)− q‖1 ≤

66α

= 1α≤ ε

42

Lemma 5.6. Let w ∈ XP,φ and T ∈ T be a positive and regular, linear transforma-

tion. If (Pk(w, r))r∈N has property P, then also (P Tk (w, r))r∈N has property P.

Proof. Let w ∈ XP,φ be such that (Pk(w, r))r∈N has property P and �x ε > 0, q ∈ D,m, i ∈ N and T = [cm,n]N×N a positive and regular, linear transformation in T . Let

G > 0 be as in condition 2 of the de�nition of T . Further, since γr, the rth row sum

of T , tends to 1 as r →∞ then we can �nd R ∈ N, such that for r ≥ R we have that

|γr − 1| < ε/3. Finally, the sequence (Pk(w, r))r∈N has property P which means that

we can �nd a j ∈ N satisfying the following three properties:

1. j ≥ max {i, R, 2}

2. j/2j ≤ ε6G

3. For n ∈ N with j < n < ψm+1(2j) we have that ‖Pk(w, n)− q‖1 ≤

ε3(1+ε)

We set j′ = 2j and we check that the sequence (P Tk (w, r))r∈N satis�es the conditions

of property P.

1. j′ = 2j > j ≥ i

2. j′ > j ≥ 2 and so j′/2j′ ≤ j/2j ≤ ε/3

3. Finally, for r ∈ N with j′ < r < ψm(2j′) or equivalently 2j < r < ψm+1(2

j) we

have that:

∥∥P Tk (w, r)− q

∥∥1=

∥∥∥∥∥∑n∈N

cr,nPk(w, n)− q

∥∥∥∥∥1

=

∥∥∥∥∥∑n∈N

cr,n(Pk(w, n)− q)− q(∑n∈N

cr,n − 1)

∥∥∥∥∥1

≤∑n∈N

|cr,n| ‖(Pk(w, n)− q)‖1 + ‖q‖1 |1− γr| .

43

Recalling that ||q||1 = 1 and r > 2j > j ≥ R we get:

∥∥P Tk (w, r)− q

∥∥1≤∑n≤j

|cr,n| ‖(Pk(w, n)− q)‖1

+∑

j<n<ψm+1(2j)

|cr,n| ‖(Pk(w, n)− q)‖1

+∑

n≥ψm+1(2j)

|cr,n| ‖Pk(w, n)− q‖1

+ε

3.

Using the fact that T is a lower triangular matrix we get:

∥∥P Tk (w, r)− q

∥∥1≤∑n≤j

supn∈N

cr,n (‖Pk(w, n)‖1 + ‖q‖1)

+∑

j<n<ψm+1(2j)

|cr,n| ·ε

3(1 + ε)

+∑

n≥ψm+1(2j)

0 · ‖(Pk(w, n)− q)‖1

+ε

3

≤ 2j

(supn∈N

cr,n

)+

ε

3(1 + ε)

∑j<n<ψm+1(2j)

|cr,n|

+ε

3

≤ 2j( r2j

)supn∈N

cr,n +ε

3(1 + ε)· γr +

ε

3

≤ 2

(j

2j

)r supn∈N

cr,n +ε

3(1 + ε)(1 + ε) +

ε

3

≤ 2(ε

6G)G+

ε

3+ε

3= ε

The choices ofm, i, q, ε were arbitrary so we conclude that the sequence ((P (w, r))r∈N

satis�es the property P.

44

For our �nal step we proceed to show that the set

E = {πP,φ(w) : (Pk(w, n))n∈N has property P } ∩ U∞

is a subset of

ETk ={πP,φ(w) : A

Tk (w) = Sk

}∩ U∞

the set of points in U∞ whose symbolic expansion has an averaged sequence of vectors

of frequencies with the full simplex of shift invariant probability vectors as accumu-

lation points.

Lemma 5.7. For all positive and regular, linear transformations T ∈ T and k ∈ Nwe have E ⊆ ETk .

Proof. Let T ∈ T , x ∈ E and suppose w is the symbolic expansion of x so that

x = πP,φ(w). By assumption, (Pk(w, n))n∈N has property P and so by the previous

lemma the averaged version of this sequence, (P Tk (w, n))n∈N has property P. To show

the containment argument it su�ces to show that each p ∈ Sk is an accumulation

point of the averaged sequence. So �x p ∈ Sk and η ∈ N.Since D = Sk there exists a probability vector q ∈ D such that ||p − q||1 < 1

η. Now

we make use of the fact that (P Tk (w, n))n∈N has property P so that for each m ∈ N

we may choose a j ∈ N with:

1. j ≥ η

2. j/2j < 1η

3. for n ∈ N with j < n < ψm(2j) then ||(Pk(w, n)− q)||1 ≤ 1

η

Fix a natural nη in the interval j < nη < ψm(2j). Then

∥∥P Tk (w, nη)− p

∥∥1≤ ||P T

k (w, nη)− q||1 + ||q − p||1 ≤1

η+

1

η=

2

η

since nη ≥ η we may extract a strictly increasing subsequence (nηu)u∈N such that

P Tk (w, nηu)→ p as u→∞. Therefore, p is an accumulation point of (P T

k (w, n))n∈N.

This holds for all p ∈ Sk, so x ∈ ETk .

Proof of main result. Lemma 5.5 shows that the set E is residual in U∞. Then,

by lemma 5.7 E is a subset of ETk and so ETk is also residual in U∞ and thus residual

in M , since M\U∞ is meagre, completing the proof.

45

References

[1] Alabdulmohsin, I. M.: 2016, À new summability method for divergent series'.

arXiv:1604.07015 [math]. arXiv: 1604.07015.

[2] Baire, R.: 1899, Sur les fonctions de variables reelles. Bernardoni de C. Rebes-

chini. Google-Books-ID: cS4LAAAAYAAJ.

[3] Borel, M. m.: 1909, `Les probabilites denombrables et leurs applications arithme-

tiques'. Rendiconti del Circolo Matematico di Palermo (1884-1940) 27(1), 247�271.

[4] Hardy, G. H.: 2000, Divergent Series. American Mathematical Society.

[5] Hyde, J. T., V. Laschos, L. O. R. Olsen, I. Petrykiewicz, and A. Shaw: 2010, Ìter-

ated Cesaro averages, frequencies of digits, and Baire category'. Acta Arithmetica

144, 287�293.

[6] Lind, D. and B. Marcus: 1995, An Introduction to Symbolic Dynamics and Coding.

Cambridge University Press.

[7] Madritsch, M. G. and I. Petrykiewicz: 2014, `Non-normal numbers in dynami-

cal systems ful�lling the speci�cation property'. arXiv:1402.1506 [math]. arXiv:

1402.1506.

[8] Niven, I. and H. S. Zuckerman: 1951, Òn the de�nition of normal numbers.'.

Paci�c Journal of Mathematics 1(1), 103�109.

[9] Olsen, L. O. R.: 2003, Èxtremely non-normal continued fractions'. Acta Arith-

metica 108, 191�202.

[10] Oxtoby, J. C.: 2013, Measure and Category: A Survey of the Analogies between

Topological and Measure Spaces. Springer New York.

46

Appendix A

Regular linear transformations

Theorem. A linear transformation T = [cm,n]N×N is regular if and only if:

1. there exists M > 0 such that for all m ∈ N we have that γm =∞∑n=0

|cm,n| < M ,

2. for all n ∈ N, cm,n → 0 as m→∞ and

3. setting cm =∞∑n=0

cm,n, then cm → 1 as m→∞.

Proof. The proof will follow very similar ideas to the proof of theorems 1 and 2 from

chapter 3 of Hardy [4].

(⇐) : Assume T = [cm,n]N×N is a linear transformation such than the three con-

ditions of the theorem are satis�ed. We want to show that T is regular. Let (sn)n∈N

be a real sequence, so that snn−→ s. Now, we wish to show that tm =

∞∑n=0

cm,nsnm−→ s.

Claim. It su�ces to prove the result in the case that s = 0.

Proof. Assume that whenever a sequence converges to 0 we have that the correspond-

ing sequence (tm)m∈N also converges to 0. Then, consider any sequence (sn)n∈N, so

that there exists s ∈ R with snn−→ s and write s′n = sn − s and t′m =

∞∑n=0

cm,ns′n.

Then, it is clear that the sequence (s′n)n∈N tends to 0 and by assumption (t′m)m∈N

also tends to 0. Then, condition 3 implies that tm =∞∑n=0

cm,nsn =∞∑n=0

cm,n(s′n + s) =

t′m + scmm−→ 0 + s · 1 = s.

So now we can suppose s = 0 and let ε > 0 .

47

Claim. For all m ∈ N, the series tm =∞∑n=0

cm,nsn and cm =∞∑n=0

cm,n are absolutely

convergent.

Proof. Let m ∈ N and notice that the result for cm is Condition 1 which holds by

assumption. Now consider tm. Firstly, observe that since the sequence (sn)n∈N is

convergent then it is also bounded. Therefore, there exists K > 0 such that |sn| ≤ K

for all n ∈ N . Now using this fact together with condition 1 we can conclude the

result for tm in the following way:

∞∑n=0

|cm,nsn| ≤ K∞∑n=0

|cm,n| < KM

and hence considering the partial sums of the series∞∑n=0

|cm,nsn| we get an increas-

ing sequence which is bounded above. An application of the monotone convergence

theorem gives the result.

In order to show that tm tends to 0 we need one �nal tool. The convergence of

the sequence (sn)n∈N implies that there exists N(ε) ∈ N so that n ≥ N(ε) implies

that |sn| < ε2M

. Then,

|tm| =

∣∣∣∣∣∞∑n=0

cm,nsn

∣∣∣∣∣ =∣∣∣∣∣∣N(ε)−1∑n=0

cm,nsn +∞∑

n=N(ε)

cm,nsn

∣∣∣∣∣∣ ≤∣∣∣∣∣∣N(ε)−1∑n=0

cm,nsn

∣∣∣∣∣∣+∣∣∣∣∣∣∞∑

n=N(ε)

cm,nsn

∣∣∣∣∣∣and so looking at the two summands seperately we get:

• For the �rst summand, notice that for �xed N(e) condition 2 implies that this

summand tends to 0 as m → ∞, i.e. we can �nd M(ε) = M(ε,N(ε)) ∈ N so

that m ≥M(ε) implies that

∣∣∣∣∣N(ε)−1∑n=0

cm,nsn

∣∣∣∣∣ ≤ ε2.

• For the second summand using condition 1 this time we get∣∣∣∣∣∣∞∑

n=N(ε)

cm,nsn

∣∣∣∣∣∣ ≤∞∑

n=N(ε)

|cm,nsn| ≤ε

2M

∞∑n=N(ε)

|cm,n| ≤ε

2M

∞∑n=0

|cm,n| ≤ε

2M·M =

ε

2.

Finally, combing the �ndings for the two summands we get that for m ≥M(ε)

we have that tm ≤ ε, prooving that tm → 0 as m→∞.

48

(⇒): Conversely, suppose T = [cm,n]N×N is a regular linear transformation. We want

to show that the three conditions of the theorem are necessary.

Claim. cm =∞∑n=0

cm,n → 1 as m→∞.

Proof. Consider the constant sequence (sn)n∈N with sn = 1 for all n ∈ N. Then

clearly snn−→ 1 and so tm =

∞∑n=0

cm,nsnm−→ 1. Therefore,

cm =∞∑n=0

cm,n =∞∑n=0

cm,n · 1 =∞∑n=0

cm,nsnm−→ 1

Similarly, we get that the second condition is necessary for each n ∈ N by respec-

tively considering the sequence (sk)k∈N with sn = 1 and sk = 0 for all k ∈ N\{n}.Finally, we show in the next claim that the �rst condition is necessary to complete

the proof of this theorem.

Claim. There existsM > 0 such that for all m ∈ N we have that γm =∞∑n=0

|cm,n| < M .

Proof. As a �rst step we need to show that each γm has a �nite value. Assume not i.e.

there exists m ∈ N so that γm =∞∑n=0

|cm,n| = ∞. Then, we can construct a sequence

(εn)n∈N with εn > 0 for all n ∈ N and εnn−→ 0 so that

∞∑n=0

εn|cm,n| = ∞. (E.g.

take εn =

(n∑

ν=N

|cm,ν |)−1

where cm,N is the �rst nonzero element of the sequence

(cm,n)n∈N). But then, by considering the sequence (sn)n with sn = εnsgn(cm,n) we

have that sn → 0 but tm =∞∑n=0

cm,nsn =∞∑n=0

εn|cm,n| = ∞ contradicting the fact that

tm is a real number. Hence, (γm)m is a sequence of non-negative real numbers and

ultimately, we want to show that (γm)m is a bounded sequence. Assume not for the

sake of contradiction. Since, γm ≥ 0 for all m ∈ N and the sequence is unbounded

it means that for any G > 0 there exists m0 ∈ N so that γm0 > G. Now for n ∈ Nde�ne γm,n =

n∑ν=0

|cm,ν |. Then, by construction we have that γm,nn−→ γm. In order to

reach the seeked contradiction we are going to construct a convergent sequence whose

averaged version will not converge to the same limit. For that reason, let n1 ∈ N and

de�ne inductively the sequences of integers (nn)n and (mn)n in the following way.

49

Suppose m1, ...,mr−1 and n1, ..., nr are determined. Then choose mr and nr+1 as

follows. Take mr big enough so that

(i): mr > mr−1

(ii): γmr,nr =nr∑n=0

|cmr,n| < 1 (we can do this since by condition 2 that we proved

above for all n ∈ N, limm→∞

cm,n = 0)

(iii): γmr =∞∑n=0

|cmr,n| > r2+2r+2 (we can do this since (γm)m is unbounded and

each γm has a �nite value)

Now using the fact that γm,nn−→ γm we choose nr+1 > nr big enough so that

γmr − γmr,nr+1 =∞∑

n=nr+1+1

|cm,n| < 1. By construction of mr and nr+1 it easily follows

thatnr+1∑

n=nr+1

|cmr,n| > r2 + 2r. We now de�ne the crucial sequence (sn)n by:

sn =

0 n ≤ n1

1rsgn(cmr,n) nr < n ≤ nr+1 for r = 1, 2, ...

Then, it is clear that the sequence (sn)n is bouded by 1 and converges to 0 but

let's now consider its averaged version

|tmr | =

∣∣∣∣∣∞∑n=0

cmr,nsn

∣∣∣∣∣ =∣∣∣∣∣∞∑i=0

ni+1∑n=ni+1

cmr,nsn

∣∣∣∣∣ =∞∑i=0

ni+1∑n=ni+1

1

r|cmr,n|

≥ 1

r

nr+1∑n=nr+1

|cmr,n| −nr∑n=0

|cmr,n| −∞∑

n=nr+1

|cmr,n| >1

r(r2 + 2r)− 1− 1 = r

implying that (tm)m has a subsequence that diverges to in�nity and so the original

sequence de�nitely does not converge to 0, a contradiction to T being regular.

50

Appendix B

Accumulation points of block

frequencies

Theorem. Let w ∈ X+P,φ. Then, ATk (w) ⊆ Fk where Fk is the following simplex of

shift invariant vectors:

Fk =


pi ≤ 1,∑i∈S

pij =∑i∈S


.

Proof. Let p = (pi)i∈Sk be an accumulation point of the sequence (P Tk (w,m))m∈N with

respect to ||.||1. Hence, clearly p satis�es the condition pi ≥ 0 since by assumption

T is positive and the sequence (Pk(w, n))n∈N consists of frequencies, i.e. non-negative

numbers. More crucially, there exists a strictly increasing sequence (nm)m of positive

integers such that

||P Tk (w, nm)− p||1

m−→ 0 (B.1)

It follows that for i ∈ Sk,

∑n∈N

cnm,nPk(w, i, n)m−→ pi

51

But then, we get the second condition in the following way

1 = limm→∞

∑n∈N

cnm,n = limm→∞

∑n∈N

cnm,n||Pk(w, n)||1 = limm→∞

∑n∈N

cnm,n

∑i∈Sk

P (w, i, n)

=∑

i∈Sk limm→∞∑

n∈N cnm,nP (w, i, n) =∑

i∈Sk pi = ||p||1 if S is �nite

≥∑

i∈Sk limm→∞∑

n∈N cnm,nP (w, i, n) =∑

i∈Sk pi = ||p||1 if S is in�nite

where the inequality comes from the use of Fatou's Lemma.

For j ∈ Sk−1 considering all possible ways this block of length k − 1 can occur it

follows that ∣∣∣∣∣∑i∈S

P (w, ij, n)−∑i∈S

P (w, ji, n)

∣∣∣∣∣ ≤ 1

n(B.2)

Now it follows from (B.1) and (B.2) that if j ∈ Sk−1, then

∣∣∣∣∣∑i∈S

pij −∑i∈S

pji

∣∣∣∣∣ ≤∣∣∣∣∣∑i∈S

pij −∑i∈S

P (w, ij, nm)

∣∣∣∣∣+

∣∣∣∣∣∑i∈S

P (w, ij, nm)−∑i∈S

P (w, ji, nm)

∣∣∣∣∣+

∣∣∣∣∣∑i∈S

P (w, ji, nm)−∑i∈S

pji

∣∣∣∣∣

≤∑i∈S

|pij − P (w, ij, nm)|+1

nm+∑i∈S

|P (w, ji, nm)− pji|

≤ ||Pk(w, nm)− p||1 +1

nm+ ||Pk(w, nm)− p||1

m−→ 0

which proves the �nal condition for p to be an element of Fk, namely that for all

j ∈ Sk−1 we have

∑i∈Spij =

∑i∈Spji.

52

Extremely non-normal numbers in dynamical systems

Documents

Transcript of Extremely non-normal numbers in dynamical systems