Extremely non-normal numbers in dynamical systems
Transcript of Extremely non-normal numbers in dynamical systems
Extremely non-normal numbers in
dynamical systems
Anastasios Stylianou
School of Mathematics and Statistics
University of Saint Andrews
A thesis submitted for the degree of
Master of Mathematics
April 2018
Declaration
I certify that this project report has been written by me, is a record ofwork carried out by me, and is essentially di�erent from work
undertaken for any other purpose or assessment.
Anastasios Stylianou
Acknowledgements
Firstly, I would like to express my great appreciation to my supervisor,Professor Lars Olsen. His unfailing support and assistance through thiswork was invaluable. Working under his supervision helped me become abetter mathematician.
Further, I am particularly grateful for the constant encouragementand support of my family. My parents, Andreas and Emily, who werealways eager to learn about the progress of my thesis and all my siblings,Matthew, Vicky, Iro and Andreas, whose useful comments and insightshelped me throughout this project.
I would like to acknowledge the technical support fromMelissa Iacovidou,and thank her for introducing me to Lyx and assisting me with any LaTeXchallenge I presented to her.
Last but not least, I would like to say a big thank you to my �atmateKypros Papadopoulos. Both for his direct help on this project throughinspection and constructive conversations and mainly for his indirect supportby motivating me with his hard working presence.
Abstract
It was proved by Borel, more than a hundred years ago, that Lebesguealmost all real numbers are normal. Here, we present the topologicalviewpoint from which a typical number is not only non-normal, but itfails to be normal in a spectacular way. We consider numbers fromsymbolic representations of dynamical systems and examine the sequencesof their digits frequency vectors. Using regular averaging methods we tryto smooth out any divergence from these sequences. However, we observethat even the averaged versions of these sequences of vectors still exhibitan extreme non-normal behaviour. In particular, we show that every shiftinvariant probability vector is an accumulation point of these sequences.
Contents
0 Introduction 10.1 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Notation index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1 Measure and Category 31.1 Existence theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Cantor's theorem . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.2 Borel's theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.3 Baire category theorem . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Generic property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.1 Mearure theoretic viewpoint . . . . . . . . . . . . . . . . . . . 71.2.2 Topological viewpoint . . . . . . . . . . . . . . . . . . . . . . 71.2.3 A Paradox? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Symbolic Representation of Dynamical Systems 112.1 Shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Dynamical systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Markov Partitions and symbolic representations . . . . . . . . . . . . 15
2.3.1 Uniqueness of symbolic representations . . . . . . . . . . . . . 18
3 Summability Theory 203.1 Divergent series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Averaging methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 History and de�nition . . . . . . . . . . . . . . . . . . . . . . 213.2.2 Examples of averaging methods . . . . . . . . . . . . . . . . . 233.2.3 Main result for regular linear transformations . . . . . . . . . 24
4 Normal Numbers 264.1 Historical overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2 Normal numbers in dynamical systems . . . . . . . . . . . . . . . . . 274.3 Extremely non-normal numbers . . . . . . . . . . . . . . . . . . . . . 284.4 Statement of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
i
5 Typical numbers are extremely non-normal 335.1 On in�nite Markov Partitions . . . . . . . . . . . . . . . . . . . . . . 345.2 On �nite Markov Partitions . . . . . . . . . . . . . . . . . . . . . . . 375.3 Proof of the main result . . . . . . . . . . . . . . . . . . . . . . . . . 40
References 46
A Regular linear transformations 47
B Accumulation points of block frequencies 51
ii
Chapter 0
Introduction
0.1 Thesis outline
Firstly, we start by stating the main result we aim to reach by the end of this thesis.
Theorem. Let k ≥ 1 be an integer and T ∈ T be a positive and regular, linear trans-
formation. Furthermore, let P = {Pi : i ∈ I} be a Markov Partition for a dynamical
system (M,ϕ). Suppose that the generated shift space X+P,ϕ is the one-sided full shift.
Then, the set ETk is residual.
After reading this project, the reader should be comfortable with the statement
of the theorem above. Each of the next four Chapters will focus on a speci�c area of
background knowledge needed to comprehend the main result. In these four chapters
we present a brief introduction on the following notions:
• Chapter 1: Generic property in measure theoretic and topological viewpoint
• Chapter 2: Shifts, dynamical systems and symbolic representations
• Chapter 3: Averaging methods
• Chapter 4: Normal numbers in symbolic representations of dynamical systems
The �nal and most important chapter, contains the proof of the main result.
Combining the general theory of chapters 2 and 4 we will de�ne normal and non-
normal numbers in a symbolic representation of a dynamical system. Then, making
use of regular averaging methods which will encounter in chapter 3, we will de�ne the
set of extremely non-normal numbers. Finally, we will show that this set of numbers
satis�es the topological version of the generic property discussed in chapter 1.
1
0.2 Notation index
Notation Meaning Page
N Set of natural numbers including zero 3
E σ-algebra 3
B(X) Borel sigma-algebra on X 4
λ Lebesgue measure 4
A Alphabet, set of symbols 11
AN Set of sequences with terms in A 11
Bn(X) Set of n-blocks occuring in words from X 12
B(X) Set of blocks occuring in words from X 12
F Set of forbidden blocks 12
XF Shift with set of forbidden blocks F 12
P Topological partition 14
bxc Integral part of x 17
T Set of �nice� linear transformations 26
TQ Set of �nice� linear transformations with rational entries 26
N(x, b, n) Number of occurences of digit b in the �rst n digits of x 27
N(x, b, n) Number of occurences of block b in the �rst n digits of x 28
S Alphabet, set of symbols 28
P (w, b, n) Frequency of block b, in the �rst n digits of word w 29
Pk(w, n) Vector of frequencies of k-blocks in the �rst n digits of word w 29
[b] Cylinder centered at the block b 29
P Tk (w,m) mth averaged vector of frequencies of k-blocks in w 30
ATk (w) Set of accumulation points of (P Tk (w,m))m∈N 30
Sk Simplex of shift invariant probability vectors 31
E Set of extremely non-normal numbers 31
P (w, b) Frequency of block b in string w 34
Pk(w) Vector of frequencies of k-blocks in string w 34
γ∗ γ∗ = γγγγ . . . 36
sgn(x) Sign of x: sgn(x) =
x/|x| x 6= 0
0 x = 050
2
Chapter 1
Measure and Category
1.1 Existence theorems
The aim of this chapter is to rigorously de�ne �generic properties� of a set. Following
the example of Oxtoby [10], we start with a brief introduction on the theorems by
Cantor, Baire and Borel. These three theorems are called existence theorems. The
motivation of this characterisation will be discussed later in this section.
1.1.1 Cantor's theorem
Both the concepts of measure and category are based on that of countability. There-
fore, let's start by recalling the de�nition of a countable set.
De�nition 1.1. A set A is called countable if there exists a bijective function from
N to A. Moreover, a set is called at most countable if it is either countable or �nite.
Theorem 1.1 (Cantor). For any sequence (an)n∈N of real numbers and for any non-
empty interval I ⊆ R there exists a point p in I such that p 6= an for every n ∈ N.
Proof. See �gure 1.1 for the idea of the proof.
Cantor's theorem can be characterised as an existence theorem. The reason is
that if a set of points not satisfying a speci�c property is a countable subset of an
interval in the real line, then we can conclude immediately that not only there exist
points in the interval satisfying the property in question, but in fact, it also follows
that most of the interval's points, in the sense of cardinality, satisfy this property.
3
Figure 1.1: Diagonal argument
1.1.2 Borel's theorem
Previously, we discussed the existence theorem of Cantor. In this section, we continue
with Borel's existence theorem connected with the notion of measure. As with the
previous theorem, we start with a de�nition, that of a nullset.
De�nition 1.2. A set A ⊆ R is said to be a nullset if for any ε > 0 there exists a
countable family of intervals In such that A ⊆⋃In and
∑|In| < ε. More generally, if
(X, E , µ) is a measure space then a subset A ⊆ X is said to be a nullset if µ (A) = 0.
Theorem 1.2 (Borel). If a �nite or in�nite sequence of intervals In in R covers a
non-empty interval I, then |I| ≤∑|In|. More generally, if (X, E , µ) is a measure
space and A ⊆⋃i∈NBi for measurable sets A and Bi for i ∈ N we have that
µ (A) ≤∑n∈N
µ (Bi)
Proof. Let A ∈ E and suppose that {Bi}i∈N ⊆ E such that A ⊆⋃i∈NBi. Then,
µ (A) = µ
(A ∩
⋃i∈N
Bi
)= µ
(⋃i∈N
Bi ∩ A
)≤∑n∈N
µ (Bi ∩ A) ≤∑n∈N
µ (Bi)
4
Remark. Consider the measure space (R,B(R), λ). In this space, every countable
set has measure zero whereas any non-empty interval has strictly positive lebesgue
measure. This implies that for any sequence of reals and any non-empty interval in
R, there exists a point in the interval which doesn't occur in the sequence proving
Cantor's theorem. Further, if the set of reals not satisfying a certain property is a
nullset and contained in an interval, then we can conclude that there exist points in
the interval satisfying this property. In fact, most of them in the sense of measure
have this property.
1.1.3 Baire category theorem
Finally, we present Baire's theorem. Again, this theorem can be characterised as an
existence theorem, this time linked with the notion of category.
De�nition 1.3. Let (X, d) be a metric space. A set A is a dense subset of X if any
non-empty, open subset of X has a non-empty intersection with A.
De�nition 1.4. Let (X, d) be a metric space. A set A is called nowhere dense if it
is not dense in any non-empty, open subset of X, equivalently if for all x ∈ X and all
r > 0, there is a y ∈ X and ρ > 0 such that
B (y, ρ) ⊆ B (x, r) \A
We can characterise a nowhere dense set as one that is �full of holes�.
De�nition 1.5. Let (X, d) be a metric space. A set A is said to be of �rst category
or meagre if it can be represented as a countable union of nowhere dense sets. A
subset of X that cannot be represented in this way is said to be of second category.
These de�nitions were formulated by Baire [2] in his PhD thesis in 1899, to whom
the following theorem is due.
Theorem 1.3 (Baire's Category Theorem). In a complete metric space (X, d), the in-
tersection of any countable family of open and dense sets is dense in X. Equivalently,
every complete metric space is of second category.
Proof. Theorem 1.3 in Oxtoby [10].
5
Remark. Baire's Theorem provides another proof of Cantor's Theorem. This is be-
cause any interval equipped with the euclidean metric is a complete metric space and
hence of second category, whereas any countable set is of �rst category. Moreover,
if a set of points not satisfying a certain property is of �rst category and contained
in an interval, we can conlude that not only there exist, but most, in the sense of
category, of the points in the interval satisfy this property.
De�nition 1.6. A subset A of a metric space (X, d) is called residual or co-meagre
if its complement is meagre.
Many results in this thesis will involve residual sets. Consequently, in the following
proposition we present the main properties of such sets which we will be using in our
proofs in the next chapters.
Proposition 1.1 (Properties of Residual sets). Let (X, d) be a metric space. Then,
1. if R is residual and R ⊆ S ⊆ X then S is residual,
2. if {Rn}∞n=0 is a countable family of residual sets, then⋂n∈N
Rn is residual,
3. if {Gn}∞n=0 is a countable family of open and dense sets, then⋂n∈N
Gn is residual.
Proof. (1): Let R be a residual set, i.e. X\R is meagre and suppose R ⊆ S ⊆ X.
Then, looking at the complements we have that X\S ⊆ X\R which implies that X\Sis also meagre.
(2): Let {Rn}∞n=0 be a countable family of residual sets. Then, for each n ∈ N, X\Rn
is meagre. Therefore, X\⋂n∈N
Rn =⋃n∈N
X\Rn is a countable union of meagre sets and
hence meagre.
(3): Assume {Gn}∞n=0 is a countable family of open and dense sets. Notice that since
X\⋂n∈N
Gn =⋃n∈N
X\Gn
by De Morgan's law, it su�ces to show the following claim.
Claim. For n ∈ N, X\Gn is nowhere dense.
Proof. Let x ∈ X and r > 0. Since Gn is dense we can �nd y ∈ Gn ∩ B (x, r). Gn
is also open so we can �nd ρ > 0 small enough so that B (y, ρ) ⊆ B (x, r) ∩ Gn =
B (x, r) \ (X\Gn).
6
1.2 Generic property
The three mentioned theorems provide di�erent notions of a �small� set, countable,
nullset and meagre respectively, with the last two both containing countable sets. A
nowhere dense set is small in the intuitive geometric sense of being perforated with
�holes� and a meagre set can be �approximated� by such a set. On the other hand, a
nullset is small in the metric sence that it can be covered by a sequence of open sets of
arbitrarily small total length. Therefore, this justi�es the next de�nitions where the
complements of these �small� sets de�ne the generic property in a measure theoretic
and a topological perspective, respectively.
1.2.1 Mearure theoretic viewpoint
De�nition 1.7. Let (X, E , µ) be a measure space. We say that �a property P holds
µ-almost everywhere in X� if the set {x ∈ X : x doesn't have P} is a nullset with
respect to µ.
Example. Countable sets, the Cantor set and lines in the plane are null subsets with
respect to to the Lebesgue measure λ. So for example, λ-almost all points in the
plane do not have rational coordinates and also for any q ∈ Q they do not belong in
the line x = π + q.
1.2.2 Topological viewpoint
De�nition 1.8. Let (X, d) be a metric space. We say that �a typical element of X
has property P � if the set {x ∈ X : x doesn't satisfy P} is meagre.
Example. All countable subsets of R are meagre, in particular the rational numbers.
The Cantor Set is an an example of an uncountable meagre set. Of course, as a space
the Cantor set equipped with the euclidean metric is complete and hence a Baire
space and so not meagre by Baire's Category Theorem. Hence, a typical real number
is both irrational and not in the Cantor set.
It is natural to ask if and how are these classes of �small� sets related. This is the
context of the next subsection.
7
1.2.3 A Paradox?
Theorem 1.4. The real numbers can be decomposed into two complementary sets
such that one is of �rst category and the other is of measure zero.
Proof. Any of the examples given bellow produces such a decomposition.
A complete metric space should be thought as a vast set. It is so big, that any
Cauchy sequence converges to a limit in the space. The theorem above suggests that
such a huge set can be decomposed into two disjoint �tiny� sets, each small in its own
way. This shows that a set that is small in one sense can be big in another. There
is no solid argument suggesting that these two notions of smallness should agree ev-
erywhere since they were introduced in di�erent areas of mathematics and so there is
nothing paradoxical about the result above. However as we noted before, countable
sets are �small� with respect to both viewpoints. The following examples from Oxtoby
[10] present some cases where a set is big with one notion and small with the other one.
Example 1.1. Let q0, q1, q2, ... be an enumeration of Q. Then, the set
G =∞⋂j=1
∞⋃i=1
(qi −
1
2i+j+1, qi +
1
2i+j+1
)
is a nullset with respect to (R,B(R), λ) but also residual in (R, |.|).Observe that, for each j ≥ 1, the set Gj =
∞⋃i=1
(qi − 1
2i+j+1 , qi +1
2i+j+1
)is open and
dense in R since it contains the rationals. Hence, {Gj}∞j=1 is a countable family of
residual sets and so by proposition 1.1.3 we get that G =∞⋂j=1
Gj is residual.
On the other hand, as shown below it follows by Borel's theorem that each Gj has
�nite lebesgue measure
λ (Gj) = λ
(∞⋃i=1
(qi −1
2i+j+1, qi +
1
2i+j+1)
)
≤∞∑i=1
λ
(qi −
1
2i+j+1, qi +
1
2i+j+1
)=∞∑i=1
1
2i+j=
1
2j<∞
8
Then, it follows by the continuity of measures that since G1 ⊇ G2 ⊇ G3 ⊇ ...
λ (G) = λ
(∞⋂j=1
Gj
)= lim
j→∞λ (Gj) = lim
j→∞
1
2j= 0
proving that G has Lebesgue measure zero.
A real number z is called a Liouville number if z is irrational and also satis�es
the property that for each positive integer n there exist integers p and q such that:∣∣∣∣z − p
q
∣∣∣∣ < 1
qnand q > 1
Example 1.2. A typical number in (R, |.|) is a Liouville number while L , the set of
Liouville numbers is a nullset with respect to (R,B(R), λ).
For n ≥ 1 write
Un =∞⋃q=2
∞⋃p=−∞
{x ∈ R : 0 <
∣∣∣∣x− p
q
∣∣∣∣ < 1
qn
}=∞⋃q=2
∞⋃p=−∞
(p
q− 1
qn,p
q+
1
qn
)\{p
q
}
and observe �rstly that the set Un is open and secondly that Q ⊆ Un implying that
R = Q ⊆ Un. In particular, we get that Un is dense in R. Then, {Un}∞n=1 is a
countable family of residual sets. Proposition 1.1.3 implies that⋂n≥1
Un is residual
in R. Set I := R\Q and notice that an element z in the residual set I ∩⋂n≥1
Un is
irrational and for each n ≥ 1 x ∈ Un which implies that there exists p, q ∈ Z such
that∣∣∣z − p
q
∣∣∣ < 1qn
and q ≥ 2. This implies that z is a Liouville number, i.e. z ∈ L .
This shows that I ∩⋂n≥1
Un ⊆ L and so by proposition 1.1.1 L is residual.
On the other hand, reversing the argument above we can show the weaker statement
that for n ≥ 3 we have:
L ⊆∞⋃q=2
∞⋃p=−∞
(p
q− 1
qn,p
q+
1
qn
)
9
Therefore, �xing m ∈ N we have:
L ∩ (−m,m) ⊆∞⋃q=2
qm⋃p=−qm
(p
q− 1
qn,p
q+
1
qn
)
and so considering the Lebesgue measure of this set
λ (L ∩ (−m,m)) ≤∞∑q=2
mq∑p=−mq
λ(
(p
q− 1
qn,p
q+
1
qn
)) =
∞∑q=2
mq∑p=−mq
(2
qn
)
=∞∑q=2
2(2mq + 1)
qn≤
∞∑q=2
4mq + q
qn≤ (4m+ 1)
∞∑q=2
1
qn−1
≤ (4m+ 1)
∫ ∞1
dq
qn−1≤ 4m+ 1
n− 2
This holds for any n ≥ 3 and therefore, since for �xed m ∈ N we have that
limn→∞4m+1n−2 = 0 we conclude that λ (L ∩ (−m,m)) = 0. This implies that L
is a nullset with respect to (R,B(R), λ).
Another example is the set of normal numbers in R. Lebesgue almost all numbers
are normal but a typical number is non-normal. In the next chapters, we will focus
in this example and prove some extensions and generalisations of this result.
10
Chapter 2
Symbolic Representation of
Dynamical Systems
In this chapter we present a brief introduction to symbolic dynamics using the same
notation as Chapters 1 and 6 of Lind and Marcus [6]. Symbolic representations of
dynamical systems will provide a more general setup of a number system than the
decimal expansion of the real numbers. Informally, a partition of a compact metric
space will induce, under some assumptions, a numbering representation of the points
in our space. This symbolic representation will enable us to study frequencies of digits
in a much more general layout.
2.1 Shifts
De�nition 2.1. If A is an alphabet then the one-sided full A− shift is the collection
of all countably in�nite sequences of symbols from A denoted by
AN ={x = (xi)i∈N : xi ∈ A for all i in N
}The one-sided full r-shift is the one-sided shift of the alphabet {0, . . . , r − 1}.
When |A| = r then there is a natural conjugation between the one-sided full A− shiftand the one-sided full r-shift and so sometimes we choose not to di�erentiate between
them. Similarly, if |A| = ω we can assume N is our alphabet. Also, note that through-
out this thesis we will be working with one-sided shifts AN instead of the two-sided
AZ. Very similar de�nitions and results could be stated for two-sided shifts.
11
Blocks of consecutive symbols will play a central role in the study of frequencies
of occurrence in numbers in the subsequent chapters.
De�nition 2.2. A block (or string) over A is a �nite sequence of symbols from A.For example αβαβαααββ is a block over the alphabet A := {α, β}. The length of
a block x is the number of symbols it contains. Therefore, if x = x1...xn then it has
length n, denoted by |x| = n. The empty block (or empty string), denoted by ε, is
the sequence with no symbols and so |ε| = 0. An m − block is a block of length m.
Finally, let Am denote the set of all m− blocks with symbols from A.
If x ∈ AN we say that a block w = w1...wn over A occurs in x = x0x1... if there
exists j ∈ N so that xj = w1, ..., xj+n−1 = wn. Notice that the empty block ε occurs
in any element x. Now let F be a set of blocks over A which we will regard as the
family of forbidden blocks. For any such F , de�ne XF to be the set of points in AN
such that no block in F occurs in one of them.
XF ={x = x0x1 . . . ∈ AN : xixi+1 . . . xi+j /∈ F for all i, j ∈ N
}De�nition 2.3. A one-sided shift space over A is a subset of the one-sided full
A− shift such that X = XF for a collection F of forbidden blocks with symbols from
A. The shift space XF is called of �nite type if F is a �nite set.
Conversely, sometimes it is convenient to describe a shift space by specifying
which blocks are allowed, rather than which are forbidden. This leads naturally to
the notion of the language of a shift space, i.e. the collection of �allowed� blocks in X.
De�nition 2.4. Let X be a subset of a one-sided shift space, and let Bn(X) denote
the set of all n−blocks that occur in points of X. The language of X is the collection
B(X) =∞⋃n=0
Bn(X)
12
Next, we present the necessary and su�cient conditions for a set of blocks to be
the language of a shift space and show that such a set determines uniquely the shift
space. The following two propositions will turn out to be very useful in the �nal
section of this chapter where we will use them to de�ne Markov Partitions.
Proposition 2.1. Let L be a collection of blocks over A. Then, L is the language
of some shift space X, i.e. L = B(X), if and only if for all w ∈ L the following two
conditions hold:
1. every subblock of w is in L and
2. there exist blocks u, v ∈ L so that |uv| > 0 and uwv ∈ L
Proof. (⇒) : if w ∈ L then there exists x ∈ X so that w occurs in x. But then,
every subblock of w also occurs in x prooving the �rst condition. For the second
condition, the existence of blocks u, v with |uv| > 0 and such that uwv ∈ L follows by
�extending� the block w in x. Since, x has in�nite length we can do this by choosing
the adjacent blocks of w on either side in the representation of x.
(⇐) : Assume L is a set of blocks over A such that the two conditions hold. Let
X denote the shift space XLc . We will show that actually L = B(X). Let w ∈ B(X),
i.e. there exists x ∈ X so that w occurs in x. Hence, w /∈ Lc which implies that
w ∈ L establishing that B(X) ⊆ L. On the other hand, for w ∈ L by repeatedly
applying rule 1 followed by rule 2 we can create a sequence x = (xi)i∈N so that each
subblock of x is in L and w occurs in x. Therefore, x ∈ XLc implying that w ∈ B(X)
and so L ⊆ B(X).
Proposition 2.2. The language of a shift space determines the shift space. In fact,
for a shift space X, it follows that X = XB(X)c.
Proof. Let x ∈ X. Any block occuring in x is an element of B(X) and so x ∈ XB(X)c
implying that X ⊆ XB(X)c . Conversly, let x ∈ XB(X)c . Then, any block w occuring
in x is an element of B(X). However, since X is a shift space there exists a set
of forbidden blocks F so that X = XF and so w ∈ B(XF) for any such block.
Consequently, w /∈ F �nally implying that XB(X)c ⊆ XF = X .
13
2.2 Dynamical systems
De�nition 2.5. A dynamical system (M,ϕ) consists of a compact metric space (M,d)
and a continuous map ϕ :M →M .
One of the main sources of interest in symbolic dynamics is its use in representing
other dynamical systems. Suppose we want to study a dynamical system (M,ϕ).
To describe the orbit {ϕn(x) : n ∈ N} for a given x ∈ M we can construct an
�approximative� description in the following way. Given a partition of M into sets
P0, P1, P2, ... we can track the orbit of x by recording the set Ei where x lands under
iteration by ϕ. This is the main idea of Markov Partitions studied in the next section.
De�nition 2.6. A topological partition of a metric space (M,d) is an at most count-
able collection P = {Pi : i ∈ I} of non-empty, disjoint, open sets whose closures
together cover M , meaning that M =⋃i∈IPi.
Note. A topological partition is not necessarily a partition in the usual sense, since
the union of its elements need not be the whole space. Also, recall from the de�nition
of an at most countable set in chapter 1, P could be a �nite set.
Suppose that (M,φ) is a dynamical system and that P = {Pi : i ∈ I} is a
topological partition ofM . Consider the index set I as an alphabet. Then, we say that
a block w = α1 . . . αn , where α1, . . . , αn ∈ I, is allowed for P , φ ifn⋂j=1
φ−j(Pαj) 6= ∅
and set LP,φ to be the collection of all allowed blocks of P , φ.
Proposition 2.3. LP,φ is the language of a unique shift space called the one-sided
symbolic dynamical system corresponding to P , φ and denoted by X+P,φ.
Proof. Using Proposition 2.1 it su�ces to check that for w ∈ LP,φ all subblocks of w
are in LP,φ and also that there exist u, v ∈ LP,φ with |uv| > 0 such that uwv ∈ LP,φ.For w = a1...an ∈ LP,φ we have that
n∩j=1
φ−j(Paj) is non-empty by assumption. Any
subblock w′ of w is of the form w′ = ai . . . ai+j with 1 ≤ i ≤ n, 0 ≤ j ≤ n− i. Noticethat
n∩j=1
φ−j(Paj) ⊆i+j∩k=i
φ−k(Pak) and so by assumption w′ ∈ LP,φ. To prove the second
condition take u = v = a1 and notice that sincen∩j=1
φ−j(Paj) ⊆ φ−1(Pa1) we have
that u, v ∈ LP,φ and |uv| > 0. Then, uwv = a1a1...ana1 and again using the fact thatn∩j=1
φ−j(Paj) is non-empty by assumption we conclude that uwv ∈ LP,φ . Therefore,
LP,φ is the language of a shift space X. Uniqueness, follows from Propostion 2.2.
14
Now given a dynamical system (M,ϕ) and a topological partition P = {Pi : i ∈ I}of M , in order to ensure that X+
P,φ gives a realistic representation of points in M we
need to impose some extra conditions. For n ≥ 0, consider the n-cylinder around
x = x0x1 . . . ∈ X+P,φ and denote it by
Dn(x) =n⋂k=0
ϕ−k(Pxk) ⊆M
The closures of the n-cylinders,{Dn(x)
}∞n=0
, are closed subsets of a compact space,
hence compact. Additionaly, they decrease with n, meaning that D0(x) ⊇ D1(x) ⊇D2(x) ⊇ ... .
Claim.∞⋂k=0
Dk(x) is a non-empty subset of M .
Proof. Assume for the sake of contradiction that∞∩k=0
Dk(x) = ∅. Then, De Morgan's
Law gives that the union of the complements of the n-cylinders,∞∪k=0
(Dk(x))c provide
an open cover ofM . By assumption,M is a compact metric space hence there exists a
�nite open subcoverN∪k=0
(Dnk(x))c =M . But then considering the complemnets again
we have thatN∪k=0
Dnk(x) = ∅. Set n0 = max0≤k≤N nk. Then, the decreasing property
of the cylinders imples that Dn0(x) =N∩k=0
Dnk(x) = ∅ which is a contradiction.
We have established that∞∩k=0
Dk(x) is non-empty. Moreover, we would like this
countable intersection of cylinders around x to contain a single element ofM in order
for X+P,φ to be in an injective correspondence with M . This leads us to the de�nition
of a Markov Partition.
2.3 Markov Partitions and symbolic representations
De�nition 2.7. Let (M,ϕ) be a dynamical system and P = {Pi : i ∈ I} be a topo-
logical partition ofM . Then, we call P a Markov Partition for (M,ϕ) if the following
two conditions hold:
1. for all x ∈ X+P,φ the intersection
∞∩k=0
Dk(x) is a singleton,
2. X+P,φ is a shift space of �nite type.
We say that a Markov Partition produces a symbolic representation of the Dynamical
System.
15
We proceed by presenting some examples of symbolic representations of dynam-
ical systems that should seem familiar. In all four of the following examples, for
simplicity we use ([0, 1) , |.|) as the underlying metric space of the dynamical system.
Although this space is not compact, there exists a natural bijection between [0, 1) and
R/Z, where (R/Z, |.|) is a compact metric space. In the �rst two examples, we con-
sider �nite Markov Partitions which produce the well-known N -ary and β-expansions.
Example 2.1. Let N ∈ N and consider the dynamical system ([0, 1) , SN) where
SN : [0, 1)→ [0, 1) is given by
SN(x) = Nx (mod 1)
together with the �nite topological partition
P =
{(i
N,i+ 1
N
): 0 ≤ i ≤ N − 1
}Then it is easy to see that P is a Markov Partition for ([0, 1) , SN) with induced shift
space the one-sided full N -shift and the symbolic representation of the system is ex-
actly the N -ary expansion of [0, 1).
Example 2.2. Similarly to the previous example, let β > 1 be a real number and
consider the dynamical system ([0, 1) , ϕ) where ϕ : [0, 1)→ [0, 1) is given by
ϕ(x) = βx (mod 1)
together with the �nite topological partition
P =
{(i
β,i+ 1
β
): 0 ≤ i ≤ bβc − 1
}∪{(bβcβ, 1
)}Then again, we can show that P is a Markov Partition of ([0, 1) , ϕ) and the symbolic
representation that it produces corresponds to β-expansion of points in the unit in-
terval.
16
In the next two examples we consider two other types of expansions with the
main di�erence being that their corresponding topological partition will be countably
in�nite rather than �nite as in N -ary and β-expansions.
Example 2.3. Consider the dynamical system ([0, 1) , G) where G : [0, 1)→ [0, 1) is
the Gauss map given by
G(x) =
1x−⌊1x
⌋x 6= 0
0 x = 0
together with the in�nite topological partition P ={(
1i+1, 1i
): i ∈ N, i ≥ 1
}Claim. For x ∈ [0, 1) the symbolic representation of its orbit under iteration with G
corresponds to its Continued Fraction Expansion.
Proof. Set x1 = x and for k ≥ 1 and xk+1 = G (xk). Then, the sequence (xk)k∈N is the
sequence of the orbit of x under iteration with G. Notice that applying G corresponds
to taking the fractional part of the input. Associate each interval(
1i+1, 1i
)∈ P for i ≥
1 with the digit i. Then, the kth digit of x in its symbolic representation is the unique
natural number ak such that xk ∈(
1ak+1
, 1ak
). This implies that 1
xk∈ (ak, ak + 1)
and so⌊
1xk
⌋= ak. But this is exactly how the kth digit in the continued fraction
expansion is found, as the integer part of the reciprocal of the fractional part of the
previous step xk−1.
Remark. The partition should be modi�ed so that it contains the lower limit at each
interval in order to face the problem occurring when we consider rational points in
[0, 1). However, we choose to ignore this problem in order to have open sets in the
partition. The set of rationals is a countable set and thus negligible in our perspective
in the later chapters.
Example 2.4. Consider the dynamical system ([0, 1) , L) where L : [0, 1)→ [0, 1) is
de�ned by
L(x) =
n(n+ 1)x− n x ∈[
1n+1
, 1n
)0 x = 0
together with the in�nite topological partition
P =
{(1
i+ 1,1
i
): i ∈ N, i ≥ 1
}
17
Then, similarly to the example above, P is a Markov Partition and by associating each
interval(
1i+1, 1i
)∈ P with the digit i we can show that for x ∈ [0, 1) the symbolic
representation of its orbit under iteration with L corresponds to its Lüroth series
expansion.
2.3.1 Uniqueness of symbolic representations
Let (M,ϕ) be a dynamical system and suppose that P = {Pi : i ∈ I} is a Markov
Partition ofM . Using the same notation as above, denote by X+P,φ the one-sided shift
of �nite type coresponding to P , φ.In the de�nition of a Markov Partition we required that each element of the par-
tition is an open set. This will allow us to escape from any ambiguity concerning the
uniqueness of symbolic representations and get a one to one correspondence between
in�nite words in our shift X+P,φ and �typical� points in the dynamical system (M,ϕ).
Markov Partitions require that any w ∈ X+P,φ maps to a unique element in M , but
the converse need not be true. For example consider the decimal expansion as seen
in example 2.1. Then, 0.2000... and 0.1999... represent the same point in [0, 1]. This
huge complication appears because the boundaries of di�erent elements in the parti-
tion is non-empty. In this case, we have that 210∈(
110, 210
)∩(
210, 310
)which leads to
two dinstict representations of the number 210.
In this thesis, we are aiming to show that a subset of points in M is residual. To
make things easier though we would like to concentrate on the set of inner points of
elements in a topological partition of M . As we show below, the set of inner points
U∞ is residual in M and so by working in U∞ instead of M we only avoid a meagre
set, not a�ecting our aim to show that another subset of M , seen now as a subset of
U∞, is residual.
For the reason mentioned above, set
U =⋃i∈I
Pi
and notice that U is an open and dense set in M . This follows from the facts that
each Pi is open and that U =⋃i∈IP i = M . For n ≥ 1 set Un =
n−1⋂i=0
ϕ−i(U). Using the
facts that U is an open and dense set while ϕ is a continuous map we conclude that
for each n ≥ 1, Un is an open and dense subset of M . Hence, {Un}n∈N is a family
18
of open and dense sets inside the compact space M . Then, by applying the Baire
Category Theorem we get that U∞ :=∞⋂i=0
Ui is a dense subset of M and moreover as
shown in proposition 1.1, U∞ is a residual subset of M .
In the next chapters we are going to consider symbolic expansions of points in M
and study their digits frequencies. Ergo, we need the map πP,φ : XP,φ →M given by
πP,φ(w) ∈∞⋂n=0
Dn(w)
This map is well de�ned since by assumption the set∞⋂n=0
Dn(w) is a singleton for each
w ∈ X+P,φ. Moreover, πP,φ is bijective on U∞ and we can call w ∈ X+
P,φ the symbolic
expansion of x ∈ U∞ if x = πP,φ(w) . As mentioned before, in the following chapters
for convienience we will be using U∞ as our ambient space.
Note. The shift map σ : X+P,φ → X+
P,φ given by
σ(x0x1x2 . . .) = x1x2x3 . . .
is such that the following diagram commutes:
X+P,φ
σ−−−−→ X+P,φyπP,φ
yπP,φM
ϕ−−−−→ M
19
Chapter 3
Summability Theory
The theory of summability of divergent series is a major branch of mathematical
analysis that has found important applications in applied mathematics, physics and
engineering. It deals with methods of assigning natural values to divergent series,
whose prototypical examples include the Abel summation method, the Cesàro means
and Borel summability method (cf. Alabdulmohsin [1]).
3.1 Divergent series
Any series that does not converge to a real number is called divergent. In 1828, Niels
Abel described divergent series as the �work of the devil� and declare it �shameful�
for any mathematician to base any argument on them (cf. Hardy [4]). However, some
divergent series are �nicer� than others in the sense that they may share properties
with convergent series or even after alternating their terms they become convergent.
Some of the most famous examples of divergent series are given below.
Example 3.1. The series∑∞
n=11n= 1 + 1
2+ 1
3+ 1
4+ 1
5+ . . . known as the harmonic
series satis�es a necessary condition for a series to converge, namely that its terms
tend to 0, but at the same time the series slowly diverges to in�nity.
Example 3.2. The divergent series∑∞
n=0(−1)n = 1−1+1−1+ . . ., known today as
Grandi's divergent series, sparked a big debate in the community of mathematicians
of the 18th century. Looking at the value(s) of this series, Guido Grandi was allured
to advocate creationism. He argued that grouping terms into (1 − 1) + (1 − 1) + ...
20
suggested the value of 0 for the in�nite sum. On the contrary, the series itself arose in
a well-known Maclaurin expansion of the function f(x) = 11+x
around 1, which sug-
gested a value of 12for the in�nite sum. Since the same mathematical object could be
equally assigned a zero value (nothing) and a non-zero value (something), Grandi ar-
gued that creation out of nothing was mathematically justi�able (cf. Alabdulmohsin
[1]).
Example 3.3.∑∞
n=0 n = 0 + 1 + 2 + 3 + 4 + . . .. In this example, not only the
sum of the natural numbers diverges to in�nity but also its terms diverge to in�nity.
Note that the usual misconception that this sum is equal to − 112
arises in at least two
di�erent ways. However, both methods, Zeta function regularization and Ramanu-
jan summation, do not rigorously prove this result but rather suggest that in some
perspective it is logical to associate the value − 112
to this divergent series.
3.2 Averaging methods
3.2.1 History and de�nition
Divergent series arise quite frequently in many branches of mathematics and sciences,
such as asymptotic analysis, analytic number theory, Fourier analysis, quantum the-
ory and dynamical systems, but their divergence makes them di�cult to work with.
Consequently, they provoked a profound debate in the mathematical community for
a long time. Throughout the 17th and 18th centuries, for example, mathematicians
used divergent series regularly, rather naively. They supported the view that a series
should have the value of the algebraic expression from which it was derived. Partic-
ularly, Euler believed that every series had a unique value, and that divergence was
nothing more than an arti�cial limitation [4].
In the 19th century, however, the commitment to mathematical rigor had led the
most prominent mathematicians of the time, such as Cauchy and Weierstrass, to
forbid the use of divergent series entirely. Consequently, little work was published
on divergent series from 1830 to 1880 Hardy [4]. Nevertheless, around the turn of
the 20th century, a Hegelian synthesis between the two opposing views was initiated.
Ernesto Cesàro placed their study of divergent series on a rigorous footing by provid-
ing the �rst modern de�nition. Cesàro's denition was an averaging method, which
21
was later generalized independently by Norlund and Voronoi. Later on, Ramanujan
would share a similar view by describing the value of an in�nite sum as a �center of
gravity�. The study of divergent series included contributions from famous mathe-
maticians such as Frobenius, Borel, Hardy, Ramanujan and Littlewood, and the name
summability theory was established to denote this newly surfaced area of mathemat-
ical analysis. Reasonably, the key question in summability theory is how to interpret
divergent series, such as the Grandi series mentioned earlier (cf. Alabdulmohsin [1]).
For a series∞∑i=0
ai we denote the nth partial sum by sn = a0+a1+. . .+an for n ∈ N.
When the sequence of partial sums converges to a real number, we say that the series
convereges to that limit. The theory of divergent series uses averaging methods to
generalise the notion of the limit of (sn)n∈N and hence give a value to the series that
would otherwise diverge in the traditional sense.
De�nition 3.1. An N×N real valued matrix T = [cm,n], where cm,n is the (m,n)th-
entry of T , is called a linear transformation if for any real sequence (sn)n∈N and any
natural number m, tm =∑∞
n=0 cm,nsn converges to a real number. The sequence
(tm)m∈N is called the T -averaged version of (sn)n∈N and if limm→∞ tm exists we call it
the T -value of (sn)n∈N.
Linear transformations are mainly used to smooth out divergence from a sequence.
However, if a sequence is already convergent we would like the averaged version of
this sequence to also converge to the same limit. Those linear transformations which
�preserve� limits are called regular. In this thesis, we will be primarily concerned with
regular and positive, linear transformations.
De�nition 3.2. A linear transformation T , with (m,n)th-entry cm,n ∈ R is called
1. Regular, if T maps all convergent sequences to their original limit, i.e. T is
regular if
tm → l as m→∞
whenever
sk → l as k →∞
2. Positive, if cm,n ≥ 0 for all m,n ∈ N.
22
3.2.2 Examples of averaging methods
Hölder means: Given a series A :=∑∞
n=0 ai with partial sums sequence (sn)n∈N
set H0n = sn and de�ne recursively Hk+1
n =Hk
0+...+Hkn
n+1. If the limit limn→∞H
kn exists
for some k ∈ N we say that A is Hölder summable with (h, k) sum equal to the limit
above.
Cesàro means: The Cesàro (c, 1) averaging method is a linear transformation given
by the lower triangular matrix
T =
1 0 0 · · · 0 · · ·12
12
0 · · · 0 · · ·13
13
13
. . . 0 · · ·...
......
. . . 0 · · ·1m
1m
1m
... 1m
......
......
......
. . .
More generally, as in the previous example, we can de�ne the (c, k) Cesàro averaging
method in the following way:
Given a series A :=∑∞
n=0 ai with partial sums sequence (sn)n∈N set A0n = sn and
de�ne recursively Ak+1n = Ak0 + ... + Akn. De�ne also Ek
n to be the value of Akn when
a0 = 1 and an = 0 for n ≥ 1. If the limit limn→∞Ak
n
Eknexists for some k ∈ N we say
that A is Cesàro summable with (c, k) sum equal to the limit above.
The two examples above are very similar. Holder means (h, k) are de�ned recur-
sively by n summations and one division each time in k steps, whereas the Cesàro
means (c, k) are de�ned in the same way except that the only division occurs at the
last step. It is shown in Hardy [4] that the two methods are actually equivalent,
meaning that a series A is (h, k) summable with sum equal to S if and only if A is
(c, k) summable with the same sum.
Abelian means: Let (λn)n∈N be a strictly increasing and unbounded sequence of
non-negative reals. Suppose
f(x) =∞∑n=0
sne−λnx
23
converges for all real numbers x > 0. Then the Abelian mean Aλ is given by
Aλ((sn)n) = limx↘0
f(x)
If λn = n for each n ∈ N, then we obtain the method of Abel summation. Here
f(x) =∞∑n=0
sne−nx =
∞∑n=0
snzn
putting z = e−x. Then the limit of f(x) as x approaches 0 from above is the same as
the limit of the power series for f(z) as z approaches 1 from below through positive
reals, and the Abel sum A(s) is de�ned to be
A((sn)n) = limz↗1
∞∑n=0
snzn
It is interesting to note that Abel summation is consistent with Cesàro summation,
i.e. A((sn)n) = (c, k)((sn)n) whenever the latter is de�ned. The Abel sum is therefore
regular, linear and consistent with Cesàro summation but also more powerful than
the latter.
3.2.3 Main result for regular linear transformations
Toeplitz and Schur proved the following result around 1911 in order to give a charac-
terisation of regular linear transformations. Toeplitz considered only lower triangular
matrices. However, his result was then generalised by Steinhauss (see Hardy [4]).
Theorem 3.1. A linear transformation T = [cm,n]N×N is regular if and only if the
following conditions are satis�ed:
1. there exists M > 0 such that for all m ∈ N we have that γm =∑∞
n=0 |cm,n| < M ,
2. for all n ∈ N, limm→∞ cm,n = 0 and
3. setting cm =∑∞
n=0 cm,n we have that cm → 1 as m→∞.
Proof. Given in the Appendix.
24
In the next chapter, we will be using linear transformations T = [cm,n]N×N that are
positive and regular. Using the theorem above, we get that cm,n ≥ 0 for all m,n ∈ Nand that the sequence of row sums of T , (γm)m∈N is bounded and tends to 1, whereas
the elements of each column of T tend to 0. Finally, following Toeplitz's example we
will work with lower triangular linear transformations.
De�nition 3.3. Let T be the set of all �nice� linear transformations T = [cm,n]N×Nsuch that the following two conditions hold:
• T is a lower triangular matrix and
• there exists G > 0 such that supn∈N |cm,n| < Gm
for all m ∈ N.
Also, denote by TQ the countable subset of linear transformations in T with entries
in Q.
Example. The linear transformations corresponding to the (h, k) Hölder means and
the (c, k) Cesàro means are elements of T .
25
Chapter 4
Normal Numbers
4.1 Historical overview
Turning our attention to occurrences of distinct digits in a number, we may ask our-
selves, what properties should a �normal� number satisfy? In order to get acquainted
with this question consider the following example. Assume we choose a real number
at random, written in decimal expansion. Then, there is nothing suggesting that
the digit �5� will appear more times than the digit �7� in this number. This is the
intuition for Borel's normal numbers theorem but before we introduce this, let's �rst
rigorously de�ne the property of normality.
De�nition 4.1. Let x be a real number with fractional part .x1x2... when written
in its unique non-terminating expansion base r. Let N(x, b, n) denote the number of
occurrences of the digit b in the �rst n places in the fractional part of x. The number
x is called simply normal to base r if
limn→∞
N(x, b, n)
n=
1
r
for each b ∈ {0, ..., r − 1}; x is said to be normal base r if all numbers x, xr, xr2, ...
are simply normal in bases r, r2, r3, ... .
Remark. This de�nition for r = 10 was introduced by Emile Borel [3] who stated that
�la propriete characteristique� of a normal number is the following.
26
De�nition 4.2. A real number x with fractional part .x1x2... when written in its
unique non-terminating expansion base r is called normal if for any block of m spec-
i�ed digits b = b1 . . . .bm we have
limn→∞
N(x, b, n)
n=
1
rm
where N(x, b, n) stands for the number of occurrences of the block b in the �rst n
digits of the fractional part of x.
Remark. In Niven and Zuckerman [8], it is shown that the two de�nitions of normal
numbers given above are equivalent.
Theorem 4.1. (Borel,1909) Lebesgue almost all numbers in [0, 1] are normal base
10.
Remark. This result extends to any base r, for r ≥ 2. Sometimes when the base we
are working in is clear from the context, instead of calling a number normal base r,
we will just call it normal.
Corollary 4.1. A real number is called non-normal base r if it is not normal in base
r. The set of non-normal numbers in any base has Lebesgue measure zero.
Proof. It follows from the generalised version of Borel's result.
4.2 Normal numbers in dynamical systems
We would like to de�ne the property of normality in a more general setup. Consider
numbers in a symbolic representation of a dynamical system, using the notation and
notions introduced in Chapter 2. Given (M,ϕ) a dynamical system with a Markov
Partition P = {Pi : i ∈ I} set
S =
{0, ..., N − 1} if I is a �nite set of size N
N if I is a countable set
to be the alphabet of our shift so that each digit i in the alphabet S corresponds,
maybe after relabeling, to the set Pi in the partition.
27
Similarly to the previous section, we de�ne the frequency of a block of length k,
say b = b1...bk ∈ Sk, in the �rst n digits of a word w = a0a1a2... ∈ X+
P,ϕ by
P (w, b, n) =
|{0≤i≤n−k: ai=b1,...,ai+k−1=bk}|
n−k+1if n ≥ k
0 otherwise
and also write
Pk(w, n) = (P (w, b, n))b∈Sk
for the vector of frequencies of all k-blocks in the �rst n digits of w. Notice that we
immediately get that ||Pk(w, n)||1 =∑
b∈Sk |P (w, b, n)| = 1.
Now let µ be a probability measure on the induced shift space X+P,φ.
De�nition 4.3. We call w ∈ X+P,φ µ-normal, if for any block b ∈ S
k we have
limn→∞
P (w, b, n) = µ ([b])
Where [b] is the cylinder around b, i.e. the set{(ai)i∈N ∈ X
+P,φ : a0 = b1, ..., ak−1 = bk
}.
In particular, we call w normal, if µ is the the normalised Hausdor� measure on X+P,φ.
4.3 Extremely non-normal numbers
In the previous section we de�ned a normal number in a symbolic dynamical system.
Now we focus on non-normal numbers. Non-normal numbers do not satisfy the prop-
erty above, i.e. that for any block of digits the sequence of its frequencies converges
to the measure of the cylinder centered arround that block. More crucially, for many
numbers not only that sequence does not converge to that speci�c point but rather it
diverges dramatically. The divergence is so chaotic, that almost every possible con-
vegrence point is an accumulation point for the sequence. We will study this in more
detail later. Now, we use the previous chapter on regular linear transformations to
de�ne an averaged version of the sequence of block frequencies to check, if even this
smoothed version of the sequence still exhibits this very abnormal behaviour.
28
Recall from the previous section the de�nition of the vector of block frequencies
in the �rst n digits.
Pk(w, n) = (P (w, b, n))b∈Sk
We use this to de�ne the sequence (Pk(w, n))n∈N. Given a positive and regular,
linear transformation T = [cm,n]N×N we proceed to de�ne the T -averaged version of
the latter sequence denoted by (P Tk (w,m))m∈N where for m ∈ N we set
P Tk (w,m) =
∑n∈N
cm,nPk(w, n)
Note. P Tk (w,m) is a vector with a coordinate for each b ∈ S
k. Keep in mind that in
the term Pk(w, n), n corresponds to the number of digits of w we consider frequen-
cies of blocks over. On the other hand, in the term P Tk (w,m), m corresponds to the
number of averaging steps already taken likewise to the sequence (tm)m∈N considered
in the previous chapter.
For a normal number w, we have that (Pk(w,m))m∈N converges pointwise to
(µ ([b]))b∈Sk with respect to ||.||1 and so by regularity of T ,(P Tk (w,m)
)m∈N also
converges pointwise to the same limit.
Turning our attention to non-normal numbers now, consider for w ∈ X+P,φ the set
of accumulation points of the sequence (P Tk (w,m))m∈N, denoted by ATk (w).
Theorem 4.2. Let w ∈ X+P,φ. Then, ATk (w) ⊆ Fk where Fk is the following simplex
of shift invariant vectors:
Fk =
(pi)i∈Sk : pi ≥ 0,∑i∈Sk
pi ≤ 1,∑i∈S
pij =∑i∈S
pji for all j ∈ Sk−1
.
Proof. Given in the Appendix
29
Now we concentrate on the elements of Fk that are probability vectors. Hence,
consider the set
Sk ={p ∈ [0, 1]S
k
: ‖p‖1 = 1}∩ Fk
It is important to note that, it is shown in the proof of the theorem above that
Sk = Fk, in the case that S is �nite.
De�nition 4.4. Let x ∈ U∞ with π−1P,φ(x) = w ∈ X+P,φ . For T ∈ T we call x,
T -extremely non-k-normal if ATk (w) = Sk. Also, denote the set of T -extremely non-
k-normal numbers by ETk . Finally, de�ne the set of extremely non-normal numbers
by:
E =⋂T∈T
⋂k≥1
ETk
where T is positive and regular.
In order to make it easier to work with the set E, we present the following propo-
sition to express it as a countable intersection.
Proposition 4.1. E =⋂
T∈TQ
⋂k≥1
ETk .
Proof. Clearly, E ⊆⋂
T∈TQ
⋂k≥1
ETk . We are now left to prove the reverse containment.
Let x ∈⋂
T∈TQ
⋂k≥1
ETk and �x T ∈ T and k ∈ N. It su�ces to show that x ∈ ETk . In
other words, considering q ∈ Sk it su�ces to show that it is an accumulation point of
(P Tk (w,m))m∈N, where w is the symbolic expansion of x. Let ε > 0 and since TQ = T
we can �nd T ′ = [c′m,n]N×N in TQ such that |cm,n − c′m,n| < ε2n+2 for all m,n ∈ N.
Since x ∈ ET ′k there exists a strictly increasing sequence of natural numbers (rl)l∈N
so that P T ′
k (w, rl) → q as l → ∞. Then, there exists L ∈ N so that for l ≥ L, we
have ∥∥∥P T ′
k (w, rl)− q∥∥∥1<ε
2
30
Therefore, for l ≥ L we have that
∥∥P Tk (w, rl)− q
∥∥1≤∥∥∥P T
k (w, rl)− P T ′
k (w, rl)∥∥∥1+∥∥∥P T ′
k (w, rl)− q∥∥∥1
≤
∥∥∥∥∥∞∑n=0
crl,nPk(w, n)−∞∑n=0
c′rl,nPk(w, n)
∥∥∥∥∥1
+ε
2
≤
∥∥∥∥∥∞∑n=0
(crl,n − c′rl,n)Pk(w, n)
∥∥∥∥∥1
+ε
2
≤∞∑n=0
∣∣crl,n − c′rl,n∣∣ ‖Pk(w, n)‖1 + ε
2
≤∞∑n=0
∣∣crl,n − c′rl,n∣∣+ ε
2
≤∞∑n=0
ε
2n+2+ε
2=ε
2
∞∑n=0
1
2n= ε
which shows that q ∈ Sk is an accumulation point of (P Tk (w,m))m∈N. This holds for
any q ∈ Sk implying that x ∈ ETk .
4.4 Statement of results
Theorem 4.3 (Main Result). Let k ≥ 1 be an integer and T ∈ T be a positive
and regular, linear transformation. Furthermore, let P = {Pi : i ∈ I} be a Markov
Partition for a dynamical system (M,ϕ). Suppose that the generated shift space X+P,ϕ
is the one-sided full shift. Then, the set ETk is residual.
Proof. See Chapter 5.
Remark 4.1. For the case where the Markov Partition is a �nite set and hence produces
a shift space over a �nite alphabet, we could get a more general result as shown in
Madritsch and Petrykiewicz [7], by requiring X+P,ϕ to satisfy the speci�cation property
instead of being the full one-sided shift. The speci�cation property provides the
required framework to show that, our shift is �almost� closed under concatenation of
words, a property we want to use in our proofs which is clearly true for the full shift.
31
Remark 4.2. Recall the de�nition of N -ary expansions from example 2.1. Also, re-
member the (c, k) Cesàro means for k ∈ N seen in Chapter 3. The main result of
Hyde et al. [5] says that for a typical point in [0, 1] the Cesàro averaged version of
the sequence of block frequncies in its N -ary expansion has the whole simplex of
N -dimensional probability vectors as set of accumulation points. This result consists
of a special case of theorem 4.8 where the symbolic representation of the system is
the one discussed in example 2.1 and extremely non-normal numbers are de�ned only
by considering Cesàro means instead of any positive and regular averaging method.
Therefore, denoting by Tk ∈ T the linear transformation corresponding to the (c, k)
Cesàro means and assuming our dynamical system is ([0, 1] , SN) together with the
Markov Partition{(
iN, i+1N
): 0 ≤ i ≤ N − 1
}we get the required result from theorem
4.8. In particular, this result was the motivation for this more general version.
Corollary 4.2. The set E is residual in M , implying that �a typical element of M is
extremely non-normal�.
Proof. Theorem 4.8 implies that for T ∈ TQ and k ∈ N the set ETk is residual in M .
By proposition 4.5, E is a countable intersection of residual sets, hence residual in
U∞. Finally, U∞ was shown to be a residual set of M implying that E is also residual
in M and hence completing the proof.
32
Chapter 5
Typical numbers are extremely
non-normal
This Chapter will focus on the proof of Theorem 4.3 which is the main result of this
thesis. We begin by introducing some more notation and present some lemmata that
will play an important role in our aim to prove the theorem.
De�nition 5.1. For a �nite word w = a0...an−1 of length n and a k-block b = b1...bk
both over an alphabet S de�ne by
P (w, b) =
|{0≤i≤n−k: ai=b1,...,ai+k−1=bk}|
n−k+1if n ≥ k
0 otherwise
the frequency of appearance of b in w and set
Pk(w) = (P (w, b))b∈Sk
to be the vector of frequencies of k-blocks in w.
Also, recall from the previous chapter, the set of shift invariant probability vectors
associated with frequencies of k-blocks from an alphabet S:
Sk =
(pi)i∈Sk : pi ≥ 0,∑i∈Sk
pi = 1, and∑i∈S
pij =∑i∈S
pji for all j ∈ Sk−1
.
33
Subsequently, we have to treat the two cases of �nite and in�nite Markov Parti-
tions separately. After we establish the necessary lemmata for each case, we present
a joint proof in the �nal section of this chapter.
5.1 On in�nite Markov Partitions
Suppose that we have a dynamical system (M,ϕ) together with a countable Markov
Partition P = {P0, P1, ...}. In this case the induced symbolic representation ofM has
S = N as an alphabet. This is a an in�nite alphabet but for elements of Sk we would
like to be able to �weight� only the coordinates of blocks that use symbols from a
�nite subset of our alphabet. For this reason we de�ne for k, N ≥ 1 the set
Sk,N =
(pi)i∈Nk :pi ≥ 0,
∑i∈Nk
pi = 1,∑i∈N
pij =∑i∈N
pji for all j ∈ Nk−1
and pi = 0 for i ∈ Nk\{0, ..., N − 1}
of shift invariant probability vectors where only the �rst N digits are weighted.
Now de�ne the union of probability vectors over �nite alphabets
S∗k =⋃N≥1
Sk,N
We proceed with the following observations. The set S∗k is clearly a dense subset
of the space (Sk, ‖.‖1). Moreover, S∗k is separable (with countable dense subset its
subset of elements with rational coordinates). So let's name a sequence (qk,m)m in S∗k
that is dense in (Sk, ‖.‖1). Throughout this section we �x the integers m + 1, k ≥ 1
and set q = qk,m. Then, since q ∈ S∗k =⋃N≥1 Sk,N there exists N ≥ 1 so that
q ∈ Sk,N and consequently qi = 0 for i ∈ Nk\{0, ..., N − 1}k.We would like to �nd a �nite word with k-block frequencies really close to the
coordinates of q in order to use it later in the main part of the proof. Therefore, we
de�ne for n ≥ 1 the set
Zn := Zn (q, N, k) =
w ∈ ⋃l≥knNk
{0, ..., N − 1}l : ‖Pk(w)− q‖1 ≤1
n
and crucially prove that it is non-empty.
34
Lemma 5.1. For all n ≥ 1, k,N ∈ N and q ∈ S∗k we have Zn (q, N, k) 6= ∅.
Proof. This is lemma 2.4 in Olsen [9].
We will make use of this �rst lemma to create an in�nite word with speci�c block
frequencies by concatenating arbitrary many copies of a �nite word from Zn(q, N, k).
Lemma 5.2. Let k, n,N, t be positive integers and q ∈ Sk,N . In addition, consider
the word w = w1...wt ∈ Nt and let M := max1≤i≤t {wi} be the maximal digit of w.
Then, for any �nite word γ ∈ Zn(q, N, k) and any
l ≥ L := t+ |γ|max
{nNk,
t
k
[1 +
(M + 1)k
Nk
]}we have that
‖Pk(wγ∗, l)− q‖1 ≤6
n(5.1)
Proof. Firstly, set s := |γ| to be the length of γ and σ := wγ∗|l the starting l-block
of wγ∗ = wγγγ . . .. By the de�nition of the block σ
σ =
l︷ ︸︸ ︷w︸︷︷︸t
γ︸︷︷︸s
γ︸︷︷︸s
... γ︸︷︷︸s︸ ︷︷ ︸
q times
γ1...γr︸ ︷︷ ︸r
it follows that there exist natural numbers q and r so that l = t+qs+r with 0 ≤ r < s.
Now a block i ∈ Nk that occurs in σ satis�es the following property:
qs
lP (γ, i) ≤ P (σ, i) ≤ qs
lP (γ, i) +
t+ q(k − 1) + r
l(5.2)
This can be easily proved considering where the block i could occur in σ. It is either
inside γ, or in w, or �nally in between them or at the end.
In our e�ort to prove the inequality (5.1) we will focus our concern on the occur-
rences of k-blocks inside γ and use the triangle inequality as follows:
‖Pk(wγ∗, l)− q‖1 = ‖Pk(σ)− q‖1 ≤∥∥∥Pk(σ)− qs
lPk(γ)
∥∥∥1+∥∥∥qslPk(γ)− q
∥∥∥1
35
For the �rst summand we have that∥∥∥Pk(σ)− qs
lPk(γ)
∥∥∥1=∑i∈Nk
∣∣∣P (σ, i)− qs
lP (γ, i)
∣∣∣=
∑i∈{0,...,N−1}k
∣∣∣P (σ, i)− qs
lP (γ, i)
∣∣∣+ ∑i∈Nk\{0,...,N−1}k
∣∣∣P (σ, i)− qs
lP (γ, i)
∣∣∣≤
∑i∈{0,...,N−1}k
(t+ q(k − 1) + r
l
)+
∑i∈Nk\{0,...,N−1}k
P (w,i) 6=0
P (σ, i)
(5.3)
≤∑
i∈{0,...,N−1}k
(t+ qk + s
l
)+
∑i∈{0,...,M}k
t
l(5.4)
= Nk
(t+ qk + s
l
)+ (M + 1)k
t
l
=[Nk + (M + 1)k
] tl+Nk
(qk
l
)+Nk
(sl
)≤tNk
[1 + (M + 1)k/Nk
]l
+Nk
(qk
qknNk
)+Nk
( s
snNk
)(5.5)
=tNk
[1 + (M + 1)k/Nk
]l
+2
n
≤ 1
n+
2
n=
3
n(5.6)
(5.3): using property (5.2) and the fact that γ ∈ {0, ..., N − 1}k
(5.4): M is the maximal digit of w so for i ∈ Nk\{0, ...,M}k, P (w, i) = 0.
Also, for i ∈ Nk\{0, ..., N − 1}k, P (σ, i) ≤ tl
(5.5): By the assumptions for l and γ we have that l ≥ qs ≥ qknNk and l ≥ snNk
(5.6): By assumption l ≥ |γ| tk
[1 + (M + 1)k/Nk
]≥(nkNk
)tk
[1 + (M + 1)k/Nk
]
36
Looking at the second summand now we have:∥∥∥qslPk(γ)− q
∥∥∥1≤∥∥∥qslPk(γ)− Pk(γ)
∥∥∥1+ ‖Pk(γ)− q‖1
≤∣∣∣qsl− 1∣∣∣ ‖Pk(γ)‖1 + 1
n(5.7)
≤ 1
n+
∣∣∣∣qs− ll
∣∣∣∣ (5.8)
=1
n+t+ r
l≤ 1
n+t
l+s
l(5.9)
≤ 1
n+
2
nNk≤ 3
n(5.10)
(5.7): Since γ ∈ Zn(q, N, k), ‖Pk(γ)− q‖1 ≤1n
(5.8): By de�nition we have that ‖Pk(γ)‖1 = 1
(5.9): By construction of σ
(5.10): Since l ≥ snNk and l ≥ s tk≥ nkNk t
k≥ tnNk by assumption.
Combining the two established inequalities we get the result.
5.2 On �nite Markov Partitions
In this section, we focus on the simpler case of �nite Markov Partitions. Suppose
that we have a dynamical system (M,ϕ) together with a �nite Markov Partition P =
{P0, ..., PN−1}. This induces a symbolic representation of M over a �nite alphabet
S = {0, ...N − 1}. In this case, denote the set of shift invariant probability vectors
associated with frequencies of k-blocks from the alphabet {0, ...N − 1} as
Sk =
(pi)i∈Sk : pi ≥ 0,∑i∈Sk
pi = 1, and∑i∈S
pij =∑i∈S
pji for all j ∈ Sk−1
Then, for a natural number k, n ≥ 1 and q ∈ Sk de�ne the set of �nite blocks with
block frequencies �relatively� close to the coordinates of q by
Zn := Zn (q, k) =
w ∈ ⋃l≥knNk
{0, ..., N − 1}l : ‖Pk(w)− q‖1 ≤1
n
and crucially prove that it is non-empty.
37
Lemma 5.3. For all n ≥ 1, k ∈ N and q ∈ Sk we have Zn (q, k) 6= ∅.
Proof. See Lemma 3.2 in Madritsch and Petrykiewicz [7].
We will make use of this lemma to create an in�nite word with speci�c block
frequencies by concatenating arbitrary many copies of a �nite word in Zn(q, k).
Lemma 5.4. Let k, n, t be positive integers and q ∈ Sk. In addition, consider the
word w = w1...wt ∈ {0, ..., N − 1}t and let M := max1≤i≤t {wi}. Then, for any
γ ∈ Zn(q, k) and any
l ≥ L := nNkmax {t, |γ|}
we have that
‖Pk(wγ∗, l)− q‖1 ≤6
n
Proof. Firstly, set s := |γ| and σ := wγ∗|l. By construction of the word σ it follows
that there exist natural numbers q and r so that l = t+ qs+ r with 0 ≤ r < s.
σ =
l︷ ︸︸ ︷w︸︷︷︸t
γ︸︷︷︸s
γ︸︷︷︸s
... γ︸︷︷︸s︸ ︷︷ ︸
q times
γ1...γr︸ ︷︷ ︸r
Now a block i ∈ {0, ..., N − 1}k that occurs in σ satis�es the following property:
qs
lP (γ, i) ≤ P (σ, i) ≤ qs
lP (γ, i) +
t+ q(k − 1) + r
l(5.11)
this can be easily proved considering where the block i could occur in σ. It is either
inside γ, or in w, or �nally in between them or at the end.
In our e�ort to prove the inequality we will focus our concern on the occurrences of
k-blocks inside γ.
‖Pk(wγ∗, l)− q‖1 = ‖Pk(σ)− q‖1 ≤∥∥∥Pk(σ)− qs
lPk(γ)
∥∥∥1+∥∥∥qslPk(γ)− q
∥∥∥1.
38
For the �rst summand we have that∥∥∥Pk(σ)− qs
lPk(γ)
∥∥∥1=
∑i∈{0,...,N−1}k
∣∣∣P (σ, i)− qs
lP (γ, i)
∣∣∣≤
∑i∈{0,...,N−1}k
t+ q(k − 1) + r
l(5.12)
≤∑
i∈{0,...,N−1}k
(t+ qk + s
l
)
= Nk
(t+ qk + s
l
)≤ Nk t
l+qkNk
l+Nk s
l
≤ Nk
(t
nNkt
)+
qkNk
qnkNk+Nk
( s
nNks
)(5.13)
=3
n
(5.12): By property (5.11)
(5.13): Since l ≥ nNkmax {t, |γ|} and also l ≥ qs ≥ qnkNk.
Looking at the second summand now we have:∥∥∥qslPk(γ)− q
∥∥∥1≤∥∥∥qslPk(γ)− Pk(γ)
∥∥∥1+ ‖Pk(γ)− q‖1
≤∣∣∣qsl− 1∣∣∣ ‖Pk(γ)‖1 + 1
n(5.14)
≤ 1
n+
∣∣∣∣qs− ll
∣∣∣∣ (5.15)
=1
n+t+ r
l≤ 1
n+t
l+s
l(5.16)
≤ 1
n+
2
nNk≤ 3
n(5.17)
(5.14): Since γ ∈ Zn(q, k), ‖Pk(γ)− q‖1 ≤1n
(5.15): By de�nition we have that ‖Pk(γ)‖1 = 1
(5.16): By construction of σ
(5.17): Since l ≥ nNkmax {t, |γ|}
Combining the two established inequalities we get the result.
39
5.3 Proof of the main result
Theorem (Main Result). Let k ≥ 1 be an integer and T ∈ T be a positive and reg-
ular, linear transformation. Furthermore, let P = {Pi : i ∈ I} be a Markov Partition
for a dynamical system (M,ϕ). Suppose that the generated shift space X+P,ϕ is the
one-sided full shift. Then, the set ETk is residual.
Our strategy is to form a residual set E which is easier to work with and crucially
show that it is a subset of ETk but also and already residual. Our construction will
follow ideas from proof of theorem 1.1 Hyde et al. [5].
We begin by recursively de�ning the functions ψm for m ≥ 1 by ψ1(x) = 2x and
ψm = ψ1(ψm−1(x)) for m ≥ 2. Next we choose a countable, dense subset of S∗k (in
the �nite Markov Partition case we ignore the asterisk ∗ and work with Sk). Set
D = S∗k ∩ Q(Sk) to be that set. Now we may concentrate on the probability vectors
inside D.We say that a sequence (xn)n∈N with terms in R(Sk) has property P if for all
q ∈ D, m, i ∈ N and ε > 0 there exists a natural number j such that:
1. j ≥ i
2. j2j< ε
3. if j < n < ψm(2j) then ‖xn − q‖1 < ε
We de�ne now the set E which consists of all points in U∞ whose symbolic expansion
has sequence of vector of frequencies satisfying property P, i.e.
E = {πP,φ(w) : (Pk(w, n))n∈N has property P } ∩ U∞
Now our objective is to show the following statements:
Firstly, we show that our set E is residual. Then, we proceed to show that if
(Pk(w, n))n∈N has property P, then also (P Tk (w, n))n∈N has property P and �nally, we
show that E is a subset of ETk for k ∈ N and T ∈ T a positive and regular, linear
transformation.
40
Lemma 5.5. The set E is a residual subset of U∞.
Proof. To prove the lemma will make use of the properties of residual sets again. We
construct a countable family of open and dense sets in U∞ and show that our set E
can be expressed as the intersection of this family, hence showing that E is residual.
For that reason, �x α,m, i ∈ N and q ∈ D and de�ne property Pα,m,q,i for a sequence
(xn)n∈N with terms in RSkin the following way:
Our sequence satis�es property Pα,m,q,i if for all ε >1αthere exists a natural number
j such that:
1. j ≥ i
2. j2j< ε
3. if j < n < ψm(2j) then ||xn − q||1 < ε
Analogously to the discussion above, we de�ne the set Eα,m,q,i which consists of all
points in U∞ whose symbolic expansion have frequency vectors satisfying property
Pα,m,q,i, i.e.
Eα,m,q,i = {πP,φ(w) : (Pk(w, n))n∈N has property Pα,m,q,i} ∩ U∞
It easily follows from the de�nitions that
E =⋂α∈N
⋂m∈N
⋂q∈D
⋂i∈N
Eα,m,q,i
and so it su�ces to show that each set Eα,m,q,i is open and dense. Fix α,m, i ∈ Nand q ∈ D and proceed with the following claims.
Claim. Eα,m,q,i is open.
Proof. Let x ∈ Eα,m,q,i and set w = w0w1... to be the symbolic expansion of x so
that πP,φ(w) = x. We want to �nd a ball of positive radius in M which is a subset
of Eα,m,q,i. By construction, there exists a natural number j satisfying the three
conditions de�ning property Pα,m,q,i, i.e. j ≥ i, j2j≤ 1
αand if j < n < ψm(2
j)
then ‖Pk(w, n)− q‖1 ≤1α. Now to simplify notation set t := ψm(2
j) and consider
Dt(w) =t∩k=0
ϕ−k(Pwk). Recalling that ϕ is a continuous function and each set P ∈ P
is open by assumption, we get that the cylinder Dt(w) is open as a �nite intersection
41
of open sets and of course it contains x. Therefore, we can �nd a positive distance
δ so that the ball in U∞ centered in x of radius δ is a subset of Dt(w). Now we
wish to show that this ball BM(x, δ) is a subset of Eα,m,q,i. We will do that by
noticing that for a point y ∈ BM(x, δ) ⊆ Dt(w) the �rst t digits of the symbolic
expansion of y are the same as x. As a result the same j as above shows that the
sequence(Pk(π
−1P,φ(y), n)
)n∈N has property Pα,m,q,i �nally implying that y ∈ Eα,m,q,i.
The choice of y was arbitrary and so BM(x, δ) ⊆ Eα,m,q,i.
Claim. Eα,m,q,i is dense.
Proof. Let x ∈ U∞ and δ > 0. It su�ces to �nd an element in the intersection
BM(x, δ) ∩ Eα,m,q,i. Again, denote by w ∈ X+P,φ the symbolic expansion of x and
notice that x ∈ Dt(w) for t ≥ 1 and also diamDt(w)t→∞−→ 0. Hence, there exists a
positive natural number t′ so that Dt′(w) ⊂ B(x, δ).
Now set σ := w|t′ to be the block of the �rst t′ digits of the symbolic expansion of x.Using lemma 5.1, we may choose a �nite word γ ∈ Z6a(q, N, k) so that ||Pk(γ)−q||1 ≤16α. With the block σ of length t′ and the �nite word γ ∈ Z6a(q, N, k) we can
immediately use lemma 5.2.
Let ε ≥ 1αand L as in lemma 5.2, then we can choose a positive natural number
j big enough so that j ≥ max {i, L} and j/2j < ε. We will show that any point in
the non-empty open cylinder Dψm(2j)(σγ∗) has the desired property of lying in the
intersection of BM(x, δ)∩Eα,m,q,i. Firstly, it is easy to see that since σ agrees with w in
the �rst t′ digits and by assumption ψm(2j) ≥ j ≥ L ≥ t′ we have that Dψm(2j)(σγ
∗) ⊆Dt′(σγ
∗) = Dt′(w) ⊂ B(x, δ) . So we are left to prove that Dψm(2j)(σγ∗) ⊂ Eα,m,q,i.
Let y ∈ Dψm(2j)(σγ∗) and remember that we chose j so that j ≥ i and j/2j < ε.
We only need to show that for j < n < ψm(2j) we have ||Pk(π−1P,φ(y), n) − q||1 <
ε. To do that we make use of the fact that the block frequencies of the symbolic
representation of y in the �rst n digits for n < ψm(2j) are the same as the block
frequencies of σγ∗ since y ∈ Dψm(2j)(σγ∗). Therefore,
||Pk(π−1P,φ(y), n)− q||1 = ||Pk(σγ∗, n)− q||1
and since n > j ≥ L Lemma 5.2 implies that∥∥Pk(π−1P,φ(y), n)− q∥∥1= ‖Pk(σγ∗, n)− q‖1 ≤
66α
= 1α≤ ε
42
Lemma 5.6. Let w ∈ XP,φ and T ∈ T be a positive and regular, linear transforma-
tion. If (Pk(w, r))r∈N has property P, then also (P Tk (w, r))r∈N has property P.
Proof. Let w ∈ XP,φ be such that (Pk(w, r))r∈N has property P and �x ε > 0, q ∈ D,m, i ∈ N and T = [cm,n]N×N a positive and regular, linear transformation in T . Let
G > 0 be as in condition 2 of the de�nition of T . Further, since γr, the rth row sum
of T , tends to 1 as r →∞ then we can �nd R ∈ N, such that for r ≥ R we have that
|γr − 1| < ε/3. Finally, the sequence (Pk(w, r))r∈N has property P which means that
we can �nd a j ∈ N satisfying the following three properties:
1. j ≥ max {i, R, 2}
2. j/2j ≤ ε6G
3. For n ∈ N with j < n < ψm+1(2j) we have that ‖Pk(w, n)− q‖1 ≤
ε3(1+ε)
We set j′ = 2j and we check that the sequence (P Tk (w, r))r∈N satis�es the conditions
of property P.
1. j′ = 2j > j ≥ i
2. j′ > j ≥ 2 and so j′/2j′ ≤ j/2j ≤ ε/3
3. Finally, for r ∈ N with j′ < r < ψm(2j′) or equivalently 2j < r < ψm+1(2
j) we
have that:
∥∥P Tk (w, r)− q
∥∥1=
∥∥∥∥∥∑n∈N
cr,nPk(w, n)− q
∥∥∥∥∥1
=
∥∥∥∥∥∑n∈N
cr,n(Pk(w, n)− q)− q(∑n∈N
cr,n − 1)
∥∥∥∥∥1
≤∑n∈N
|cr,n| ‖(Pk(w, n)− q)‖1 + ‖q‖1 |1− γr| .
43
Recalling that ||q||1 = 1 and r > 2j > j ≥ R we get:
∥∥P Tk (w, r)− q
∥∥1≤∑n≤j
|cr,n| ‖(Pk(w, n)− q)‖1
+∑
j<n<ψm+1(2j)
|cr,n| ‖(Pk(w, n)− q)‖1
+∑
n≥ψm+1(2j)
|cr,n| ‖Pk(w, n)− q‖1
+ε
3.
Using the fact that T is a lower triangular matrix we get:
∥∥P Tk (w, r)− q
∥∥1≤∑n≤j
supn∈N
cr,n (‖Pk(w, n)‖1 + ‖q‖1)
+∑
j<n<ψm+1(2j)
|cr,n| ·ε
3(1 + ε)
+∑
n≥ψm+1(2j)
0 · ‖(Pk(w, n)− q)‖1
+ε
3
≤ 2j
(supn∈N
cr,n
)+
ε
3(1 + ε)
∑j<n<ψm+1(2j)
|cr,n|
+ε
3
≤ 2j( r2j
)supn∈N
cr,n +ε
3(1 + ε)· γr +
ε
3
≤ 2
(j
2j
)r supn∈N
cr,n +ε
3(1 + ε)(1 + ε) +
ε
3
≤ 2(ε
6G)G+
ε
3+ε
3= ε
The choices ofm, i, q, ε were arbitrary so we conclude that the sequence ((P (w, r))r∈N
satis�es the property P.
44
For our �nal step we proceed to show that the set
E = {πP,φ(w) : (Pk(w, n))n∈N has property P } ∩ U∞
is a subset of
ETk ={πP,φ(w) : A
Tk (w) = Sk
}∩ U∞
the set of points in U∞ whose symbolic expansion has an averaged sequence of vectors
of frequencies with the full simplex of shift invariant probability vectors as accumu-
lation points.
Lemma 5.7. For all positive and regular, linear transformations T ∈ T and k ∈ Nwe have E ⊆ ETk .
Proof. Let T ∈ T , x ∈ E and suppose w is the symbolic expansion of x so that
x = πP,φ(w). By assumption, (Pk(w, n))n∈N has property P and so by the previous
lemma the averaged version of this sequence, (P Tk (w, n))n∈N has property P. To show
the containment argument it su�ces to show that each p ∈ Sk is an accumulation
point of the averaged sequence. So �x p ∈ Sk and η ∈ N.Since D = Sk there exists a probability vector q ∈ D such that ||p − q||1 < 1
η. Now
we make use of the fact that (P Tk (w, n))n∈N has property P so that for each m ∈ N
we may choose a j ∈ N with:
1. j ≥ η
2. j/2j < 1η
3. for n ∈ N with j < n < ψm(2j) then ||(Pk(w, n)− q)||1 ≤ 1
η
Fix a natural nη in the interval j < nη < ψm(2j). Then
∥∥P Tk (w, nη)− p
∥∥1≤ ||P T
k (w, nη)− q||1 + ||q − p||1 ≤1
η+
1
η=
2
η
since nη ≥ η we may extract a strictly increasing subsequence (nηu)u∈N such that
P Tk (w, nηu)→ p as u→∞. Therefore, p is an accumulation point of (P T
k (w, n))n∈N.
This holds for all p ∈ Sk, so x ∈ ETk .
Proof of main result. Lemma 5.5 shows that the set E is residual in U∞. Then,
by lemma 5.7 E is a subset of ETk and so ETk is also residual in U∞ and thus residual
in M , since M\U∞ is meagre, completing the proof.
45
References
[1] Alabdulmohsin, I. M.: 2016, `A new summability method for divergent series'.
arXiv:1604.07015 [math]. arXiv: 1604.07015.
[2] Baire, R.: 1899, Sur les fonctions de variables reelles. Bernardoni de C. Rebes-
chini. Google-Books-ID: cS4LAAAAYAAJ.
[3] Borel, M. m.: 1909, `Les probabilites denombrables et leurs applications arithme-
tiques'. Rendiconti del Circolo Matematico di Palermo (1884-1940) 27(1), 247�271.
[4] Hardy, G. H.: 2000, Divergent Series. American Mathematical Society.
[5] Hyde, J. T., V. Laschos, L. O. R. Olsen, I. Petrykiewicz, and A. Shaw: 2010, `Iter-
ated Cesaro averages, frequencies of digits, and Baire category'. Acta Arithmetica
144, 287�293.
[6] Lind, D. and B. Marcus: 1995, An Introduction to Symbolic Dynamics and Coding.
Cambridge University Press.
[7] Madritsch, M. G. and I. Petrykiewicz: 2014, `Non-normal numbers in dynami-
cal systems ful�lling the speci�cation property'. arXiv:1402.1506 [math]. arXiv:
1402.1506.
[8] Niven, I. and H. S. Zuckerman: 1951, `On the de�nition of normal numbers.'.
Paci�c Journal of Mathematics 1(1), 103�109.
[9] Olsen, L. O. R.: 2003, `Extremely non-normal continued fractions'. Acta Arith-
metica 108, 191�202.
[10] Oxtoby, J. C.: 2013, Measure and Category: A Survey of the Analogies between
Topological and Measure Spaces. Springer New York.
46
Appendix A
Regular linear transformations
Theorem. A linear transformation T = [cm,n]N×N is regular if and only if:
1. there exists M > 0 such that for all m ∈ N we have that γm =∞∑n=0
|cm,n| < M ,
2. for all n ∈ N, cm,n → 0 as m→∞ and
3. setting cm =∞∑n=0
cm,n, then cm → 1 as m→∞.
Proof. The proof will follow very similar ideas to the proof of theorems 1 and 2 from
chapter 3 of Hardy [4].
(⇐) : Assume T = [cm,n]N×N is a linear transformation such than the three con-
ditions of the theorem are satis�ed. We want to show that T is regular. Let (sn)n∈N
be a real sequence, so that snn−→ s. Now, we wish to show that tm =
∞∑n=0
cm,nsnm−→ s.
Claim. It su�ces to prove the result in the case that s = 0.
Proof. Assume that whenever a sequence converges to 0 we have that the correspond-
ing sequence (tm)m∈N also converges to 0. Then, consider any sequence (sn)n∈N, so
that there exists s ∈ R with snn−→ s and write s′n = sn − s and t′m =
∞∑n=0
cm,ns′n.
Then, it is clear that the sequence (s′n)n∈N tends to 0 and by assumption (t′m)m∈N
also tends to 0. Then, condition 3 implies that tm =∞∑n=0
cm,nsn =∞∑n=0
cm,n(s′n + s) =
t′m + scmm−→ 0 + s · 1 = s.
So now we can suppose s = 0 and let ε > 0 .
47
Claim. For all m ∈ N, the series tm =∞∑n=0
cm,nsn and cm =∞∑n=0
cm,n are absolutely
convergent.
Proof. Let m ∈ N and notice that the result for cm is Condition 1 which holds by
assumption. Now consider tm. Firstly, observe that since the sequence (sn)n∈N is
convergent then it is also bounded. Therefore, there exists K > 0 such that |sn| ≤ K
for all n ∈ N . Now using this fact together with condition 1 we can conclude the
result for tm in the following way:
∞∑n=0
|cm,nsn| ≤ K∞∑n=0
|cm,n| < KM
and hence considering the partial sums of the series∞∑n=0
|cm,nsn| we get an increas-
ing sequence which is bounded above. An application of the monotone convergence
theorem gives the result.
In order to show that tm tends to 0 we need one �nal tool. The convergence of
the sequence (sn)n∈N implies that there exists N(ε) ∈ N so that n ≥ N(ε) implies
that |sn| < ε2M
. Then,
|tm| =
∣∣∣∣∣∞∑n=0
cm,nsn
∣∣∣∣∣ =∣∣∣∣∣∣N(ε)−1∑n=0
cm,nsn +∞∑
n=N(ε)
cm,nsn
∣∣∣∣∣∣ ≤∣∣∣∣∣∣N(ε)−1∑n=0
cm,nsn
∣∣∣∣∣∣+∣∣∣∣∣∣∞∑
n=N(ε)
cm,nsn
∣∣∣∣∣∣and so looking at the two summands seperately we get:
• For the �rst summand, notice that for �xed N(e) condition 2 implies that this
summand tends to 0 as m → ∞, i.e. we can �nd M(ε) = M(ε,N(ε)) ∈ N so
that m ≥M(ε) implies that
∣∣∣∣∣N(ε)−1∑n=0
cm,nsn
∣∣∣∣∣ ≤ ε2.
• For the second summand using condition 1 this time we get∣∣∣∣∣∣∞∑
n=N(ε)
cm,nsn
∣∣∣∣∣∣ ≤∞∑
n=N(ε)
|cm,nsn| ≤ε
2M
∞∑n=N(ε)
|cm,n| ≤ε
2M
∞∑n=0
|cm,n| ≤ε
2M·M =
ε
2.
Finally, combing the �ndings for the two summands we get that for m ≥M(ε)
we have that tm ≤ ε, prooving that tm → 0 as m→∞.
48
(⇒): Conversely, suppose T = [cm,n]N×N is a regular linear transformation. We want
to show that the three conditions of the theorem are necessary.
Claim. cm =∞∑n=0
cm,n → 1 as m→∞.
Proof. Consider the constant sequence (sn)n∈N with sn = 1 for all n ∈ N. Then
clearly snn−→ 1 and so tm =
∞∑n=0
cm,nsnm−→ 1. Therefore,
cm =∞∑n=0
cm,n =∞∑n=0
cm,n · 1 =∞∑n=0
cm,nsnm−→ 1
Similarly, we get that the second condition is necessary for each n ∈ N by respec-
tively considering the sequence (sk)k∈N with sn = 1 and sk = 0 for all k ∈ N\{n}.Finally, we show in the next claim that the �rst condition is necessary to complete
the proof of this theorem.
Claim. There existsM > 0 such that for all m ∈ N we have that γm =∞∑n=0
|cm,n| < M .
Proof. As a �rst step we need to show that each γm has a �nite value. Assume not i.e.
there exists m ∈ N so that γm =∞∑n=0
|cm,n| = ∞. Then, we can construct a sequence
(εn)n∈N with εn > 0 for all n ∈ N and εnn−→ 0 so that
∞∑n=0
εn|cm,n| = ∞. (E.g.
take εn =
(n∑
ν=N
|cm,ν |)−1
where cm,N is the �rst nonzero element of the sequence
(cm,n)n∈N). But then, by considering the sequence (sn)n with sn = εnsgn(cm,n) we
have that sn → 0 but tm =∞∑n=0
cm,nsn =∞∑n=0
εn|cm,n| = ∞ contradicting the fact that
tm is a real number. Hence, (γm)m is a sequence of non-negative real numbers and
ultimately, we want to show that (γm)m is a bounded sequence. Assume not for the
sake of contradiction. Since, γm ≥ 0 for all m ∈ N and the sequence is unbounded
it means that for any G > 0 there exists m0 ∈ N so that γm0 > G. Now for n ∈ Nde�ne γm,n =
n∑ν=0
|cm,ν |. Then, by construction we have that γm,nn−→ γm. In order to
reach the seeked contradiction we are going to construct a convergent sequence whose
averaged version will not converge to the same limit. For that reason, let n1 ∈ N and
de�ne inductively the sequences of integers (nn)n and (mn)n in the following way.
49
Suppose m1, ...,mr−1 and n1, ..., nr are determined. Then choose mr and nr+1 as
follows. Take mr big enough so that
(i): mr > mr−1
(ii): γmr,nr =nr∑n=0
|cmr,n| < 1 (we can do this since by condition 2 that we proved
above for all n ∈ N, limm→∞
cm,n = 0)
(iii): γmr =∞∑n=0
|cmr,n| > r2+2r+2 (we can do this since (γm)m is unbounded and
each γm has a �nite value)
Now using the fact that γm,nn−→ γm we choose nr+1 > nr big enough so that
γmr − γmr,nr+1 =∞∑
n=nr+1+1
|cm,n| < 1. By construction of mr and nr+1 it easily follows
thatnr+1∑
n=nr+1
|cmr,n| > r2 + 2r. We now de�ne the crucial sequence (sn)n by:
sn =
0 n ≤ n1
1rsgn(cmr,n) nr < n ≤ nr+1 for r = 1, 2, ...
Then, it is clear that the sequence (sn)n is bouded by 1 and converges to 0 but
let's now consider its averaged version
|tmr | =
∣∣∣∣∣∞∑n=0
cmr,nsn
∣∣∣∣∣ =∣∣∣∣∣∞∑i=0
ni+1∑n=ni+1
cmr,nsn
∣∣∣∣∣ =∞∑i=0
ni+1∑n=ni+1
1
r|cmr,n|
≥ 1
r
nr+1∑n=nr+1
|cmr,n| −nr∑n=0
|cmr,n| −∞∑
n=nr+1
|cmr,n| >1
r(r2 + 2r)− 1− 1 = r
implying that (tm)m has a subsequence that diverges to in�nity and so the original
sequence de�nitely does not converge to 0, a contradiction to T being regular.
50
Appendix B
Accumulation points of block
frequencies
Theorem. Let w ∈ X+P,φ. Then, ATk (w) ⊆ Fk where Fk is the following simplex of
shift invariant vectors:
Fk =
(pi)i∈Sk : pi ≥ 0,∑i∈Sk
pi ≤ 1,∑i∈S
pij =∑i∈S
pji for all j ∈ Sk−1
.
Proof. Let p = (pi)i∈Sk be an accumulation point of the sequence (P Tk (w,m))m∈N with
respect to ||.||1. Hence, clearly p satis�es the condition pi ≥ 0 since by assumption
T is positive and the sequence (Pk(w, n))n∈N consists of frequencies, i.e. non-negative
numbers. More crucially, there exists a strictly increasing sequence (nm)m of positive
integers such that
||P Tk (w, nm)− p||1
m−→ 0 (B.1)
It follows that for i ∈ Sk,
∑n∈N
cnm,nPk(w, i, n)m−→ pi
51
But then, we get the second condition in the following way
1 = limm→∞
∑n∈N
cnm,n = limm→∞
∑n∈N
cnm,n||Pk(w, n)||1 = limm→∞
∑n∈N
cnm,n
∑i∈Sk
P (w, i, n)
=∑
i∈Sk limm→∞∑
n∈N cnm,nP (w, i, n) =∑
i∈Sk pi = ||p||1 if S is �nite
≥∑
i∈Sk limm→∞∑
n∈N cnm,nP (w, i, n) =∑
i∈Sk pi = ||p||1 if S is in�nite
where the inequality comes from the use of Fatou's Lemma.
For j ∈ Sk−1 considering all possible ways this block of length k − 1 can occur it
follows that ∣∣∣∣∣∑i∈S
P (w, ij, n)−∑i∈S
P (w, ji, n)
∣∣∣∣∣ ≤ 1
n(B.2)
Now it follows from (B.1) and (B.2) that if j ∈ Sk−1, then
∣∣∣∣∣∑i∈S
pij −∑i∈S
pji
∣∣∣∣∣ ≤∣∣∣∣∣∑i∈S
pij −∑i∈S
P (w, ij, nm)
∣∣∣∣∣+
∣∣∣∣∣∑i∈S
P (w, ij, nm)−∑i∈S
P (w, ji, nm)
∣∣∣∣∣+
∣∣∣∣∣∑i∈S
P (w, ji, nm)−∑i∈S
pji
∣∣∣∣∣
≤∑i∈S
|pij − P (w, ij, nm)|+1
nm+∑i∈S
|P (w, ji, nm)− pji|
≤ ||Pk(w, nm)− p||1 +1
nm+ ||Pk(w, nm)− p||1
m−→ 0
which proves the �nal condition for p to be an element of Fk, namely that for all
j ∈ Sk−1 we have
∑i∈Spij =
∑i∈Spji.
52