Notes on Linear Recurrence Sequences - Georgia...

21
Notes on Linear Recurrence Sequences April 8, 2005 As far as preparing for the final exam, I only hold you responsible for knowing sections 1, 2.1, 2.2, 2.6 and 2.7. 1 Definitions and Basic Examples An example of a linear recurrence sequence is the Fibonacci numbers F 0 ,F 1 ,F 2 , ..., where F 0 = 0, F 1 = 1, and F n+1 = F n + F n-1 for n 2; so, the sequence begins 0, 1, 1, 2, 3, 5, 8, ... In general, a linear recurrence sequence X 0 ,X 1 , ... obeys the rule X n = a 0 X n-1 + a 1 X n-2 + ··· + a k-1 X n-k (1) for n k, where a 0 , ..., a k-1 are constants. The values X 0 , ..., X k-1 are initial conditions. An example is, say k = 3, a 0 = 1, a 1 = 2, a 2 = 3, X 0 = 0, X 1 = 1, X 2 = 2. This produces the recurrence relation X n = X n-1 +2X n-2 +3X n-3 . So, the sequence is X 0 ,X 1 ,X 2 , ... =0, 1, 2, 4, 11, 25, ... 2 A Formula for X n 2.1 The Characteristic Polynomial The simplest of all linear recurrence sequences are geometric progressions, which are defined by the rule X 0 =1,X n+1 = aX n , 1

Transcript of Notes on Linear Recurrence Sequences - Georgia...

Notes on Linear Recurrence Sequences

April 8, 2005

As far as preparing for the final exam, I only hold you responsible forknowing sections 1, 2.1, 2.2, 2.6 and 2.7.

1 Definitions and Basic Examples

An example of a linear recurrence sequence is the Fibonacci numbers F0, F1, F2, ...,where F0 = 0, F1 = 1, and Fn+1 = Fn + Fn−1 for n ≥ 2; so, the sequencebegins

0, 1, 1, 2, 3, 5, 8, ...

In general, a linear recurrence sequence X0, X1, ... obeys the rule

Xn = a0Xn−1 + a1Xn−2 + · · · + ak−1Xn−k (1)

for n ≥ k, where a0, ..., ak−1 are constants. The values X0, ..., Xk−1 areinitial conditions.

An example is, say k = 3, a0 = 1, a1 = 2, a2 = 3, X0 = 0, X1 = 1,X2 = 2. This produces the recurrence relation

Xn = Xn−1 + 2Xn−2 + 3Xn−3.

So, the sequence is

X0, X1, X2, ... = 0, 1, 2, 4, 11, 25, ...

2 A Formula for Xn

2.1 The Characteristic Polynomial

The simplest of all linear recurrence sequences are geometric progressions,which are defined by the rule

X0 = 1, Xn+1 = aXn,

1

in other wordsX0, X1, X2, ... = 1, a, a2, a3, ...

Such a sequence has the property that

Xn+1

Xn= a,

that is, the ratio of successive terms is a.This property (successive term ratios are constant) is not shared by the

Fibonacci numbers; however, one can speculate that the ratio of successiveFibonacci numbers tends to a limit. That is, does there exist a number ϕsuch that

limn→∞

Fn+1

Fn= ϕ ?

It turns out that the answer is YES; and, remarkably, just knowing thatthe limit exists is enough to find it: Indeed, if n is a very big integer, then

Fn+1

Fn≈ ϕ, and

Fn

Fn−1≈ ϕ.

Now,

Fn+1 = Fn + Fn−1 =⇒ Fn+1

Fn= 1 +

Fn−1

Fn.

This last equation tells us that

ϕ ≈ 1 +1

ϕ.

In fact, we get equality here by letting n tend to infinity; so,

ϕ2 − ϕ − 1 = 0.

The limit ϕ is thus either

1 +√

5

2, or

1 −√

5

2.

The fact that this second ratio is negative means that it cannot be our ϕ;and so,

ϕ =1 +

√5

2.

2

The polynomial f(x) = x2 −x−1 is called the characteristic polynomialfor Fn. More generally, if we have a sequence defined as in (1), then thecharacteristic polynomial is defined to be

xk − a0xk−1 − a1x

k−2 − · · · − ak−1. (2)

If

limn→∞

Xn+1

Xnexists,

then this limit will be a root of the polynomial (2). However, there areexamples of sequences where this limit fails to exist; for example,

Xn+1 = Xn − Xn−1, with initial conditions X0 = 0, X1 = 1.

The sequence here is

X0, X1, X2, ... = 0, 1, 1, 0,−1,−1, 0, 1, 1, 0,−1,−1, 0, ...

2.2 Using Matrices to Determine Fn

We begin with a really interesting relation:

[

1 11 0

] [

Fm+1

Fm

]

=

[

Fm+2

Fm+1

]

.

It is not immediately obvious what this gives us, but notice what happensif we multiply both sides by that 0 − 1 matrix:

[

1 11 0

]2 [Fm+1

Fm

]

=

[

1 11 0

] [

Fm+2

Fm+1

]

=

[

Fm+3

Fm+2

]

.

In fact, if we repeat this a couple of times we get

[

1 11 0

]n [Fm+1

Fm

]

=

[

Fm+n+1

Fm+n

]

.

So, for m = 0 we get

[

1 11 0

]n [10

]

=

[

Fn+1

Fn

]

.

On the other hand, we know from linear algebra that to compute a highpower of a matrix (such as our 0− 1 matrix), the task is fairly easy once we

3

have diagonalized. First, we must find the eigenvalues, which are determinedby the characteristic polynomial: We begin by letting

A =

[

1 11 0

]

.

Then, the characteristic polynomial is

1 − λ 11 −λ

= (1 − λ)(−λ) − 1 = λ2 − λ − 1.

Notice that this polynomial is the characteristic polynomial we defined inthe previous section!

Now, we know that

A = S

[

ϕ 00 ϕ′

]

S−1,

where ϕ and ϕ′ are roots of the characteristic polynomial, and where Sis the matrix whose columns are eigenvectors of A corresponding to theseeigenvalues ϕ and ϕ′; in fact,

S =

[

ϕ ϕ′

1 1

]

, and S−1 =1√5

[

1 −ϕ′

−1 ϕ

]

.

Call this diagonal matrix Λ. Then,

An = (SΛS−1)(SΛS−1) · · · (SΛS−1) = SΛnS−1 = S

[

ϕn 00 (ϕ′)n

]

S−1.

So, with a little work, we find that the second entry of the column vector

An

[

10

]

is1√5

(

ϕn − (ϕ′)n)

.

Which is a very nice formula, I hope you will agree!

4

2.3 A Comment about this Formula

Since|ϕ′| < 1,

we know that as n tends to infinity, the term (ϕ′)n tends to 0. So, for n ≥ 1,we will have Fn is the nearest integer to

ϕn

√5.

In fact, the larger n is, the closer that this number is to an integer, whichis rather strange: How many irrational numbers ϕ do you know of with theproperty that ϕn/

√5 is always near to an integer? For example,

ϕ10

√5

= 55.003635..., andϕ13

√5

= 232.9991401....

It also turns out to be the case that

ϕn + (ϕ′)n is an integer,

So, powers of ϕ should also be near to an integer. For example,

ϕ13 = 521.0019162..., ϕ16 = 2206.999531...

There is a general class of irrational numbers ϕ > 1 with the remarkableproperty that the powers ϕn all get closer and closer to an integer. Theyare called Pisot numbers.

2.4 A Formula for Special Sequences Xn

As with the Fibonacci numbers, we have that the system

Xn+1 = aXn + bXn−1

satisfies[

a b1 0

]n [X1

X0

]

=

[

Xn+1

Xn

]

.

The characteristic polynomial of this matrix is the same as the charac-teristic polynomial for the sequence Xn we defined in a previous section.

Now, if the roots of this polynomial are distinct, then we know that Acan be diagonalized, and then we get

[

Xn+1

Xn

]

= S

[

λn1 00 λn

2

]

S−1.

5

So, it is easy to see that Xn is some linear combination of λn1 and λn

2 ;that is,

Xn = Aλn1 + Bλn

2 .

If the matrix cannot be diagonalized, things are more subtle.

Example. Suppose that Xn is defined by the rule

Xn+1 = 3Xn − 2Xn−1, X0 = 0, X1 = 1.

Then, the characteristic polynomial is

x2 − 3x + 2,

which has the roots x = 1 and x = 2. So,

Xn = A + B2n.

Setting n = 0, we find0 = X0 = A + B,

so A = −B; and, setting n = 1, we find

1 = X1 = A + 2B.

So, B = 1 and A = −1. So,

Xn = 2n − 1.

More generally, we may use the matrix form of an arbitrary sequence

Xn = a0Xn−1 + · · · + ak−1Xn−k.

We get

a0 a1 a2 · · · ak−1

1 0 0 · · · 00 1 0 · · · 0...

......

. . ....

0 0 0 · · · 1

n

Fk−1

Fk−2...

F0

=

Fk−1+n

Fk−2+n

...Fn

.

It turns out that the characteristic polynomial of this matrix equalsthe characteristic polynomial of the sequence Xn. Now, if the matrix canbe diagonalized, as with the case of Fibonacci numbers, then the sequencemust have the form

Xn = A1λn1 + A2λ

n2 + · · · + Atλ

nt , (3)

where λ1, ..., λt are the eigenvalues of A (Note: We may have t < k, becausesome of the eigenvalues could be repeated.)

6

2.5 Exceptional Sequences

There are some sequences which do not have the form (3). For example,consider the sequence

Xn = 4Xn−1 − 4Xn−2.

The corresponding matrix here is

A =

[

4 −41 0

]

,

which cannot be diagonalized.

2.6 Generating Functions

As before, we suppose that

Xn = a0Xn−1 + · · · + ak−1Xn−k.

Then, consider the sum

f(x) =

∞∑

n=0

Xnxn.

Suppose that m ≥ k. Then, we observe that

f(x) = · · · + Xmxm + · · ·xf(x) = · · · + Xm−1x

m + · · ·x2f(x) = · · · + Xm−2x

m + · · ·... · · · + Xm−kx

m−k + · · ·

So, then, for m ≥ k we find that

(1 − a0x − a1x2 − · · · − ak−1x

k)f(x) = · · · + (Xm − a0Xm−1 − · · · − ak−1Xm−k)xm + · · ·= · · · + 0xm + · · · .

So, the coefficient is 0. So, we must have that

(1 − a0x − a1x2 − · · · − ak−1x

k)f(x) = g(x),

where g(x) is some polynomial of degree at most k − 1. It follows that

f(x) =g(x)

1 − a0x − · · · − ak−1xk.

7

The polynomial on the denominator is the characteristic polynomial ofthe sequence Xn, written backwards (that is, the coefficients are written inreverse order). Another way of expressing this polynomial is as follows: Let

h(x) = 1 − a0x − · · · − ak−1xk,

and letp(x) = xk − a0x

k−1 − · · · − ak−1.

Then,h(x) = xkp(1/x).

It follows that the roots of h(x) are the reciprocals of the roots of p(x) (notethat 0 is never a root of the polynomial).

It turns out that the generating function of a sequence Xn is a rationalfunction A(x)/B(x) if and only if Xn is a linear recurrence sequence.

2.7 The General Case

Now suppose that the characteristic polynomial p(x) factors as follows

p(x) = (x − λ1)α1(x − λ2)

α2 · · · (x − λt)αt ,

where the λi’s are all distinct, and the αi ≥ 1. Note that

α1 + · · · + αt = k.

Then,h(x) = (1 − λ1x)α1 · · · (1 − λtx)αt .

So, by the theory of partial fractions, we know that there exist constants

A1,1, ..., A1,α1; A2,1, ..., A2,α2

; ...; At,1, ..., At,αt

such that

f(x) =g(x)

h(x)=

t∑

i=1

(

Ai,1

1 − λix+

Ai,2

(1 − λix)2+ · · · + Ai,αi

(1 − λix)αi

)

. (4)

Now, the termAi,j

(1 − λix)j

8

is the (j − 1)st derivative of

Ai,j

(j − 1)!λj−1i (1 − λix)

=Ai,j

(j − 1)!λj−1i

∞∑

m=0

λmi xm.

So, the coefficient of xm inAi,j

(1 − λix)j

is justAi,jm(m − 1) · · · (m − j + 2)λm

i

(j − 1)!,

(when j = 1 this is to be Ai,j) which can be expressed as

q(m)λmi ,

where q(m) is some polynomial of degree j − 1 in m. So, the coefficient ofxm in

Ai,1

1 − λix+

Ai,2

(1 − λix)2+ · · · + Ai,αi

(1 − λix)αi

is of the formqi(m)λm

i ,

where qi(m) is some polynomial of degree at most αi − 1 in m. Combiningthis with (4), we deduce that

Xm = q1(m)λm1 + q2(m)λm

2 + · · · + qt(m)λmt , (5)

where qi(m) is a polynomial in m of degree αi − 1.

2.8 The Non-Homogeneous Case

Before we give a non-trivial application of the formula (5) for the mth termin a general recurrence sequence, we work out the non-homogeneous case:Up until now we have been dealing with the “homogeneous case” of lin-ear recurrence sequences, which can be described as follows: A recurrencerelation of the form

Xn = a0Xn−1 + · · · + ak−1Xn−k

can be rewritten as

Xn − a0Xn−1 − · · · − ak−1Xn−k = 0.

9

The left-hand-side is linear in Xn, ..., Xn−k, with coefficients 1,−a0, ...,−ak−1,while the right hand side is 0. This is what we mean by “homogeneous”.

But now we can ask about sequences Yn which satisfy

c0Yn + c1Yn−1 + · · · + ckYn−k = Zn, (6)

where Zn is some sequence.In the case where Zn = c is a constant we can reduce the problem of

finding a formula for Yn to the homogeneous case as follows: We observethat

(c0Yn+1 + · · · + ckYn−k+1) − (c0Yn + · · · + ckYn−k) = 0.

The left hand side is then a linear combination of Yn−k, ..., Yn+1; and so, weare back to the homogeneous case.

The question now becomes: Which sequences Zn can you always reduceto the homogeneous case? And it turns out that if Zn is itself the nthterm of a homogeneous linear recurrence sequence, then the reduction goesthrough. Perhaps the easiest way to see this, and to deduce a formula forsuch sequences Yn, is to work with generating functions.

Suppose that f(x) is the generating function for Xn; that is,

f(x) =

∞∑

n=0

Xnxn.

Then, (6) is telling us that the coefficient of xn in the power series expansionof

(c0 + c1x + · · · + ckxk)f(x)

isc0Xn + · · · + ckXn−k = Yn.

However, this only holds for n ≥ k, because Xm is only defined for whenm ≥ 0.

So, in general, what we get is that

(c0 + c1x + · · · + ckxk)f(x) = r(x) +

∞∑

n=0

Ynxn,

where r(x) is some polynomial of degree at most k − 1.Now, the power series with terms Ynxn is just the generating function for

Yn, which we are assuming is a homogeneous linear recurrence sequuence;

10

and so, the generating function for Yn is a rational function A(x)/B(x)(where A and B are polynomials). It follows that

f(x) =r(x)

c0 + · · · + ckxk+

A(x)

(c0 + · · · + ckxk)B(x),

which is a rational function. It follows that Xn is a homogeneous linearrecurrence sequence.

2.9 An Application

Here we give a rather simpleminded application to illustrate the principlesin the previous section. This application amounts to deriving a formula for

Sn = 1 + 2 + · · · + n.

This sequence satisfies the non-homogeneous recurrence

Sn − Sn−1 = n (7)

for n ≥ 1 (we define S0 = 0).Now, if we let

f(x) =∞∑

n=0

Snxn

be the generating function for Sn, then we observe that the relation (7)implies that

(1 − x)f(x) = r(x) +

∞∑

n=0

nxn,

where r(x) is some polynomial of degree at most 0, and so is a constant.Clearly, r(x) = 0. So, we have

f(x) =1

1 − x

∞∑

n=0

nxn.

Now, we know that

1

1 − x=

∞∑

n=0

xn;

and so, differentiating term-by-term we have that

1

(1 − x)2=

∞∑

n=1

nxn−1.

11

So,

x

(1 − x)2=

∞∑

n=0

nxn

It follows thatf(x) =

x

(1 − x)3.

Now, to get a formula for the coefficient of xn in this sequence, we observethat by differentiating 1/(1 − x)2 term-by-term, we get

2

(1 − x)3=

∞∑

n=0

n(n − 1)xn−2.

So,

x

(1 − x)3=

∞∑

n=0

n(n + 1)

2xn,

and it follows that

Sn =n(n + 1)

2,

as is well known.

3 Linear Recurrence Sequences and Finite State

Machines

It turns out that these recurrence relations are intimately related to regulargrammars and finite state machines; however, there is a fair amount ofbackground which is necessary in order to say much about this. Here, wewill be content just to describe this connection in a very special case, namelywhere Xn is the Fibonacci sequence.

We begin by reminding ourselves of the following basic fact about Fi-bonacci numbers:

Theorem 1 The number of length n strings of 0’s and 1’s containing no

consecutive 1’s is Fn+2. For example, in the case n = 3, the strings are

000,100,010,001, and 101, of which there are 5; and, indeed, Fn+2 = F5 = 5

For completeness, we give here the induction proof of this result:

Proof. The claim is clearly true for n = 0 and n = 1. For n = 0, there isonly one string, namely the empty string, and, indeed, F0+2 = F2 = 1. Forn = 1 there are two strings, namely 0 and 1, and F1+2 = F3 = 2.

12

Suppose, for proof by mathematical induction that the claim holds forall 0 ≤ n ≤ k, where k ≥ 1. Now consider the collection of all strings oflength k + 1 of 0’s and 1’s with no consecutive 1’s. We can divide this setof strings into two groups, according to whether the (k + 1)st character is a0 or a 1: If the (k + 1)st character is a 0, then the first k characters can beany string with no consecutive 1’s, and there are Fk+2 such strings. If the(k + 1)st character is a 1, then the kth character must be a 0 in order toavoid consecutive 1’s, and then the first k− 1 characters can be anything solong as there are no consecutive 1’s; so, there are Fk−1+2 = Fk+1 possibilitiesfor these first k − 1 characters. In total, the number of length k + 1 stringsis Fk+1 + Fk+2 = Fk+3 = F(k+1)+2; and so the induction step is proved, andthe claim holds by mathematical induction. �

Now we give a proof based on finite state machines: The set of strings of0’s and 1’s without consecutive 1’s is an example of what is called a language

in theoretical computer science, where Σ = {0, 1} is the alphabet. Moreover,this language is special in that it can be recognized by a finite state machine.Such languages are said to be regular.

Basically, a finite state machine is a graph, together with a pointer point-ing to a certain location in the string, and a state variable which indicateswhich state the machine is in. The vertices in the graph represent states andat a given instant in time the machine is said to have state variable equalto one of these vertices. The edges in the graph are directed, and each edgecorresponds to a character in the alphabet; thus, leading out of each vertexin the graph there can be at most |Σ| edges (assuming that the machineis what is called deterministic, which we will assume). The states of thegraph are designated one of three types: a start state, some halt states, andnormal states. The machine’s state variable will change as the charactersin the string are read, and each time that a character is “read”, the pointeradvances one position to the right in the string. The pointer never goes tothe left. By the time the pointer reaches the last character in the string, ifthe state variable equals one of these halt vertices, then the machine haltsand says “This string is in the language”; and if, by the time the pointerreaches the end of the string the machine’s state variable is not equal to oneof these halt vertices, then the machine reports “This string is not in thelanguage”.

Technically, each vertex should have exactly |Σ| edges leading out ofit, one for each possible character in the alphabet. For our definition ofa finite state machine, we allow vertices that do not have a full set of |Σ|edges leading away from them. If the machine happens to be in one of these

13

underfull states, and if the next character that the pointer points to does notcorrespond to one of these < |Σ| edges, then our machine halts, and reportsthat the string is not in the language, no matter if the state the machine isin a halt state.

Now, a machine (in our sense) for generating strings of 0’s and 1’s withoutconsecutive 1’s can be described by two states, both of which are halt states,and one is (of course) a start state. The start vertex we label A, and theother vertex we label B. The edges for this graph are as follows: There isan edge leading from vertex A to itself, which corresponds to the character0; there is an edge leading from vertex A to vertex B, which corresponds tothe character 1; and, there is an edge leading from vertex B back to vertexA, which corresponds to the character 0.

Let us now see what happens if the machine is fed a string with consec-utive 1’s: Say the string is 11. The machine starts in state A, and when itreads that first one, it transitions to state B, and the pointer advances sothat it is over that second 1. Then, when the machine reads that second1, there is nowhere that it can go, because there is only one edge leadingaway from vertex B, and that edge corresponds to the character 0. So, themachine halts, and reports that the string is not in the language.

Consider now what happens if the string is 011. In this case, when themachine reads that first 0, it transitions from state A back to state A; andthen, when it reads that 11, it will end up halting in state B and reportingthat the string is not in the language.

We now count the number of strings of length n that are in the language:This is the number of paths of length n from vertex A to itself plus thenumber of paths of length n from vertex A to vertex B. Here a path meansa sequence like ABAAB, which means that we transition from vertex Ato vertex B, and then from B to A, and from vertex A to vertex A, andfinally from vertex A to vertex B. The length of this path is 4, because wetransition along 4 edges.

Now, as we know, the number of paths of length n from one vertex toanother is some entry in the power of an adjacency matrix. In our case, theadjacency matrix is

A =

[

1 11 0

]

,

where the entry in the ith row and jth column is 1 if and only if there is anedge leading from vertex i to vertex j. Then, the Ak

i,j equals the number ofpaths of length k from vertex i to vertex j. We are interested in

Ak1,1 + Ak

1,2,

14

which is the sum of the entries in the first row of Ak.As we know,

[

1 11 0

]k [10

]

=

[

Fk+1

Fk

]

,

which is telling us that the first column of Ak has entries Fk+1 and Fk. Sincethe matrix A is symmetric (that is, A equals its transpose), we know thenthat the first row equals [Fk+1 Fk]. So, the number of strings of length n inour language, which is the sum of the entries in the first row of An, is

Fn+1 + Fn = Fn+2.

We state here a general result without proof:

Theorem 2 Suppose that L is a regular language with some finite alphabet

Σ. Then, there exist numbers λ1, ..., λk and polynomials p1(x), ..., pk(x) such

that the number of strings of length n lying in L is given by

p1(n)λn1 + p2(n)λn

2 + · · · + pk(n)λnk .

This puts heavy restrictions on the structure of regular languages, andin the next section we will use it to give an alternative (sketch of a) proofof a classic result on balanced parentheses.

3.1 An Application to Automata Theory

One of the classic results from theoretical computer science concerning reg-ular grammars (rules which generate regular languages) is that the languageof balanced parentheses is not regular; that is, there is no finite state ma-chine which can decide whether or not a string of (’s and )’s is balanced. By“balanced” here we mean, for example ((()())()). An example of a stringwhich is not balanced is (()(.

Now, as we know, the number of balanced parentheses of length 2k isthe Catalan number

Ck =1

k + 1

(

2k

k

)

.

Using Stirling’s formula (which gives an asymptotic formula for n!), one canprove that

(

2k

k

)

∼ 22k

√πk

.

15

This means that

limk→∞

(2kk

)

22k/√

πk= 1.

So,

Ck ∼ 22k

k√

πk.

It turns out that this cannot have the form given by Theorem 2, because1/k

√kπ does not grow like a polynomial. 1

Thus, the language of balanced parentheses is not regular.

4 An Algebraic Combinatorial Interpretation of

Second Order Linear Recurrence Sequences

We give here a way to interpret general second order linear recurrence se-quences in terms of strings. Basically, we generalize the result connectingFn+2 to n length strings. But how?

The idea is to look at formal sums of strings of α’s and β’s, containingno consecutive α’s, where we do not apply commutativity of multiplication.For example, consider the formal “sum” of strings of length 3 of such strings:We get

βββ + αββ + βαβ + ββα + αβα.

Note that there are five terms in this formal sum. Now suppose that Xn+2

is the formal sum of all such strings of length n, where we define X0 = 0,X1 = 1, and X2 = β. Then, we see that

X3 = α + β, X4 = ββ + αβ + βα,

and so on.

Now we address the following question: If we are given the formal sumsX0, ..., Xn, how can be construct the formal sum Xn+1?

The idea is as follows: A string of α’s and β’s of length n has nthcharacter either equal to β or equal to α. If the nth character is β, thenthe first n − 1 characters can be any combination of α’s and β’s withoutconsecutive α’s; so, the formal sum of all these strings ending in β is Xn−3β.

1Actually, things are a little more complicated, because even though 22k/k√

πk cannotbe a single term p(2k)λ2k, where p(x) is a polynomial, it still could be the sum of severalterms of this form; however, with more work one can show that Ck cannot be a sum ofsuch terms.

16

If the nth character is α, then the (n−1)st character must be β, and the firstn − 2 characters then can be anything, so long as there are no consecutiveα’s; so, the formal sum of strings ending in α is Xn−4βα. So, the formalsum of all strings of length n of α’s and β’s, no consecutive α’s, is

Xn−3β + Xn−4βα.

However, from the way we defined the sequence Xn, we must have that thisalso equals Xn−2. So, we have that

Xn−2 = Xn−3β + Xn−4βα;

orXn+1 = Xnβ + Xn−1βα. (8)

Now the idea is to think of β and βα as numbers, rather than justcharacters. So, if we have a sequence

Xn+1 = c0Xn + c1Xn−1,

we can solve for α and β so as to put this into the form (8); in fact,

β = c0, and α =c1

c0.

(Here we assume c0 6= 0.) So, this general recurrence can be interpreted asa formal sum of strings of α’s and β’s much the same way that Fibonaccinumbers can be interpreted as counting certain strings of length n composedof 0’s and 1’s.

There is a problem, though, and that is that we have the initial conditionsX0 = 0 and X1 = 1, and it would be good to have an interpretation forarbitrary initial conditions. Well, there is a way to do this, but we will notbother with it here, and will be content with what we already have.

There is also a way to interpret (8) in terms of finite state machines;basically, Xn corresponds to certain weighted paths through some graph.

5 Exponential Generating Functions

It turns out that not only is there a nice form for the generating function ofa sequence Xn, but there is also a nice form for the exponential generatingfunction. Recall that the exponential generating function for a sequence Xn

is defined to be

E(x) =

∞∑

n=0

Xn

n!xn.

17

Let us start with the Fibonacci numbers. From the equation

Fn =1√5

((

1 +√

5

2

)n

−(

1 −√

5

2

)n)

,

one can easily deduce that its corresponding exponential generating functionis

1√5

(

eϕx − eϕ′x)

,

where

ϕ =1 +

√5

2, and ϕ′ =

1 −√

5

2.

It turns out that all linear recurrence sequence have exponential gener-ating functions which have a similar form to this; however, there is a muchnicer way of expressing it, than just as a sum of exponentials. In fact, onecan express it as a single exponential! To do this, we need to define theexponential of a matrix: Given an n × n matrix A, we define eAx to be acertain n × n matrix (here, x is a variable [scalar], not a column vector):

eAx = I + Ax +A2

2!x2 +

A3

3!x3 + · · · ,

where I denotes the n×n identity matrix. This matrix exponential satisfiesa number of properties, and here are a few of them

(i) If we let O denote a zero matrix, then e0x = I, the n × n identitymatrix.

(ii) If A and B are n×n matrices, then eAx and eBx are n×n matrices,and their product is eAxeBx = e(A+B)x.

(iii) For any matrix A, eAx is an invertible matrix, as its inverse is e−Ax.This is an easy consequence of (i) and (ii).

(iv) ddx

eAx = AeAx.

Now, in the case of Fibonacci numbers, we have that Fn is the the 2, 1entry of the matrix

[

1 11 0

]n

.

One can show then that the exponential generating function for Fn is the2, 1 entry of the matrix eAx.

More generally, suppose that

Xn = c0Xn−1 + · · · + ck−1Xn−k.

18

Then, the exponential generating function for Xn turns out to be

[0 0 · · · 0 1]eAx

Xk−1

Xk−2...

X0

,

where

A =

c0 c1 c2 · · · ck−2 ck−1

1 0 0 · · · 0 00 1 0 · · · 0 0...

......

. . ....

...0 0 0 · · · 1 0

.

6 The Irrationality of e

√2

The exponential generating function for the Fibonacci numbers

eϕx

√5− eϕ′x

√5

(9)

has the property that the coefficients of powers of x in are rational numbers.Here we will use a similar fact about e

√2 to prove that it is irrational!

Let us begin by reviewing the proof that e is irrational: If e were rational,say e = p/q, then q!e = (q − 1)!p is an integer. But, since

e = 2 +1

2+

1

3!+

1

4!+ · · · ,

we have that

q!e = q!

(

2 +1

2+

1

3!+ · · · + 1

q!

)

+ q!

(

1

(q + 1)!+

1

(q + 2)!+ · · ·

)

.

Now,

q!

(

2 +1

2+

1

6+ · · · + 1

q!

)

is an integer,

and

q!

(

1

(q + 1)!+

1

(q + 2)!+ · · ·

)

=1

q + 1+

1

(q + 1)(q + 2)+

1

(q + 1)(q + 2)(q + 3)+ · · ·

<1

q + 1+

1

(q + 1)2+

1

(q + 1)3+ · · ·

19

≤ 1

2+

1

4+

1

8+ · · ·

= 1.

So,q!e = I + δ,

where I is an integer, and 0 < δ < 1. So, q!e cannot be an integer, and weconclude that e is irrational.

Now we repeat the argument for e√

2. We begin by observing that ife√

2 is rational, then so is e−√

2 (being the reciprocal of e√

2). So, if e√

2 isrational, so is

e√

2 + e−√

2 =∞∑

n=0

(√

2)n + (−√

2)n

n!

= 2

∞∑

m=0

2m

(2m)!.

Next, we want to see how many power of 2 divide (2m)!. We begin byletting w(n) denote the number of times that 2 divides n; so, 2w(n) divides2, but 2w(n)+1 does not divide 2. Then, there is a simple, but useful formulafor w(n): We have that

w(n) =∑

j≥1

2j |n

1.

The power to which 2 divides n!, then, is

n∑

h=1

w(h) =

n∑

h=1

j≥1

2j |h

1 =∑

j≥1

h≤n

2j |h

1 =∑

j≥1

⌊ n

2j

.

We note that this last sum over j is actually a finite sum, because for jsufficiently large n/2j will be less than 1, and therefore bn/2jc = 0.

So, 2 divides (2m)! about

j≥1

2m

2j= 2m

times. With a little bit of work, one can show that, more precisely, if t(n)is the number of times that 2 divides n!, then

n − ` − 1 ≤ t(n) ≤ n − 1,

20

where ` is the unique integer satisfying

2`−1 < n ≤ 2`.

When n = 2` the upper bound t(n) = n − 1 attained, and when n = 2` − 1the lower bound t(n) = n − ` − 1 is attained.

It turns out that this implies (with some work) that for n = 2` − 1,

(2m)!

2mdivides

(2n)!

2n,

for all integers m ≤ n. Thus, if we let

I =(2n)!

2n

m≤n

1

(2m)!/2m, (10)

then I is an integer.To show that

e√

2 + e−√

2

2=

∞∑

m=0

1

(2m)!/2m,

cannot be rational, it suffices to show that for infinitely many j ≥ 1, if welet n = 2j − 1, then

(2n)!

2n

(

e√

2 + e−√

2

2

)

= I + δ

is not an integer, where I is as in (10), and where

δ =(2n)!

2n

∞∑

m=n+1

2m

(2m)!.

Thus, we just need to show that δ is not an integer: We have that

δ =2

n + 1+

4

(n + 1)(n + 2)+

8

(n + 1)(n + 2)(n + 3)+· · · <

1

2+

1

4+

1

8+· · · = 1

for n ≥ 3. So, δ is not an integer for δ ≥ 3, and we conclude that e√

2 isirrational.

21