normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the...

38
The aim of this chapter is to investigate the dynamics in the neighbourhood of equilib- ria of Hamiltonian systems, both in the linear and in the nonlinear case. The argument has strong connections with the general problem of equilibria for systems of differen- tial equations, but here emphasis will be given to the aspects that make Hamiltonian system quite peculiar. Consider for a moment a generic system of differential equations ˙ x j = X j (x 1 ,...,x n ) , x D R n , where D is open. Assume that this system has an equilibrium point at ( x 1 ,..., x n ). It is well known that in a neighbourhood of the equilibrium the system may often be approximated by a linear one (5.1) ˙ x = Ax where A is a n×n real matrix with entries A j,k = ∂X j ∂x k ( x 1 ,..., x n ) . The general method for solving such a system of differential equation was discovered by Lagrange [60]. The complete classification of the equilibria as, e.g., nodes, saddles, foci and centers is due to Poincar´ e [85], who used the concept of normal form for the system. The construction of the normal form is based on finding a suitable linear transformation of coordinates which changes the system into a particularly simple one. The case of Hamiltonian systems may be treated with the same methods, of course. However, it is interesting to investigate to which extent the concept of normal form may be introduced without losing the canonical character of the equations. The question is given a more precise formulation as follows. Let the Hamiltonian H (q,p) have an equilibrium at ( q, p). In a neighbourhood of the equilibrium let us introduce local coordinates x = q q, y = p p . The transformation is canonical, and we can approximate the Hamiltonian with is quadratic part (5.2) H (x, y)= j,k ( A j,k x j x k + B j,k x j y k + C j,k y j y k ) .

Transcript of normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the...

Page 1: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

5

NONLINEAR OSCILLATIONS

The aim of this chapter is to investigate the dynamics in the neighbourhood of equilib-ria of Hamiltonian systems, both in the linear and in the nonlinear case. The argumenthas strong connections with the general problem of equilibria for systems of differen-tial equations, but here emphasis will be given to the aspects that make Hamiltoniansystem quite peculiar.

Consider for a moment a generic system of differential equations

xj = Xj(x1, . . . , xn) , x ∈ D ⊂ Rn ,

where D is open. Assume that this system has an equilibrium point at (x1, . . . , xn) .It is well known that in a neighbourhood of the equilibrium the system may often beapproximated by a linear one

(5.1) x = Ax

where A is a n×n real matrix with entries Aj,k =∂Xj

∂xk(x1, . . . , xn) . The general method

for solving such a system of differential equation was discovered by Lagrange [60]. Thecomplete classification of the equilibria as, e.g., nodes, saddles, foci and centers is dueto Poincare [85], who used the concept of normal form for the system. The constructionof the normal form is based on finding a suitable linear transformation of coordinateswhich changes the system into a particularly simple one.

The case of Hamiltonian systems may be treated with the same methods, ofcourse. However, it is interesting to investigate to which extent the concept of normalform may be introduced without losing the canonical character of the equations. Thequestion is given a more precise formulation as follows. Let the Hamiltonian H(q, p)have an equilibrium at (q, p). In a neighbourhood of the equilibrium let us introducelocal coordinates x = q − q , y = p− p . The transformation is canonical, and we canapproximate the Hamiltonian with is quadratic part

(5.2) H(x,y) =∑

j,k

(

Aj,kxjxk +Bj,kxjyk + Cj,kyjyk)

.

Page 2: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

108 Chapter 5

For, the term H(q, p) can be ignored, being a constant, the linear part vanishes atequilibrium, and we neglect contributions of order higher than two. The resultingcanonical equations are linear, which takes us back to the general problem above.However, we may ask for a normal form of the quadratic Hamiltonian (5.2), i.e., for alinear transformation to normal form that is canonical. This is the subject of the firstpart of the chapter.

Having settled the linear problem, we may investigate to which extent the linearapproximation applies to the dynamics of the complete system. In particular the ques-tion of stability of the equilibrium is raised. As a general fact the theory of Lyapounovstates that in most cases the answer can be formulated by analyzing the linear system.This is indeed the case when the equilibria are nodes, saddle points or foci in the ter-minology of Poincare. These points are also collectively called hyperbolic equilibriumpoints. However, a nontrivial exception to Lyapounov theory is represented by centerpoints, which are the most interesting ones in the Hamiltonian case. These points arealso called elliptic equilibrium points, and this is the name that I will adopt from nowon. The second part of the chapter is devoted to a discussion of this problem basedon the search for first integrals.

5.1 Normal form for linear systems

Let us briefly recall how a general system (5.1) may be given a normal form. Applyingthe linear transformation

(5.3) x = Mξ , detM 6= 0

the system is changed to

ξ = Λξ , Λ = M−1

AM .

Lemma 5.1: Consider the linear system x = Ax with a real matrix A and assumethat the eigenvalues λ1, . . . , λn of the matrix A are distinct. Then there exists a com-plex matrix M such that the linear transformation x = Mξ with ξ ∈ Cn gives thesystem the normal form

(5.4) ξ = Λξ , Λ = diag(λ1, . . . , λn)

with Λ = M−1AM . The matrix M is written as

M = (w1, . . . ,wn)

where the column vectors w1, . . . ,wn are the complex eigenvectors of A . If the eigen-values are real, then the matrix M is real, and equation (5.4) is restricted to Rn .

The lemma is a standard argument in linear algebra, hence I omit the proof.

Remark. The matrix M is not uniquely determined, due to the fact that the eigen-values are determined up to a multiplicative factor. This implies that we can multiplyM by a diagonal matrix with non zero diagonal elements. A general statement is: if two

Page 3: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 109

different matrices M and M both satisfy the relation M−1AM = Λ and M−1AM = Λthen

(5.5) M = MD , D = diag(d1, . . . , dn)

with non vanishing d1, . . . , dn .

If the matrix A has a pair of complex eigenvalues then the following lemma willbe useful.

Lemma 5.2: Let λ = µ+ iω with ω 6= 0 be a complex eigenvalue of the real matrixA, and let w = u + iv , with u, v real vectors, be the corresponding eigenvector, sothat Aw = λw. Then the following statements hold true.

(i) w∗ = u − iv is the eigenvector corresponding to the eigenvalue λ∗, that isAw∗ = λ∗w∗.

(ii) The real vectors u and v are linearly independent.

(iii) We have

Au = µu− ωv , Av = ωu+ µv .

Proof. (i) If Aw = λw then by calculating the complex conjugate of both memberswe have (Aw)∗ = (λw)∗ , which in view of A∗ = A coincides with Aw∗ = λ∗w∗. Thisproves that w∗ is the eigenvector corresponding to λ∗.(ii) We know that w and w∗ are linearly independent over the complex numbers,being eigenvectors corresponding to different eigenvalues λ, λ∗ (this is a known generalproperty). By contradiction, let us assume that u = αv for some real α 6= 0 . Then wehave w = u + iv = (α + i)v and w∗ = (α − i)v, so that w = α+i

α−iw∗, contradicting

the linear independence of the eigenvectors over the complex numbers. We concludethat u and v must be linearly independent, as claimed.

(iii) Just calculate

Au =λw + λ∗w∗

2= µ

w +w∗

2+ iω

w −w∗

2= µu− ωv ,

Av =λw − λ∗w∗

2i= µ

w −w∗

2i+ iω

w +w∗

2i= ωu+ µv ,

and the proof is complete. Q.E.D.

A compact and elegant form of the flow is obtained by introducing a linear evolu-tion operator U(t) as follows. First note that the solution may be written in exponentialform as1 x(t) = etAx0 , where etA =

k≥0 tkAk/k! . The problem is how to explicitly

calculate the exponential of the matrix tA . The problem is easily solved in complex

1 It is assumed that the reader knows how to assign a meaning to the exponential of amatrix. If this is not the case, he or she may just check that the expression given hereis actually the solution of the equation by formally calculating the time derivative. Thequestion of convergence will be discussed later, in chapter 6, in the more general contextof Lie series. For a reference see, e.g., Arnold’s book [6].

Page 4: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

110 Chapter 5

coordinates ξ because we have

etΛ = I+ tΛ+t2Λ2

2!+

t3Λ3

3!+ . . . = diag

(

eλ1t, . . . , eλnt)

,

where I is the identity matrix. This is easily checked in view of Λk = diag(λk1 , . . . , λkn)

(prove it by induction). For the real matrix A , using A = MΛM−1 we have A0 = I and,by induction,

Ak = AA

k−1 =(

MΛM−1)(

MΛk−1

M−1)

= MΛkM

−1 .

Remark that Ak is a power of real matrix, hence it is real even if the eigenvalues arecomplex numbers. Thus we conclude that

etA =∑

k≥0

tkAk

k!=∑

k≥0

tkMΛkM−1

k!= M

(

k≥0

tkΛk

k!

)

M−1 = MetΛM−1 .

The resulting matrix is real. This leads to the following

Proposition 5.3: Let the eigenvalues of the real matrix A be distinct. Then the flowof the linear system of differential equations x = Ax satisfying the initial conditionx(0) = x0 is

(5.6) φtx0 = U(t)x0 , U(t) = MetΛM−1

where etΛ = diag(

eλ1t, . . . , eλnt)

is constructed using the eigenvalues λ1, . . . , λn of Aand M is as in lemma 5.1. The evolution operator U(t) is a real matrix.

A few more consideration can be added in case the real matrix A is symmetric.

Corollary 5.4: If the matrix A is symmetric, then the eigenvalues are real and thematrix M can be chosen to be orthogonal, i.e., M⊤M = I .

This is a straightforward consequence of known properties of symmetric matrices.The purpose here is to stress is the analogy with the Hamiltonian case, where thetransformation matrix M should be symplectic.

5.2 Normal form of a quadratic Hamiltonian

Let us now enter the discussion concerning a canonical system with a Hamiltonian ofthe form (5.2). As already pointed out the problem is to look for a normal form of thelinear system of Hamilton’s equations keeping the canonical form. This is tantamountto saying that we look for a normal form of the Hamiltonian.2

2 This problem is a common topic in textbooks when the Lagrangian formalism is used.One is usually lead to study a Lagrangian L(q, q) = T (q)− V (q) where both the kineticenergy T and the potential energy V are quadratic forms in the generalized velocities qand in the generalized coordinates q, respectively. In this case the problem reduces tothe known algebraic one of simultaneous diagonalization of two quadratic forms. TheHamiltonian case considered here is more general because the quadratic Hamiltonianis allowed to include mixed terms in coordinates and momenta. Moreover, we want all

Page 5: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 111

It is convenient to simplify the notation by using the compact formalism. Thus,I shall denote z = (x1, . . . , xn, y1, . . . , yn)

⊤, a column vector, and the Hamiltonian isthe quadratic form

(5.7) H(z) =1

2z⊤C z , C

⊤ = C

where C is a 2n× 2n real, symmetric and non degenerate matrix. The correspondingHamilton’s equations are

(5.8) z = JC z ,

where J is the 2n× 2n antisymmetric matrix

J =

(

0 I

−I 0

)

defining the symplectic form. Recall that we have J⊤ = J−1 = −J .

As in sect. 5.1, let us look for a linear transformation z = Mζ which gives thelinear system a diagonal form. We know that if the eigenvalues of the matrix JC aredistinct then the matrix M can be determined, and so we have

(5.9) M−1

JCM = Λ .

The problem is whether the transformation matrix M can be restricted to be symplec-tic, i.e., to satisfy the condition

(5.10) M⊤JM = J .

5.2.1 The linear canonical transformation

Let us investigate the properties of the eigenvalues of the matrix JC, recalling that Cis assumed to be symmetric.

Lemma 5.5: The eigenvalues of the matrix JC satisfy the following properties:

(i) if λ is an eigenvalue, so is its complex conjugate λ∗ ;(ii) if λ is an eigenvalue, so is −λ .

That is: the eigenvalues of JC can be organized either in pairs, when they are real orpure imaginary, or in groups of four, as illustrated in fig. 5.1.

Proof. (i) The characteristic polynomial of a real matrix has real coefficients. Henceits roots are either real or pairs of complex conjugate numbers.

transformations to be canonical, which is a further constraint. In the present notes Ilimit the discussion to the most common case of distinct eigenvalues, following [94], § 15.A short exposition of the results for the general case of eigenvalues with multiplicitygreater that one is found in [7]. A complete discussion is in [98], where results from [99]are used.

Page 6: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

112 Chapter 5

Im λ

Re λ

Im λ

Re λ

λ

−λ

Im λ

Re λλ−λ

λ−λ*

*

λ−λ

a b c

Figure 5.1. The eigenvalues of the matrix JC with C symmetric. a. A pair

of real eigenvalues, namely the complex conjugates are λ∗ = λ and −λ∗ = −λ.

b. A pair of pure imaginary eigenvalues, namely λ , −λ ; we have λ∗ = −λ and

−λ∗ = λ . c. Four distinct complex eigenvalues λ , −λ , λ∗ , −λ∗ with real and

imaginary part both non zero.

(ii) Consider the characteristic equation det(JC− λI) = 0 and remark that(

JC− λI)⊤

=(

CJ⊤ − λI

)

= −(

CJ+ λI)

= −JJ−1(

CJ+ λI)

= −J(

J−1

CJ+ J−1λI

)

= J(

JCJ+ λIJ)

= J(

JC+ λI)

J .

This shows thatdet(

JC− λI)

= det(

JC+ λI)

,

i.e., the characteristic polynomial is symmetric in λ. The claim follows. Q.E.D.

The next lemma may be seen as stating the analogy between symmetric matri-ces, which have orthogonal eigenvectors, and matrices of the form JC , which haveeigenvectors possessing a natural symplectic structure.

Lemma 5.6: Let the eigenvalues of the matrix JC be distinct, and let λ, λ′ be twoeigenvalues corresponding to the eigenvectors w, w′ , respectively. Then we have

w⊤Jw′ 6= 0 if and only if λ′ = −λ .

Proof. Let us show that we have

(λ+ λ′)w⊤Jw′ = 0 .

For, in view of J2 = I and J⊤J = I , with a short calculation we get

(λ+ λ′)w⊤Jw′ = (JCw)⊤Jw′ +w⊤

J(JCw′)

= w⊤CJ

⊤Jw′ +w⊤

CJ2w′ = w⊤

Cw′ −w⊤Cw′ = 0 .

If λ+λ′ 6= 0 then we havew⊤Jw′ = 0 . If λ+λ′ = 0 we must prove that w⊤Jw′ 6= 0 , sothat w is symplectic orthogonal to all eigenvectors but w′. By contradiction, assumethat also w⊤Jw′ = 0 . This would imply that w⊤Jw′ = 0 for all eigenvectors of thematrix JC. Since the eigenvectors are a basis, in view of the non degeneration of the

Page 7: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 113

symplectic form this would imply that w = 0 , contradicting the fact that w is itselfan element of the basis. We conclude that w⊤Jw′ 6= 0 . Q.E.D.

We now should normalize the eigenvectors so as to be able to construct a symplec-tic matrix. In view of lemma 5.5 we can arrange the eigenvalues and the eigenvectorsof the matrix JC in the order

(5.11)λ1 , . . . , λn , λn+1 = −λ1 , . . . , λ2n = −λn ,

w1 , . . . , wn , wn+1 , . . . , w2n ,

The n quantities

dj = w⊤j Jwj+n , j = 1, . . . , n

are non zero in view of lemma 5.6. Let us construct the matrix

(5.12) M = (w1/d1, . . . ,wn/dn, wn+1, . . . ,w2n)

by arranging the eigenvectors in columns in the chosen order.

Lemma 5.7: The matrix M constructed as in (5.12) is symplectic.

Proof. We must prove that M⊤JM = J . To this end let us denote its elements byaj,k . Then in view of lemma 5.6 and of the definition of dj we have

aj,j+n =1

djw⊤j Jwj+n = 1 , aj+n,j =

1

djw⊤j+nJwj+n = −1 ,

and aj,k = 0 in all other cases. Q.E.D.

Since the columns of M are eigenvectors of JC we conclude with the

Proposition 5.8: Let the eigenvalues of the matrix JC be distinct. Then the lineartransformation z = Mζ with the matrix M constructed as in (5.12) is canonical, andchanges the system of linear differential equations (5.8) into its diagonal form

ζ = Λζ , Λ = diag(λ1, . . . , λn,−λ1, . . . ,−λn) .

5.2.2 Non uniqueness of the diagonalizing transformation

The diagonalizing transformation is not unique, for we can multiply it by a symplecticdiagonal matrix R = diag(r1, . . . , r2n) . This is seen as follows. In view of the remarkmade after lemma 5.1 the matrix M = MR still gives the system a diagonal form.Thus we should only check that M is still symplectic, which is true because the set ofsymplectic matrices forms a group.

Let us look for the general form of a diagonal symplectic matrix R. To this endlet us write it in the convenient form

R =

(

R0 00 R1

)

, R0 = diag(r1, . . . , rn) , R1 = diag(r′1, . . . , r′n) .

Page 8: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

114 Chapter 5

Writing the symplecticity condition we have(

R0 00 R1

)(

0 I

−I 0

)(

R0 00 R1

)

=

(

R0 00 R1

)(

0 R1

−R0 0

)

=

(

0 R0R1

−R0R1 0

)

=

(

0 I

−I 0

)

.

The last line give us the condition R0R1 = I . We conclude that the symplectic matrixM which takes the linear canonical system in diagonal form may be replaced with thesymplectic matrix MR where R is a non degenerate diagonal matrix of the form

(5.13) R =

(

R0 00 R

−10

)

, R0 = diag(r1, . . . , rn) .

5.2.3 Complex normal form of the Hamiltonian

Let us now come to the transformation of the Hamiltonian, using the compact no-tation ζ = (ξ,η)⊤ where ξ = (ξ1, . . . , ξn) ∈ Cn are the coordinates and η =(η1, . . . , ηn) ∈ Cn are the corresponding conjugate momenta. Furthermore, recall-ing the ordering (5.11) of the eigenvalues, it will be convenient to write the diagonalmatrix Λ as

(5.14) Λ =

(

Λ0 00 −Λ0

)

, Λ0 = diag(λ1, . . . , λn) .

Proposition 5.9: Let the eigenvalues of the matrix JC be distinct. Then the lin-ear transformation generated by the matrix M defined by (5.12) gives the quadraticHamiltonian (5.7) the form

(5.15) H(ξ,η) =

n∑

j=1

λjξjηj .

Proof. By transforming the quadratic form 12z

⊤Cz we get 12ζ

⊤M⊤CMζ. Recallingthat M−1

JCM = Λ we calculate

M⊤CM = −

(

M⊤J)

JCM = −(

M⊤JM)(

M−1

JCM)

,= −JΛ =

(

0 Λ0

Λ0 0

)

,

a symmetric matrix. Thus, the transformed Hamiltonian is H(ζ) = −12ζ⊤JΛζ ,

namely (5.15) in complex coordinates (ξ,η) . Q.E.D.

5.2.4 First integrals

The normalized Hamiltonian (5.15) possesses n first integrals which are in involution,namely

(5.16) Ψj(ξj , ηj) = ξjηj , j = 1, . . . , n .

Page 9: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 115

This is easily checked by a straightforward calculation of Poisson brackets {Ψj , H} .Thus the system turns out to be integrable in Liouville’s sense.3 Further first integralsmay be found by looking for a solution of the equation {Φ, H} = 0 namely

i

n∑

l=1

λl

(

ηl∂Φ

∂ηl− ξl

∂Φ

∂ξl

)

= 0 .

Choosing4 Φ = ξjηk we have

〈k − j, λ〉ξjηk = 0 ,

which can be satisfied only if 〈k − j, λ〉 = 0 , which is a resonance relation. Thereforefurther first integrals independent of those in (5.16) do exist only if the eigenvalues λsatisfy a resonance condition. This is an extension of proposition 4.10 to the generalcase of quadratic Hamiltonians.

The canonical equations for the Hamiltonian (5.16) are

(5.17) ξj = λjξj , ηj = −λjηj ,

and the corresponding solutions with initial datum ξj,0, ηj,0 are

ξj(t) = ξj,0eλjt , ηj(t) = ηj,0e

−λj t .

This is straightforward, but the structure of the orbits in the real phase space remainshidden until we perform all substitutions back to the original variables, or we write inexplicit form the evolution operator U(t) = MetΛM−1 as in proposition 5.3.

More generally, the drawback of this section is that the functions so determinedhave in general complex values, while it seems better to write both the Hamiltonianand the first integrals in real variables. Thus, let us proceed by looking for a normalform in real variables.

5.2.5 The case of real eigenvalues

If the eigenvalues of the matrix JC are real and distinct then the whole normaliza-tion procedure is performed without involving complex objects. Thus, considering forsimplicity the case of a system with one degree of freedom, the normal form of theHamiltonian is

(5.18) H(ξ, η) = λξη , (ξ, η) ∈ R2

where λ, −λ are the eigenvalues of the matrix JC .It may be interesting however to remark that one can use as a paradigm Hamil-

tonian also the form

(5.19) H(x, y) =λ

2(x2 − y2) .

3 Integrablity in Arnold–Jost sense is not assured because the invariant surfaces deter-mined by the first integrals need not be compact.

4 I use the multi-index notation, i.e., k, j are integer vectors with non negative entries and

ξjηk = ξj11 · · · ξjnn ηk1···η

knn

1 .

Page 10: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

116 Chapter 5

This is a typical form of a Hamiltonian describing the motion of a particle on a one–dimensional manifold.

Exercise 5.1: Find a canonical transformation that changes the Hamiltonian (5.19)into (5.18).

5.2.6 The case of pure imaginary eigenvalues

Let us consider again the case of a system with one degree of freedom. My aim is toshow that the Hamiltonian may be given the normal form

(5.20) H(x, y) =ω

2(y2 + x2) ,

which is easily recognized as the Hamiltonian of a harmonic oscillator.Let the eigenvalues of the matrix JC to be λ = iω and λ∗ = −iω . Then the

corresponding eigenvectors are complex conjugated, i.e., may be written as w = u+ive w∗ = u− iv. Recalling that the symplectic form is anticommutative we have

w⊤Jw∗ = (u+ iv)⊤J(u− iv)

= i(−u⊤Jv + v⊤

Ju) = −2iu⊤Jv ,

which is a pure imaginary quantity. We may always order the eigenvalues and theeigenvectors as in (5.11) with the further requirement

(5.21) d = u⊤Jv > 0 .

This may just require exchanging a pair (λ,w) with its conjugate (−λ,w∗) . Moreprecisely, we may create the association

iω ↔ u , −iω ↔ v

by choosing the sign of ω so that the condition d > 0 is satisfied.5 Let us now constructthe transformation matrix

(5.22) T = (u/√d , v/

√d) .

This is a symplectic matrix, as is checked by calculating

T⊤JT =

1

d

(

u⊤Ju u⊤Jvv⊤Ju v⊤Jv

)

=

(

0 1−1 0

)

.

By property (iii) of lemma 5.2 we have

JCu = −ωv , JCv = ωu ,

and we can write this relation for the columns of the matrix T , thus getting

JCT = T

(

0 ω−ω 0

)

,

5 Remark that the sign of ω in the normal form (5.20) depends precisely on this choice.Since the choice is dictated by the eigenvectors of the matrix JC there is no arbitrarinesshere.

Page 11: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 117

that is

T−1

JCT = J

(

ω 00 ω

)

.

Thus the equations take the form

x = ωy , y = −ωx ,

and the transformed Hamiltonian is (5.20).

Exercise 5.2: The Hamiltonian of a harmonic oscillator in complex variables isH(ξ, η) = iωξη . Find a canonical transformation that changes the Hamiltonian (5.20)into this one.

5.2.7 The case of complex conjugate eigenvalues

The simplest case is that of a system with two degrees of freedom. My aim is to showthat a good paradigm Hamiltonian is

(5.23) H = µx1y1 + µx2y2 + ω(x1y2 − x2y1)

where µ and ω are real parameters.Let me first show how we can construct a real symplectic basis. Write the four

eigenvalues and the corresponding eigenvectors of the matrix JC as

λ = µ+ iω , w+ = u+ + iv+ ,

λ∗ = µ− iω , w∗+ = u+ − iv+ ,

−λ = −µ+ iω , w− = u− + iv− ,

−λ∗ = −µ− iω , w∗− = u− − iv− ,

the alignment in rows reflecting the correspondence. By lemma 5.6 we know thatw⊤

+Jw− 6= 0 and w∗+⊤Jw∗

− 6= 0 , and we can always manage that these quantities areequal and pure imaginary,6 i.e.,

(5.24) w⊤+Jw− +w∗

+⊤Jw∗

− = 0 .

Still by lemma 5.6 we know also that, omitting four obvious relations, we have

(5.25)w⊤

+Jw∗+ = 0 , w⊤

+Jw∗− = 0 , w⊤

−Jw∗− = 0 ,

w∗+⊤Jw+ = 0 , w∗

+⊤Jw− = 0 , w∗

−⊤Jw− = 0 .

Lemma 5.10: With the condition (5.24), the vectors u+, u−, v+, v− satisfy

(5.26) u⊤+Jv− = u⊤

−Jv+ 6= 0

and

(5.27) u⊤+Jv+ = u⊤

−Jv− = u⊤+Ju− = v⊤

+Jv− = 0 .

6 For instance, if w⊤

+Jw− = eiϑ it is enough to replace w+ with ie−iϑw+ , with the

corresponding change for w∗

+.

Page 12: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

118 Chapter 5

This means that u+, u−, v−, v+ , in this order, form a symplectic basis, still to benormalized.

Proof. From w⊤+Jw

∗+ = 0 we get

(u+ + iv+)⊤J(u+ − iv+) = −2iu⊤

+Jv+ = 0 .

With a similar calculation, from w⊤−Jw

∗− = 0 we also get u⊤

−Jv− = 0 . Thus two ofthe relations (5.27) are proven.Using w⊤

+Jw∗− = 0 we get

(u+ + iv+)⊤J(u− − iv−) =

(

u⊤+Ju− + v⊤

+Jv−)

− i(

u⊤+Jv− − v⊤

+Ju−)

= 0 ,

and setting separately to zero the real and the imaginary part we get

(5.28) u⊤+Ju− + v⊤

+Jv− , u⊤+Jv− − v⊤

+Ju− = 0 .

Now calculate

(5.29)w⊤

+Jw− = (u+ + iv+)⊤J(u− + iv−)

=(

u⊤+Ju− − v⊤

+Jv−)

+ i(

u⊤+Jv− + u⊤

−Jv+

)

,

In view of (5.28) we get

w⊤+Jw− = 2u⊤

+Ju− + 2iu⊤+Jv−

In view of the property (5.24) the l.h.s. is a pure imaginary quantity, and since weknow that w⊤

+Jw− 6= 0 we get

u⊤+Ju− = 0 , u⊤

+Jv− 6= 0 .

By (5.28) we also getv⊤+Jv− = 0 , v⊤

+Ju− 6= 0 .

Furthermore by the second of (5.28) we have u⊤+Jv− = v⊤

+Ju− which completes theproof. Q.E.D.

We are now ready to construct the matrix T that gives the system the normalform (5.23) by suitably adapting the scheme of the previous section. Since u⊤

+Jv− isa real quantity we can always manage so that

(5.30) u⊤+Jv− = v⊤

+Ju− = d > 0 .

For, if d is negative it is enough to exchange u+ with u− and v+ with v− . Then weconstruct the 4× 4 matrix

(5.31) T =

(

u+√d,v+√d,v−√d,u−√d

)

This is a symplectic matrix. For, in view of (5.30) and of lemma 5.10we have

T⊤JT =

1

d

u⊤+Ju+ u⊤

+Jv+ u⊤+Jv− u⊤

+Ju−v⊤+Ju+ v⊤

+Jv+ v⊤+Jv− v⊤

+Ju−v⊤−Ju+ v⊤

−Jv+ v⊤−Jv− v⊤

−Ju−u⊤−Ju+ u⊤

−Jv+ u⊤−Jv− u⊤

−Ju−

= J

Page 13: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 119

By property (iii) of lemma 5.2 we have

JCu+ = µu+ − ωv+ , JCv+ = ωu+ + µv+ ,

JCu− = −µu− + ωv− , JCv− = −ωu− − µv− .

Using this we also have

JCT =

(

µu+ − ωv+√d

,ωu+ + µv+√

d,−µv− − ωu−√

d,

ωv− − µu−√d

)

= T

µ −ω 0 0ω µ 0 00 0 −µ −ω0 0 ω −µ

Thus, denoting by (x1, x2, y1, y2) the new variables the system (5.8) is transformedinto

x1 = µx1 − ωx2 , x2 = ωx1 + µx2

y1 = −µy1 − ωy2 , y2 = ωy1 − µy2

which are the canonical equations for the Hamiltonian (5.23).

Exercise 5.3: Find the canonical transformation that changes the Hamilto-nian (5.23) into the Hamiltonian in complex variables H(ξ, η) = λξη .

Hint: this may be used as an example of application of the diagonalizing procedure insections 5.2.1, 5.2.2 and 5.2.3.

5.3 Nonlinear elliptic equilibrium

The Hamiltonian in a neighbourhood of an elliptic equilibrium can be typically ap-proximated by a quadratic Hamiltonian with the form (see 5.2.6)

(5.32) H0(x, y) =1

2

l

ωl(x2l + y2l ) ,

where (x, y) ∈ R2n are the canonical variables, and ω = (ω1, . . . , ωn) ∈ Rn is thevector of frequencies, that are assumed not to vanish.

By assuming the Hamiltonian to be analytic and expanding it in power series weare led to consider a canonical system with Hamiltonian

(5.33) H(x, y) = H0(x, y) +H1(x, y) +H2(x, y) + . . .

where (x, y) ∈ R2n are canonically conjugate variables, H0(x, y) has the form (5.32),and Hs(x, y), for s ≥ 1, is a homogeneous polynomial of degree s + 2 in the canon-ical variables. The power series is assumed to be convergent in a neighbourhood ofthe origin of R2n. This is actually a perturbed system of harmonic oscillators, whichdescribes many interesting physical models.

Page 14: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

120 Chapter 5

5.3.1 Use of action–angle variables

The canonical transformation to action–angle variables

(5.34) xl =√

2Il cosϕl , yl =√

2Il sinϕl, 1 ≤ l ≤ n

gives H0 the form (4.17) of an isochronous system, which fails to satisfy the conditionof non degeneration of Poincare’s theorem.7 The Hamiltonian presents a couple ofminor differences with respect to that of the general problem of dynamics as statedby Poincare, namely the Hamiltonian (4.1). For, it seems that the Hamiltonian (5.33)is not an expansion in a perturbation parameter, and moreover it is not written inaction–angle variables.

The lack of a perturbation parameter is just a trivial matter, because the pa-rameter ε is easily replaced by the distance from the origin. Indeed, if we consider thedynamics inside a sphere of radius centered at the origin then the homogeneous poly-nomial Hs(x, y) is of order O(s+2), so that plays the role of perturbation parameter.More formally we may introduce a scaling transformation

xj = εx′j , yj = εy′j ,

which is not canonical, but preserves the canonical form of the equation if the newHamiltonian is defined as

H ′(x′, y′) =1

ε2H(x, y)

x=εx′ , y=εy′

(see example 2.4). Thus the Hamiltonian is changed to

H ′(x′, y′) = H0(x′, y′) + εH1(x

′, y′) + ε2H2(x′, y′) + . . . ,

which introduces the power expansion in ε . This means that the natural reorderingof the power series as homogeneous polynomials corresponds exactly to the use of aparameter.

The transformation to action–angle variables is a bit more delicate. Indeed thecanonical transformation (5.34) introduces a singularity at the origin which causes aloss of analyticity for I = 0 . Let us see this point in some more detail.

By transforming a homogeneous polynomial of degree s in x, y we obtain a Fourierseries in the angles ϕ with coefficients that depend on the actions I in a particularform. The following rules apply:8

7 The problem of a degenerate Hamiltonian of the type considered here has been firstinvestigated by Whittaker [96]. As an historical remark, it is curious that in his veryexhaustive paper Whittaker did not mention the problem of the consistency of theconstruction pointed out in sect.4.3.1. A few years later Cherry wrote two papers wherea lot of work is devoted to the consistency problem, but without reaching a definiteconclusion [20] [21]. An indirect solution was found by Birkhoff in [17], ch. III, § 8, usingthe method of normal form that usually goes under his name.

8 The reader will easily check that these rules apply by writing the trigonometric functionsin complex form.

Page 15: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 121

(i) The actions I appear only as powers of√I1, . . . ,

√In , namely

cj1,...,jn Ij1/21 · · · Ijn/2n ,

where cj1,...,jn ∈ R .(ii) In every term of the Fourier expansion

cj1,...,jn Ij1/21 · · · Ijn/2n ei(k1ϕ1+...+knϕn)

the exponent kl may take only the values −jl, −jl + 2, . . . , jl − 2, jl .With some patience the reader will be able to check that these properties are preservedby sums, products and Poisson brackets.9

The polynomial dependence on√I1, . . . ,

√In keeps the property of a homoge-

neous polynomial of degree s to be of order s, so that nothing essential is changed.However, the lack of analyticity may be somehow annoying when one tries to analyzethe convergence. As a matter of fact all these problems actually disappear is we chooseto work in cartesian coordinates, as we shall do here. In a first stage the theory will bedeveloped at a purely formal level, in the sense that all calculation will be performeddisregarding the problem of convergence of the series that will be constructed. Theproblem of convergence will be discussed later.

5.3.2 A formally integrable case

In view of the degeneration of the unperturbed system we may expect to be able toconstruct first integrals in our case by just applying the procedure of sect 4.3.1, namelyto solve the system (4.21) of equations.

Let us adapt the scheme to our case. We look for a first integral

(5.35) Φ(x, y) = Φ0(x, y) + Φ1(x, y) + . . .

where Φ0(x, y) = Il =12(x

2l + y2l ) is the action of the l–th oscillator, and Φs(x, y) is

a homogeneous polynomial of degree s + 2. By setting l = 1, . . . , n we may constructn independent first integrals. Thus, splitting the equation {H,Φ} in homogeneouspolynomials we get the recurrent system

(5.36){H0,Φ1} = −{H1,Φ0}{H0,Φs} = −{H1,Φs−1} − . . .− {Hs,Φ0} , s > 0 .

Looking at the algebraic aspect of the equations above makes the problem quite simple.Denote by P the vector space of formal power series and by Pr the subspace of

9 The direct verification is straightforward, but we may remark that there is no actualneed to do all the calculations. Here is the argument. Recall that the transformation toaction–angle variables is canonical and that the functions that we are considering areobtained by transforming polynomials. Since the canonical transformation preserves thePoisson brackets the calculation in action–angle variables can not affect the propertythat in cartesian variables x, y the result is a polynomial. Thus the properties must bepreserved. The same argument, with no use of the canonical structure, applies to theproduct.

Page 16: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

122 Chapter 5

homogeneous polynomials of degree r. The unperturbed Hamiltonian H0 acts as alinear operator LH0

· = {H0, ·} from the linear space Pr into itself. Moreover, usingthe complex canonical coordinates (ξ, η) ∈ C2n defined by

(5.37) xl =1√2(ξl + iηl) , yl =

i√2(ξl − iηl) , 1 ≤ l ≤ n

we get

(5.38) H0 = i∑

l

ωlξlηl ,

so that the operator LH0above takes a diagonal form. For, by applying it to a mono-

mial ξjηk ≡ ξj11 . . . ξjnn ηk11 . . . ηknn we get

LH0ξjηk = i

k − j, ω⟩

ξjηk .

Define now, as usual, R as the image of P by LH0and, correspondingly, by Rr the

image of the subspace Pr. We see that the eq. (5.36) can be solved if the r.h.s. belongsto Rr. On the other hand, let us define the null space N as N = L−1

H0(0) the

inverse image of the null element by LH0, with the corresponding definition for Nr.

We get that both Nr and Rr are linear subspaces of the same space Pr, whichare disjoint, namely satisfy N ∩ R = {0}, and generate Pr by direct sum, namelysatisfy N ⊕R = Pr. Thus the system (5.36) can be solved provided the r.h.s. has nocomponent in N . This is actually the consistency problem that has been have pointedout in sect. 4.3.1.

Dealing with the latter problem is actually not easy, in general. However a directsolution may be found in a particular but relevant case which occurs in many physicalsituations.10

Let us say that a function f+(x, y) is even in the momenta in case f+(x, y) =f+(x,−y) and that f−(x, y) is odd in the momenta in case f−(x, y) = −f−(x,−y).

10 See [30]. The reader will notice that only the non–resonant case is discussed here, andmoreover the proof of the proposition can not be trivially extended to the case of a non–reversible Hamiltonian. The direct construction of formal first integrals for the resonantcase encounters major difficulties due to the nontrivial structure of the null space N .This is indeed the main concern of the second paper of Cherry [21]: he proposes todetermine the arbitrary terms in the solution of the linear equation at a given order soas to remove the unwanted terms in N from the known term of the equations for higherorders. However, Cherry fails to prove that the method actually works — althoughexactly his method has been implemented on computer by Contopoulos in order toperform an explicit calculation (see [23], [24] and [25]). First integrals for both the non–reversible case and the resonant one may be constructed using the methods of normalform introduced by Poincare and widely used by Birkhoff; these methods are indirect,in contrast with the direct and overall simpler approach used here. It may be interestingto remark that an attempt made in [31] to justify the method of Cherry has lead toa connection with the algorithm for giving the Hamiltonian a normal form via the Lietransform methods. These methods will be the subject of chapter 6.

Page 17: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 123

The formal existence of first integrals is stated by the following

Proposition 5.11: Let H(x, y) be as in (5.33) where H0 has the form (5.32), andassume:i. non resonance: for k ∈ Zn one has 〈k, ω〉 = 0 if and only if k = 0;ii. reversibility: the Hamiltonian is an even function of the momenta, namely satisfies

H(x,−y) = H(x, y).Then there exist n independent formal integrals Φ(1), . . . ,Φ(n) of the form (5.35) whichare even functions of the momenta. Moreover they form a complete involution system.

The proof is based on the following

Lemma 5.12: The Poisson bracket between even and odd functions of the momentaobeys the rule

{·, ·} +

+ −∣

+

− +

,

i.e., the Poisson bracket between functions of the same parity is odd; the Poissonbracket between functions of different parity is even.

Proof. The product of even and odd function clearly obeys the opposite rule, i.e.,the product of functions of the same parity is even, and the product of functions ofdifferent parity is odd. For we have

(

f+ · g+)

(x, y) = f+(x, y) · g+(x, y)= f+(x,−y) · g+(x,−y) =

(

f+ · g+)

(x,−y) ,(

f− · g−)

(x, y) = f−(x, y) · g−(x, y)=(

−f−(x,−y))

·(

−g−(x,−y))

=(

f− · g−)

(x,−y) ,(

f+ · g−)

(x, y) = f+(x, y) · g−(x, y)=(

f+(x,−y))

·(

−g−(x,−y))

= −(

f+ · g−)

(x,−y) .

On the other hand, the derivative with respect of one of the coordinates x does notaffect the parity, while the derivative with respect to one of the momenta changes itinto the opposite one. Coming to the Poisson bracket, it is enough to take into accountthat it is defined as products of derivatives, and apply the rules above for derivativesand products. Q.E.D.

Proof of proposition 5.11. The proof is based on the fact that the non resonancecondition implies that every function f ∈ N must be even in the momenta. For, it candepend only on the action variables I1 = ξ1η1, . . . , In = ξnηn which is real coordinates

write I1 =x21+y

21

2 , . . . , In =x2n+y

2n

2 and are clearly even functions.Let us now proceed by induction and prove that a first integral can be constructed andmust be an even function. Since the first term Φ0 must satisfy {H0,Φ0} = 0 we have

Page 18: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

124 Chapter 5

Φ0 ∈ N , hence it must be an even function. Suppose that Φs has been determinedfor 0 ≤ s ≤ r as an even function of the momenta, which is true for r = 0. Thenthe r.h.s. of the equation for Φr+1 is an odd function, because it is found as a sum ofPoisson brackets between even functions. Therefore it has no component in N , andso Φr+1 can also be determined and is an even function, because its Poisson bracket{H0,Φr+1} must be an odd function; such a solution is unique up to an arbitraryterm Φr+1 ∈ N , which adds again an even function. This completes the induction,and shows that Φ may be constructed, and must be an even function. Let Φ0 = Ilbe any of the actions and construct the corresponding first integrals. Thus we get nfirst integrals Φ(1), . . . ,Φ(n) that are independent because the first terms I1, . . . , Inare independent. Let now Φ and Ψ be two such first integrals; we prove that theyare in involution. Let Υ = {Φ,Ψ}. By Poisson’s theorem it is a first integral, hence itmust be an even function, as we have proved. On the other hand Υ must be an oddfunction, because it is the Poisson bracket between even function. Since it must beboth odd and even, we conclude that it must be zero. Q.E.D.

We should keep in mind that the proposition just proved is a formal one, in thesense that all the construction is performed by simply using algebra, regardless of theconvergence of the series so generated. In the same spirit, one could apply the methodof Liouville and Arnold to build the action–angle variables.11 Hence the system isformally integrable.

Having settled the formal aspect, one should discuss the convergence propertiesof the series so generated. Indeed the denominators 〈k, ω〉, although non vanishing,are not bounded from below. On the other hand, there are known examples of seriesinvolving small denominators which are convergent. An essentially negative answerto the problem of convergence has been given by Siegel in [93]. However, it must beemphasized that the problem of convergence has puzzled the best mathematicians fora couple of centuries. In view of the complexity of the problem, let us make a detourby looking at numerical results.

5.4 Numerical exploration

The present section aims at giving a glimpse on the dynamics around a non linearequilibrium using numerical methods in order to calculate the orbits. Such an approachis motivated by history. The existence of chaotic orbits, that will be illustrated in thepresent section, had been predicted by Poincare [86][87]. However, the phenomenonremained mostly unknown for more that 70 years.

At the dawn of numerical simulations of dynamics with the help of computers,between 1955 and 1960, the method of Poincare section has been used in order tovisualize the dynamics of systems of two harmonic oscillators with a cubic nonlinearity,a simple model that may describe the dynamics of stars in a Galaxy. Many such studies

11 This is discussed by Whittaker in [97], ch. XVI, §199. Whittaker also includes in §198 ashort discussion concerning the problem of the convergence of the formal first integrals.

Page 19: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 125

Figure 5.2. The method of

Poincare section for an orbit

in the three dimensional space.

Taking a surface Σ transversal to

the flow and an initial point P0

on that surface the orbit is fol-

lowed until it crosses again the

surface in the same direction at

P1 , and then again at P2, P3

and so on. An orbit is thus rep-

resented by the sequence of the

successive intersections.

have been performed by Contopoulos, who also had the idea of calculating the so calledthird integral (to be added to the two classical ones of the energy and the angularmomentum) by a series expansion via the method discussed in sect. 5.3. The serieshad to be truncated at low order, of course, due to the limited power of computersavailable at that time.

The existence of chaos has been rediscovered, so to say, thanks to the work ofContopoulos (see, e.g., [23]), followed by the works of Henon and Heiles [47] in 1964and of Gustavson [43] in 1966, just to quote the first ones. It should be emphasized thatthe discovery came as an unexpected surprise, and marked the opening of a wide fieldof research. After 1970 the interest in numerical simulations of dynamics has raisedtumultuously — perhaps chaotically.12 It goes without saying that a long intervalof more than 70 years between the discovery of Poincare and the rediscovery andwidespread diffusion of knowledge about chaotic phenomena should not be underrated.It is, I think, a clear symptom of the strong difficulty of imagining such complexphenomena on the mere basis of analytical investigations. This section may help thereader’s imagination.

5.4.1 The Poincare section

A first class of works is a numerical implementation of the method known as Poincaresection, illustrated in figure 5.2. Here we are concerned with a numerical implementa-tion of that method for a class of Hamiltonians that has been widely investigated by

12 A comprehensive exposition of the history from the point of view of one of the protago-nists, namely Contopoulos, can be found in [26].

Page 20: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

126 Chapter 5

Contopoulos. Let us see how it works making reference to the Hamiltonian

(5.39) H(x, y) =ω1

2(y21 + x2

1) +ω2

2(y22 + x2

2) + x21x2 −

1

3x32 ;

Restricting attention for the moment to the case of positive frequencies ω1, ω2 theHamiltonian has a minimum at the origin. Hence for small positive E the manifoldof constant energy has a connected component topologically similar to a sphere. Thepotential energy

V (x1, x2) =ω1

2x21 +

ω2

2x22 + x2

1x2 −1

3x32

has three critical saddle points at

P1 =

(

−√

ω21 + 2ω1ω2

2,−ω1

2

)

, P2 =

(

ω21 + 2ω1ω2

2,−ω1

2

)

, P3 = (0, ω2) .

The corresponding energies are

E1 = E2 =ω31

24+

ω21ω2

8, E3 =

ω32

6.

The smallest of these values will be called escape energy and denoted by Ef , for if Eexceeds that value then the energy surface ceases to be compact, and the orbits mayescape to infinity.

Choosing as surface of section the plane x1 = 0 we are led to consider the compactcomponent close to the origin of the two dimensional manifold

(5.40) H(0, x2, y1, y2) =ω1

2y21 +

ω2

2(y22 + x2

2)−1

3x32 = E .

On that manifold the vector field writes

J gradH =(

ω1y1, ω2y2, 0,−ω2x2 + x32

)

,

and is generically transversal to the section plane x1 = 0. The projection of thecompact component of the energy surface on the section plane is the region

ω2

2(y22 + x2

2)−1

3x32 ≤ E ,

which in turn is compact for 0 < E < Ef .

The Poincare section is represented as follows.

(i) Pick a fixed value of the energy E; it is convenient to choose a value 0 < E < Ef ,but calculating orbits for higher energy is not forbidden: just take into accounta possible escape to infinity (that is, an overflow during the calculation).

(ii) Choose an initial point x2, y2 inside the region (5.40)

(iii) Set x1 = 0 and calculate y1 as a root of equation (5.40); choose, e.g., the positivesign.

Page 21: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 127

(iv) Integrate numerically the orbit until it crosses again the section plane x1 = 0with positive y1, and mark the coordinates x1, x2 of the section point on thegraph.13

(v) Repeat the step (iv) as many time as are needed in order to produce a satisfac-tory figure.

(vi) Repeat the steps (ii)–(v) for all wanted orbits.

5.4.2 A non resonant case

The portrait of the Poincare section for a case with non resonant frequencies is repre-sented in figure 5.3. The frequencies are

ω1 = 1 , ω2 =

√5− 1

2≃ 0.6180339887498948 ,

so that ω2 is the golden section, in some sense the most irrational number.The amount of the perturbation due to non linear terms is parameterized by the

energy. The values of the energies in the different panels are chosen so as to illustratethe considerable change in the behaviour of the orbits as the energy is increased froma value close to zero up to the critical value. The different panels correspond to thefollowing energies:

a : E = 0.001 , b : E = 0.024 ,

c : E = 0.025 , d : E = 0.03 ,

e : E = 0.034 , f : E = 0.039344 ,

the last one being very close to the escape energy.In the purely harmonic case (removing all non quadratic terms from the Hamil-

tonian) the portrait of the Poincare section would be very simple. The system is

integrable, since the actions I1 =x21+y

21

2 and I2 =x22+y

22

2 are first integrals; hence theorbits lie on invariant tori parameterized by I1, I2. For fixed energy, the intersectionof the torus with the section plane x1 = 0 projects on the plane x2, y2 as a curveI2 = const, namely a circle which degenerates into point for I2 = 0. The latter condi-tion corresponds to a fixed point in the plane x2, y2 and to a periodic orbit which isa normal mode with all the energy on the oscillator with frequency 1. For I2 > 0 thesection points distribute over a circle, and tend to be uniformly distributed in view ofthe strong non resonance, so that when many points are represented one sees just acircle.

13 A couple of remarks are in order here. First, it is better to use a symplectic integrationalgorithm in order to avoid the possible energy drift induced by non symplectic methods.E.g., the figures in the present notes have been calculated using the simplest symplecticalgorithm, known as leap–frog method. Second, the intersection point is found when twosuccessive calculated points of the orbit lie on different sides of the section plane. Thisgives a segment, not a point. Choosing one of the extremes makes the figure rather fuzzy.It is better to calculate the section point by interpolation on the segment; in most casesthis is enough in order to produce an æstetically acceptable figure.

Page 22: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

128 Chapter 5

a b

c d

e f

Figure 5.3. Poincare section of orbits on different energy surfaces for the

Hamiltonian (5.39) (see text).

Page 23: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 129

If the non linear system is integrable, then the orbits are expected to be invarianttori, and the points should be distributed over a curve which is the intersection of atwo dimensional invariant torus with the plane of section. We may also expect that thecurve is close the circle that represents the section for the quadratic approximation ofthe Hamiltonian, because the first integrals are perturbations of the harmonic actions.Conversely, if there are no first integrals except the energy (as claimed by the theoremof Poincare) the section points are not expected to lie on a curve, and could even bedistributed at random.

Drawing the numerically calculated the Poincare section for different energy oneobserves the following facts.

(a) E = 0.001 . For energy close to zero the portrait agrees with our expectations.However, the reader should remark that the periodic orbit is displaced, and thepoints distribute over curves close to circles that appear to be invariant, buthave different centers. This is due to a deformation of the orbits. Moreover, thepoints distribute over the curves with different densities, due to the change offrequency induced by the perturbation. The overall impression is that invarianttori do persist, and the system is at least very close to integrable.

(b) E = 0.024 . The curves are considerably distorted, but the successive pointsstill appear to lie on invariant curves. We could imagine a strong deformationof the tori of a system which is still very close to integrable.

(c) E = 0.025 . The three loops figure are generated by a periodic orbit (period 3 onthe map, ratio 2/3 between the frequencies) represented by the three isolatedpoints at the center of the loops. A second three–periodic orbit is located atthe three intersections of the loops. The first orbit is stable, and is surroundedby non periodic orbits that seem to lie on a twisted torus. The second orbit isunstable, and there are separatrices emanating from it, similar to what happensto the unstable equilibrium of the pendulum. The portrait starts to be quitedifferent from that of the harmonic case, but still the behaviour of the orbitslooks quite regular.

(d) E = 0.03 . More periodic orbits in stable/unstable pairs are generated close tothe limiting curve of the compact domain.

(e) E = 0.034 . More and more periodic orbits are generated. But the new fact isthat close to one of the separatrices there is a spot generated by an orbit thatseems to exhibit a chaotic, random behaviour, for the points do not lie on aregular curve.

(f) E = 0.039344 . The region invaded by chaotic orbits is considerably enlarged.A further increase of E causes the energy surface to open, and all orbits in the chaoticregion go over the saddle point of the potential energy, thus escaping quite rapidly toinfinity. However, some orbits close to the stable periodic ones may remain bounded.

5.4.3 A resonant case: the model of Henon and Heiles

The model of Henon and Heiles is again the Hamiltonian (5.39) with frequencies ω1 =ω2 = 1. The frequencies are in strong resonance (i.e., a resonance of very low order)which is expected to have a remarkable impact on the dynamics. Proposition 5.11 can

Page 24: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

130 Chapter 5

a b

c d

Figure 5.4. Poincare section on different energy surfaces for the model of

Henon and Heiles (see text).

not be applied in a straightforward manner, which does not mean that first integralscan not be constructed: it means only that the construction is not so straightforwardas in the non resonant case.14

With a moment’s thought the reader will realize that the Poincare section for

14 There are two ways out, as mentioned in note 10. The first one is to apply the methodsuggested by Cherry, namely adding null space terms at every step so as to remove theunwanted terms at higher orders. This as been done by Contopoulos [24]. The secondway consists in applying an indirect method that goes through the construction of anormal form of the Hamiltonian. This has been first done by Gustavson [43].

Page 25: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 131

the harmonic case (no cubic or higher order terms) is a little puzzling. With equalfrequencies all points in the section plane are fixed points for the map generated by thePoincare section. This precisely because all orbits are periodic with a frequency ratio1 : 1. If the frequencies are resonant then all orbits are still periodic, and the Poincaremap for every orbit contains a finite number of points which repeat periodically. Itgoes without saying that such a singular portrait will be broken by non linear terms,in general.

The case 1 : 1 of Henon and Heiles is rather singular. The Poincare section fordifferent energies is represented in figure 5.4. The values of energy are the same as inthe original paper [47], namely

a : E = 1100

, b : E = 112

,

c : E = 18 , d : E = 1

6 ,

the last one being the escape energy. The portraits of Poincare sections in this caseappear to be quite different from the ones of the non resonant case. The perplexingthing, at first sight, is the complete lack of correspondence with the harmonic case.Here are a few comments.

(a) E = 1100 . The periodic orbit close to the center has been made unstable by the

resonance. Most of the energy surface is filled in by orbits that rotate aroundtwo periodic orbits (period 1) represented by two points at the center of the leftand right families of closed curves. This is the effect of the resonance. However,the dynamics still resembles that of an integrable system.

(b) E = 112

. At first sight, a deformation of the figure in panel a, but one canobserve a spot close to unstable periodic orbits that suggests some chaos. Thefigure suggests that chaotic orbits show up close to unstable orbits: this wasindeed the discovery of Poincare.

(c) E = 18. The chaotic region covers a considerable part of the energy surface:

all the points scattered around belong to the same orbit. On the other hand,inside the chaotic regions there are orbits that appear to lie on invariant surfaces(curves in the section plane). An intriguing image suggested by Henon describesthis phenomenon by saying that there are islands of order inside a chaotic sea.The latter expression has become widespread.

(d) E = 16. Only a few very small islands of order do survive.

A remark made by Henon and Heiles is that the portion of area covered by chaoticorbits increases very fast with the size of the perturbation (i.e., with energy). On theother hand, this my be observed also in the figures for the non resonant case.

Again, higher values of E cause the energy surface to open, and most orbits escapequite rapidly to infinity, with the exception of a few islands that do survive for somerange of E.

5.4.4 A puzzling question concerning stability

The third example is provided again by Hamiltonian (5.39) with non resonant fre-

quencies, but with opposite signs: we set ω1 = 1 and ω2 = 1−√5

2 . This is a particularly

Page 26: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

132 Chapter 5

a b

c d

Figure 5.5. Poincare section for the Hamiltonian system (5.39) with ω1 = 1

and ω2 = (1−√5)/2 on the energy surface H(x, y) = 0.025.

puzzling case if one raises the problem of stability of the equilibrium.In the harmonic approximation the equilibrium point of the Hamiltonian (5.39)

is a product of two centers; therefore it is obviously stable since the harmonic actions

Φ1 =x21 + y212

, Φ2 =x22 + y222

are first integrals in involution, and the orbits lie on invariant tori. When we add cubicterms the equilibrium remains stable provided the frequencies have the same sign, forin that case the Hamiltonian itself may be used as a Lyapounov function, since itis a first integral. This is true for the models of the previous sections 5.4.2 (the non

Page 27: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 133

resonant model) and 5.4.3 (the resonant case of Henon and Heiles).

In the present case the Hamiltonian can not be used as a Lyapounov function.15

The non linear terms introduce a coupling between the actions, so that they can bothincrease while preserving the constant value of the total energy. Hence the equilib-rium is questionable, and it is spontaneous to ask whether the formal first integralsconstructed as in sect. 5.3.2, proposition 5.11 may help in investigating the stability.

In order to do a first analysis let us draw again the portrait of the Poincaresection. This is represented in fig. 5.5 for the energy surface E = 0.025. Here are a fewcomments.

(a) One sees a central region that appears to be stable and three separated islandsthat are in fact neighbourhoods of a stable periodic orbit in resonance 2 : 3(period 3 in the map). The region in between is full of orbits that escape toinfinity.

(b) An enlargement of the island on the left that brings into evidence the complexityof the dynamics close to the border of the island. One observes in particular aproliferation of periodic orbits that create further islands in the region close tothe border of the big island.

(c) A further enlargement of a small rectangle of panel b, located inside the borderof the islands, in the upper right part. The enlarged zone may be identifiedby looking at the coordinates on the axes. One sees further islands that couldhardly be represented in panel b. Also, a little spot close to an unstable pointis observed.

(d) An enlargement of the spot close to the unstable point of panel c that showsa small chaotic region almost invisible in the previous figures. This points outthe existence of a complicated structure of small islands inside a chaotic layer.

The overall question now is: does the proliferation of islands and of chaotic regions showup at every smaller and smaller scale? The answer is: yes, it does. This is a consequenceof the last geometric theorem of Poincare [89] [16], that will not be discussed in thepresent notes. Roughly, what actually happens is that stable/unstable periodic orbitsappear in pairs. Stable periodic orbits are surrounded by islands circumscribed byseparatrices emanating from the corresponding unstable orbits. A chaotic behaviourarises close to the separatrices that can be revealed via a suitable enlargement. Theislands around a stable periodic orbits exhibit essentially the same structure on adifferent scale. This emerges with evidence in the successive enlargements of figure 5.5.What can not be seen in the figure, but only imagined, is that this process creates astructure of boxes into boxes that repeats indefinitely on a smaller and smaller scale.

15 A remarkable example is represented by the triangular solutions of Lagrange for the cir-cular restricted problem of three bodies. The frequencies of the harmonic approximationhave different signs, and this is indeed the reason why the stability of these points isstill an open problem. A similar situation occurs for the approximation of Lagrange andLaplace for the precession of the perihelia and the nodes of the planetary orbits.

Page 28: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

134 Chapter 5

Figure 5.6. Comparison between the portrait of the Poincare section and the

level lines of a formal first integral. The first figure is the Poincare section on

the energy surface H = 0.0025. The remaining figures (see also the figure on

next page) are the level lines of the formal first integral truncated at orders 5, 9,

12, 24, 33, 38, 45, 60 and 70. The levels drawn correspond to the values of the

truncated integral at the initial points of the orbits represented in the Poincare

section.

5.4.5 Use of the formal first integrals

We are now going to compare the Poincare section obtained by numerical integrationwith the curves constructed using the first integrals of sect. 5.3.2, proposition 5.11.

The series may be explicitly constructed up to high order with the help of a com-puter. Indeed, all functions involved in the calculation are homogeneous polynomials,that may be easily represented in machine format by just storing the coefficients in

Page 29: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 135

Figure 5.6. (continued).

Page 30: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

136 Chapter 5

an appropriate order. Moreover, all the process of construction of formal first integralreduces to simple algebraic manipulations of the coefficients of the polynomials.16

Having constructed the expansion of a first integral Φ(x1, x2, y1, y2) up to a given

order we proceed as follows. We fix the energy and the initial point (x(0)2 , y

(0)2 ) of an

orbit. Then we calculate on the one hand the Poincare sections for that orbit and, onthe other hand, the level lines implicitly defined by

Φ(0, x2, y1, y2)∣

y1=ψ(x2,y2)= Φ(0, x

(0)2 , y1, y

(0)2 )∣

y1=ψ(x(0)2 ,y

(0)2 )

,

namely for the value of Φ at the initial point. In case of convergence we expect that thepoints obtained by Poincare section lie on the curves constructed via the first integral.Hence, a direct comparison may be made by inspection of the figures.

The results are presented in fig. 5.6. The energy value has been set to H(x, y) =0.0025, and the first figure represents the Poincare section for some orbits that sur-round the central periodic orbit. We forget the orbits that surround the 2/3 resonance,that can not expect to be described by a power series expansion since they are sep-arated from the central island by a chaotic region. With the purpose of identifyinga possible convergence region, if any, the behaviour of the first integral is illustratedby drawing the level lines for different truncations of the series, from order 5 to order70. One will notice that in a region around the central periodic orbit (fixed point inthe map) the curves in Poincare section and the level lines of the first integral arevisually the same. In the external region the correspondence is quite poor, although avague resemblance may be imagined in the figures for order 9, 12 and 24. Successivetruncations to higher orders give raise to a sort of progressive destruction of the innercurves. Such a behaviour is reminiscent of that of asymptotic series.

The figures suggest that there is a stable region around the equilibrium thatcan be effectively investigated using the formal integrals. But the trouble is that theprocess of progressive destruction of inner curves raises doubts on the convergence ofthe series, and so also on the effectiveness of the formal first integrals. The rest ofthe chapter provides some answers to the latter question, using quantitative analytictools.

5.5 Quantitative estimates

In order to investigate the problem of convergence in its simplest form let us considerthe case of a cubic perturbation, i.e., a canonical system with Hamiltonian

(5.41) H(x, y) = H0(x, y) +H1(x, y) ,

where H1 a homogeneous polynomial of degree 3. E.g, the Hamiltonian (5.39) that wehave used in numerical investigation has precisely this form. The fact that the pertur-bation is just a polynomial, and not a series as in (5.33), is not really relevant: dealing

16 A method of representation of polynomial together with a discussion concerning writingprograms of algebraic manipulation adapted to perturbation expansions may be found,e.g., in [40].

Page 31: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 137

with the full series introduces longer formulæ, but not really new information.17 Letus also assume that the Hamiltonian is an even function of the momenta (which justguarantees the consistency of the formal construction). Moreover, and more relevant,let us introduce a non increasing sequence {αs}s≥1 of positive real numbers such thatthe nonresonance condition

(5.42) |〈k, ω〉| ≥ αs for k ∈ Zn , 0 < |k| ≤ s+ 2

is satisfied. We have αs → 0 for s → ∞, of course. Thus, we can perform the formalconstruction of the n first integrals

Φ(x, y) = Il +Φ1 + . . . , Il =1

2(x2l + y2l ) , 1 ≤ l ≤ n

as described in sect. 5.3.2. In particular, the solution will be made unique by thecondition Φs ∈ R for s ≥ 1.

Equation (5.36) for the Hamiltonian (5.41) takes now the simpler form

(5.43) LH0Φs = Ψs :=

{

−{H1, Il} for s = 1 ,

−{H1,Φ(l)s−1} for s > 1 .

5.5.1 Norms

In order to introduce quantitative estimates we must be able to evaluate the size ofa function. To this end, let x, y be real or complex variables, and write a generichomogeneous polynomial f(x, y) of degree s as

f(x, y) =∑

|j+k|=sfj,kx

jyk ,

with real or complex coefficients fj,k. The polynomial norm of f is then defined as

(5.44) ‖f‖ =∑

j,k

|fj,k| .

Considering a domain

(5.45) ∆ ={

(x, y) ∈ R2n : x2

l + y2l ≤ 2, 1 ≤ l ≤ n}

,

namely the Cartesian product of n disks of radii in the coordinate planes xl, yl, thesize of the homogeneous polynomial f(x, y) is bounded in ∆ by

|f(x, y)| ≤ s‖f‖ .

17 The general and detailed treatment can be found in [33]. The simpler case discussedhere, however, gives a full insight on the problem.

Page 32: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

138 Chapter 5

5.5.2 Technical estimates

The aim now is to translate the recursive set of equations (5.43) into a set of recursive

estimates on the norms of the polynomials Φ(l)s and Ψ

(l)s . This is given in three steps.

(i) Transform the Hamiltonian to complex variables ξ, η, defined by the canonicaltransformation (5.37); this will allow us to exploit the diagonal form of thelinear operator LH0

.(ii) Construct the power expansion of a first integral Φ up to a given order r in com-

plex variables by recursively solving (5.43). This involves two basic operations:a Poisson bracket to determine Ψs, and the inversion of the linear operatorLH0

· = {H0, ·}.(iii) Transform back to real variables by the inverse of the transformation (5.37).

We want to know how the norms are propagated through these operations. This isgiven by the following technical estimates.

Lemma 5.13: The transformation (5.37) to complex variables changes the norm ofa homogeneous polynomial of degree s at most by a factor 2s/2; the same holds truefor the inverse.

Proof. The proof is trivial: just observe that every real variable x, y is replaced bytwo complex variables ξ, η, with a division by

√2. Therefore every variable is affected

by a factor√2 in the norm of the transformed function. The same applies to the

inverse transformation. Q.E.D.

The estimate (5.47) is a consequence of the diagonal form of LH0in complex variables

together with the definition of the sequence αs in (5.42).

Lemma 5.14: The Poisson bracket between two homogeneous polynomials f andg of degree s and r respectively is estimated by

(5.46)∥

∥{f, g}∥

∥ ≤ sr‖f‖ ‖g‖ .

Proof. Write the explicit expression of the Poisson bracket as

{f, g} =∑

j,k,j′,k′

fjkgj′k′xj+j′yk+k

n∑

l=1

jlk′l − j′lklxlyl

.

Then use the definition of the norm to compute18

‖{f, g}‖ ≤∑

j,k,j′,k′

|fjk| |gj′k′ |n∑

l=1

(jlk′l + j′lkl)

≤ sr(

j,k

|fjk|)(

j′,k′

|gj′k′ |)

= sr‖f‖‖g‖ .

18 Use∑n

l=1(jlk

l+j′lkl) ≤ s∑n

l=1(k′

l+j′l) = sr, the first inequality being due to 0 ≤ jl ≤ s

and 0 ≤ kl ≤ s.

Page 33: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 139

Q.E.D.

Lemma 5.15: The unique solution Φs ∈ R of equation (5.43) is estimated in com-plex variables by

(5.47)∥

∥Φ(l)s

∥ ≤ 1

αs

∥Ψ(l)s

∥ ,

with αs satisfying (5.42).

Using the estimates above we readily obtain

Lemma 5.16: The unique solution Φ = Il+Φ1+. . . of equations (5.36) with Φs ∈ R

satisfies

(5.48)∥

∥Φs∥

∥ ≤ abs−1 (s+ 1)!∏sl=1 αl

for s ≥ 1 .

with some positive constants a and b and with the sequence α1, α2, . . . satisfying (5.42).

Proof. By definition of the norm we have ‖Il‖ = ‖x2l + y2l ‖/2 = 1. On the other

hand H1 is known, and so also ‖H1‖ to which a factor 23/2 must be added due to thetransformation to complex variables. Taking into account that Φs−1 has degree s+ 1translate the recurrent formulæ (5.43) for Φs into the recurrent formula for the norms

‖Ψs‖ ≤ 3 · 23/2(s+ 1)‖H1‖ ‖Φs−1‖ , ‖Φs‖ ≤ ‖Ψs‖αs

.

A recurrent application of the latter estimate and a further multiplication of ‖Φs‖ by2s/2 due to the transformation back to real variables proves the claim, with constantsa and b that could be explicitly evaluated (which is not necessary here). Q.E.D.

5.5.3 Truncated first integrals

The estimate (5.48) clearly does not allow us to prove the convergence of the expan-sions of the first integrals Φ(1), . . . ,Φ(n). This, of course, does not prove that they donot converge: it could be simply due to the fact that we are unable to find betterestimates. Recall, however, that non–convergence of these first integrals as a genericfact has been proved by Siegel [93]. Nevertheless we may try to unveil the informationhidden in the formal first integrals, despite the possible non convergence.

Let us suppose that we have performed the procedure up to some arbitrary orderr, and consider the truncated first integrals

(5.49) Φ(l,r) = Il +Φ(l)1 + . . .+ Φ(l)

r , l = 1, . . . , n ,

whose time derivatives clearly are

(5.50) Φ(l,r) = {H1,Φ(l)r } .

The purpose is to show that, even if we do not know the explicit form of the integralsΦ(l,r), we can nevertheless obtain significant information from the technical estimatesof the previous section. Suppose that, for a real system, we are only able to observethe values of the actions I1, . . . , In during time. This is the typical situation; e.g., it

Page 34: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

140 Chapter 5

Figure 5.7. Representation of the time evolution of the harmonic action un-

der the effect of deformation and of noise. (a) The deformation causes a quasi

periodic, bounded oscillation of the action Il(t). (b) The noise adds further fre-

quencies, and may cause a secular variation (drift), at most linear in time. The

figure represents the worst case.

is what we do when we compute the osculating elements of the orbit of a planet fromthe observed positions. The natural question here is how much these quantities, whichare the approximate constants of our problem, can vary in time. Using the fact thatthe Φ(l,r)’s are, hopefully, better conserved than the Il’s, we find the bound

|Il(t)− Il(0)| ≤ |Il(t)− Φ(l,r)(t)|+ |Φ(l,r)(t)− Φ(l,r)(0)|+ |Φ(l,r)(0)− Il(0)| .

In order to simplify the discussion, let us assume, just for a moment, that the Φ(l,r)’sare exact first integrals (this could, of course, be true for a very particular Hamil-tonian), so that the central term in the formula above vanishes. The two remainingterms are estimated by noting that for any (x, y) in a polydisk ∆ defined by (5.45)we have

(

Φ(l,r) − Il)

(x, y)∣

∣ =∣

∣Φ(l)1 (x, y) + . . .+ Φ(l)

r (x, y)∣

∣ < dr3 ,

with a constant dr depending of course on r, and provided is small enough. Thisexpresses the fact that Φ(l,1) − Il is a polynomial starting with a term of degree 3.

Page 35: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 141

Thus we get the bound

(5.51)∣

∣Il(t)− Il(0)∣

∣ < 2dr3

for any t, provided one can guarantee that the orbit is confined for all times in ∆. Asimple choice is to assume, e.g., Il(0) = 2/3, and to ask

∣Il(t)− Il(0)∣

∣ < 2/6. Thus,comparing the r.h.s. of the latter inequality and of (5.51), we conclude that the orbitis confined forever in the annulus 2/6 < Il(t) < 2/2 provided < 1/(12dr). Sincewe have assumed that the functions Φ(l,r) are exact first integrals, which implies thatthe system is integrable, For, the orbits lie on invariant tori Φ(l,1) = const instead ofIl = const, and the motion is quasiperiodic with frequencies depending on the torus.Therefore the variation in time of the harmonic actions I1, . . . , In clearly appearsto be due to a deformation of the invariant surfaces. This is illustrated in fig. 5.7–(a): the action Il(t) exhibits a quasi periodic oscillation and it is confined in thestrip Il,min ≤ Il(t) ≤ Il,max, whose width is estimated by (5.51). By the way, thiscorresponds to what we observe in panel a of figure 5.3, at the resolution allowed bythe scale of the figure.

Let us now take into account the fact that the Φ(l,r)’s are not exact first integrals.By (5.50), the time derivative Φ(l,r) is a homogeneous polynomial of degree r + 3, sothat for (x, y) ∈ ∆R we have

(5.52)∣

∣Φ(l,r)(x, y)

∣< Cr

r+3 ,

with a constant Cr, depending of course on r. Thus, superimposed to the deformation,there may occur a secular variation19 in time of Il(t) due to the dynamical evolutionof Φ(l,r); note however that if is sufficiently small then such an evolution is hopefullyvery slow with respect to the one due to the deformation, in view the estimate (5.52).Following Nekhoroshev, let us refer to this effect as a noise that causes a slow drift.

The situation is illustrated in fig. 5.7–(b): the disturbing term −{H1,Φ(l)r } introduces

changes that we are unable to predict and are superimposed to the quasi periodicmotion, and moreover the lower and upper limit of the action may be subject to aslow change due to the drift. An a priori bound is given by the estimate (5.52), whichguarantees that the width of the strip where Il(t) is confined increases at most linearlywith t. The figure represents the worst case, in which the drift is actually linear. Takinginto account the effect of the noise, the bound (5.51) must be changed to

(5.53) |Il(t)− Il(0)| < 2dr3 + Cr

r+3t .

However, we still have to ensure that Il(t) <12

2 (since all the estimates hold on ∆R).This, of course, cannot be true for all times, but we can use the fact that the noise hasa small effect in order to ensure that it holds for a quite large time. To do this, it isenough, for example, to ask Cr

r+3t ≤ dr3 (i.e.,we allow

∣Φ(l,r)(t)− Φ(l,r)(0)∣

∣ to be

19 The adjective secular has been introduced by Kepler in order to describe changes in theelements of planetary orbits that may be observed on a time scale of centuries.

Page 36: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

142 Chapter 5

of the same order of the deformation). Thus, by a little modification of the argumentabove we conclude

Proposition 5.17: Consider the Hamiltonian (5.41), and assume that the frequen-cies ω1, . . . , ωn satisfy the condition (5.42) for a suitable sequence {αs}s≥1. Then, forany integer r ≥ 1 there exist constants dr and Cr such that for sufficiently smalland for any orbit with initial point (x0, y0) ∈ ∆2/3 we have

(5.54)∣

∣Il(t)− Il(0)∣

∣ < dr3 for |t| < Tr :=

drCrr

.

The latter proposition expresses the property of complete stability introducedby Birkhoff. As a matter of fact, Birkhoff could not do better because he did noteven attempt to estimate the dependence of the constants dr and Cr on r. Thanksto lemma 5.16 we do have such an estimate. We have indeed, with possibly a minorchange of the constants a and b,

(5.55) Cr = abr(r + 2)!

α1 · · ·αr.

The determination of dr may require a few more considerations. Observe that

∣Φ(l)1 (x, y) + . . .+ Φ(l)

r (x, y)∣

∣ ≤∥

∥Φ(l)1

∥3 + . . .+∥

∥Φ(l)r

∥r+2

≤ ab

α13 + . . .+

abr

α1 · · ·αrr+2 .

Thus, there exists a constant dr such that for 3 ≤ s < r the coefficient of s in theestimate above is bounded, e.g., by (dr/2)

s−3. With such a value for dr the sum aboveis bounded by dr

3 provided the condition “ sufficiently small” of proposition 5.17 isunderstood as “ < 1/dr ”.

5.5.4 Exponential estimates

The estimates of the proposition above exhibit a strong dependence on r, so that oneexpects, a priori, that the choice of r has a significant impact on the final result. Weshould also take into account an æstetic remark. The truncation order r is introducedhere as an extraneous element: it is clearly nonsense to introduce a “user definedparameter” in the statement of a stability result concerning a physical system. In apractical application the actual choice of a quite low r may be dictated in fact bythe practical impossibility of performing an explicit expansion of a first integral upto high orders. However, from a theoretical viewpoint we have estimates concerningthe dependence on r of the constants Cr and dr in proposition 5.17. This suggeststhe possibility of sharpening the theory by looking for a choice of r which is, in somesense, the optimal one. This leads in a quite natural way to exponential estimates firstobtained by Moser [79] and Littlewood [71] [72].

Let us consider the estimate (5.52), with Cr as in (5.55). Recalling that {αs}s≥1

is a non increasing sequence, one realizes that the expression Crr, considered as a

Page 37: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

Nonlinear oscillations 143

function of r for a fixed , has a minimum. Indeed, we clearly have

Crr+3 =

b(r + 2)

αrCr−1

r−1 ;

hence the estimate is improved by adding the order r if b(r+2)/αr < 1. Thus, havingchosen so that the initial point of the orbit satisfies, e.g., (x(0), y(0)) ∈ ∆2/3, wemay choose an optimal value r of the truncation order r as

(5.56) r =αrb

− 2 .

Here, “optimal” means that this particular value minimizes the estimate for the timederivative Φ(l,r) in the neighbourhood ∆ of the equilibrium.

A more explicit analytic estimate can be obtained if one knows more about thesequence {αs}s≥1. To this end, the natural choice is to make use of the diophantinecondition obtained in sect. 4.2.3, namely to set

αs = γ(s+ 2)−τ , s > 1

with suitable constants γ > 0 and τ ≥ n− 1. Then one is naturally led to choose theoptimal truncation order ropt = ropt() as

(5.57) ropt =

(

)1/(τ+1)

− 2 , ∗ =γ

b.

Thus, perturbation theory is useful if < ∗, in such a way that one has ropt ≥ 1. Thisallows us to remove r from the estimates of proposition 5.17, by substituting ropt()in place of r, so that the truncation order turns out to be determined by the size ofthe domain containing the initial data.

Let me now stress only the relevant steps, and skip the technical details. Writethe r.h.s of the estimate (5.52) in a simple form as (r!)a(/∗)r, and minimize it bysetting r = (∗/)1/a. Then, using r! ∼ rre−r (by Stirling’s formula), we readily get

(r!)a(

)r

∼ exp

[

−(

)1/a]

.

Thus, the noise becomes exponentially small with the inverse of the size of the domain.This is the simplest case of an exponential stability estimate. The formal statementof the theorem is the following:

Theorem 5.18: Consider the canonical system with Hamiltonian (5.41), and as-sume that the harmonic frequencies ω satisfy the nonresonance condition |〈k, ω〉| >γ |k|−τ for k ∈ Zn, with real constants γ > 0 and τ ≥ 0. Then there exists positiveconstants β1, β2 and ∗ such that the following statement holds true: if

≤ 3−(τ+1)∗ ,

Page 38: normal form - unimi.it · The lemma is a standard argument in linear algebra, hence I omit the proof. Remark. The matrix Mis not uniquely determined, due to the fact that the eigen-values

144 Chapter 5

then for any orbit with initial point (x(0), y(0)) ∈ ∆2/3 we have

∣Il(t)− Il(0)∣

∣ < β12 for |t| < T ∗ := β2 exp

[

−(

)1/(τ+1)]

.

The exponential dependence of T ∗ on 1/ constitutes a significant improvement withrespect to the theory of complete stability of Birkhoff. In the very words of Littlewood,“while not eternity, this is a considerable slice of it”.