Section 1 - Some Mathematics

60
0 ASTR3002 (Black Holes and the Universe) Lilia Ferrario, Department of Mathematics, Mathematical Sciences Institute Version of August 18, 2011

Transcript of Section 1 - Some Mathematics

Page 1: Section 1 - Some Mathematics

0

ASTR3002 (Black Holes and the Universe)

Lilia Ferrario, Department of Mathematics,Mathematical Sciences Institute

Version of August 18, 2011

Page 2: Section 1 - Some Mathematics

2

Contents

Curvilinear coordinate systems 7Euclidean space 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Curvilinear coordinates in 2D Euclidean space . . . . . . . . . . . . 8

What is the geometrical meaning? . . . . . . . . . . . . . . . . 9

Riemannian Spaces 13Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Example: transformation from Cartesians to Polars . . . . . . 15Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Contravariant vectors . . . . . . . . . . . . . . . . . . . . . . . 16Covariant vectors . . . . . . . . . . . . . . . . . . . . . . . . . 17Visualisation in 2D Euclidean space . . . . . . . . . . . . . . . 18

Transformation Laws - summary . . . . . . . . . . . . . . . . . . . 19Metric and Riemann Geometry . . . . . . . . . . . . . . . . . . . . 20Riemann space in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Tensor properties of gij . . . . . . . . . . . . . . . . . . . . . . 21Length and magnitude of a vector . . . . . . . . . . . . . . . . 22Contravariant metric tensor . . . . . . . . . . . . . . . . . . . 23Raising and lowering of indices . . . . . . . . . . . . . . . . . 24Angle between two vectors . . . . . . . . . . . . . . . . . . . . 24Coordinate basis vectors . . . . . . . . . . . . . . . . . . . . . 25

Calculus of Variations (reading material) 29The Euler-Lagrange Equations . . . . . . . . . . . . . . . . . . 30Example: The Brachistochrone problem . . . . . . . . . . . . 34Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Hamilton’s Principle 39Principle of least Action . . . . . . . . . . . . . . . . . . . . . . . . 39

Example: Simple Pendulum . . . . . . . . . . . . . . . . . . . 40Application to special relativity . . . . . . . . . . . . . . . . . 41

3

Page 3: Section 1 - Some Mathematics

4 CONTENTS

Geodesics 45Euler-Lagrangian equation and Christoffel symbols . . . . . . . . . 47First integrals of the equations . . . . . . . . . . . . . . . . . . . . . 49Parallel Displacement . . . . . . . . . . . . . . . . . . . . . . . . . . 49Relationship to space-time: the geodesic principle . . . . . . . . . . 50

Example: Christoffel symbol of second kind . . . . . . . . . . 51Example: Christoffel symbol of second kind . . . . . . . . . . 53

How to calculate geodesics . . . . . . . . . . . . . . . . . . . . . . . 54Example: Geodesics on the surface of the unit sphere . . . . . 56

Covariant derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 58Intrinsic (total, absolute) derivative . . . . . . . . . . . . . . . . . . 60Parallel transport of a contravariant vector . . . . . . . . . . . . . . 60Parallel displacement and geodesics . . . . . . . . . . . . . . . . . . 61Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61Parallel transport in matrix form . . . . . . . . . . . . . . . . . . . 62

Example: application to polar coordinates . . . . . . . . . . . 63Parallel transport along a circle . . . . . . . . . . . . . . . . . 63Example: Parallel transport around a closed loop in flat space 64Example: Parallel transport around a closed loop on curved

2-D surface . . . . . . . . . . . . . . . . . . . . . . . . 65Covariant derivatives: formal definitions . . . . . . . . . . . . . . . 66Parallel transport of a covariant vector . . . . . . . . . . . . . . . . 67

Covariant derivatives: a summary . . . . . . . . . . . . . . . . 68Riemann-Christoffel tensor . . . . . . . . . . . . . . . . . . . . . . . 69Riemann-Christoffel tensor: symmetries . . . . . . . . . . . . . . . 69

Intrinsic curvature and its relation to parallel transport . . . . 70Riemann curvature tensor . . . . . . . . . . . . . . . . . . . . 70Ricci tensor and scalar . . . . . . . . . . . . . . . . . . . . . . 71Bianchi’s identities . . . . . . . . . . . . . . . . . . . . . . . . 71Einstein’s tensor . . . . . . . . . . . . . . . . . . . . . . . . . 72

The low gravitational field limit 73

Einstein Field Equations 77Field equations of empty space . . . . . . . . . . . . . . . . . . . . 77Field equations in space with matter/radiation . . . . . . . . . . . . 77The matter-energy tensor . . . . . . . . . . . . . . . . . . . . . . . 80

Cosmology 83Observables in astronomy . . . . . . . . . . . . . . . . . . . . . . . 83Quasars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

CONTENTS 5

Naive Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . 84Estimate of age (classical model) . . . . . . . . . . . . . . . . . . . 85Problems with this model . . . . . . . . . . . . . . . . . . . . . . . 86Modern point of view: the Cosmological Principle . . . . . . . . . . 86Model universes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

The Minkowski model . . . . . . . . . . . . . . . . . . . . . . 87Universes of constant positive curvature: spatial distance element . 87

Case A: 1D circumference of a 2-D circle . . . . . . . . . . . . 87Case B: 2D area of a 3D sphere . . . . . . . . . . . . . . . . . 88Case C: 3D area of a 4D sphere . . . . . . . . . . . . . . . . . 88The dimensionless area distance and radial coordinates . . . . 903-D spaces of constant negative curvature (3D pseudospheres) 91Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Further properties . . . . . . . . . . . . . . . . . . . . . . . . . 92

Robertson-Walker metric and Friedmann Equations 95Metric tensor of the universe: Robertson-Walker metric . . . . . . . 95

Metric of space at fixed cosmic time . . . . . . . . . . . . . . . 96Light propagation (redshift) in GR models . . . . . . . . . . . . . . 97Derivation of Friedmann’s equations . . . . . . . . . . . . . . . . . 99

Density parameters . . . . . . . . . . . . . . . . . . . . . . . . 103

Solutions of Friedman’s equations 105Solutions of Friedman’s equations for a matter dominated universe . 105

Parametric solutions for Λ = 0 . . . . . . . . . . . . . . . . . . 105The static universe: introduction of the Λ term . . . . . . . . 109

More general treatment of Friedmann’s equations . . . . . . . . . . 110Case A: Λ = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 110Case B: k = 0 and Λ 6= 0 . . . . . . . . . . . . . . . . . . . . . 111

Solutions of Friedman’s equations for a radiation dominated universe114GR equations in the radiation dominated universe . . . . . . . 116

The Steady-State Universe . . . . . . . . . . . . . . . . . . . . . . . 116Bondi, Hoyle and Gold Universe (1948) . . . . . . . . . . . . . 116De Sitter Universe (1917) . . . . . . . . . . . . . . . . . . . . 118

Page 4: Section 1 - Some Mathematics

6 CONTENTS

Curvilinear coordinate systems

Euclidean space 2D

In a 2D coordinate system, pairs of real numbers (a, b) are attached to pointsor objects. Coordinates just labels and do not need to have a geometricalsignificance. In a 2D Euclidean space the coordinates take a geometrical

(x,y)

(x+dx,y+dy)

x

y

i

j r

dr

significance through the introduction of the concept of distance between twoneighbouring coordinate points: (a, b) and (a + da, b + db). Moreover, in aEuclidean space it is possible to find a “rectangular” Cartesian coordinatesystem (x, y) where the distance between (x, y) and (x+ dx, y + dy) is givenby

ds2 = dx2 + dy2

with

• Position vector: ~r = (x, y)

• Unit coordinate vectors: ~i = (1, 0), ~j = (0, 1)

• General vectors: ~A = (ax, ay), ~B = (bx, by)

7

Page 5: Section 1 - Some Mathematics

8 CURVILINEAR COORDINATE SYSTEMS

• Inner (scalar) product: ~A · ~B = axbx + ayby which gives

~i ·~i = ~j ·~j = 1, ~i ·~j = 0

.

For neighbouring points:

~r = (x, y)

~r + d~r = (x+ dx, y + dy)

d~r = dx~i+ dy~j

⇒ ds2 = d~r · d~r = dx2 + dy2.

Thus, we simply take the coordinate displacements in the x and y directions,square them and add them up.

In a more general coordinate system, even in 2D Euclidean space, thingsare not that easy!

Curvilinear coordinates in 2D Euclidean space

We want to transform from a Cartesian coordinate system (x, y) to a curvi-linear coordinate system (u, v) via

~r = ~r(u, v)

or, in components,x = x(u, v), y = y(u, v)

clearly

dx =∂x

∂udu+

∂x

∂vdv

dy =∂y

∂udu+

∂y

∂vdv

We know that this is locally invertible at each point if

J =

∂x

∂u

∂x

∂v∂y

∂u

∂y

∂v

6= 0

The inverse transformation is

u = u(x, y) v = v(x, y)

CURVILINEAR COORDINATES IN 2D EUCLIDEAN SPACE 9

Since we started with Cartesian coordinates (possible because space is Eu-clidean), then:

ds2 = dx2 + dy2 = guudu2 + 2guvdudv + gvvdv

2

with

guu =

(∂x

∂u

)2

+

(∂y

∂u

)2

, gvv =

(∂x

∂v

)2

+

(∂y

∂v

)2

and

gvu = guv =∂x

∂u

∂x

∂v+∂y

∂u

∂y

∂v

Note that ds2 6= du2 + dv2. Instead, we have

ds2 = [du, dv]

[guu guv

gvu gvv

] [dudv

]

(1)

If we now define [dudv

]

=

[guu guv

gvu gvv

] [dudv

]

(2)

thends2 = dudu+ dvdv (3)

which is not quite du2 + dv2, but it looks like it!The quantities (du, dv) are called the contravariant components of d~r

and they contra-vary the coordinate changes, while (du, dv) are called thecovariant components of d~r.

What is the geometrical meaning?

Consider the point P : ~r(u0, v0), thus

~g1 =∂~r

∂u(u0, v0) tangent to u coordinate curve at P

~g2 =∂~r

∂v(u0, v0) tangent to v coordinate curve at P

These can be used as basis vectors at P to expand any other vector.

d~r =∂~r

∂udu+

∂~r

∂vdv

ord~r = ~g1du+ ~g2dv

Page 6: Section 1 - Some Mathematics

10 CURVILINEAR COORDINATE SYSTEMS

dr

P

g2

P

g1

y

v=const

u=const

(u−coordinate curves)

(v−coordinate curves)

v−coordinate curve

− coordinate curveu

x

g2

g1

thus, (du, dv) are the components of d~r in the direction of the coordinatecurves u and v, but note that ~g1 and ~g2 are not generally unit vectors.

We can also define two basis vectors ~g1 and ~g2 orthogonal to the u and vcoordinate curves, such that

d~r = ~g1du+ ~g2dv

with~g1 · ~g1 = 1, ~g1 · ~g2 = 0~g2 · ~g2 = 1, ~g2 · ~g1 = 0

(4)

Note that we have introduced a normalisation. Then

d~r = ~g1du+ ~g2dv = ~g1du+ ~g2dv

Dot both sides with ~g1

~g1 · ~g1du+ ~g1 · ~g2dv = du

Now dot both sides with ~g2

~g1 · ~g2du+ ~g2 · ~g2dv = dv

Hence [dudv

]

=

[~g1 · ~g1 ~g1 · ~g2

~g1 · ~g2 ~g2 · ~g2

] [dudv

]

= g

[dudv

]

(5)

where

g =

[guu guv

guv gvv

]

(6)

CURVILINEAR COORDINATES IN 2D EUCLIDEAN SPACE 11

is the “metric tensor” (we’ll see this in great detail later).

Any vector ~A can be written as

~A = A1~g1 + A2~g2 covariant expansion (7)

~A = A1~g1 + A2~g

2 contravariant expansion (8)

Thus, given a vector A then A1 and A2 are the contravariant components of~A with respect to ~g1, ~g2, while A1 and A2 are the covariant components of ~Awith respect to ~g1, ~g2.

Note also that A1 = A1 and A2 = A2 hold only for a rectangular coordi-nate system or for an orthogonal curvilinear coordinate system.

Page 7: Section 1 - Some Mathematics

12 CURVILINEAR COORDINATE SYSTEMS

Riemannian Spaces

Transformations

Consider a set of points or objects in an n-dimensional space. A coordinatesystem is a systematic rule for assigning

(x1, x2, · · · , xn)

to points or objects in this space. It is a systematic labelling which may nothave any geometrical significance. A meaning is given to “distance” betweenpoints through the introduction of a metric. Thus, the purpose of coordinatesis to label points, and the purpose of a metric is to connect them togethergeometrically. We first consider coordinate transformations and introducethe concept of a metric later.

Consider a change of coordinate from

(x1, x2, · · · , xn) → (x1, x2, · · · , xn)

according to some functional relationship

x1 = f1(x1, x2, · · · , xn) ⇒ x1(x1, x2, · · · , xn)... =

...

xn = fn(x1, x2, · · · , xn) ⇒ xn(x1, x2, · · · , xn)

The functions f 1, f 2, · · · , fn are restricted to be real, single valued and con-tinuous with first and second derivatives over the region of interest.

Under the transformation T above

d~x = (dx1, dx2, · · · , dxn) → d~x = (dx1, dx2, · · · , dxn)

according to

dxj =n∑

k=1

∂xj

∂xkdxk

13

Page 8: Section 1 - Some Mathematics

14 RIEMANNIAN SPACES

or, if we use the implicit summation over repeated indices with non-repeatedindices being assigned all possible values in turn

dxj =∂xj

∂xkdxk

or

dxj = ajkdx

k

[a] = ajk =

∂xj

∂xk

that is

dx1

dx2

...dxn

=

∂x1

∂x1· · · ∂x1

∂xn

∂x2

∂x1· · · ∂x2

∂xn

......

...∂xn

∂x1· · · ∂xn

∂xn

dx1

dx2

...dxn

=

a11 · · · a1

n

a21 · · · a2

n...

......

an1 · · · an

n

dx1

dx2

...dxn

If det[a] = J 6= 0 in the neighbourhood of a point, then the transformationis locally invertible and

dxj =∂xj

∂xkdxk

Thus the matrix∂xj

∂xkis the inverse of

∂xj

∂xk

and∂xk

∂xi

∂xj

∂xk= δj

i

where δji is the Kronecker-delta:

δji =

1 indices same number0 indices different number.

Or, in matrix notation

dxj = ajkdx

k

[a] = ajk =

∂xj

∂xk

[a] [a] = [I]

TRANSFORMATIONS 15

This reflects the general requirement that any transformation followed by itsinverse is unity.

Example: transformation from Cartesians to Polars

Consider the transformation from Cartesians (x, y) to polar coordinates (r, θ):

x = r cos θ, y = r sin θ

with the ’old’ coordinates (x, y) being expressed as functions of the ’new’ones (r, θ). This gives the following “inverse transformation”

dx =∂x

∂rdr +

∂x

∂θdθ = cos θdr − r sin θdθ

dy =∂y

∂rdr +

∂y

∂θdθ = sin θdr + r cos θdθ

thus [dxdy

]

=

[cos θ −r sin θsin θ r cos θ

] [drdθ

]

and for the “forward transformation” we have

dr = cos θdx+ sin θdy

dθ = −1

rsin θdx+

1

rcos θdy

thus[drdθ

]

=

[cos θ sin θ

−sin θ

r

cos θ

r

] [dxdy

]

Now set

(x1, x2) = (x, y); (x1, x2) = (r, θ)

so that

[a] =∂xj

∂xk=

cos x2 sin x2

−sin x2

x1

cos x2

x1

[a] =∂xj

∂xk=

[cos x2 −x1 sin x2

sin x2 x1 cos x2

]

Note that aa = I.

Page 9: Section 1 - Some Mathematics

16 RIEMANNIAN SPACES

Vectors

In this section we will learn about contravariant and covariant vectors. Co-variance and contravariance refer to how the components of a coordinatevector transform under a change of basis.

Contravariant vectors

If a set of numbers ν1, ν2, · · · , νn in a coordinate system x1, x2, · · · , xnis related to another set of numbers ν1, ν2, · · · , νn in another coordinatesystem x1, x2, · · · , xn according to the transformation equations

νj =∂xj

∂xkνk

under a coordinate transformation for which

dxj =∂xj

∂xkdxk

then this set of numbers is called a contravariant vector of the first rank (orfirst order).

Thus, a contravariant vector Ak satisfies a transformation law which mustbe the same as that satisfied by the components of the coordinate displace-ment vector, that is

Aj =∂xj

∂xkAk

For example, in a rotation, stretching or dilation transformation, the vec-tor itself is left unchanged since its components change in a manner thatcancels the change introduced by the co-ordinate transformation. So, if thecoordinate bases are rotated, say, clockwise, the vector components are ro-tated anti-clockwise. Thus, they contra-vary with a change of basis.

Example: the tangent vector

Consider a curve xj(u). The tangent vector to this curve is

dxj(u)

du

is a contravariant vector. In fact if

xj(x1(u), x2(u), · · · , xn(u))

VECTORS 17

from the chain rule:dxj

du=∂xj

∂xk

dxk

du

or, if we set νj =dxj

duand νk =

dxk

du:

νj =∂xj

∂xkνk

which is the contravariant transformation law.

Covariant vectors

A covariant vector uj has components which transform according to the law

uk =∂xj

∂xkuj

under a coordinate transformation xp → xp. That is, they transform throughthe inverse Jacobian matrix.

Unlike contravariant vectors, the components of a covariant vector trans-form in the same way as the reference axes. Thus, they co-vary with a changeof basis. However, if the transformation of basis only involves rotation, thenthere are no differences on how the components of contravariant and covari-ant vectors behave. These differences become obvious when other types oftransformation (e.g. stretching) come into play. Very generally speaking,contravariant vectors are “regular vectors” with units of distance (such as adisplacement) or distance times some other unit (such as velocity or accel-eration); covariant vectors, instead, have units of one-over-distance such asgradient. If you change units (a special case of a change of coordinates) frommetres to, say, millimeters (a scale factor of 1/1000) a displacement of 1mbecomes 1000mm, which gives a contravariant change in numerical value. Incontrast, a gradient of 1 K/m becomes 0.001 K/mm, which is a covariantchange in value. Tensors are another type of quantity that behave in thisway; in fact a vector is a special type of tensor.

Example: gradient of scalar field

The simplest example of a covariant vector is the gradient of a scalar field

Φ(x1, x2, · · · , xn)

given by

νj =∂Φ

∂xj

Page 10: Section 1 - Some Mathematics

18 RIEMANNIAN SPACES

In fact, given the transformation

xk(x1, x2, · · · , xn)

from the chain rule we get

∂Φ

∂xk=∂Φ

∂xj

∂xj

∂xk

or, if we set νk =∂Φ

∂xkand νj =

∂Φ

∂xj:

νk =∂xj

∂xkνj covariant transformation

Visualisation in 2D Euclidean space

e

e

^

^

2

1

V

V

V

V11

V

2

2

A point P (~V ) in a 2D Euclidean space and a general Cartesian coordinate

system that is not orthonormal (non-orthonormal basis vectors (~e1, ~e2)) can

be represented either as the perpendicular projection of ~V onto the axes

Vi = ~V · ~ei, i = 1, 2

or by the parallel projection defined by

~V = V k~ek

= V 1~e1 + V 2~e2

hence, in generalVi = V k~ek · ~ei 6= V i

and Vi = V i only if we use orthonormal unit vectors in Euclidean space(i.e. a rectangular Cartesian Coordinate system). Now, depending on whichrepresentation is used, the components of the vector (V k or Vk) transformdifferently with respect to the coordinate axes changes and this is embodiedin the different transformation laws that we require of covariant and con-travariant vectors in a Riemann space.

TRANSFORMATION LAWS - SUMMARY 19

Transformation Laws - summary

• Scalar transformation: Φ = Φ (invariant). A scalar is a tensor of rankzero.

• Contravariant vector transformation: Bj =∂xj

∂xkBk (free index in nu-

merator).

• Covariant vector transformation: Cj =∂xk

∂xjCk (free index in denomi-

nator).

• Contravariant 2nd rank tensor transformation: T jk =∂xj

∂xl

∂xk

∂xmT lm.

• Covariant 2nd rank tensor transformation: Tjk =∂xl

∂xj

∂xm

∂xkTlm.

• Mixed tensor transformation of the second rank: U jk =

∂xj

∂xl

∂xm

∂xkU l

m.

• Tensors of rank greater than 2:

Almnij =

∂xl

∂xq

∂xm

∂xs

∂xn

∂xt

∂xp

∂xi

∂xr

∂xjAqst

pr

each index of Almnij is paired with its counterpart in Aqst

pr , that is, l ispaired with q, m is paired with s, n is paired with t, etc. The order ofthe fractions is the same as the order of the indices in Almn

ij , that is, l,m, n, i and j. The covariant fractions are the same as the contravariantfractions, but inverted.

• Transformation of the Kronecker-delta tensor:

δij =

∂xi

∂xl

∂xm

∂xjδlm =

∂xi

∂xl

∂xl

∂xj=∂xi

∂xj= δi

j

thus δij is a 2nd rank tensor.

• Transformation of scalar or inner product: Given a covariant vector Uk

and a contravariant vector V k, the quantity

UkVk

Page 11: Section 1 - Some Mathematics

20 RIEMANNIAN SPACES

is the scalar or the inner product. This quantity is invariant undercoordinate transformattion. Thus

Uk =∂xj

∂xkUj, V k =

∂xk

∂xlV l

⇒ UkVk =

∂xj

∂xk

∂xk

∂xlUjV

l

= δjlUjV

l = UjVj

Note: The quantity UkV k is not invariant under a general transfor-mation. The introduction of covariant vectors allows invariants to beconstructed. This plays an important role in physical laws.

Metric and Riemann Geometry

So far, there has been no notion of distance between points xi and xi + dxi

in our mathematical space. This has to be introduced in terms of the metrictensor gij. A metric is a rule which specifies the distance moved in terms ofthe coordinate changes.

Spaces of common experience are 1, 2, and 3-D Euclidean spaces. Thedistance between points depends on Pythagoras’ law, which can be expressedin rectangular Cartesian Coordinates as

ds2 = dx2 + dy2

Setting

x1 = x, x2 = y

we have

ds2 = gijdxidxj

where

gij =

[1 00 1

]

If we change to a general curvilinear coordinate system as we did in

x = x(u, v), y = y(u, v)

or

u = u(x, y), v = v(x, y)

RIEMANN SPACE IN RN 21

then

ds2 =

[∂x

∂udu+

∂x

∂vdv

]2

+

[∂y

∂udu+

∂y

∂vdv

]2

= a(u, v)du2 + b(u, v)dudv + b(u, v)dudv + c(u, v)dv2

writing x1 = u, x2 = v and x1 = x, x2 = y, then

ds2 = gij(xk)dxidxj

where

gij =

[a(x1, x2) b(x1, x2)b(x1, x2) c(x1, x2)

]

Note that gij is not diagonal and, furthermore, its value can depend onposition (x1, x2). The metric looks more complicated, but in this particularcase we know that we can transform back to the diagonal Euclidean form(because we started off with such a metric).

Riemann space in Rn

We now assume that a space with coordinates (x1, x2, · · · , xn) has a metric

ds2 = gijdxidxj

which we assume to be an invariant under general coordinate transformation(that is, the distance is a scalar). Furthermore, we assume that det(g) 6= 0and

gij = gji and gij = gij(xk)

This is quadratic in coordinate changes as in the curvilinear Euclidean case(a system in which the coordinate lines are curved), but that is where thesimilarity ends. In particular, it may not be possible to find a coordinatesystem where the metric takes the unit diagonal form. Euclidean space En

is a special case of Rn.

Tensor properties of gij

Supposeds2 = gijdx

idxj (9)

and we transform coordinates

xi = xi(xj)

Page 12: Section 1 - Some Mathematics

22 RIEMANNIAN SPACES

then, since ds2 is an invariant under coordinate transformation

ds2 = gij∂xi

∂xkdxk ∂x

j

∂xpdxp

= gij∂xi

∂xk

∂xj

∂xp︸ ︷︷ ︸

gkp

dxkdxp

= gkpdxkdxp

hence, ds2 has the same form as (9) if gkp =∂xi

∂xk

∂xj

∂xpgij. Thus g transforms

as a second order covariant tensor.

Length and magnitude of a vector

If we use rectangular Cartesian coordinates, the magnitude of any vector ~Vis

|~V |2 = ~V · ~V =n∑

k=1

V kV k

If we transform to curvilinear coordinates in En we know that, in general,|~V |2 6=

∑nk=1 V

kV k (as we have demonstrated previously in 2D). The quantitythat is invariant under such a coordinate transformation is

|~V |2 = ~V · ~V = gjkVjV k

In Rn, we define the magnitude of a contravariant vector by

|~V |2 = ~V · ~V = gijViV j

where gij is the metric in that space. We already had by definition

ds2 = gijdxidxj

(we’ll see later that gijdxi is the covariant component of dxj and therefore

that ds2 = dxjdxj). So that

ds2 = |d~x|2 = d~x · d~x

which we have assumed to be invariant under a coordinate transformation.As an exercise, you can check that |~V |2 = gijV

iV j is also invariant undera general coordinate transformation.

RIEMANN SPACE IN RN 23

Note that the metric tensor can give an indefinite form

gjkVjV k = VkV

k

which can be negative in some directions and zero in others. In this case weuse √

e gjkV jV k

for the magnitude, with e = ±1. A vector with zero length is called a nullvector.

Contravariant metric tensor

Since gjj is symmetric, we can construct its inverse (since det(g) 6= 0). Wedenote the inverse by gij. gij is in fact a contravariant tensor of second order.

gij gjk = δik

where

gij =cofactor of gij

det(g)

Example: plane polar coordinates

In plane polar coordinate, we have

ds2 = dr2 + r2dθ2

thus

gij =

(1 00 r2

)

gij =

(1 0

01

r2

)

.

Example: spherical coordinates

In spherical polar coordinate, we have

ds2 = dr2 + r2dθ2 + r2 sin2 θdφ2

thus

gij =

1 0 00 r2 00 0 r2 sin2 θ

gij =

1 0 0

01

r20

0 01

r2 sin2 θ

Page 13: Section 1 - Some Mathematics

24 RIEMANNIAN SPACES

Raising and lowering of indices

Given a tensor gik and its inverse gik and a contravariant vector Uk, we canconstruct a covariant vector by

Vj = gjkUk

One can check that Vj is covariant by showing that it satisfies the transfor-mation law for covariant vectors.

We can associate another vector to Uk

W l = glpgpkUk = δl

kUk = U l

For this reason, we use the same letter for both quantities and say that

U j and Uj = gjkUk

are the contravariant and covariant forms of the same vector ~U .Note that

|~U |2 = U jgjkUk = U jUj

is the inner product of the vector with itself. This proces is called “the raisingand lowering of indices” and can be extended to higher order tensors:

U α = g αβUβ

Uα = gαβUβ

Tαβγ = gαµT βγµ = gαµgµνT

νβγ

and regard Tαγβ and Tα

βγ as mixed contravariant and covariant components.Note that while it is relatively easy to visualise covariant and contravari-

ant components in a Euclidean space, it is harder to give them a visualrepresentation in a general Riemaniann space.

Angle between two vectors

In tensor calculus, the angle between two vectors ~U and ~V is defined to be

cos θ =UjV

j

|~U ||~V |by analogy with ordinary vector calculus. This is justified since the RHS isinvariant under general coordinate transformations.

A curve in Rn can be described by

xj = xj(s)

RIEMANN SPACE IN RN 25

where s is the arclength (distance) along the curve. The unit tangent to thecurve is

T j =dxj

ds

One can check that |~T |2 = 1.The angle between two curves is defined as the angle between the unit

tangents using the formula given above.Note:

• if gij is indefinite, the angle may be imaginary.

• In all cases, the vectors ~U and ~V are orthogonal if

UjVj = 0

Coordinate basis vectors

Consider~A = (A1, A2, · · · , An) = Ak~ek (10)

The vectors ~ek are covariant coordinate basis vectors.

~A · ~B = AαBα = AαBβg

αβ

︸ ︷︷ ︸

B α

(11)

~A · ~B = AαBα = AαB βgαβ︸ ︷︷ ︸

(12)

But

~A · ~B = Aα~eα ·B β~eβ

= AαB β~eα · ~eβ (13)

Therefore, comparison between (12) and (13) gives

~eα · ~eβ = gαβ (14)

These vectors are generally not unit nor orthogonal. If ~A = Ak~ek, then

~A · ~ej = Ak~ek · ~ej

= Akgkj

= gjkAk = Aj

Thus, the scalar product of ~A with vector ~ej gives the covariant component

of ~A.

Page 14: Section 1 - Some Mathematics

26 RIEMANNIAN SPACES

We can also define the contravariant coordinate basis ~eα by writing ~A =Aα~e

α, etc.). Then

~A · ~B = Aα~eα ·Bβ~e

β (15)

= AαBβ~eα · ~eβ (16)

hence comparison between (11) and (15) gives

~eα · ~eβ = g αβ (17)

Example

0

1

2

321

1 2 3 4 5 6

A

B

x

x

1

1

x x2 2

7

Consider a Euclidean space E2 with Cartesian Coordinates (x1, x2), (sothat ds2 = (dx1)2 + (dx2)2), and

Ai = (3, 0), gij =

(1 00 1

)

, Bi = (1, 1)

The covariant components of the vectors ~A and ~B are

Ai = gijAj = (3, 0), Bi = gijB

j = (1, 1)

(remember that if we use orthonormal unit vectors in Euclidean space andrectangular Cartesian coordinates - then Vi = V i).

Consider now the stretching, non-orthogonal transformation given by

x1 = 2x1 + 0x2

x2 = 0x1 + x2

so that the transformation matrices are

∂xi

∂xj=

(2 00 1

)

,∂xi

∂xj=

(1/2 00 1

)

RIEMANN SPACE IN RN 27

So, Ai transforms to

Ai =∂xi

∂xjAj =

(2 00 1

)(30

)

=

(60

)

Ai =∂xj

∂xiAj =

(1/2 00 1

)(30

)

=

(3/20

)

Note thatAiAi = 9, AiAi = 9,

which means that length is preserved even though the transformation is notorthogonal. Also, you can see that now

Ai 6= Ai

The transformation of Bi is

Bi =∂xi

∂xjBj =

(2 00 1

)(11

)

=

(21

)

Bi =∂xj

∂xiBj =

(1/2 00 1

)(11

)

=

(1/21

)

Note again that

BiBi = 2 · 1

2+ 1 = 2 = BiBi

andBi 6= Bi

You can check that we could have equally lowered the index through theuse of g in the new frame,

Bi = gijBj

where

gij =∂xk

∂xi

∂xp

∂xjgkp

Page 15: Section 1 - Some Mathematics

28 RIEMANNIAN SPACES

Calculus of Variations (readingmaterial)

We shall now deal with the problem of finding the shortest distance betweentwo points. On a flat (euclidean) surface, the shortest distant is a straight linethat connects the two points. On the other hand, if the given curve is on a 3Dsurface, there may be more than one solution, the so-called “geodesics”. Thiskind of problem has important ramifications into many branches of physics.

We know that the length of a plane curve C between two points withabscissas x0 and x1 is given by

s =

∫ x1

x0

[

1 +

(df

dx

)2]1/2

dx C : f(x0) = 0, f(x1) = b

where f(x) belongs to a class of functions with continuous first derivatives.We shall now look for functions f(x) in this class for which the arc length

ds is stationary. This is called the “geodesic problem”.Note that the quantities that we vary are the functions f(x). For this

reason, in the calculus of variations, we deal with extremals of “functionals”.A functional is a special “function” which depends on the changes of oneor more functions taking on the role of the arguments. Thus, functionalsmap from the space of functions to R. In our case, this special function isthe arc length s. Thus, in the above example, we can think of f(x) as the“independent variable”.

We recall that in problem of finding extrema of a function of severalvariables

f(x1, x2, x3, · · · cn) xi ∈ R

We know that every continuous function in a bounded closed region R attainsits maximum or minimum values either in the interior or on the boundary ofR (Weierstrass Theorem). If the function is differentiable, a necessary (butnot sufficient) condition for an extremum at an interior point is

∂f

∂xi

= 0

29

Page 16: Section 1 - Some Mathematics

30 CALCULUS OF VARIATIONS (READING MATERIAL)

However, in variational calculus we are not interested in finding the minimaand maxima of a function. Instead, we want to find a function that minimisesa given integral. Therefore, we are concerned with finding relative extremalsof the functionals. That is, extremals relative to a certain neighbourhoodof the functional arguments for which the functional takes on the extremalvalue.

Definition 1. We say that a function f is in the neighbourhood of the func-tion g if

|f − g| < h, h > 0

for all values of (x1, x2, x3, · · · cn) xi ∈ R

Unlike in the case of functions of real variables, where extrema are guar-anteed, the same is not true in the present case, because of the limitationsthat may be imposed on the class of admissible functions.

The Euler-Lagrange Equations

Consider the functional

J(f) =

∫ b

a

F (f,df

dx, x) dx

where F

(

f,df

dx, x

)

has second order continuous derivatives and a and b are

fixed given end-points.Suppose now that J(f) has an extremum value (maximum or minimum)

when the path taken is

C0 : f = f0(x), a ≤ x ≤ b

and consider a neighbouring path

Cǫ : f = fǫ(x) = f0(x) + ǫη(x), a ≤ x ≤ b

where ǫ is a small constant and η(x) is an arbitrary differentiable functionof x. Thus, the variation introduced to move to a neighbouring path is a“small” function ǫη(x) that is added to f to perturb it.

Thus, the value of J(f) on the neighbouring path is given by

Cǫ : J(ǫ) =

∫ b

a

F (f0 + ǫη(x),d

dx(f0 + ǫη(x)), x) dx

31

a necessary condition of extremality is given by

dJ(ǫ)

∣∣∣∣ǫ=0

= 0, ∀η(x)

that is, there is no better path in the neighborhood of f0. This is very similarto the standard maximization/minimization problems.

dJ(ǫ)

dǫ=

∫ b

a

η(x)

∂F(f0 + ǫη(x), df0

dx+ ǫdη

dx, x)

∂x+dη

dx

∂F(

f0 + ǫη(x), df0

dx+ ǫdη(x)

dx, x)

(df

dx

)

dx

Since we want

dJ(ǫ)

∣∣∣∣ǫ=0

= 0

then we need to set

∫ b

a

η(x)

∂F(f0,

df0

dx, x)

∂x+dη

dx

∂F(f0,

df0

dx, x)

(df

dx

)

dx = 0

Integrate by parts (∫U dv = UV −

∫V du) the second term of the integrand:

∫ b

a

η(x)∂F(f0,

df0

dx, x)

∂x+

η(x)

∂F(f0,

df0

dx, x)

(df

dx

)

b

a

−∫ b

a

η(x)d

dx

∂F(f0,

df0

dx, x)

(df

dx

)

dx = 0

or

η(x)

∂F(f0,

df0

dx, x)

(df

dx

)

b

a

+

∫ b

a

η(x)

∂F(f0,

df0

dx, x)

∂x− η(x)

d

dx

∂F(f0,

df0

dx, x)

(df

dx

)

dx = 0

Page 17: Section 1 - Some Mathematics

32 CALCULUS OF VARIATIONS (READING MATERIAL)

Thus, we must have

η(x)

∂F(f0,

df0

dx, x)

(df

dx

)

b

a

= 0

and

∫ b

a

η(x)

∂F(f0,

df0

dx, x)

∂x− d

dx

∂F(f0,

df0

dx, x)

(df

dx

)

dx = 0

or

η(b)∂F(f0,

df0

dx, x)

(df

dx

)

∣∣∣∣∣∣∣∣b

− η(a)∂F(f0,

df0

dx, x)

(df

dx

)

∣∣∣∣∣∣∣∣a

and

∫ b

a

η(x)

∂F(f0,

df0

dx, x)

∂x− d

dx

∂F(f0,

df0

dx, x)

(df

dx

)

dx = 0

The integral condition, which must be satisfied for all functions η(x)belonging to the chosen class of function, leads to the fundamental equationthat determines the extremal path:

∂F(f0,

df0

dx, x)

∂x− d

dx

∂F(f0,

df0

dx, x)

(df

dx

)

= 0

which is called the “Euler-Lagrange” equation.The cases given by either when both ends of the curve f(a) and f(b) are

prescribed or one of them is prescribed and the other varies give rise to thefollowing boundary cases to the Euler-Lagrange equation.

1. If f(a) and f(b) are both prescribed, then there is no variation of theend points (fixed end-points) so that η(a) = η(b) = 0.

33

2. If f(a) is prescribed and f(b) is variable, then one must impose

∂F(f0,

df0

dx, x)

(df

dx

)

∣∣∣∣∣∣∣∣b

= 0

η(a) = 0

3. If f(b) is prescribed and f(a) is variable, then one must impose

∂F(f0,

df0

dx, x)

(df

dx

)

∣∣∣∣∣∣∣∣a

= 0

η(b) = 0

4. If neither of the end-points is prescribed, then η(a) and η(b) are arbi-trary, so that one one must impose

∂F(f0,

df0

dx, x)

(df

dx

)

∣∣∣∣∣∣∣∣a

= 0

∂F(f0,

df0

dx, x)

(df

dx

)

∣∣∣∣∣∣∣∣b

= 0

The constraint∂F

(df

dx

) = 0

is called “transversality condition”.The equations we have just see can be generalised to functionals that

depends on several functions f1, f2, f3, · · · , fn of the variable x. In this casewe shall have n the Euler-Lagrange equations

∂F

∂fi

− d

dx

∂F

(dfi

dx

)

for i = 1, 2, · · ·n

Page 18: Section 1 - Some Mathematics

34 CALCULUS OF VARIATIONS (READING MATERIAL)

with up to a maximum of 2n transversality conditions

∂F

(dfi

dx

) = 0

at any end point where fi is not prescribed.

Example: The Brachistochrone problem

Determine the plane curve of quickest descent of a particle moving on asurface under gravity between two fixed points (0, 0) and (a, b).Solution:

Consider a coordinate system with the origin at (0, 0) and the x−axisdirected downward. For zero total energy E we get

E =1

2mv2 −mgx = 0 ⇒ v =

2gx

An element of distance traversed by the particle is given by:

ds =

(dx)2 + (dy)2 =

√√√√

[

1 +(

dydx

)2

x

]

dx

But

v =ds

dtdt =

ds

v=

ds√2gx

Thus

t =

∫ a

0

ds

v=

1

(2g)1/2

∫ b

a

[

1 +(

dydx

)2

x

]1/2

dx

Different functions y(x) will give different values for t. We call t a functionalof y(x). Our problem is to find the minimum of this functional with respectto possible functions y(x). That is, here we have

F (y,dy

dx, x) =

[

1 +(

dydx

)2

x

]1/2

In this example, F does not depend explicitly on y so that

∂F

∂y= 0

35

and the Euler-Lagrange equation gives

0 − d

dx

∂F

(dy

dx

)

= 0

⇒ ∂F

(dy

dx

) = C

⇒∂

[1+( dy

dx)2

x

]1/2

(dy

dx

) = C

⇒dydx

[x(1 + dy

dx

)]1/2= C

⇒ dy

dx=

[C2x

1 − C2x

]1/2

⇒ y =

∫ [x

A− x

]1/2

dx A =1

C2

Use the trigonometric substitution x = A2(1 − cos θ) = A sin2

(θ2

)to get

y =

∫√

sin2(

θ2

)

1 −(

θ2

)A sin

2

)

cos

2

)

= A

sin2

2

)

=A

2(θ − sin θ) +B

To determine the constant of integration B we let θ = 0 at x = 0. But sinceat x = 0, we also have y = 0 then B = 0. The constant A can be determinedby requiring that the curve pass through (a, b), that is:

b =A

2(θ − sin θ), and a =

A

2(1 − cos θ)

Thus, the general solution in parametrised form is

x =A

2(1 − cos θ)

y =A

2(θ − sin θ)

Page 19: Section 1 - Some Mathematics

36 CALCULUS OF VARIATIONS (READING MATERIAL)

The path is a part of a cycloid. ,

Example

Find the curve that minimises

J =

∫ 1

0

[(df

dx

)2

+ 1

]

dx

where f(0) = 1 and f(1) is free.Solution:

Here

F

(

f,df

dx, x

)

=

(df

dx

)2

+ 1

∂F

(df

dx

)

∣∣∣∣∣∣∣∣x=1

= 0 (from transversality condition)

and∂F

∂x= 0,

∂F

∂(

dfdx

) = 2

(df

dx

)

Use Euler-Lagrange equation

0 − d

dx

[

2

(df

dx

)]

= 0

d2f

dx2= 0

df

dx= C

Since

∂F

(df

dx

)

∣∣∣∣∣∣∣∣x=1

= 2

(df

dx

)∣∣∣∣x=1

= 0 from transversality condition

then the integration constant above is C= 0 for x = 1 and further integrationwill yield

f(x) = D

37

Since f(0) = 1, then D = 1 and the solution is

f(x) = 1, 0 ≤ x ≤ 1

With this function, we get J = 1, which is clearly a minimum.,

Page 20: Section 1 - Some Mathematics

38 CALCULUS OF VARIATIONS (READING MATERIAL)

Hamilton’s Principle

Principle of least Action

Of all possible paths along which a dynamical system may move from onepoint to another, within a specified time interval, and consistent with anyconstraints, the actual path followed is that which minimises the integral withrespect to time of the difference between the kinetic and potential energies.This is also called the principle of least action.

J =

∫ t2

t1

(T − V ) dt =

∫ t2

t1

Ldt

L = T − V

where T and V are the kinetic and potential energies respectively and Lis the Lagrangian. The Lagrangian contains all the information about thesystem and the forces that are acting on it. We can easily see that these leadto Newton’s equations by looking at the 1-D motion along the x−axis. If weexpress the gravitational force field as ~f = −∇V and

T =1

2mx2

V (x) = −∫ x

x0

f(x) dx

L = T − V =1

2mx2 +

∫ x

x0

f(x) dx

J =

∫ t1

t0

L(x, x) dt the “Action”

It follows that the optimal path must satisfy (see notes on “Calculus ofVariations”)

∂L

∂x− d

dt

(∂L

∂x

)

= 0

39

Page 21: Section 1 - Some Mathematics

40 HAMILTON’S PRINCIPLE

or

f(x) − d

dt(mx) = 0 Newton’s Law of Motion

One can similarly verify that for general 3-D motion with respect to a rect-angular coordinate system, L(x, y, z, x, y, z), the 3D Euler-Lagrangian equa-tions lead to the Newton’s Laws of motion in the 3 coordinate directions.The same is true for a generalised coordinate system (e.g. spherical, polar).

The power of Hamilton’s principle is that it expresses a fundamentalphysical principal in a covariant (coordinate independent) form.

Example: Simple Pendulum

T =1

2m(lΦ)2

V = mgl(1 − cos Φ)

L(Φ, Φ) =1

2m(lΦ)2 −mg(1 − cos Φ)

wherem is the mass of the bob, l is the length of the rod, g is the gravitationalacceleration, Φ is the angle made by the rod and the pendulum and Φ isthe angular speed of the bob. The Euler-Lagrangian equations in the Φcoordinate are

∂L

∂Φ− d

dt

(∂L

∂Φ

)

= 0

−mgl sin Φ − d

dt(ml2Φ2) = 0

Φ +g

lsin Φ = 0

In mechanical problems, where generalised coordinates q1, q2, · · · , qn areused to specify a system, L(q1, q2, · · · , qn, t), the quantities

pi =∂L

∂qi

are called the generalised momenta and the Euler-Lagrangian equations takethe form

∂L

∂qi− dpi

dt= 0, i = 1, · · · , n

If a coordinate qk does not appear in the Lagrangian, the coordinate is saidto be ignorable, and the corresponding generalised momentum pk is thenconserved.

PRINCIPLE OF LEAST ACTION 41

Thus, in the pendulum example we have just seen, the generalised coor-dinate is Φ, the generalised speed is Φ, the generalised momentum is

pΦ =∂L

∂Φ= ml2Φ (in fact, the angular momentum)

and the generalised force is

∂L

∂Φ= −mgl sin Φ (in fact, the moment of force)

Note that the generalised speed, momentum and force do not have the usualdimensions of their standard counterparts.

Application to special relativity

B

A

ε η

t

x

x (t)+ (t)

x (t)

The position of a particle in space-time is

xi = (ct, x, y, z) = (x0, x1, x2, x3)

where~r = (x, y, z) = (x1, x2, x3)

is its position in 3-D space. In special relativity, the space-time interval is

c2dτ 2 = ds2 = c2dt2 − dx2 − dy2 − dz2

= dx02 − dx12 − dx22 − dx32

is an invariant (i.e. the same for all inertial observers), with the speed oflight embodied in this assumption and where τ is the proper-time measured

Page 22: Section 1 - Some Mathematics

42 HAMILTON’S PRINCIPLE

by a clock moving with the particle. We can show that the equations of SRcan be derived if we postulate that free particles move so as to extremise thespace-time interval ds.

Now consider a free-falling particle (that is, no forces acting on it) withposition vector ~r = ~r(t). In order to find its free falling path, we need tooptimise the space-time path length between the initial (1) and final (2)points along its path. That is, we need to extremise the integral

I =

∫ 2

1

dt2 − 1

c2(dx2 + dy2 + dz2)

If we use the proper time τ along the path for describing the position of theparticle (i.e. x(τ), y(τ), z(τ)), then

I =

∫ 2

1

t2 − 1

c2(x2 + y2 + z2)dτ =

∫ 2

1

√2Ldτ

where x =dx

dτ, etc. and where L is defined by

L =1

2

[

t2 − 1

c2(x2 + y2 + z2)

]

︸ ︷︷ ︸

(dτ/dτ)2

=1

2

(dτ

)2

=1

2

by the definition of proper time. Because of the appearence of the kineticenergy in L, it is referred to as a generalised Lagrangian in special relativity(note that V = 0 because there is no force). The Euler-Lagrange equationsfor the 4D path: ~r(τ) = [t(τ), x(τ), y(τ), z(τ)] are

d

(

∂√

2L

∂t

)

− ∂√

2L

∂t= 0

d

(

∂√

2L

∂x

)

− ∂√

2L

∂x= 0

d

(

∂√

2L

∂y

)

− ∂√

2L

∂y= 0

d

(

∂√

2L

∂z

)

− ∂√

2L

∂z= 0

or

1√2L

[d

(∂L

∂t

)

− ∂L

∂t

]

= 0

1√2L

[d

(∂L

∂x

)

− ∂L

∂x

]

= 0

etc

PRINCIPLE OF LEAST ACTION 43

Henced

(t)

= 0d

dτ(x) = 0

d

dτ(y) = 0

d

dτ(z) = 0

whose solutions are

t = γ (constant) x = e (constant)y = f (constant) z = g (constant)

It follows that the free-falling particle must move with constant velocity ~vwhose components (in any inertial reference system) are given by

vx =dx

dt=x

γ

vy =dy

dt=y

γ

vz =dz

dt=z

γ

Using the constraint L =1

2and x = γvx, y = γvy, z = γvz,, we obtain

1

2

[

t2 − 1

c2(x2 + y2 + z2)

]

=

[

γ2 − γ2 v2

c2

]

=1

2

so

γ =1

1 − v2

c2

∆t =dt

dτ= γ ⇒ ∆τ

1 − v2

c2

(time dilation in SR)

These results are well known in special relativity and are valid in a “flat”spacetime with no matter.

The whole of GR can also be derived on the assumption that free particlesmove so as to extremise the space-time interval, but in GR the expressionfor ds2 is much more complicated.

In the next sections we shall develop the tools that we will need for anintroduction to General Relativity and its application to cosmology.

Page 23: Section 1 - Some Mathematics

44 HAMILTON’S PRINCIPLE

Geodesics

B

A

geodesic

distorted curve

Consider a curve ~x(λ) in Rn with metric

ds2 = gαβdxαdxβ

The arc length between fixed end points A and B is

∫ B

A

ds =

∫ λB

λA

gαβdxα

dxβ

dλdλ (18)

=

∫ λB

λA

√2Ldλ

where

L =1

2gαβ

dxα

dxβ

dλ=

1

2gαβx

αxβ

45

Page 24: Section 1 - Some Mathematics

46 GEODESICS

is called the Lagrangian. Dots are used ford

dλ. Note that if ds2 = c2dτ 2

(τ=proper time), then xα represents a velocity in the reference system of thefree-falling particle and 1

2gαβx

αxβ becomes the kinetic energy of the particle.The potential energy is V = 0, because the particle is free-falling.

If the curve is to be a geodesic, its length is to be stationary against anysmall variation (e.g. see the slightly distorted curve in the figure). Examplesof geodesics are straight lines in a plane and great circles on a sphere. Notethat in the latter case, geodesics could give the shortest or longest distance.In what follows we choose the parameter λ to be an affine parameter, thatis, one for which L is constant along the geodesic curve. Thus, if we take

λ = s, this guarantees L =1

2throughout, which simplifies the final result.

Such choice has two advantages:

1. The geodesic equations simplify to the affine or Euler form

d

ds

(∂L

∂xα

)

− ∂L

∂xα= 0 (19)

2. L =1

2is always available as a first integral of these equations which is

of great help in their solutions.

Let’s now re-write the Euler-Lagrangian equations. We have

L =1

2gβγx

βxγ

Thus

∂L

∂xα=

1

2gβγ

(∂xβ

∂xα

)

xγ +1

2gβγ

(∂xγ

∂xα

)

=1

2gβγδ

βαx

γ +1

2gβγδ

γαx

β

=1

2gαγx

γ +1

2gβαx

β

=1

2gαγx

γ +1

2gαβx

β since gij = gji

= gαβxβ

The quantity

pα =∂L

∂xα

EULER-LAGRANGIAN EQUATION AND CHRISTOFFEL SYMBOLS 47

is the canonically conjugate momentum to the coordinate xα.

∂L

∂xα=

1

2

∂gβγ

∂xαxβxγ

Hence, the Euler-Lagrangian equations become

d

ds

(

gαβdxβ

ds

)

− 1

2

∂gβγ

∂xα

dxβ

ds

dxγ

ds= 0

where we have re-introducedd

ds.

Note that these correspondences and the justification of the terminol-ogy (Lagrangian, momenta, etc.) will become apparent when we identifygeodesics with the trajectories of free-falling particles in general relativity.

These differential equations (α = 1, 2, · · · , n) must be satisfied by thefunctions xα(s) of arc-length s along a geodesic provided ds 6= 0 (the case ofnull geodesics must be treated separately).

Euler-Lagrangian equation and Christoffel sym-

bols

d

ds

(

gαβdxβ

ds

)

− 1

2

∂gβγ

∂xα

dxβ

ds

dxγ

ds= 0 Euler-Lagrangian

∂gαβ

∂xγ

dxγ

ds

dxβ

ds+ gαβ

d2xβ

ds2− 1

2

∂gβγ

∂xα

dxβ

ds

dxγ

ds= 0

gαβd2xβ

ds2+

(∂gαβ

∂xγ− 1

2

∂gβγ

∂xα

)dxβ

ds

dxγ

ds= 0

Now, since∂gαβ

∂xγ=

1

2

[∂gαβ

∂xγ+∂gβα

∂xγ

]

since gij = gji

then

gαβd2xβ

ds2+

[1

2

∂gαβ

∂xγ+

1

2

∂gβα

∂xγ− 1

2

∂gβγ

∂xα

]dxβ

ds

dxγ

ds= 0

Since β and γ are dummy variables, they can be interchanged

∂gαβ

∂xγ

dxβ

ds

dxγ

ds=∂gαγ

∂xβ

dxγ

ds

dxβ

ds

Page 25: Section 1 - Some Mathematics

48 GEODESICS

so we get

gαβd2xβ

ds2+

1

2

[∂gβα

∂xγ+∂gγα

∂xβ− ∂gβγ

∂xα

]dxβ

ds

dxγ

ds= 0. (20)

The quantity

Γβγα = [βγ, α] =1

2

[∂gβα

∂xγ+∂gγα

∂xβ− ∂gβγ

∂xα

]

is called the Christoffel symbol of the first kind.Now we introduce the notation

gij,k =∂gij

∂xk

so that

Γβγα = [βγ, α] =1

2[gβα,γ + gγα,β − gβγ,α]

Consider again

gαβd2xβ

ds2+

1

2[gβα,γ + gγα,β − gβγ,α]

dxβ

ds

dxγ

ds= 0.

Multiply by gσα noting that gσαgαβ = δσβ

gσαgαβd2xβ

ds2+

1

2gσα [gβα,γ + gγα,β − gβγ,α]

dxβ

ds

dxγ

ds= 0

d2xσ

ds2+

1

2gσα [gβα,γ + gγα,β − gβγ,α]

dxβ

ds

dxγ

ds= 0

So, the Euler Lagrangian equations become

d2xσ

ds2+ Γσ

βγ

dxβ

ds

dxγ

ds= 0 (21)

where

Γσβγ =

σβγ

= gσα[βγ, α] = gσαΓαβγ

is called the Christoffel symbol of the second kind. Explicitly

σβγ

=1

2gσα [gβα,γ + gγα,β − gβγ,α] .

as you can see Γσβγ = Γσ

γβ. Note that

d2xσ

ds2+ Γσ

βγ

dxβ

ds

dxγ

ds= 0

FIRST INTEGRALS OF THE EQUATIONS 49

1. These are ordinary second order differential equations and the solutionsare unique when the functions and the first derivatives are prescribedarbitrarily at a starting point.

2. That is, at every point in space, there exists a unique geodesic with anarbitrarily prescribed initial tangent

Tα =dxα

ds

3. If there is a unique solution curve passing through two points in space,this curve is the shortest length joining two points.

First integrals of the equations

In addition toL = constant

there may be other first integrals of the Euler equations. Generally themetric is a function of the position. In some applications, the spacetimemay have some kind of symmetry which allows us to choose coordinates inwhich the metric is independent of one of the coordinate. Let’s say thatsuch a coordinate is u. The Euler-Lagrangian equation corresponding to thecoordinate u then simplifies to

d

ds

(∂L

∂u

)

= 0

which may be immediately integrated to

∂L

∂u= constant along the geodesic.

A coordinate u of this kind is often known as ignorable. The correspondingfirst integralis a momentum (in a generalised sense). Depending on the ap-plication, such momenta may appear in a variety of ways: linear momentum,angular momentum, energy, etc.

Parallel Displacement

When a vector is moved from one point to another without changing itsmagnitude or direction, we say it is parallel displaced. Such idea is perfectlyclear in a flat Euclidean space.

Page 26: Section 1 - Some Mathematics

50 GEODESICS

However, if spacetime is not Euclidean it is not possible to compare di-rections in any unambiguous way, since “parallel” loses its meaning over anextended region.

For any curve xα(s), define

Tα =dxα

ds

This is a vector that is tangent to the curve and is in fact a unit tangentsince

~T · ~T = gαβdxα

ds

dxβ

ds= 1

The geodesic equations can be recast as follows

dTα

ds+ Γα

στTσT τ = 0 (22)

so a geodesic becomes a very special curve along which the unit tangentvector evolves according to the above equation. Along a geodesic we expecta tangent to remain ”parallel” to itself. In fact, the geodesic equation can beused to define what is meant by parallel displacement in Rn.

Relationship to space-time: the geodesic prin-

ciple

The assumption of the invariance of the space-time interval ds2 in specialrelativity is assumed to carry over to general relativity.

• Events occur at points in a (3 + 1) dimensional space-time. The 4-Dspace-time is curved in the presence of matter and is specified by ametric tensor gij(x

α).

• The worldline of a free (inertial) observer is to be a geodesic, that is,the straightest possible worldline. Freely falling objects will alwaysfollow geodesics, which could be curved if there is matter. To followany other path requires an external force (not gravity). The clock ofsuch an observer registers the space-time interval measured along theworld-line, namely, the proper time.

The above is referred to as the geodesic principle and is essentially ahypothesis.

RELATIONSHIP TO SPACE-TIME: THE GEODESIC PRINCIPLE 51

Note that since the geodesic principle is stated in terms of the invariantds, it is a covariant equation, which means that it is independent of thespecific coordinate frames, as laws of physics ought to be.

The expression

Aα =d2xα

ds2+ Γα

βγ

dxβ

ds

dxγ

dsor Aα = xα +α

βγ xβxγ

is zero along a geodesic. This is the path that a free particle will follow.Aα is not zero along other paths and its value gives a measure of how farany given curve departs from straightness. In general relativity, Aα yieldsthe 4-acceleration of a particle. In fact, Aα is a contravariant 4-vector eventhough Γα

βγ is not a tensor.

Example: Christoffel symbol of second kind

Calculate the Christoffel symbol of second kind for plane polar coordinates.

ds2 = dr2 + r2dθ2

Take r = index 1 and θ = index 2. Thus

gij =

(1 00 r2

)

, gij =

(1 0

01

r2

)

so

g11 = 1 g22 = r2 g12 = 0

g11 = 1 g22 =1

r2g12 = 0

Page 27: Section 1 - Some Mathematics

52 GEODESICS

Method 1: Brute force approach

We have Γkij = 1

2gks [gis,j + gjs,i − gij,s] where, g11 = 1, g22 = r2, g12 = 0, g11 =

, g22 =1

r2, g12 = 0, thus the quantities in brackets are

[11, 1] =1

2[g11,1 + g11,1 − g11,1] = 0 (i = 1, j = 1, s = 1)

[11, 2] =1

2[g12,1 + g12,1 − g11,2] = 0 (i = 1, j = 1, s = 2)

[12, 1] =1

2[g11,2 + g21,1 − g12,1] = 0 (i = 1, j = 2, s = 1)

[12, 2] =1

2[g12,2 + g22,1 − g12,2] =

1

2[0 + 2r − 0] = r (i = 1, j = 2, s = 2)

[21, 1] =1

2[g21,1 + g11,2 − g21,1] = 0 (i = 2, j = 1, s = 1)

[21, 2] =1

2[g22,1 + g12,2 − g21,2] =

1

2[2r + 0 − 0] (i = 2, j = 1, s = 2)

[22, 1] =1

2[g21,2 + g21,2 − g22,1] =

1

2[0 + 0 − 2r] = −r (i = 2, j = 2, s = 1)

[22, 2] =1

2[g22,2 + g22,2 − g22,2] = 0 (i = 2, j = 2, s = 2)

So, we have found that the non-zero quantities are

[12, 2] =1

2[g12,2 + g22,1 − g12,2] = r

[21, 2] =1

2[g22,1 + g12,2 − g21,2] = r

[22, 1] =1

2[g21,2 + g21,2 − g22,1] = −r

so that the Christoffel symbols are (with g11 = 1, g22 =1

r2, g12 = 0)

Γ212 = g2l[12, l] = g21[12, 1] + g22[12, 2] =

1

r2· r =

1

r

Γ121 = g2l[21, l] = g21[21, 1] + g22[21, 2] =

1

r2· r =

1

rΓ1

22 = g1l[22, l] = g11[22, 1] + g12[22, 2] = −r

Method 2: Geodesic equation approach

Note that

d

ds

(∂L

∂xi

)

− ∂L

∂xi= 0 ⇔ Ai =

d2xi

ds2+ Γi

kmxkxm = 0

RELATIONSHIP TO SPACE-TIME: THE GEODESIC PRINCIPLE 53

Take again r = index 1 and θ = index 2. Thus, since g11 = 1, g22 = r2, g12 =g21 = 0, then

L =1

2gijx

ixj =1

2(r2 + r2θ2)

For the variable r, we have

(∂L

∂r

)

= r, and∂L

∂r= rθ2, so

Ar =d2r

ds2− r

(dθ

ds

)2

= 0

Now compare with the above Ar =d2r

ds2+ Γr

rrr2 + Γr

rθrθ + Γrθrθr + Γr

θθθ2 to

get the Christoffel symbols:

Γrrr = Γr

rθ = Γrθr = 0,Γr

θθ = −r

For the variable θ, we have

(∂L

∂θ

)

= r2θ, and∂L

∂θ= 0, so

Aθ =d

ds

(

r2dθ

ds

)

=d2θ

ds2+

2

r

dr

ds

ds= 0

so that

Γθrr = Γθ

θθ = 0,Γθθr =

1

r,Γθ

rθ =1

r

Example: Christoffel symbol of second kind

Calculate the Christoffel symbol of second kind for spherical polar coordi-nates.

Take r = index 1, θ = index 2 and φ = index 3. We use again

d

ds

(∂L

∂xi

)

− ∂L

∂xi= 0 ⇔ Ai =

d2xi

ds2+ Γi

kmxkxm = 0

with

g11 = 1, g22 = r2, g33 = r2 sin2 θ, g12 = g13 = g21 = g31 = g23 = g32 = 0

thus

L =1

2gijx

ixj =1

2

[

r2 + r2θ2 + r2 sin2 θφ2]

Page 28: Section 1 - Some Mathematics

54 GEODESICS

For the variable r:

Ar =d

ds

(dr

ds

)

− r

(dθ

ds

)2

− r sin2 θ

(dφ

ds

)2

= 0

For the variable θ:

d

ds

(

r2dθ

ds

)

− r2 sin θ cos θ

(dφ

ds

)2

= 0

Aθ =d2θ

ds2+

2

r

dr

ds

ds− sin θ cos θ

(dφ

ds

)2

= 0

For the variable φ:

d

ds

(

r2 sin2 θdφ

ds

)

= 0

Aφ =d2φ

ds2+

2

r

dr

ds

ds+ 2 cot θ

ds

ds= 0

Now compare Ar, Aθ and Aφ with

Aα =d2xα

ds2+ Γα

βγ

dxβ

ds

dxα

ds= 0

and read out the Christoffel symbols

Γ122 = −r Γ1

33 = −r sin2 θ

Γ212 = Γ2

21 =1

rΓ2

33 = − sin θ cos θ

Γ313 = Γ3

31 =1

rΓ3

23 = Γ332 = cot θ

How to calculate geodesics

Givends2 = dr2 + r2dθ2

Calculate the geodesics.

Solution

L =1

2gijx

ixj =1

2(r2 + r2θ2)

HOW TO CALCULATE GEODESICS 55

For the variable r:d

ds

(∂L

∂r

)

− ∂L

∂r= 0

so thatd2r

ds2− r

(dθ

ds

)2

= 0 (23)

For the variable θ:∂L

∂θ= 0 ⇒ θ ignorable

thusd

ds

(∂L

∂θ

)

− ∂L

∂θ= 0

becomesd

ds

(∂L

∂θ

)

= 0

so that

∂L

∂θ= h = constant

r2dθ

ds= h (24)

We also have the condition

L =1

2(25)

But we only need two of the three above equations to solve the system.

Substitute (24) into (23) and multiply bydr

ds.

dr

ds

d2r

ds2=

h2

r3

dr

ds

1

2

(dr

ds

)2

= − h2

2r3+B′

Set B′ =B2

2, since it must be positive. So

dr

ds= ±

B2 − h2

r2

anddθ

dr=dθ

ds

ds

dr=

h

r2

±1√

B2 − h2

r2

=±1

|r|√

B2r2

h2− 1

Page 29: Section 1 - Some Mathematics

56 GEODESICS

so we obtain

θ = ±sec−1

(Br

h

)

+ c

or

r =h

Bsec(±θ − c) (26)

so, as expected, the geodesics are straight lines.

r

h/B

c

θ −c

Example: Geodesics on the surface of the unit sphere

We are now going to calculate the geodesics on the surface of the unit sphere(r = constant=1). In fact, we already know that these are the so-called greatcircles, but it is instructive to derive this result from the Lagrangian. Themetric on the surface of the unit sphere is

ds2 = r2dθ2 + r2 sin2 θdφ2

where θ is the colatitude and φ is the longitude. The Lagrangian is

L =1

2gijx

ixj =1

2

(dθ

ds

)2

+1

2sin2 θ

(dφ

ds

)2

or

L =1

2θ2 +

1

2sin2 θφ2 =

1

2(27)

Our first integral is L = 1/2, which ensures that the affine parameter on thegeodesic is the distance s measured on the surface of the sphere. Then onecan also see immediately that the coordinate φ does not appear explicitly inL and thus is “ignorable”. So from

d

ds

(∂L

∂φ

)

− ∂L

∂φ= 0

HOW TO CALCULATE GEODESICS 57

since∂L

∂φ= 0, we obtain another first integral:

∂L

∂φ= constant

⇒ φ sin2 θ = J (28)

Eliminate φ between (27) and (28) to get

θ2 = 1 − J2

sin2 θ(sin θ ≥ |J |) (29)

This is a first order DE. Since sin θ ≥ |J |, set |J | = sin θ0. We can solve theDE in equation (29) by separating the variables:

sin θ dθ√

sin2 θ − sin2 θ0

= ds

⇒ cos θ = cos θ0 sin(s− s0) (30)

where s0 is the constant of integration and fixes the origin of the parameters. If we take s = 0 at θ = π/2, then

cos θ = cos θ0 sin s (31)

Now we need to determine φ as a function of s. If we eliminate θ between(28) and (31) we obtain

φ =sin θ0

1 − cos2 θ0 sin2 s

dφ =sin θ0

1 − cos2 θ0 sin2 sds

φ =

∫sin θ0

1 − cos2 θ0 sin2 sds

⇒ tan(φ− φ0) = sin θ0 tan s (32)

where φ0 is the longitude of the equatorial crossing. Equations (31) and (32)taken together are the parametric representation of great circles (check!).

Page 30: Section 1 - Some Mathematics

58 GEODESICS

Covariant derivatives

The expression that we derive and the idea of covariant derivatives, andparallel transport, are very general and apply to any Riemannian space. Aswe have already seen, the metric of a Riemann space can be written as

gij = ~ei · ~ej

hence∂gij

∂xk=∂~ei

∂xk· ~ej +

∂~ej

∂xk· ~ei

but

~ei =∂~r

∂xi, ~ej =

∂~r

∂xj.

Since all mixed second order partial derivatives are continuous, the partialderivatives can be exchanged by Clairaut’s theorem to get

∂~ei

∂xk=

∂xk

∂~r

∂xi=

∂xi

∂~r

∂xk=∂~ek

∂xietc.

Using this result one can verify that

∂~ei

∂xj= [ij, k]~e k

where

[ij, k] =1

2

(∂gik

∂xj+∂gjk

∂xi− ∂gij

∂xk

)

are the Christoffel symbols of the first kind.Then, since ~e k = ~eαg

αk, we have

∂~ei

∂xj= Γα

ij~eα. (33)

Thus, we can think of Γαij as the component in the direction ~eα of the rate

of change of ~ei in the direction ~ej. Now consider a vector field ~A(xj) definedat every point or in some region S. In terms of the coordinate basis vectors,we have

~A(xj) = Ai(xj)~ei

and the change in the vector field between neighbouring points with coordi-nates

P (xi), P (xi + ∆xi)

COVARIANT DERIVATIVES 59

is

∆ ~A = (Ai + ∆Ai)(~ei + ∆~ei) − Ai~ei

⇒ dA = ~eidAi + Aid~ei (to first order)

Therefore, here, we need to stress that while in Minkowski spacetimewith coordinates (−ct, x, y, z) the derivative of a vector ~A = Ai~ei is just

∂ ~A

∂xj=

∂Ai

∂xj~ei, in a general spacetime with arbitrary coordinates, the base

vectors ~ei(xj) vary from point to point so the differential change in ~A(xj)

arises from two sources:

1. The change in the components Ai(xj) as the values xj change.

2. The changes in the base vectors ~ei(xj) as the values xj change.

It follows that

∂ ~A

∂xj=

∂Aα

∂xj~eα +

∂~eα

∂xjAα

=∂Aα

∂xj~eα +

∂~ei

∂xjAi

=

[∂Aα

∂xj+ Γα

ijAi

]

~eα from (33)

= Aα; j~eα

where

Aα; j =

∂Aα

∂xj+ Γα

ijAi

This expression defines the covariant derivative Aα;j of the contravariant vec-

tor Aj(xi).This expression defines the covariant derivative Aα

;j of the contravariant

vector Aj(xi). Thus,∂ ~A

∂xjis a vector with components

∂Aα

∂xj+ Γα

ijAi.

with respect to the base system ~eα(xj). Note that if the Christoffel symbolsvanish identically in S, the reference frame associated with these symbols iscartesian and, the base vectors are independent of the coordinates and

Ai; j =

∂Ai

∂xj

Page 31: Section 1 - Some Mathematics

60 GEODESICS

Intrinsic (total, absolute) derivative

This result can now be used to calculate the rate of change of a vector alongany curve xα(u). This is referred to as the intrinsic derivative. So

d ~A

du=∂ ~A

∂xj

dxj

du

with ~A = Ai(xj)~ei. Therefore

d ~A

du=∂ ~A

∂xj

dxj

du=

(∂Ai

∂xj~ei +

∂~ei

∂xjAi

)dxj

du

=

[∂Aα

∂xj+ Γα

ijAi

]dxj

du~eα

=

[∂Aα

∂xj

dxj

du+ Γα

ijAidx

j

du

]

~eα

=

[dAα

du+ Γα

ijAidx

j

du

]

~eα. (34)

which is the intrinsic derivative with respect to u.

Parallel transport of a contravariant vector

Now consider any contravariant vector AαP defined at P and any smooth

curve C given by xα(u). If we construct at every point on C a vector equal

to ~A in magnitude and parallel to its direction, we have a field of parallelytransported vectors along C. Since the vector does not change along C, then

d ~A

du= 0

Hence, parallely transported vectors along any curve must satisfy[∂Aα

∂xj+ Γα

ijAi

]

= 0 (35)

or, using the intrinsic derivative wrt u given in equation (34)[dAα

du+ Γα

ijAidx

j

du

]

= 0 (36)

Thus, along any curve the parallel transported vector at a neighbouring pointP ′ is

AαP ′ = Aα

P − ΓαβγA

βPdx

γ (37)

PARALLEL DISPLACEMENT AND GEODESICS 61

ordAα = −Γα

βγAβdxγ (38)

Parallel displacement and geodesics

A geodesic may be used to propagate a uniform direction in Riemannianspace. That is, if a geodesic joins P and Q, by definition its unit tangents atP and Q are “parallel”.

With parallel transport one can extend the concept of “straight” linesto curved spaces. Thus, we can say that a line is “straight” if it paralleltransports its own tangent vector.

T

T

geodesics

Parallel displacement provides a tool to find a geodesic curve which passesthrough a point in some direction.

Likewise, if you begin at any point where a contravariant vector is locatedand move in the direction defined by this vector, infinitesimally, parallelydisplacing as you go, then you will be carried over a geodesic.

As we have already seen, along a geodesic the change in the componentsof the unit tangent Tα satisfy

dTα = −ΓαβγT

βdxγ

or if u is any parameter along the geodesic xα(u)

dTα

du= −Γα

βγTβ dx

γ

du

Note that for an arbitrary curve (not a geodesic), the unit tangent is notparallely transported along the curve.

Inner product

If Aµ and Bµ are two vectors, then

~A · ~B = gµνAµBν (= AνB

ν)

Page 32: Section 1 - Some Mathematics

62 GEODESICS

If we parallely transport both Aµ and Bµ along any curve xα(u)

dAµ

du= −Γµ

αβ

dxα

duAβ,

dBµ

du= −Γµ

αβ

dxα

duBβ

so

d

du( ~A · ~B) =

d

du(gµνA

µBν) =∂gνµ

∂xσ

dxσ

duAµBν + gνµ

dAµ

duBν + gµνA

µdBν

du

=

[∂gµν

∂xσ− Γα

µσgαν − Γανσgµα

]

AµBν dxσ

du

The expression in brackets is identically zero from the definition of Christoffelsymbols. Thus

d

du( ~A · ~B) =

d

du(gνµA

µBν) = 0

Hence,

1. The scalar product is invariant under parallel transport.

2. The length of a vector is preserved under parallel transport.

Parallel transport in matrix form

Define

~A =

A1

A2

...An

which is a column vector of contravariant components.The Christoffel symbols are grouped together in matrices as follows. ||Γσ||

is a matrix whose element in row α and column β is

(||Γσ||)αβ = Γασβ

Parallel transports along any path can be calculated from

dAα = −(dxσ||Γσ||)Aα

as a matrix multiplication.

PARALLEL TRANSPORT IN MATRIX FORM 63

Example: application to polar coordinates

Consider polar coordinates, for which we have

ds2 = dr2 + r2dθ2

dxr = dr, dxθ = dθ

The eight symbols can be all put together into the following matrices (wecalculated these values earlier - see for example “brute force” approach)

for σ = r ||Γr|| =

[Γr

rr Γrrθ

Γθrr Γθ

]

=

[0 0

01

r

]

for σ = θ ||Γθ|| =

[Γr

θr Γrθθ

Γθθr Γθ

θθ

]

=

[0 −r1

r0

]

Since parallel transports along any path can be calculated from dAα =−(dxσ||Γσ||)Aα, then we have

[dAr

dAθ

]

= −dr[

0 0

01

r

] [Ar

]

− dθ

[0 −r1

r0

] [Ar

]

and

dAr = rAθdθ

dAθ = −Aθ

rdr − Ar

rdθ

Parallel transport along a circle

Consider now the parallel transport of a vector ~A along a circle of radius Rin a Euclidean space, expressed in polar coordinates. We have

r = R, dr = 0, dθ =ds

R, s = arclength parameter

So, using the results we just obtained,

dAr = RAθ ds

R= Aθds

dAθ = −Ar

R

ds

R= −Ar ds

R2

Page 33: Section 1 - Some Mathematics

64 GEODESICS

so that we have

dAr

ds= Aθ

dAθ

ds= −A

r

R2

with solutions

Ar = C cos(s

r+D

)

Aθ = −CR

sin(s

r+D

)

From the metric we have Ar = Ar and Aθ = r2Aθ, with r = R, so

|A|2 = ArAr + AθAθ = C2 = constant

The vector ~A does not change in magnitude, but the absolute values ofits components vary along the path because the basis vectors change. Thecomponents return to the initial values after traversing a full circle. This isbecause space is flat.

Example: Parallel transport around a closed loop in flatspace

What happens when we parallel-transport a vector around a closed loop? Ifspace is intrinsically flat, parallel transport of a vector from A through B, Cand back to A will result in a vector that is parallel to the original one (seefigure ).

A

B

C

In the case of a cylinder, one may think that it is a curved surface.However, a cylinder is constructed by rolling a flat piece of cardboard without

PARALLEL TRANSPORT IN MATRIX FORM 65

having to distort it, so the intrinsic geometry of a cylinder is, in fact, flat.The distance between any two points is preserved before and after rollingthe cardboard and parallel lines stay parallel. Thus, there are two differentkinds of curvature: intrinsic and extrinsic. Thus, a cylinder has “extrinsiccurvature”.

When we talk about the curvature of space, and of space-time, instead,we refer to its “intrinsic curvature” (see next example).

Example: Parallel transport around a closed loop oncurved 2-D surface

Here we show that a sphere has an intrinsically curved surface. Figure shows

that the vector rotates by 90 degrees, as it is transported around the closedloop formed by great circles. So parallel lines do not remain parallel, whichmeans that the space is not flat. In curved space, one can transport a vectorby a particular closed loop, and whether one ends up with a vector pointingin the same direction as the original, will depend on the path that one takes.

In the case of the surface of a sphere, the reason why the vector does notgo back to its initial direction is due to the fact that the angles of a “triangle”on the surface of a sphere does not add up to 180 degrees. So, the angle thatthe transported vector makes with each side of the triangle jumps at eachvertex, by 180 degrees minus the angle at the vertex. In total, this meansit ends up rotated by 540 degrees minus the sum of the angles. If that sumwere 180 degrees, the net rotation would be 360 and the transported vectorwould match the original. On a sphere, the angles of a triangle always addup to something more than 180 degrees, so the transported vector is rotatedby less than 360 degrees and fails to line up with the original.

Page 34: Section 1 - Some Mathematics

66 GEODESICS

Covariant derivatives: formal definitions

Consider two neighbouring points P , at xk and Q at xk + dxk = xk + ǫhk.

(a) For a scalar field φ(xk) and a small vector ǫ~h, a typical definition ofderivative is:

limǫ→0

φ(~x+ ǫ~h) − φ(~x)

ǫ= hk ∂φ

∂xk(39)

∂φ

∂xkis a covariant vector as shown previously. The notation is

φk or∂φ

∂xkor ∂kφ

(b) For a vector field Ak(xq) in a curved space, we are facing the problemthat vectors at different points (even neighbouring points) cannot becompared directly, since, in general, there may not be universal paral-lelism. Therefore, we shall compare vectors at the same place. So the

P

Q

Parallel transport Actual vector

actual vectorAk(~x+ ǫ~h)

at the point Q will be compared with the result of parallel-transportingthe vector Ak(~x) at P along the small displacement ǫ~h to Q (see figure)).

Ak(~x) − ǫhs︸︷︷︸

dxs

ΓkspA

p(~x)

(see equation 37 of parallel transport), then

limǫ→0

Ak(~x+ ǫ~h) − [Ak(~x) − ǫhsΓkspA

p(~x)]

ǫ

hs

[∂Ak

∂xs+ Γk

spAp

]

for any ~h

The RHS is a vector for all vectors ~h, and therefore

Ak;s = ∂sA

k + ΓksβA

β

PARALLEL TRANSPORT OF A COVARIANT VECTOR 67

is a tensor, the covariant derivative of the contravariant vector Ak. Anyindex which follows the semicolon is taken to imply the operation ofcovariant differentiation.

The previous approach to find the covariant derivative of a contravari-ant vector was to differentiate both the vector components and thecoordinate basis vectors. The two methods obviously yield identicalresults because parallel transport takes into consideration the changesin the basis vectors.

Parallel transport of a covariant vector

We can now derive the rule for the parallel transport of a covariant vector.The inner product must remain invariant if two vectors are parallely trans-ported along any curve, since the magnitudes of the vectors and the anglebetween them remain invariant. Thus we must have

d

du(AµB

µ) = 0

dAµ

duBµ + Aµ

dBµ

du= 0

Using the expression 36 we derived for parallel transport of contravariant

components applied todBµ

duwe get

dAµ

duBµ − AµΓµ

σαBσ dx

α

du= 0

dAµ

duBµ − AαΓα

µσBµdx

σ

du= 0

[dAµ

du− AαΓα

µσ

dxσ

du

]

Bµ = 0

It follows that during parallel transport the covariant components evolveaccording to

dAµ

du= AαΓα

µσ

dxσ

du(40)

The rules for parallel transport can be easily generalised to tensors of anyrank.

Page 35: Section 1 - Some Mathematics

68 GEODESICS

Covariant derivatives: a summary

Covariant derivative of contravariant vector:

Aj;n =

∂Aj

∂xn+ Γj

nlAl

Covariant derivative of covariant vector:

Aj;n =∂Aj

∂xn− Γl

jnAl

In general, each contravariant/covariant index attracts one additional pos-itive/negative Γ-term in the covariant derivative, exactly as in the case ofparallel transport. For example

Aji;n =

∂Aji

∂xn+ Γj

lnAli − Γl

inAjl

By summation convention, what we have here in 4-D is a set of 64 equationseach with 9 terms on the RHS! To generalise, the covariant derivative ofAu1u2···us

r1r2···rp,n with respect to xn is

Au1u2···us

r1r2···rp,n =∂Au1u2···us

r1r2···rp

∂xn+

s∑

α=1

Γuα

knAu1···uα−1kuα+1···us

r1r2···rp−

p∑

β=1

ΓlrβnA

u1u2···us

r1···rβ−1lrβ+1···rp

Covariant differentiation of products: The rules for the ordinary deriva-tive of a product carries over to the covariant derivative. Therefore

(vµuν);σ = vµ;σuν + vµuν;σ

It follows that

gµν;σ = ∂σgµν − Γασµgαν − Γα

σνgµα = 0 from def. of Γ

Using the above one can show that

Aµ;σ = gµνAν;σ

Covariant second derivative of a covariant vector:

Aj;n;p =∂

∂xpAj;n − Γl

jpAl;n − ΓlnpAj;l (41)

=∂

∂xp

[∂Aj

∂xn− Γl

jnAl

]

− Γljp

[∂Al

∂xn− Γk

lnAk

]

− Γlnp

[∂Aj

∂xl− Γk

jlAk

]

=∂2Aj

∂xn∂xp− Γl

jn

∂Al

∂xp− Al

∂xpΓl

jn − Γljp

∂Al

∂xn

+ ΓljpΓ

klnAk − Γl

np

∂Aj

∂xl+ Γl

npΓkjlAk

RIEMANN-CHRISTOFFEL TENSOR 69

Second covariant derivative of a scalar φ is

φ;µν = ∂νφ;µ − Γανµφ;α = ∂µ∂νφ− Γα

νµ∂αφ

The Γ-term is present since the first derivative is a covariant vector (see thederivation of the covariant derivatives of a scalar field (39)).

Riemann-Christoffel tensor

Take now the expression for the second covariant derivative Aj;np of an arbi-trary vector Aj (see equation(41)) and then take it again but with reversedorder of differentiation (Aj;pn). Finally, subtract these results to obtain

Aj;np − Aj;pn = AlRljnp (42)

where

Rljnp =

∂xnΓl

jp −∂

∂xpΓl

jn + ΓlnsΓ

sjp − Γl

psΓsjn (43)

is the Riemann-Christoffel Tensor. In 4D this tensor has 256 elements withten terms on each RHS, by the summation convention. Note that

1. In flat space, R is zero because the second covariant derivatives aresymmetric.

2. If ever the Riemann tensor is non-zero we are dealing with a curvedspace, and the non-zero components give information on intrinsic spacecurvature.

3. The Riemann tensor involves the Christoffel symbols and their deriva-tives and does not depend on the choice of the vector Aj.

Riemann-Christoffel tensor: symmetries

1. Antisymmetry in the last pair of indices

Rljnp = −Rl

jpn

which follows directly from the definition of the Riemann tensor (seeequation 43).

2. Cyclic symmetry:Rl

ijk +Rljki +Rl

kij = 0

(prove this by taking the third covariant derivative of a scalar Φ andby permuting the indeces several times).

Page 36: Section 1 - Some Mathematics

70 GEODESICS

P

S

R

Q

dx

dx j

dxk

dxk

j

Intrinsic curvature and its relation to parallel transport

Consider a rectangle with sides dxj and dxk and a vector ~v located at P thatis parallely transported along PQRSP . It can be shown that the change inthe l component of ~v is

vl|| − vl = Rl

ijkvidxjdxk

that is, the change is proportional to the original vector v and the displace-ments d~x and are linked by the Riemann-Christoffel tensor. A non-zerochange implies intrinsic curvature. In terms of the matrices ||Γα|| introducedearlier

Rijkl = element in row i and column j of ||Bkl||

where||Bkl|| = ∂k ||Γl|| − ∂l ||Γk|| + ||Γk|| ||Γl|| − ||Γl|| ||Γk||

So, here is the recipe to calculate Rijkl by hand:

1. Assemble the four matrices ||Γα||.

2. Evaluate the six matrices ||Bkl|| as above. Note that we have six ratherthan sixteen, because of the anti-symmetry in k and l.

Riemann curvature tensor

We shall now introduce the wholly covariant Riemann curvature tensor

Rrjnp = grlRljnp (44)

RIEMANN-CHRISTOFFEL TENSOR: SYMMETRIES 71

This tensor has many symmetries:

Rrjnp = −Rrjpn (45)

Rrjnp = −Rjrnp (46)

Rrjnp = Rnprj (47)

Rrjnp +Rrnpj +Rrpjn = 0 (48)

With these symmetries/asymmetries the number of independent componentsof the Riemann-Christoffel tensor is not n4, but a far more limited number.It can be proved that in an n-dimensional Riemann space, the independentcovariant components are n2(n2−1)/12 in number. Thus, in a 4D space, thenumber of independent components is 20.

Ricci tensor and scalar

The Ricci tensor is given by

Rjn = Rljln = gslRsjln (49)

and is the contraction of Rljln on the first and third indices.

In principle, we could instead contract on the first and second indices oron the first and fourth, etc. However, since Rsjln is antisymmetric on s and jand on n and l, all the other possible contractions would either be identicallyzero, or reduce to ±Rjn. Consequently, the Ricci tensor is the only possiblecontraction of the Riemann tensor.

Similarly, the Ricci scalar is defined by

R = gjnRjn (50)

Bianchi’s identities

If we take the derivative of the wholly covariant Riemann tensor

Rmjnp,r =1

2

∂xr

(∂2gmn

∂xp∂xj− ∂2gmj

∂xp∂xm− ∂2gmp

∂xj∂xn+

∂2gjp

∂xn∂xm

)

and evaluate the result in locally inertial coordinates (coordinates in whichthe Christoffel symbols all vanish at any given point) we find

Rmjnp,r =1

2(gmp,jnr − gmn,jpr + gjn,mpr − gjp,mnr)

Page 37: Section 1 - Some Mathematics

72 GEODESICS

From this equation, and by using the symmetry gij = gji and the fact thatpartial derivatives commute, it is possible to obtain the following identitiesby permuting n, p and r:

Rmjnp,r +Rmjrn,p +Rmjpr,n = 0

Since in our (inertial) coordinate system the Christoffel symbols are equal tozero, then this equation is equivalent to

Rmjnp;r +Rmjrn;p +Rmjpr;n = 0 (51)

which is valid in any system. These are called the Bianchi’s identity. Theyform a set of 1024 equations, most of which say nothing. There are only 24identities of a non-trivial kind.

Einstein’s tensor

Consider again Bianchi’s identities

Rmjnp;r +Rmjrn;p +Rmjpr;n = 0

If we multiply by gmpgjn we obtain

R;r − gmpRmr;p − gjnRjr;n = 0

the last two terms are identical (rename the indices). Thus

R;r − 2gmpRmr;p = 0

orR;r − 2Rp

r;p = 0

Multiply now the first term by the mixed tensor gpr = δp

r :

gprR;p − 2Rp

r;p = 0

Note that raising an index of the metric tensor is equivalent to contractingit with its inverse, yielding the Kronecker delta. That is, gijgjk = gi

k = δik.

Thus

(Rpr −

1

2gp

rR);p = 0

The quantity in parentheses is the Einstein’s tensor

Gpr = Rp

r −1

2gp

rR (52)

The Einstein tensor has the important property of zero covariant derivative:

Gpr;p = 0

In general relativity, Einstein uses the divergenceless tensor Gij and Ri

j towrite down physical equations relating to geometry in covariant form - thatis, in “coordinate independent” form.

The low gravitational field limit

The position of a particle in space-time is

xi = (t, x, y, z) = (x0, x1, x2, x3)

where~r = (x, y, z) = (x1, x2, x3)

is its position in 3-D space. We have

c2dτ 2 = ds2 = c2dt2 − dx2 − dy2 − dz2

= dx02 − dx12 − dx22 − dx32

so the metric tensor is

gµν =

c2 0 0 00 −1 0 00 0 −1 00 0 0 −1

Now we wish to obtain a metric outside the distribution of matter of aspherically symmetric star of mass M . For a weak field, we expect the metricto be a perturbation of the special relativity metric given above. Assumingspherical symmetry of the spatial part of the metric, we have

gµν =

(1 + f00)c2 0 0 0

0 −(1 + f11) 0 00 0 −(1 + f22)r

2 00 0 0 −(1 + f33)r

2 sin2 θ

(53)

wherefµµ = fµµ(r), |fµµ| << 1

Let’s now study the motion in the radial direction of a test particle of unitmass in the field of this metric. The geodesic equation for radial accelerationis

d2r

dτ 2+ Γr

νλ

dxν

dxλ

dτ= 0

73

Page 38: Section 1 - Some Mathematics

74 THE LOW GRAVITATIONAL FIELD LIMIT

If we expand the summation we obtain:

r + Γrttt

2 + Γrrrr

2 + Γrrθrθ + · · · + Γr

φφφ2 = 0

By assuming pure radial motion (θ = 0 = φ), we get

r + Γrttt

2 + Γrrrr

2 + 2Γrrttr = 0 (54)

Remember that

Γσµλ =

1

2gσν

[∂gµν

∂xλ+∂gνλ

∂xµ− ∂gµλ

∂xν

]

So, the Christoffel symbols needed in this equation are

Γrrr =

1

2grα(2grα,r − grr,α)

Γrrt =

1

2grα(grα,t + gtα,r − grt,α)

Γrtt =

1

2grα(2gtα,t − gtt,α)

The metric tensor gµν has no off-diagonal components and is assumed to betime-independent. Therefore, when the summations over α are carried out,the Christoffel symbols reduce to

Γrrr =

1

2grrgrr,r

Γrrt = 0

Γrtt = −1

2grrgtt,r

The contravariant metric tensor corresponding to (53) is

gµν =

1

(1 + f00)c20 0 0

0 − 1

(1 + f11)0 0

0 0 − 1

(1 + f22)r20

0 0 0 − 1

(1 + f33)r2 sin2 θ

(55)and the Christoffel symbols are

Γrrr =

1

2

f11,r

(1 + f11)Γr

tt =1

2

c2f00,r

(1 + f11)

75

If we now take 1 + f11 ≈ 1, the above equations become

Γrrr =

1

2f11,r Γr

tt =c2

2f00,r

Substitution of these Christoffel symbols into equation (54) yields

r +1

2f11,rr

2 +c2

2f00,r t

2 = 0 (to 1st order)

If the particle’s velocity is v << c, then dτ ≈ dt and

d2r

dt2= −c

2

2

[

f00,r +vr

c2f11,r

]

(56)

where vr = r is the radial velocity of the particle. Since vr << c, then

d2r

dt2= −c

2

2f00,r (57)

which is the counterpart of the Newtonian equation for the radial motion ofa particle of unit mass moving in a gravitational field

d2r

dt2= −(∇Vg) (58)

where

Vg = −MG

ris the potential due to the star of mass M and G is the gravitational constant.Comparison between the Newtonian equation (58) with (57) shows that

f00 = −2GM

c2r.

In the perturbed Lorentz metric given in (53), therefore, we have

g00 = c2(1 + f00)

and the metric is

c2dτ 2 = ds2 = c2(

1 − 2GM

c2r

)

dt2 − dσ2 (59)

where dσ2 is the squared 3-D line element. This metric is that of a static,spherically symmetric gravitational field.

Thus, we have deduced that the perturbation f00 has the character of agravitational potential. Hence, if we adopt the geodesic principle, the metricelements are effectively potentials which describe gravity.

Page 39: Section 1 - Some Mathematics

76 THE LOW GRAVITATIONAL FIELD LIMIT

Einstein Field Equations

Field equations of empty space

From classical Newtonian gravity we know that the gravitational potentialVg satisfies Laplace’s equation in empty space

∇2Vg = 0

Einstein postulated that the most general equation that will reduce to Laplaceequation in the weak field limit can be obtained by setting the Ricci tensorequal to zero:

Rαβ = 0 (Rαβ = Rβα)

This yielded a set of 10 independent equations which yield the values of gµν

in empty space.A more general possibility is that in empty space we may have

Rαβ = −Λgαβ

where Λ is a universal constant. This additional cosmological term givesspace-time as a whole an extra curvature, which could be positive or negativedepending on the sign of Λ. However, later on, Einstein referred to this termas the “biggest blunder of his life”. You’ll see later on in this course thatthis term might in fact be very important for current cosmological models.

Field equations in space with matter/radiation

Here, we shall see how we can link Einstein’s tensor to the energy tensor. Hys-torically, this was not straightforward, since they were born out of differentconsiderations. Einstein’s tensor was born out of geometrical considerations,while the energy tensor was born out of physical considerations.

In the presence of a distribution of matter of volume density ρ, from clas-sical Newtonian gravity we know that the gravitational potential Vg satisfies

77

Page 40: Section 1 - Some Mathematics

78 EINSTEIN FIELD EQUATIONS

Poisson’s equation∇2Vg = −4πGρ

In its relativistic form, the LHS of Poisson’s equation can be represented bythe Einstein’s tensor Gαβ.

Gαβ = Rαβ − 1

2gαβR

Therefore, on the RHS, we must have a second rank tensor representing thematter and energy density in the given region.

Let’s designate this tensor Tαβ. So

Gαβ = ATαβ

where A is a constant to be determined. Since Gαβ is divergenceless, the mat-ter tensor Tαβ must also have a vanishing covariant divergence. Anticipatinglater requirements, we define, tentatively, the contravariant component ofthis tensor for a distribution of non-interacting mass particles as

Tαβ = ρ0dxα

ds

dxβ

ds(60)

where ρ0 is the volume mass density, as measured in the rest frame of theparticles making up the matter. ds is an element of the worldlines of the

particles and thedxµ

dsare the worldline velocities. So

Rαβ − 1

2gαβR = ATαβ (61)

gαγRαβ − 1

2gαγgαβR = AgαγTαβ

Rγβ − 1

2δγβR = AT γ

β

R− 4

2R = AT

R = −AT

Substituting R = −AT into equation (61) gives

Rαβ = A(Tαβ − 1

2gαβT ) (62)

which is an alternative version of equation (61). We can now use this todetermine A, by requiring that it reduces to the classical Poisson’s equationin the low field limit.

FIELD EQUATIONS IN SPACE WITH MATTER/RADIATION 79

If we assume a stationary dust cloud (dx = dy = dz = 0, dτ = dt), the

only non-zero component of Tαβ = ρ0dxα

ds

dxβ

dsis T 00, which has the value

T 00 =ρ0

c2since

dt

ds=

1

c

We also have

T = g00T00

= (1 + f00)c2ρ0

c2≈ ρ0

and

T00 = g00T = (1 + f00)c2ρ0 ≈ c2ρ0

where we have neglected the product of ρ0 and f00. Furthermore, it can be

shown that R00 = −c2

2∇2f00. Thus, equation (62) gives

−c2

2∇2f00 = A(c2ρ0 −

1

2c2ρ0)

∇2f00 = −Aρ0

We found earlier that for the static, spherically symmetric weak gravitationalfield

f00 = −2Vg

c2

so we get

∇2Vg =c2A

2ρ0.

which is identical to Poisson’s equation ∇2Vg = −4πGρ0 if we set

A = −8πG

c2

In summary, the field equations for the metric gµν in the presence of matterare

Rαβ − 1

2gαβR = −8πG

c2Tαβ (63)

These are known as Einstein Field Equations.

Page 41: Section 1 - Some Mathematics

80 EINSTEIN FIELD EQUATIONS

The matter-energy tensor

Here, we will assume again that the fluid which comprises the universe iswithout pressure and viscosity. Thus, it is a “dusty” universe.

We have just seen that the energy tensor is defined by

Tαβ = ρ0dxα

ds

dxβ

ds

In a Lorentz frame having coordinates xµ = (ct, x, y, z), the metric is

c2dτ 2 = ds2 = c2dt2 − dx2 − dy2 − dz2

=c2dt2

γ2

where τ is the proper time and

γ =dt

dτ=

1√

1 − V 2

c2

therefore, the components of T µν are

T µν =ρ0γ

2

c2

1 x y zx x2 xy xzy yx y2 yzz zx zy z2

(64)

where the dot indicates differentiation wrt the time coordinate t, ρ0 is the rest(proper) mass density of dust. The two factors of γ allow for the foreshort-ening of the unit volume in the direction of motion and for the relativisticincrease in mass. The other components give quantities related to the energydensity due to motion - similar to kinetic energy, but with some cross terms.

The justification for the above definition for the energy tensor is that inthe absence of forces, and for a SR metric, the 4 equations

∂T µν

∂xν= 0 (65)

give the equations of the conservation of energy and momentum of specialrelativistic fluid mechanics. In the limit of low velocities, we get

c2T 0ν,ν =

∂ρ

∂t+

∂x(ρvx) +

∂y(ρvy) +

∂z(ρvz) = 0

THE MATTER-ENERGY TENSOR 81

which is the continuity equation of fluid flow, which has the vector form

~∇ · (ρ~v) = −ρ

Furthermore,

c2T 1ν,ν =

∂t(ρvx) +

∂x(ρv2

x) +∂

∂y(ρvxvy) +

∂z(ρvxvz) = 0

After carrying out the differentiations, this equation can be expressed as

ρ∂vx

∂t+ ~v · ~∇(ρvx) = 0

which is the x component of the equation of motion of a fluid under noexternal forces (Euler’s equation). The remaining component equations can

also be derived from the same equation (∂T µν

∂xν= 0).

The restriction of non-interacting particles may be partially lifted by theaddition of a term to T µν proportional to fluid pressure. This must be donein such a way that the resulting energy tensor is covariant and divergence-less. So, in the presence of pressure, an appropriate divergence-less tensoris

Tαβ =(

ρ0 +p

c2

) dxα

dxβ

dτ− p

c2gαβ

and similar expressions can be obtained when radiation is included.

Page 42: Section 1 - Some Mathematics

82 EINSTEIN FIELD EQUATIONS

Cosmology

Observables in astronomy

Red shift: z =λobs − λlab

λlab

=λobs

λlab

− 1

(

=∆λ

λ

)

Apparent flux: fv

Distance: d

Classical result: z =∆λ

λobs

=vr

c

where vr is the recessional velocity of the object. To measure the distanced is not easy! The diagram below is Hubble’s very first diagram, which waspublished in 1929 where he reported the famous result V = H0d.

(a)

close

far

veryfar

(b)

Figure 1: Hubble found V = H0d.

Quasars

1960’s Third Cambridge catalogue of radio sources.Mostly external galaxies emitting in radio region.

1962 3C48 No galaxy at position but a faint blue star.

83

Page 43: Section 1 - Some Mathematics

84 COSMOLOGY

3C273 Shown to be a stellar like point source using occultation by the moon.The abrupt disappearance of the radio emission from 3C273 allowedCyril Hazard to identify its optical counterpart.

Marteen Schmidt: Obtain spectrum identified as red shifted Balmer Lines ofhydrogen.

z = 0.16 (16% speed of light)

Hubble Law → d ∼ 109pc ⇒ Extremely luminous!

Today we know that these quasi stellar sources, named “Quasars”, are asso-ciated with 109M⊙ black holes and are among the most distant objects inthe universe (z ≈ 6).

Naive Interpretation

t = t0

t = 0

• Ignore gravity (free expansion)

• Set up inertial frame (space Euclidean) with us at origin O

• All matter including us concentrated at origin O at t = 0

• Explosion (isotropic) at t = 0 with a distribution of velocities in inertialframe with v ≤ c

Initially fastest moving objects move furthest in a given time.

At t = t0 (present time)

da = Vat0, db = Vbt0, . . . , dc = Vct0, . . .

for different galaxies a, b, c, . . . with velocities Va, Vb, Vc, . . . at t = 0.

ESTIMATE OF AGE (CLASSICAL MODEL) 85

Generally

d = V t0

t0 =da

Va

=db

Vb

= · · ·

Also

V =d

t0= H0d

Since V ≤ c, the radius of the universe is

dmax = ct0

These are the objects that had speeds V = c at t = 0.

Hubble sphere

v=c

max0

d = ct

v=c

v=c

t = t (now)0

Estimate of age (classical model)

If V = H0d, then if we extrapolate backwards we find that all objects are ontop of each other at

t0 = tH =d

V=

1

H0

At Hubble’s time

H0 = 530kms−1 Mpc−1

tH = 2 × 109 years!!

CurrentlyH0 = (50–100) kms−1 Mpc−1

For

H0 = 65kms−1 Mpc−1

tH =3 × 1018 × 106

65 × 105

1

3 × 107years

= 15 × 109 years

Page 44: Section 1 - Some Mathematics

86 COSMOLOGY

Problems with this model

• We are at centre of the explosion; placing us in a privileged position.

• The universe will look quite different for different observers — for in-stance the universe would have an edge.

• Space is infinite as in classical ideas but there are no objects outsidedmax = ct0.

The main problem with this picture is that the universe is not homoge-neous or isotropic thus violating the cosmological principle (CP). TheCP states that we do not occupy a privileged place in the Universe, as in theCopernican view, and that the Universe appears homogeneous and isotropicto every observer located anywhere in the Universe at all times.

This is obviously untrue on small scales, but on a very large scale, theCP is expected to apply.

On the other hand, the Perfect Cosmological Principle (PCP) goeseven further, in that this principle states that we do not occupy a privilegedplace in the Universe, but also we do not occupy a preferred time in theUniverse. That is, the Universe appears homogeneous and isotropic and thesame to every observer located anywhere in the Universe at all times. Thus,the Universe does not begin or die. This is the steady-state models of theUniverse.

Modern point of view: the Cosmological Prin-

ciple

Following what we have just seen, any viable model of the universe musthave the following characteristics:

At any epoch (time) the universe must look

CP(a) homogeneous (same at every location)

CP(b) isotropic (same in every direction)

to every observer in the universe.

The above is called the

Cosmological Principle (CP)

.

MODEL UNIVERSES 87

CP(c) Every observer at any given epoch must see the same Hubble lawif universe expands.

Model universes

The Minkowski model

Consider a coordinate system S : (t, x1, x2, x3) and a metric

ds2 = c2dτ 2 = c2dt2 − dx2 − dy2 − dz2

Now assume that the

1. galaxies are uniformly distributed in space at t = 0.

2. the coordinates (xα, yα, zα) of any galaxy α are the same for all times(no mass, no gravity and thus no force, acting on galaxies)

The Minkowski metric is the metric that one would have according to specialrelativity. Thus, for any galaxy,

dx = dy = dz = 0

therefore

ds2 = c2dt2 ⇒ cτ =

∫ √ds2 = c

dt = ct

So, the coordinate t measures proper time.The “proper distance”, that is the distance between two galaxies as mea-

sured by an observer at the same time t (dt = 0) in the reference system S,is given by (1D case for illustration purposes)

d12 =

∫ 2

1

√−ds2 =

∫ x2

x1

dx = (x2 − x1)

More generallyd12 = |r2 − r1|

This gives a static universe. To justify the special relativity metric we requirea very low density of matter, that is, no gravity. This universe is an “emptyspace” universe.

Universes of constant positive curvature: spa-

tial distance element

Case A: 1D circumference of a 2-D circle

Page 45: Section 1 - Some Mathematics

88 COSMOLOGY

x21 + x2

2 = R2

x1 = R cosφ

x2 = R sinφ

so the element of length (given by the spatial distance) is

dl2 = dx21 + dx2

2 = R2dφ2

and the circumference (which we can call “volume” of this universe) is

∫ 2π

0

Rdφ = 2πR

that is, the above quantity is the full extent of the universe.

Case B: 2D area of a 3D sphere

x21 + x2

2 + x23 = R2

x1 = R sin θ cosφ

x2 = R sin θ sinφ

x3 = R cos θ

so the element of length is given by

dl2 = dx21 + dx2

2 + dx23 = R2(dθ2 + sin2 θdφ)

and the area (which is again the full “volume” of the universe) is

∫ 2π

0

∫ π

0

R2 sin θdθdφ = 4πR2

Case C: 3D area of a 4D sphere

These are spaces defined by

x21 + x2

2 + x23 + x2

4 = R2

UNIVERSES OF CONSTANT POSITIVE CURVATURE: SPATIAL DISTANCE ELEMENT89

with parametrisation

x1 = R sinψ sin θ cosφ

x2 = R sinψ sin θ sinφ

x3 = R sinψ cos θ

x4 = R cosψ

so the element of length is

dl2 = dx21 + dx2

2 + dx23 + dx2

4

= R2[dψ2 + sin2 ψ(dθ2 + sin2 θdφ2)]

and the 3D surface area of the 4D sphere is

∫ π

0

∫ π

0

∫ 2π

0

R3 sin θ sin2 ψdφdψdθ = 2π2R3

This surface area would in fact be the 3D volume universe that we are in-habiting (but only if our universe happened to be a 4D sphere).

In this 3D space, consider now 2D surfaceswhich we obtain by setting ψ = constant. Then

x21+x

22+x

23 = R2−x2

4 = R2−R2 cos2 ψ = R2 sin2 ψ

Set p = R sinψ = constant, since ψ is constant.The equation of the surface is

x21 + x2

2 + x23 = p2

which is a sphere with area 4πp2. As ψ increases,from 0 to π, one moves outwards from the north pole (ψ = 0) of the 3D spacethrough successive spheres of area 4πp2. The area increases until ψ = π/2,after which it decreases until it is zero at ψ = π. These concentric spheres ofa 4D sphere correspond to the latitudinal circles of an ordinary sphere.

Note that here p is not the radial distance!Nevertheless it serves as a coordinate in the ra-dial direction. For obvious reasons, it is knownas the area distance coordinate. We have

dp = R cosψdψ

R2dψ2 =dp2

cos2 ψ=

dp2

1 −( p

R

)2

Page 46: Section 1 - Some Mathematics

90 COSMOLOGY

The distance element in this 3-D space can there-fore be written as

dl2 =dp2

1 −( p

R

)2 + p2(dθ2 + sin2 θdφ2)

in any radial direction (θ = const, φ = const) we have

dl2 =dp2

1 −( p

R

)2

Thus, it follows that the radial distance is

d =

dl =

∫ p

0

dp√

1 −( p

R

)2= R sin−1

( p

R

)

The dimensionless area distance andradial coordinates

We can simplify further our previous results byusing R as a scaling factor and defining

σ =p

R

dl2 = R2

[dσ2

1 − σ2+ σ2(dθ2 + sin2 θdφ2)

]

So we now have

• Area of sphere A = 4πR2σ2 = 4πd2a

• Area distance da = Rσ

The quantity σ is known as the dimensionlessarea distance coordinate. We can also define adimensionless radial coordinate r as follows

Rdσ√1 − σ2

= Rdr, r = sin−1 σ

UNIVERSES OF CONSTANT POSITIVE CURVATURE: SPATIAL DISTANCE ELEMENT91

3-D spaces of constant negative cur-vature (3D pseudospheres)

These spaces are defined by

x21 + x2

2 + x23 − x2

4 = −R2

which can be parametrised by

x1 = R sinhψ sin θ cosφ

x2 = R sinhψ sin θ sinφ

x3 = R sinhψ cos θ

x4 = R coshψ

Similarly to what we did earlier, we now con-sider the subspace given by ψ =constant

x21 + x2

2 + x23 = −R2 + x2

4 = R2 sinh2 ψ = p2

Now the surface area of these spheres keeps on increasing as ψ increases from0 to infinity.

The distance element in this space can there-fore be written as

dl2 =dp2

1 +( p

R

)2 + p2(dθ2 + sin2 θdφ2)

where σ = fracpR as before.

Summary

It can be shown that the only 3D surfaces of zero,negative or positive constant curvature have metricelements which can be expressed as

dl2 = R2

[dσ2

1 − kσ2+ σ2(dθ2 + sin2 θdφ2)

]

(k = −1, 0,+1)

or dl2 = R2[dr2 + f 2(r)

(dθ2 + sin2 θdφ2

)]

with

1. f(r) = r (k = 0 - flat universe).

Page 47: Section 1 - Some Mathematics

92 COSMOLOGY

2. f(r) = sinh r (k = −1 - hyperbolic universe)

3. f(r) = sin r (k = +1 - Spherical universe)(we have seen this earlier).

These results will form the basis for all our cosmo-logical models.

Further properties

dl2 = R2[dr2 + f 2(r)

(dθ2 + sin2 θdφ2

)]

For r = rs (radial distance coordinate frozen at a given value), we generatea sphere S in this space which has a metric

dl2 = R2f 2(rs)[dθ2 + sin2 θdφ2]

and the area of the sphere is

A = 4πf 2(rs)R2

Flat space

(a) (b)

Figure 2: In a flat, Euclidean geometry, space is divided into cubes. Theapparent angular size of objects is proportional to the inverse of their distance(Credit: Stuart Lev and Tamara Munzer for Scientific American).

If k = 0 and f(r) = r then

dl2 = R2r2s [dθ

2 + sin2 θdφ]

A = 4πr2sR

2

UNIVERSES OF CONSTANT POSITIVE CURVATURE: SPATIAL DISTANCE ELEMENT93

(a) (b)

Figure 3: This hyperbolic space is tiled with regular dodecahedra. In Eu-clidean space such a regular tiling is impossible. The size of the cells is ofthe same order as the curvature scale. Increasing distance is indicated byreddening. Although perspective for nearby objects in hyperbolic space isvery nearly identical to Euclidean space, the apparent angular size of distantobjects falls off much more rapidly, in fact exponentially (Credit: Stuart Levand Tamara Munzer for Scientific American).

Hyperbolic case

Here we have a constant negative curvature K = − 1

R2, f(r) = sinh r and

A = 4πR2 sinh2 rs.

Since sinh2 r > r2 then A > 4πR2r2s , so that the area is bigger than it is

in Euclidean space. Here we have

A = 4πR2 sinh2

(d

R

)

Spherical case

Here we have K =1

R2, f(r) = sin r and

A = 4πR2 sin2 rs = 4πR2 sin2

(d

R

)

Page 48: Section 1 - Some Mathematics

94 COSMOLOGY

(a) (b)

Figure 4: The spherical space is tiled with regular dodecahedra. The ge-ometry of spherical space resembles the surface of the Earth except here athree-dimensional rather than two-dimensional sphere is being considered.Increasing distance is indicated by reddening. Increasingly distant objectsfirst become smaller (as in Euclidean space), reach a minimum size, and fi-nally become larger with increasing distance. This behaviour is due to thefocusing nature of the spherical geometry (Credit: Stuart Lev and TamaraMunzer for Scientific American).

Robertson-Walker metric andFriedmann Equations

Metric tensor of the universe: Robertson-Walker

metric

World−lines (geodesics) of galaxies

Local Lorentzian framesspace−like hypersurface

The entire universe is viewed at a fixed time (any given “epoch” or “cos-mic time”) is homogeneous and isotropic. This means that every clock inevery galaxy measures always the same time. This restricts possible spacesto those of constant curvature, because otherwise there will be preferred ob-servers. Thus, we need to define a time (cosmic time) which is valid globally.We do this by introducing a series of non-intersecting space-like hypersur-faces (see figure). We assume that all galaxies lie on such a hypersurfacewhere the surface of simultaneity of the local Lorentzian frame of any galaxycoincides locally with the hypersurface. Thus, the hypersurface is a mesh ofthese Lorentzian frames and the 4-velocity of any galaxy is perpendicular tothis surface. The surfaces are labelled by the propertime recorded by any ofthese galaxies. That is, by a clock stationary in any of the galaxies.

Consider now two neighbouring galaxies on the same hypersurface t =constantwith spatial coordinates

(x1, x2, x3), (x1 + dx1, x2 + dx2, x3 + dx3)

95

Page 49: Section 1 - Some Mathematics

96 ROBERTSON-WALKER METRIC AND FRIEDMANN EQUATIONS

In an expanding universe model satisfying the cosmic principle, the spatialpart of the metric must have the form

dl2 = R2(t)ηij(xp)dxidxj

and the space-time metric is

ds2 = c2dt2 −R2(t)ηij(xp)dxidxj

At any epoch t, the hyperspaces must be homogeneous and isotropic and thusmust have a constant curvature. It therefore follows that the most generalmetric satisfying the cosmological principle is the Robertson-Walker metric

ds2 = c2dt2 −R2(t)

[dσ2

1 − kσ2+ σ2(dθ2 + sin2 θdφ2)

]

(k = −1, 0,+1)

Observers are initially uniformly distributed in space. They remain station-ary in spatial coordinate grid. This means that any given galaxy has thesame (σα, θα, φα) at all times. Following any galaxy, we have that

dσ = dθ = dφ = 0

so thatc2dτ 2 = ds2 = c2dt2 ⇒ τ = t

so t is the proper time called the cosmic time.

Metric of space at fixed cosmic time

Take t =constant, so that dt = 0 and

dl2 = R2(t)

[dσ2

1 − kσ2+ σ2(dθ2 + sin2 θdφ2)

]

where dl is the proper length element in space, which is curved with curvature

K =k

R2(t), k = −1, 0,+1

The curvature K is independent of position in space and is constant at anyepoch t. Note that the proper distance between any two galaxies (σ1, θ1, φ1),(σ2, θ2, φ2) scales as R(t) as t changes. Therefore, the spatial density remainsuniform for all t. The quantity R(t) is called the scaling factor of the universeand R(t)σ is the area distance.

LIGHT PROPAGATION (REDSHIFT) IN GR MODELS 97

Light propagation (redshift) in GR models

Light follows radial null geodesics. Thus, without loss of generality, we mayconsider a galaxy located at x1 emitting a pulse of light at time te with thepropagation occurring along the x−axis. This pulse is received at time t0 byan observer in a galaxy located at x2. Therefore

ds2 = 0, dθ = 0, dφ = 0

⇒ c2dt2 = R2(t)dr2

⇒ dr =c

R(t)dt

Now integrate∫ x2

x1

dr =

∫ to

te

cdt

R(t)

Assume that the same source at x1 emits another pulse of light at timete + ∆te , which is received by the observer at x2 at time to + ∆to. For thispulse we shall have

∫ x2

x1

dr =

∫ to+∆to

te+∆te

cdt

R(t)

The LHSs of the above equations are the same, since the limits x1 and x2

are the same in both cases. This is because in co-moving coordinates, fixedcoordinates values stay associated with each point (e.g. galaxy, star). Thus

∫ to

te

cdt

R(t)=

∫ to+∆to

te+∆te

cdt

R(t)∫ te+∆te

te

cdt

R(t)+

∫ to

te+∆te

cdt

R(t)=

∫ to

te+∆te

cdt

R(t)+

∫ to+∆to

to

cdt

R(t)∫ te+∆te

te

cdt

R(t)=

∫ to+∆to

to

cdt

R(t)

If ∆te and ∆to are sufficiently small, so that within these intervals R(t) canbe taken constant with values R(te) and R(to) respectively, the

∆te∆to

=R(te)

R(to)

If we now identify these light pulses with consecutive wave crests, then λ1 =c∆te and λ2 = c∆to, so we get

λ1

λ2

=R(te)

R(to)

Page 50: Section 1 - Some Mathematics

98 ROBERTSON-WALKER METRIC AND FRIEDMANN EQUATIONS

Then, since ∆λ = λ2 − λ1 we can write

λ1 + ∆λ

λ1

=R(to)

R(te)

If we now define

z =∆λ

λ

we obtain

z =R(to)

R(te)− 1

therefore, if R(t) increases over time and R(to) > R(te), the shift ∆λ will bepositive, that is, toward the red.

If we now expand R(to) in terms of (to − te) (the “time of flight”), then

R(to) = R(te) +

(dR(t)

dt

)

te

(to − te) +1

2

(d2R(t)

dt2

)

te

(to − te)2 + · · ·

substitute this expression into z =R(to)

R(te)− 1 neglecting terms of order 2 or

greater, then

z =1

R(te)

(dR(te)

dt

)

te

(to − te)

If we replace (to − te) with the classical value d/c where d is the distancebetween galaxies, then

z =1

R(te)

(dR(te)

dt

)

te

d

c

and compare this result with Hubble’s discovery z = Hd

c, we find

Hte =1

R(te)

dR(te)

dt(66)

which tells us how Hubble’s “constant” varies with time t. We can also definethe deceleration parameter q(t) as

q(t) = − 1

H2R(t)

d2R(t)

dt2(67)

DERIVATION OF FRIEDMANN’S EQUATIONS 99

Now set tf = (to − te), so that

R(to) = R(te)

[

1 +Htetf −1

2qteH

2tet

2f +O(t3f )

]

R(to)

R(te)− 1 = Htetf −

1

2qteH

2tet

2f +O(t3f )

z = Htetf −1

2qteH

2tet

2f +O(t3f )

or, in terms of currently observable parameters,

z = Htotf −1

2qtoH

2tot

2f +O(t3f )

Thus, redshift and time of flight are related through the details of the modeluniverse, local expansion rate, local deceleration, etc.

Derivation of Friedmann’s equations

Consider the metric

ds2 = c2dt2 − f [dr2 + r2(dθ2 + sin2 θdφ2)] (68)

where

f =R2(t)

[

1 +k

4r2

]2

This is an alternative form to

ds2 = c2dt2 −R2(t)

[dσ2

1 − kσ2+ σ2(dθ2 + sin2 θdφ2)

]

which has been obtained by making the transformation

σ =r

(

1 +k

4r2

)

to a new radial coordinate r.It is however easier to work in Cartesian coordinates so we will use the

Cartesian coordinate metric of equation (68).

ds2 = c2dt2 − R2(t)[

1 +k

4

(x2 + y2 + z2

)]2 [dx2 + dy2 + dz2] (69)

Page 51: Section 1 - Some Mathematics

100 ROBERTSON-WALKER METRIC AND FRIEDMANN EQUATIONS

Let’s determine now the properties of the above RW metric (in Cartesians)as a solution of the Einstein’s field equations

Rαβ − 1

2gαβR + Λgαβ = −8πG

c2Tαβ

(Note that the R in Einstein’s field equations is Ricci scalar, not to be con-fused with the scale factor of the universe R(t) !).

To calculate the Riemann curvature we need to find the Christoffel sym-bols related to the RW metric. Divide the expression above by the squared

geodesic path parameter dl. Since the Lagrangian is L =1

2

(ds

dl

)2

, The

Euler-Lagrange equations become

d

dl

[

∂xµ

(ds

dl

)2]

− ∂

∂xµ

(ds

dl

)2

= 0

and the four geodesic equations for t, x, y, and z are

t+ft

2c2x2 +

ft

2c2y2 +

ft

2c2z2 = 0 (0)

x+ft

fxt+

fx

2fx2 +

fy

fxy +

fz

fxz − fx

2fy2 − fx

2fz2 = 0 (1)

y +ft

fyt+

fx

fxy +

fy

2fy2 +

fz

fyz − fy

2fx2 − fy

2fz2 = 0 (2)

z +ft

fzt+

fx

fxz +

fy

fyz +

fz

2fz2 − fz

2fx2 − fz

2fy2 = 0 (3)

We can read out Christoffel symbols. So, if the index 0 represents coordinatet and i or j represent x, y, and z, then

Γ0ii =

ft

2c2, Γi

0i =ft

2f

Γiii =

fi

2f, Γi

jj = − fi

2f, Γi

ij =fj

2f

The components of Rαβ can be obtained by using Ricci’s tensor

Rjn = Rljnl = ∂nΓl

jl − ∂lΓljn + Γl

nsΓsjl − Γl

lsΓsjn

DERIVATION OF FRIEDMANN’S EQUATIONS 101

so

R00 =3

2

ftt

f− 3

4

(ft

f

)2

Rii =fii

2f− 3

4

(fi

f

)2

+1

2

∇2f

f− 1

4

(∇ff

)2

− ftt

2c2− 1

4c2f 2

t

f

Rij =fij

2f− 3

4

fifj

f 2(i 6= j)

then, since

f =R2(t)

[

1 +k

4r2

]2

we get

R00 = 3Rtt

R

Rij = 0 (i 6= j)

Rii = −2k+

2

c2R2

t +1

c2RRtt

0

@1+kr2

4

1

A

2 (i = j)

The contravariant RW metric tensor is given by

gij =

1/c2 0 0 00 −1/f 0 00 0 −1/f 00 0 0 −1/f

which allows us to calculate Ricci scalar R:

giiRii = g00R00 + g11R11 + g22R22 + g33R33

=6

c2R2(RRtt +R2

t + kc2)

So we finally have all the quantities required by

Rii −1

2giiR + Λgii = −8πG

c2Tii

Page 52: Section 1 - Some Mathematics

102 ROBERTSON-WALKER METRIC AND FRIEDMANN EQUATIONS

to construct Einstein’s equations. Substitutions give

3R2

t

R2− c2

(

Λ − 3k

R2

)

=8πG

c2T00

kc2 +R2t + 2RRtt − c2ΛR2

c2(

1 +kr2

4

)2 = −8πG

c2T11

kc2 +R2t + 2RRtt − c2ΛR2

c2(

1 +kr2

4

)2 = −8πG

c2T22

kc2 +R2t + 2RRtt − c2ΛR2

c2(

1 +kr2

4

)2 = −8πG

c2T33

These equations are explicit formulations of the field equations in the pres-ence of matter. Multiply each side by gii

[

3R2

t

R2− c2

(

Λ − 3k

R2

)]

g00 =8πG

c2T00g

00

kc2 +R2t + 2RRtt − c2ΛR2

c2(

1 +kr2

4

)2

gii = −8πG

c2Tiig

ii

to get

3R2

t

c2R2− Λ +

3k

R2=

8πG

c2T 0

0

−2Rtt

c2R− R2

t

c2R2− k

R2+ Λ = −8πG

c2T i

i

In the RW universe, the mass is in the form of homogeneously distributeddust particles which are stationary in coordinate space. So the tensor T i

j hasthe simple form

T ij =

ρ 0 0 00 −p/c2 0 00 0 −p/c2 00 0 0 −p/c2

At the present epoch, the pressure term is not important (dusty, stationarygas). However, in non-stationary models, in the distant past the universe

DERIVATION OF FRIEDMANN’S EQUATIONS 103

could have been very dense and thus the pressure term was important. So,the most general form of the Einstein’s equations is

3R2

t

c2R2+

3k

R2− Λ =

8πG

c2ρ (70)

2Rtt

c2R+

R2t

c2R2+

k

R2− Λ = −8πG

c4p (71)

which were found by Friedmann. Hence, these are known as “Friedmann’sequations”.

Density parameters

If we introduce

H(t) =R(t)

R(t)

then Friedmann’s equation (70):

3R2

t

c2R2+

3k

R2− Λ =

8πG

c2ρ

can be re-written as

kc2

R2=

8πGρ

3+

Λc2

3−H2(t)

= H2(t)

[ρ(t)

ρc(t)+ρΛ

ρc

− 1

]

where

ρ(t) Actual density of baryonic and non-baryonic matter at t

ρc(t) =3H2(t)

8πG“Critical” density at epoch t

ρΛ =Λc2

8πGVacuum density independent of t

We now introduce the density parameters

Ω =ρ(t)

ρc(t), ΩΛ =

ρΛ

ρc(t)

and write Friedmann’s equation in terms of these parameters:

kc2

R2(t)= H2(t)[Ω(t) + ΩΛ − 1] (72)

we obtain

Page 53: Section 1 - Some Mathematics

104 ROBERTSON-WALKER METRIC AND FRIEDMANN EQUATIONS

• A flat universe (k = 0)when

Ω(t) + ΩΛ = 1

• An open universe whenΩ(t) + ΩΛ < 1

• A closed universe whenΩ(t) + ΩΛ > 1

Solutions of Friedman’sequations

Solutions of Friedman’s equations for a matter

dominated universe

Parametric solutions for Λ = 0

Consider a matter dominated universe where

ρ(t) = ρ0

(R0

R(t)

)3

and substitute this expression in Friedmann’s equation (70) with Λ = 0

3R2

t

c2R2+

3k

R2=

8πG

c2ρ

to obtain

R2 = −kc2 + 8πGR2

3ρ0

(R0

R(t)

)3

R2 − 8πGρ0R30

3

1

R= −kc2

= −R20H

20 (Ω0 − 1))

Here we have used the fact that k is a constant and we have used currentepoch values to find its value (equation (72) with ΩΛ = 0). Now

Ω0 =ρ0

ρc,0

=ρ0

3H20

8πG

⇒ ρ0 =3H2

0

8πGΩ0

where ρc,0 is the current value of critical density. Hence, in terms of Ω0 wehave

R2 − Ω0H20

R30

R= (1 − Ω0)R

20H

20

105

Page 54: Section 1 - Some Mathematics

106 SOLUTIONS OF FRIEDMAN’S EQUATIONS

and1

R20H

20

R2 − Ω0

(R0

R

)

= 1 − Ω0

Now set S(t) =R(t)

R0

and τ = H0t to obtain

(dS

)2

− Ω0

S(τ)= 1 − Ω0

Case I: Einstein-De Sitter model (Ω0 = 1)

This model was presented in a joint paper by Einstein and De Sitter in 1932.Space is assumed to be Euclidean with k = 0 (zero curvature - flat universe)which corresponds to Ω0 = 1. With this value for Ω0 we obtain

(dS

)2

=1

S(τ)

S(τ) =

(3

)3/2

where S(τ) = 0 at τ = 0 by choice of integration constant.

From the solution of the DE we see that S(τ) = 1 at τ =2

3, that is

τ0 =2

3

or

t0 =2

3· 1

H0

=2

3tH

and the age of the universe is:

t0 =2

3tH

Also note that Einstein-De Sitter model gives

R(t) ∝ t2/3

and since

1 + z =R(t0)

R(t)=t2/3

t2/30

we can see that Einstein-De Sitter’ model predicts a redshift that varies withtime.

SOLUTIONS OF FRIEDMAN’S EQUATIONS FOR A MATTER DOMINATED UNIVERSE107

Now

Time

S(t)

2/3 tH

Ht

Case (II) & (III): Ω0 ≷ 1

Now define

S∗(τ) = S(τ)|1 − Ω0|

Ω0

τ ∗ =|1 − Ω0|3/2

Ω0

τ

so that(dS∗

dτ ∗

)2

− 1

S∗= ±1 (+ unbounded, − bounded)

whose parametric solutions are

S∗ =1

2(1 − cos η) τ ∗ =

1

2(η − sin η) Bounded

S∗ =1

2(cosh η − 1) τ ∗ =

1

2(sinh η − η) Unbounded

(73)

Page 55: Section 1 - Some Mathematics

108 SOLUTIONS OF FRIEDMAN’S EQUATIONS

timet0

S(t)

1

Marginally bounded

Bounded

Unbounded

One can see that

(1) AsΩ0 → 1, t0 →2

3tH Critical (Einstein De Sitter)

Ω0 → ∞, t0 → 0 Strongly bounded

Ω0 → 0, t0 → tH Weakly bounded

(2)t (big bang – big squeeze) =πΩ0tH

(Ω0 − 1)3/2for bounded case

Note that

q(t) = − R(t)

R2(t)R(t) =

−R(t)

H2(t)R(t)

if q(t) > 0 ⇒ expansion slowing down

Now set k = 0 in Friedman’s equations to get

R(t) = −4πG

3ρm(t)R(t)

∴ q(t) =−R(t)

H2(t)R(t)=

(4/3)πGρm(t)

H2(t)

=1

2Ω(t)

So

Ω(t) = 2q(t)

Ω0 = 2q0 (at present epoch)

SOLUTIONS OF FRIEDMAN’S EQUATIONS FOR A MATTER DOMINATED UNIVERSE109

To summarise:

Open universe : Ω0 < 1, q0 <1

2, k = −1

Closed universe : Ω0 > 1, q0 >1

2, k = +1

Critical universe : Ω0 = 1, q0 =1

2, k = 0

(Observations now suggest that the universe is in fact accelerating!).

The static universe: introduction of the Λ term

The field equations are independent of coordinates and determine the time-dependent factor R(t) of the RW metric. If R is assumed to be constant,independent of time, then Rtt = Rt = 0. Thus, the first static universemodels were simply

3k

R2− Λ =

8πG

c2ρ

k

R2− Λ = −8πG

c2p

If we combine these equations we get

Λ =4πG

c2(3p+ ρ)

The Λ term could then explain the static universe that people believed welive in before Hubble’s discovery. Without the factor Λ, (ie set Λ = 0),then either p = ρ = 0 or a negative pressure exists (3p = −ρ). Neitherof these assumptions was considered to be physically realistic. Hence, theintroduction of Λ provided a reasonable solution to the problem.

In a zero pressure (i.e. dusty), static universe, the second equation showsthat

Λ =k

R2

If we substitute this result in the first equation we obtain a positive valuefor k (k = 1 in the RW metric), which corresponds to a geometrical space offinite extent. The radius is

R =c√

4πGρ

If we assume zero pressure and ρ = 10−27kg/m3, then R ≈ 1013 light years.

Page 56: Section 1 - Some Mathematics

110 SOLUTIONS OF FRIEDMAN’S EQUATIONS

More general treatment of Friedmann’s equa-

tions

Take p = 0. We can find solutions for k = −1, 0,+1 with Λ = 0, Λ > 0 orΛ < 0. Friedmann’s equations are

3R2

t

c2R2+

3k

R2− Λ =

8πG

c2ρ

2Rtt

c2R+

R2t

c2R2+

k

R2− Λ = 0

Consider the first equation. If we take ρ0R3(t0) = ρR3(t) =constant for a

metter-dominated universe, we get

R2t =

8πG

3ρ0R3

0

R− kc2 +

Λc2

3R2 = F (R(t))

SinceΛ, ρ0, R(t0) and k can, in principle, be found from observations madeat the present time t0, then the function F (R(t)) is known, and

dR

dt=√

F (R(t))

so that

t− t1 =

∫ R(t)

R(t1)

dR√

F (R(t))

This integral can in general be evaluated in terms of elliptic functions.

Case A: Λ = 0

Set Λ = 0 and t1 = R(t1) = 0 in the integral just seen.

t =

∫ R(t)

0

dR√

8πGρ0R30

3R− kc2

=1

c

∫ R(t)

0

dR√

Rm

R− k

=Rm

c

∫ R(t)

0

(R

Rm

)1/2(R

Rm

)

1 − kR

Rm

MORE GENERAL TREATMENT OF FRIEDMANN’S EQUATIONS 111

The solution is

t =

Rm

c

[

sin−1

(R

Rm

)1/2

−(R

Rm

(

1 − R

Rm

))1/2]

(k = +1)

2Rm

3c

(R

Rm

)3/2

(k = 0)

Rm

c

[(R

Rm

(

1 − R

Rm

))1/2

− sinh−1

(R

Rm

)1/2]

(k = −1)

where

Rm =

8πGρ0R30

3c2=

2q0c

H0|2q0 − 1|3/2=

Ω0c

H0|Ω0 − 1|3/2(k 6= 0)

=R3(t0)H

20

c2(k = 0)

The flat (k = 0), and open (k = −1) universes expand for ever while theclosed universe (k = +1) expands to a maximum radius of curvature Rm attime tm and then collapses back to a dense phase at 2tm. The present age t0of these models with Λ = 0 can be found by setting R = R(t0) in the abovesolutions.

Note that this solution can be written in the parametric form we encoun-tered earlier by setting (check!)

R

Rm

= sinη

2, ⇒ η = 2 sin−1

R

Rm

Case B: k = 0 and Λ 6= 0

These are flat space universes that began from a condensed state (R = 0).The solutions of

t =

∫ R(t)

0

dR√

8πG

3ρ0R3

0

R+

Λc2

3R2

is

t =

2√3Λc2

sinh−1

[(R

Rm

)3/2(Λc2

8πρ0G

)1/2]

Λ > 0

1√

6πGρ0R30

R3/2 Λ = 0

2√

3|Λ|c2sin−1

[(R

Rm

)3/2( |Λ|c28πρ0G

)1/2]

Λ < 0

Page 57: Section 1 - Some Mathematics

112 SOLUTIONS OF FRIEDMAN’S EQUATIONS

which in this case can be written explicitly as

R(t) =

R(t0)

(8πρ0G

Λc2

)1/3

sinh2/3

(1

2t√

3Λc2)

Λ > 0

R(t0)(6πGρ0)1/3t2/3 Λ = 0

R(t0)

(8πρ0G

|Λ|c2)1/3

sinh2/3

(1

2t√

3|Λ|c2)

Λ < 0

We now re-introduce the “dark energy” term due to the cosmological constantΛ given by

ΩΛ =Λc2

3H20

Cold dark matter (CDM) behaves, gravitationally speaking, just like ordinarymatter, and there may be 10 times more dark matter than luminous matterin the universe. Its nature is, as yet, unknown. We can write the mattercontribution to Ω as due to ordinary baryonic matter (b) and to CDM:

Ωm = Ωb + Ωcdm

and we can re-write our solutions in terms of ΩΛ and Ωm. Thus

R(t) =

R(t0)

(Ωm(t0)

ΩΛ

)1/3

sinh2/3

(1

2t√

)

Λ > 0

R(t0)(6πGρ0)1/3t2/3 Λ = 0

R(t0)

(Ωm(t0)

|ΩΛ|

)1/3

sinh2/3

(1

2t√

3|Λ|)

Λ < 0

Let’s look for a relationship between the deceleration parameter q0 andΩ0, as we did earlier for Λ = 0 models. Consider Friedman’s equations atthe current epoch:

3Rt(t0)

2

c2R20

+3k

R20

− Λ =8πG

c2ρ

2Rtt(t0)

c2R0

+Rt(t0)

2

c2R20

+k

R20

− Λ = 0

We can express Rt(t0) and Rtt(t0) in terms of Hubble’s constant H0 anddeceleration parameter q0. So these equations combine to give

c2Λ = 4πGρ0 − 3q0H20

or, since ΩΛ =Λc2

3H20

and Ω0 =8πGρ0

3H20

:

Ω0 = 2q0 + 2ΩΛ

MORE GENERAL TREATMENT OF FRIEDMANN’S EQUATIONS 113

Now there is no unique relationship between q0 and Ω0, because we have anadditional parameter ΩΛ. However, since ρ, q0 and H0 are all measurablequantities, Λ, and ΩΛ can be calculated using the expressions just derived.

Note that it is possible to have q0 < 0, that is, we can have an acceleratingexpansion if Λ > 0. This is because the Λ term introduces a force of cosmicrepulsion.

From Friedman’s equations at the present epoch, one can also see that ina flat universe (k = 0), we have

Ω0 + ΩΛ = 1

Page 58: Section 1 - Some Mathematics

114 SOLUTIONS OF FRIEDMAN’S EQUATIONS

Solutions of Friedman’s equations for a radia-

tion dominated universe

The universe contains radiation in the form of photons. The expansion ofthe universe causes the energy density ǫr(t) of these photons to decrease. So,the number density of photons decreases as R−3(t), since the volume expandsas R3(t) and the energy of each photon decreases as R−1(t) because of thecosmological redshift. Therefore,

ǫr(t) =ǫr(t0)R

4(t0)

R4(t)

The equivalent mass density ρr(t) of photons is

ρr(t) =ǫr(t)

c2

The matter mass density ρ(t), however, decreases as R−3(t). Therefore, aswe go back in time, the there must have been a time tE at which point thedensities of matter and radiation were the same. Thus

ρ(tE) =ρ0R

30

R3(tE)= ρr(tE) =

ρr(t0)R40

R4(tE)

which givesR0

R(tE)=

ρ0

ρr(t0)

SOLUTIONS OF FRIEDMAN’S EQUATIONS FOR A RADIATION DOMINATED UNIVERSE115

For t < tE we have ρr > ρ. Thus, the period 0 < t < tE is called the radiationdominated era of the universe. Hence, at these early epochs the dynamics ofexpansion was determined by radiation energy rather than by matter.

Sinceρr(t)

ρ(t)∝ 1

R(t)

and currently we observeρr(t0)

ρ(t0)≈ 1

1000

then, since R(t) = R0/(z + 1), we have

ρr(t)

ρ(t)≈ 1 at z + 1 ≈ 1000

Matter and radiation are closely coupled in the radiation dominated era(through scattering of photons by electrons and ions). After z ≈ 1000 thephotons decouple from matter.

Now, sinceρr(t)c

2 = aT 4

where T is the temperature and a is the radiation constant. From this weget

T ∝ 1

R(t)

So, we can see that as R(t) → 0, then T → ∞ (Big Bang fireball).Prior to decoupling (z ≈ 1000) matter and radiation were in thermody-

namic equilibriumTm = Tr

The universe was opaque to radiation due to the presence of free electronswhich scatter photons. For physical reasons, we expect decoupling to occurwhen the universe has expanded so as to cool down to a temperature of∼ 3000 K. when hydrogen recombines and becomes neutral (p++e−=H).Free electrons are then lost from the gas and the universe becomes effectivelytransparent to radiation.

If cooling continues as a blackbody (T (t) ∝ 1/R(t)) and since R(t) =R0/(z+ 1), we expect a current background radiation field at a temperatureof

T (t0)

3000=

1

1 + zdecoup

=1

1000⇒ T ≈ 3 K

which is the cosmic microwave background radiation “accidentally” discov-ered by Penzias & Wilson in in 1963!

Page 59: Section 1 - Some Mathematics

116 SOLUTIONS OF FRIEDMAN’S EQUATIONS

GR equations in the radiation dominated universe

The equations that we need to solve in the radiation dominated universe are

Rtt(t) = −1

34πGρr(t)R(t)

R2t (t) =

8πG

3ρr(t)R

2 − kc2

withρr(t)R

4(t) = ρr(t0)R4(t0)

The solution of these equations show that near t = 0 (the big bang) we have

R(t) ∝√t

in a radiation dominated universe, regardless of the curvature k (check!).Note that

H(t) =R(t)

R(t), q(t) = −R(t)R(t)

R2(t)

still apply in a radiation dominated universe, with q(t) measuring decelera-tion as before.

The Steady-State Universe

Bondi, Hoyle and Gold Universe (1948)

This model of universe was presented in 1948 as an alternative to the BigBang theory.

Even without a cosmological constant, it is possible to have a universethat is not static, but looks the same at all epochs.

Assume k = 0 and p and ρ constant. From equations 70 and 71 by settingR

R= H (now a constant!), we get

R

R= constant,

R

R2= 3H2 (constant)

whose solution isR = consteHt

This universe has an exponentially increasing scale factor, has q = −1 and anegative pressure. In its original form, it required the creation of matter atthe expense of the negative pressure. According to this model, our universe is

THE STEADY-STATE UNIVERSE 117

Page 60: Section 1 - Some Mathematics

118 SOLUTIONS OF FRIEDMAN’S EQUATIONS

constantly expanding with matter being created spontaneously at a constantrate (one hydrogen atom per cubic metre per billion years) to maintain aconstant average density ρ.

This model would thus satisfy the perfect cosmological principle since itwould have no beginning and no end. The steady-state model was consideredto be an acceptable alternative to the Big Bang model until the discovery ofthe cosmic microwave background.

De Sitter Universe (1917)

Assume k = 0, ρm, 0 = 0 and ρr,0 = 0. This universe does not have ordinarymatter or radiation content but only a positive cosmological constant Λ.

For this universe,

R = R0e

v

u

u

t

8πGρΛ

3(t−t0)

H(t) =1

R

dR

dt=

8πGρ

3Λ = H0

In this model, the universe expands at a constant rate for ever and has q = −1and an equation of state given by p = −ρ. Such universe has no beginningand no end and satisfies the perfect cosmological principle. Thus, it hassimilarities to the steady-state model of Hoyle, Bondi and Gold. This steady-state exponentially expanding universe is nowadays used as a backdrop tothe initial inflation phase of the Big Bang model.

Inflation consisted of a period of accelerated expansion (the distance be-tween two fixed observers increases exponentially with Λ staying nearly con-stant). Inflation was introduced to smooth out inhomogeneities, anisotropiesand the curvature of space.